Numerai model training in the cloud 🤖
numermatic
is a platform for training Numerai models in the cloud. It provides a free to use API hosted in your own AWS account.
In order to work with the scripts in bin
, you'll need to have the following installed:
- jq - version
jq-1.6
- AWS CLI - version
aws-cli/1.19.53 Python/3.8.10 Linux/5.11.0-36-generic botocore/1.20.53
- SAM CLI- version
SAM CLI, version 1.22.0
Ensure that you have already created both a "data" and "artifacts" bucket created before you run the following commands. The "data" bucket should have the following structure and Numerai data files.
bucket/
|_ training/
| |_ numerai_training_data.csv
|_ transform/
|_ numerai_tournament_data.csv
Clone this repository and add a samconfig.toml
file to the repository root directory. See etc/examples/samconfig.toml
for an example.
After that, run the helper scripts below to setup the resources in your AWS account.
./bin/build_stack
./bin/launch_stack
Add a user to the application and get a user ID token.
./bin/add_user <email>
Upload a ZIPed model file to the application and begin a training execution. The ZIP file should contain both a model.py
file and a requirements.txt
.
./bin/upload_model <user_id_token> <model_file_path>
⚠️ This application is still in development.
The "transform" step of the training flow is currently failing and is a help wanted issue. The models
and end
Lambda functions still need to be wrapped up and incorporated into the full execution flow. Not all of the Lambdas have been comprehensively tested yet either.
- An email notification with the presigned URLs for the trained model when it's completed.
- Replace the hard-coded XGBoost with a custom image to accept more frameworks.
- Improve user experience with a UI element or simplified helper scripts.
- TBD
These are likely dependent on Numerai community support and/or code contributions.
Fork this repository and send a pull request. Follow Python best practices for structure and formatting! 🎉