We recommend that you run your development environment on a cloud instance due to frequent docker registry pushing, e.g. an AWS EC2 instance or GCP VM. We've had a good experience using Mutagen to synchronize local / remote file systems.
To install the necessary system packages on Ubuntu, you can run these commands:
sudo apt-get update
sudo apt install -y apt-transport-https ca-certificates software-properties-common gnupg-agent curl zip python3 python3-pip python3-dev build-essential jq tree
sudo python3 -m pip install --upgrade pip setuptools boto3
To install Go on linux, run:
mkdir -p ~/bin && \
wget https://dl.google.com/go/go1.14.7.linux-amd64.tar.gz && \
sudo tar -xvf go1.14.7.linux-amd64.tar.gz && \
sudo mv go /usr/local && \
rm go1.14.7.linux-amd64.tar.gz && \
echo 'export PATH="/usr/local/go/bin:$HOME/go/bin:$PATH"' >> $HOME/.bashrc
And then log out and back in.
To install Docker on Ubuntu, run:
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add - && \
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" && \
sudo apt update && \
sudo apt install -y docker-ce docker-ce-cli containerd.io && \
sudo usermod -aG docker $USER
And then log out and back in.
To install kubectl on linux, run:
curl -LO https://storage.googleapis.com/kubernetes-release/release/`curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt`/bin/linux/amd64/kubectl && \
chmod +x ./kubectl && \
sudo mv ./kubectl /usr/local/bin/kubectl
To install eksctl run:
curl --silent --location "https://github.com/weaveworks/eksctl/releases/latest/download/eksctl_$(uname -s)_amd64.tar.gz" | tar xz -C /tmp && \
sudo mv /tmp/eksctl /usr/local/bin
Follow these instructions to install aws-cli (v1).
E.g. to install it globally, run:
sudo python3 -m pip install awscli
aws configure
Follow these instructions to install gcloud.
For example:
echo "deb [signed-by=/usr/share/keyrings/cloud.google.gpg] https://packages.cloud.google.com/apt cloud-sdk main" | sudo tee -a /etc/apt/sources.list.d/google-cloud-sdk.list && \
curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key --keyring /usr/share/keyrings/cloud.google.gpg add - && \
sudo apt-get update && \
sudo apt-get install -y google-cloud-sdk
gcloud init
Clone the project:
git clone https://github.com/cortexlabs/cortex.git
cd cortex
Run the tests:
make test
Install development tools by running:
make tools
After the dependencies are installed, there may be a diff in go.mod
and go.sum
, which you can revert.
Run the linter:
make lint
We use gofmt
for formatting Go files, black
for Python files (line length = 100), and the VS Code yaml extension for YAML files. It is recommended to enable these in your code editor, but you can also run the Go and Python formatters from the terminal:
make format
git diff # there should be no diff
These instructions assume you'll be creating clusters on AWS and GCP. You may skip some of the steps and configuration if you'll only be developing / testing on a single cloud provider.
Create a config directory in the repo's root directory:
mkdir dev/config
Create dev/config/env.sh
with the following information:
# dev/config/env.sh
export AWS_ACCOUNT_ID="***" # you can find your account ID in the AWS web console; here is an example: 764403040417
export AWS_REGION="***" # you can use any AWS region you'd like, e.g. "us-west-2"
export AWS_ACCESS_KEY_ID="***"
export AWS_SECRET_ACCESS_KEY="***"
export GCP_PROJECT_ID="***"
export GOOGLE_APPLICATION_CREDENTIALS="***"
export GCR_HOST="gcr.io" # must be "gcr.io", "us.gcr.io", "eu.gcr.io", or "asia.gcr.io"
# export NUM_BUILD_PROCS=2 # optional; can be >2 if you have enough memory
Create the ECR registries:
make registry-create-aws
Create dev/config/cluster-aws.yaml
. Paste the following config, and update region
and all registry URLs (replace <account_id>
with your AWS account ID, and replace <region>
with your region):
# dev/config/cluster-aws.yaml
cluster_name: cortex
provider: aws
region: <region> # e.g. us-west-2
instance_type: m5.large
min_instances: 1
max_instances: 5
image_operator: <account_id>.dkr.ecr.<region>.amazonaws.com/cortexlabs/operator:latest
image_manager: <account_id>.dkr.ecr.<region>.amazonaws.com/cortexlabs/manager:latest
image_downloader: <account_id>.dkr.ecr.<region>.amazonaws.com/cortexlabs/downloader:latest
image_request_monitor: <account_id>.dkr.ecr.<region>.amazonaws.com/cortexlabs/request-monitor:latest
image_cluster_autoscaler: <account_id>.dkr.ecr.<region>.amazonaws.com/cortexlabs/cluster-autoscaler:latest
image_metrics_server: <account_id>.dkr.ecr.<region>.amazonaws.com/cortexlabs/metrics-server:latest
image_inferentia: <account_id>.dkr.ecr.<region>.amazonaws.com/cortexlabs/inferentia:latest
image_neuron_rtd: <account_id>.dkr.ecr.<region>.amazonaws.com/cortexlabs/neuron-rtd:latest
image_nvidia: <account_id>.dkr.ecr.<region>.amazonaws.com/cortexlabs/nvidia:latest
image_fluentd: <account_id>.dkr.ecr.<region>.amazonaws.com/cortexlabs/fluentd:latest
image_statsd: <account_id>.dkr.ecr.<region>.amazonaws.com/cortexlabs/statsd:latest
image_istio_proxy: <account_id>.dkr.ecr.<region>.amazonaws.com/cortexlabs/istio-proxy:latest
image_istio_pilot: <account_id>.dkr.ecr.<region>.amazonaws.com/cortexlabs/istio-pilot:latest
Create dev/config/cluster-gcp.yaml
. Paste the following config, and update project
, zone
, and all registry URLs (replace <project_id>
with your project ID, and update gcr.io
if you are using a different host):
# dev/config/cluster-gcp.yaml
project: <project_id>
zone: <zone> # e.g. us-central1-a
cluster_name: cortex
provider: gcp
instance_type: n1-standard-2
min_instances: 1
max_instances: 5
# accelerator_type: nvidia-tesla-k80 # optional
image_operator: /cortexlabs/operator:latest
image_manager: gcr.io/<project_id>/cortexlabs/manager:latest
image_downloader: gcr.io/<project_id>/cortexlabs/downloader:latest
image_istio_proxy: gcr.io/<project_id>/cortexlabs/istio-proxy:latest
image_istio_pilot: gcr.io/<project_id>/cortexlabs/istio-pilot:latest
image_google_pause: gcr.io/<project_id>/cortexlabs/google-pause:latest
Add this to your bash profile (e.g. ~/.bash_profile
, ~/.profile
or ~/.bashrc
), replacing the placeholders accordingly:
# set the default image for APIs
export CORTEX_DEV_DEFAULT_PREDICTOR_IMAGE_REGISTRY_AWS="<account_id>.dkr.ecr.<region>.amazonaws.com/cortexlabs"
export CORTEX_DEV_DEFAULT_PREDICTOR_IMAGE_REGISTRY_GCP="gcr.io/<project_id>/cortexlabs"
export CORTEX_DEV_DEFAULT_PREDICTOR_IMAGE_REGISTRY="cortexlabs"
# redirect analytics and error reporting to our dev environment
export CORTEX_TELEMETRY_SENTRY_DSN="https://c334df915c014ffa93f2076769e5b334@sentry.io/1848098"
export CORTEX_TELEMETRY_SEGMENT_WRITE_KEY="0WvoJyCey9z1W2EW7rYTPJUMRYat46dl"
alias cortex='$HOME/bin/cortex' # your path may be different depending on where you cloned the repo
Refresh your bash profile:
. ~/.bash_profile # or: `. ~/.bashrc`
Build the Cortex CLI:
make cli # the binary will be placed in <path/to/cortex>/bin/cortex
cortex version # should show "master"
Build and push all Cortex images:
# for AWS:
make images-all-aws
# for GCP:
make images-all-gcp
Here is the typical full dev workflow which covers most cases (replace aws
with gcp
if desired):
make cluster-up-aws
(creates a cluster usingdev/config/cluster-aws.yaml
)make devstart-aws
(deletes the in-cluster operator, builds the CLI, and starts the operator locally; file changes will trigger the CLI and operator to re-build)- Make your changes
make images-dev-aws
(only necessary if API images or the manager are modified)- Test your changes e.g. via
cortex deploy
(and repeat steps 3 and 4 as necessary) make cluster-down-aws
(deletes your cluster)
If you want to switch back to the in-cluster operator:
<ctrl+c>
to stop your local operatormake operator-start-aws
to restart the operator in your cluster
If you only want to test Cortex's local environment, here is the common workflow:
make cli-watch
(builds the CLI and re-builds it when files are changed)- Make your changes
make images-dev-local
(only necessary if API images or the manager are modified)- Test your changes e.g. via
cortex deploy
(and repeat steps 2 and 3 as necessary)
If you are only modifying the CLI, make cli-watch
will build the CLI and re-build it when files are changed. When doing this, you can leave the operator running in the cluster instead of running it locally.
If you are only modifying the operator, make operator-local-aws
will build and start the operator locally, and build/restart it when files are changed.
If you are modifying code in the API images (i.e. any of the Python serving code), make images-dev-aws
may build more images than you need during testing. For example, if you are only testing using the python-predictor-cpu
image, you can run ./dev/registry.sh update-single python-predictor-cpu --provider aws
(or use --provider local
if testing locally).
See Makefile
for additional dev commands.