Skip to content

Latest commit

 

History

History
288 lines (200 loc) · 9.73 KB

CONTRIBUTING.md

File metadata and controls

288 lines (200 loc) · 9.73 KB

Contributing

Remote development

We recommend that you run your development environment on a cloud instance due to frequent docker registry pushing, e.g. an AWS EC2 instance or GCP VM. We've had a good experience using Mutagen to synchronize local / remote file systems.

Prerequisites

System packages

To install the necessary system packages on Ubuntu, you can run these commands:

sudo apt-get update
sudo apt install -y apt-transport-https ca-certificates software-properties-common gnupg-agent curl zip python3 python3-pip python3-dev build-essential jq tree
sudo python3 -m pip install --upgrade pip setuptools boto3

Go

To install Go on linux, run:

mkdir -p ~/bin && \
wget https://dl.google.com/go/go1.14.7.linux-amd64.tar.gz && \
sudo tar -xvf go1.14.7.linux-amd64.tar.gz && \
sudo mv go /usr/local && \
rm go1.14.7.linux-amd64.tar.gz && \
echo 'export PATH="/usr/local/go/bin:$HOME/go/bin:$PATH"' >> $HOME/.bashrc

And then log out and back in.

Docker

To install Docker on Ubuntu, run:

curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add - && \
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" && \
sudo apt update && \
sudo apt install -y docker-ce docker-ce-cli containerd.io && \
sudo usermod -aG docker $USER

And then log out and back in.

kubectl

To install kubectl on linux, run:

curl -LO https://storage.googleapis.com/kubernetes-release/release/`curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt`/bin/linux/amd64/kubectl && \
chmod +x ./kubectl && \
sudo mv ./kubectl /usr/local/bin/kubectl

eksctl

To install eksctl run:

curl --silent --location "https://github.com/weaveworks/eksctl/releases/latest/download/eksctl_$(uname -s)_amd64.tar.gz" | tar xz -C /tmp && \
sudo mv /tmp/eksctl /usr/local/bin

aws-cli (v1)

Follow these instructions to install aws-cli (v1).

E.g. to install it globally, run:

sudo python3 -m pip install awscli

aws configure

gcloud (v1)

Follow these instructions to install gcloud.

For example:

echo "deb [signed-by=/usr/share/keyrings/cloud.google.gpg] https://packages.cloud.google.com/apt cloud-sdk main" | sudo tee -a /etc/apt/sources.list.d/google-cloud-sdk.list && \
curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key --keyring /usr/share/keyrings/cloud.google.gpg add - && \
sudo apt-get update && \
sudo apt-get install -y google-cloud-sdk

gcloud init

Cortex dev environment

Clone the repo

Clone the project:

git clone https://github.com/cortexlabs/cortex.git
cd cortex

Run the tests:

make test

Dev tools

Install development tools by running:

make tools

After the dependencies are installed, there may be a diff in go.mod and go.sum, which you can revert.

Run the linter:

make lint

We use gofmt for formatting Go files, black for Python files (line length = 100), and the VS Code yaml extension for YAML files. It is recommended to enable these in your code editor, but you can also run the Go and Python formatters from the terminal:

make format

git diff  # there should be no diff

Cluster configuration

These instructions assume you'll be creating clusters on AWS and GCP. You may skip some of the steps and configuration if you'll only be developing / testing on a single cloud provider.

Create a config directory in the repo's root directory:

mkdir dev/config

Create dev/config/env.sh with the following information:

# dev/config/env.sh

export AWS_ACCOUNT_ID="***"  # you can find your account ID in the AWS web console; here is an example: 764403040417
export AWS_REGION="***"  # you can use any AWS region you'd like, e.g. "us-west-2"
export AWS_ACCESS_KEY_ID="***"
export AWS_SECRET_ACCESS_KEY="***"

export GCP_PROJECT_ID="***"
export GOOGLE_APPLICATION_CREDENTIALS="***"
export GCR_HOST="gcr.io"  # must be "gcr.io", "us.gcr.io", "eu.gcr.io", or "asia.gcr.io"

# export NUM_BUILD_PROCS=2  # optional; can be >2 if you have enough memory

Create the ECR registries:

make registry-create-aws

Create dev/config/cluster-aws.yaml. Paste the following config, and update region and all registry URLs (replace <account_id> with your AWS account ID, and replace <region> with your region):

# dev/config/cluster-aws.yaml

cluster_name: cortex
provider: aws
region: <region>  # e.g. us-west-2
instance_type: m5.large
min_instances: 1
max_instances: 5

image_operator: <account_id>.dkr.ecr.<region>.amazonaws.com/cortexlabs/operator:latest
image_manager: <account_id>.dkr.ecr.<region>.amazonaws.com/cortexlabs/manager:latest
image_downloader: <account_id>.dkr.ecr.<region>.amazonaws.com/cortexlabs/downloader:latest
image_request_monitor: <account_id>.dkr.ecr.<region>.amazonaws.com/cortexlabs/request-monitor:latest
image_cluster_autoscaler: <account_id>.dkr.ecr.<region>.amazonaws.com/cortexlabs/cluster-autoscaler:latest
image_metrics_server: <account_id>.dkr.ecr.<region>.amazonaws.com/cortexlabs/metrics-server:latest
image_inferentia: <account_id>.dkr.ecr.<region>.amazonaws.com/cortexlabs/inferentia:latest
image_neuron_rtd: <account_id>.dkr.ecr.<region>.amazonaws.com/cortexlabs/neuron-rtd:latest
image_nvidia: <account_id>.dkr.ecr.<region>.amazonaws.com/cortexlabs/nvidia:latest
image_fluentd: <account_id>.dkr.ecr.<region>.amazonaws.com/cortexlabs/fluentd:latest
image_statsd: <account_id>.dkr.ecr.<region>.amazonaws.com/cortexlabs/statsd:latest
image_istio_proxy: <account_id>.dkr.ecr.<region>.amazonaws.com/cortexlabs/istio-proxy:latest
image_istio_pilot: <account_id>.dkr.ecr.<region>.amazonaws.com/cortexlabs/istio-pilot:latest

Create dev/config/cluster-gcp.yaml. Paste the following config, and update project, zone, and all registry URLs (replace <project_id> with your project ID, and update gcr.io if you are using a different host):

# dev/config/cluster-gcp.yaml

project: <project_id>
zone: <zone>  # e.g. us-central1-a
cluster_name: cortex
provider: gcp
instance_type: n1-standard-2
min_instances: 1
max_instances: 5
# accelerator_type: nvidia-tesla-k80  # optional

image_operator: /cortexlabs/operator:latest
image_manager: gcr.io/<project_id>/cortexlabs/manager:latest
image_downloader: gcr.io/<project_id>/cortexlabs/downloader:latest
image_istio_proxy: gcr.io/<project_id>/cortexlabs/istio-proxy:latest
image_istio_pilot: gcr.io/<project_id>/cortexlabs/istio-pilot:latest
image_google_pause: gcr.io/<project_id>/cortexlabs/google-pause:latest

Building

Add this to your bash profile (e.g. ~/.bash_profile, ~/.profile or ~/.bashrc), replacing the placeholders accordingly:

# set the default image for APIs
export CORTEX_DEV_DEFAULT_PREDICTOR_IMAGE_REGISTRY_AWS="<account_id>.dkr.ecr.<region>.amazonaws.com/cortexlabs"
export CORTEX_DEV_DEFAULT_PREDICTOR_IMAGE_REGISTRY_GCP="gcr.io/<project_id>/cortexlabs"
export CORTEX_DEV_DEFAULT_PREDICTOR_IMAGE_REGISTRY="cortexlabs"

# redirect analytics and error reporting to our dev environment
export CORTEX_TELEMETRY_SENTRY_DSN="https://c334df915c014ffa93f2076769e5b334@sentry.io/1848098"
export CORTEX_TELEMETRY_SEGMENT_WRITE_KEY="0WvoJyCey9z1W2EW7rYTPJUMRYat46dl"

alias cortex='$HOME/bin/cortex'  # your path may be different depending on where you cloned the repo

Refresh your bash profile:

. ~/.bash_profile  # or: `. ~/.bashrc`

Build the Cortex CLI:

make cli  # the binary will be placed in <path/to/cortex>/bin/cortex
cortex version  # should show "master"

Build and push all Cortex images:

# for AWS:
make images-all-aws

# for GCP:
make images-all-gcp

Dev workflow

Here is the typical full dev workflow which covers most cases (replace aws with gcp if desired):

  1. make cluster-up-aws (creates a cluster using dev/config/cluster-aws.yaml)
  2. make devstart-aws (deletes the in-cluster operator, builds the CLI, and starts the operator locally; file changes will trigger the CLI and operator to re-build)
  3. Make your changes
  4. make images-dev-aws (only necessary if API images or the manager are modified)
  5. Test your changes e.g. via cortex deploy (and repeat steps 3 and 4 as necessary)
  6. make cluster-down-aws (deletes your cluster)

If you want to switch back to the in-cluster operator:

  1. <ctrl+c> to stop your local operator
  2. make operator-start-aws to restart the operator in your cluster

If you only want to test Cortex's local environment, here is the common workflow:

  1. make cli-watch (builds the CLI and re-builds it when files are changed)
  2. Make your changes
  3. make images-dev-local (only necessary if API images or the manager are modified)
  4. Test your changes e.g. via cortex deploy (and repeat steps 2 and 3 as necessary)

Dev workflow optimizations

If you are only modifying the CLI, make cli-watch will build the CLI and re-build it when files are changed. When doing this, you can leave the operator running in the cluster instead of running it locally.

If you are only modifying the operator, make operator-local-aws will build and start the operator locally, and build/restart it when files are changed.

If you are modifying code in the API images (i.e. any of the Python serving code), make images-dev-aws may build more images than you need during testing. For example, if you are only testing using the python-predictor-cpu image, you can run ./dev/registry.sh update-single python-predictor-cpu --provider aws (or use --provider local if testing locally).

See Makefile for additional dev commands.