Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 35 additions & 19 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,25 +15,35 @@ Cortex is an open source platform for deploying machine learning models—traine

## Key features

- **Autoscaling:** Cortex automatically scales APIs to handle production workloads.
* **Autoscaling:** Cortex automatically scales APIs to handle production workloads.
* **Multi framework:** Cortex supports TensorFlow, PyTorch, scikit-learn, XGBoost, and more.
* **CPU / GPU support:** Cortex can run inference on CPU or GPU infrastructure.
* **Spot instances:** Cortex supports EC2 spot instances.
* **Rolling updates:** Cortex updates deployed APIs without any downtime.
* **Log streaming:** Cortex streams logs from deployed models to your CLI.
* **Prediction monitoring:** Cortex monitors network metrics and tracks predictions.
* **Minimal configuration:** Deployments are defined in a single `cortex.yaml` file.

- **Multi framework:** Cortex supports TensorFlow, PyTorch, scikit-learn, XGBoost, and more.

- **CPU / GPU support:** Cortex can run inference on CPU or GPU infrastructure.

- **Spot instances:** Cortex supports EC2 spot instances.
<br>

- **Rolling updates:** Cortex updates deployed APIs without any downtime.
## Spinning up a Cortex cluster

- **Log streaming:** Cortex streams logs from deployed models to your CLI.
Cortex is designed to be self-hosted on any AWS account. You can spin up a Cortex cluster with a single command:

- **Prediction monitoring:** Cortex monitors network metrics and tracks predictions.
```bash
$ cortex cluster up

- **Minimal configuration:** Deployments are defined in a single `cortex.yaml` file.
aws region: us-west-2
aws instance type: p2.xlarge
min instances: 0
max instances: 10
spot instances: yes

<br>
○ spinning up your cluster ...
your cluster is ready!
```

## Usage
## Deploying a model

### Implement your predictor

Expand Down Expand Up @@ -98,17 +108,23 @@ negative 4

<br>

## How it works
## What is Cortex an alternative to?

Cortex is an open source alternative to serving models with SageMaker or building your own model deployment platform on top of AWS services like Elastic Kubernetes Service (EKS), Elastic Container Service (ECS), Lambda, Fargate, and Elastic Compute Cloud (EC2) or open source projects like Docker, Kubernetes, and TensorFlow Serving.

<br>

## How does Cortex work?

The CLI sends configuration and code to the cluster every time you run `cortex deploy`. Each model is loaded into a Docker container, along with any Python packages and request handling code. The model is exposed as a web service using Elastic Load Balancing (ELB), TensorFlow Serving, and ONNX Runtime. The containers are orchestrated on Elastic Kubernetes Service (EKS) while logs and metrics are streamed to CloudWatch.

<br>

## Examples
## Examples of Cortex deployments

<!-- CORTEX_VERSION_README_MINOR x5 -->
- [Sentiment analysis](https://github.com/cortexlabs/cortex/tree/0.11/examples/tensorflow/sentiment-analyzer) in TensorFlow with BERT
- [Image classification](https://github.com/cortexlabs/cortex/tree/0.11/examples/tensorflow/image-classifier) in TensorFlow with Inception
- [Text generation](https://github.com/cortexlabs/cortex/tree/0.11/examples/pytorch/text-generator) in PyTorch with DistilGPT2
- [Reading comprehension](https://github.com/cortexlabs/cortex/tree/0.11/examples/pytorch/reading-comprehender) in PyTorch with ELMo-BiDAF
- [Iris classification](https://github.com/cortexlabs/cortex/tree/0.11/examples/sklearn/iris-classifier) in scikit-learn
* [Sentiment analysis](https://github.com/cortexlabs/cortex/tree/0.11/examples/tensorflow/sentiment-analyzer): deploy a BERT model for sentiment analysis.
* [Image classification](https://github.com/cortexlabs/cortex/tree/0.11/examples/tensorflow/image-classifier): deploy an Inception model to classify images.
* [Search completion](https://github.com/cortexlabs/cortex/tree/0.11/examples/tensorflow/search-completer): deploy Facebook's RoBERTa model to complete search terms.
* [Text generation](https://github.com/cortexlabs/cortex/tree/0.11/examples/pytorch/text-generator): deploy Hugging Face's DistilGPT2 model to generate text.
* [Iris classification](https://github.com/cortexlabs/cortex/tree/0.11/examples/sklearn/iris-classifier): deploy a scikit-learn model to classify iris flowers.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -42,8 +42,8 @@ cortex get classifier

# classify a sample
curl -X POST -H "Content-Type: application/json" \
-d '{ "sepal_length": 5.2, "sepal_width": 3.6, "petal_length": 1.4, "petal_width": 0.3 }' \
<API endpoint>
-d '{ "sepal_length": 5.2, "sepal_width": 3.6, "petal_length": 1.4, "petal_width": 0.3 }' \
<API endpoint>
```

## Cleanup
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
34 changes: 17 additions & 17 deletions docs/development.md → docs/contributing/development.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
# Development Environment
# Development

## Prerequisites

1. Go (>=1.12.9)
1. Docker
1. eksctl
1. kubectl
2. Docker
3. eksctl
4. kubectl

## Cortex Dev Environment
## Cortex dev environment

Clone the project:

Expand Down Expand Up @@ -109,7 +109,7 @@ make cli # The binary will be placed in path/to/cortex/bin/cortex
path/to/cortex/bin/cortex configure
```

### Cortex Cluster
### Cortex cluster

Start Cortex:

Expand All @@ -123,35 +123,35 @@ Tear down the Cortex cluster:
make cortex-down
```

### Deployment an Example
### Deploy an example

```bash
cd examples/iris-classifier
path/to/cortex/bin/cortex deploy
```

## Off-cluster Operator
## Off-cluster operator

If you're making changes in the operator and want faster iterations, you can run an off-cluster operator.

1. `make operator-stop` to stop the in-cluster operator
1. `make devstart` to run the off-cluster operator (which rebuilds the CLI and restarts the Operator when files change)
1. `path/to/cortex/bin/cortex configure` (on a separate terminal) to configure your cortex CLI to use the off-cluster operator. When prompted for operator URL, use `http://localhost:8888`
2. `make devstart` to run the off-cluster operator (which rebuilds the CLI and restarts the Operator when files change)
3. `path/to/cortex/bin/cortex configure` (on a separate terminal) to configure your cortex CLI to use the off-cluster operator. When prompted for operator URL, use `http://localhost:8888`

Note: `make cortex-up-dev` will start Cortex without installing the operator.

If you want to switch back to the in-cluster operator:

1. `<ctrl+C>` to stop your off-cluster operator
1. `make operator-start` to install the operator in your cluster
1. `path/to/cortex/bin/cortex configure` to configure your cortex CLI to use the in-cluster operator. When prompted for operator URL, use the URL shown when running `make cortex-info`
2. `make operator-start` to install the operator in your cluster
3. `path/to/cortex/bin/cortex configure` to configure your cortex CLI to use the in-cluster operator. When prompted for operator URL, use the URL shown when running `make cortex-info`

## Dev Workflow
## Dev workflow

1. `make cortex-up-dev`
1. `make devstart`
1. Make changes
1. `make registry-dev`
1. Test your changes with projects in `examples` or your own
2. `make devstart`
3. Make changes
4. `make registry-dev`
5. Test your changes with projects in `examples` or your own

See `Makefile` for additional dev commands
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# System packages

Cortex uses Docker images to deploy your models. These images can be replaced with custom images that you can augment with your system packages and libraries. You will need to push your custom images to a container registry that your cluster has access to (e.g. [Docker Hub](https://hub.docker.com/) or [AWS ECR](https://aws.amazon.com/ecr/)).
Cortex uses Docker images to deploy your models. These images can be replaced with custom images that you can augment with your system packages and libraries. You will need to push your custom images to a container registry that your cluster has access to (e.g. [Docker Hub](https://hub.docker.com) or [AWS ECR](https://aws.amazon.com/ecr)).

See the `image paths` section in [cluster configuration](../cluster/config.md) for a complete list of customizable images.
See the `image paths` section in [cluster configuration](../cluster-management/config.md) for a complete list of customizable images.

## Create a custom image

Expand Down
2 changes: 1 addition & 1 deletion docs/deployments/compute.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,6 @@ One unit of memory is one byte. Memory can be expressed as an integer or by usin
## GPU

1. Make sure your AWS account is subscribed to the [EKS-optimized AMI with GPU Support](https://aws.amazon.com/marketplace/pp/B07GRHFXGM).
2. You may need to [file an AWS support ticket](https://console.aws.amazon.com/support/cases#/create?issueType=service-limit-increase&limitType=ec2-instances) to incease the limit for your desired instance type.
2. You may need to [file an AWS support ticket](https://console.aws.amazon.com/support/cases#/create?issueType=service-limit-increase&limitType=ec2-instances) to increase the limit for your desired instance type.
3. Set instance type to an AWS GPU instance (e.g. p2.xlarge) when installing Cortex.
4. Note that one unit of GPU corresponds to one virtual GPU on AWS. Fractional requests are not allowed.
2 changes: 1 addition & 1 deletion docs/deployments/onnx.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ Deploy ONNX models as web services.
mem: <string> # memory request per replica (default: Null)
```

See [packaging ONNX models](../packaging/onnx.md) for information about exporting ONNX models.
See [packaging ONNX models](../packaging-models/onnx.md) for information about exporting ONNX models.

## Example

Expand Down
2 changes: 1 addition & 1 deletion docs/deployments/predictor.md
Original file line number Diff line number Diff line change
Expand Up @@ -154,4 +154,4 @@ torchvision==0.4.2
xgboost==0.90
```

Learn how to install additional packages [here](../dependencies/python-packages.md).
Learn how to install additional packages [here](../dependency-management/python-packages.md).
2 changes: 1 addition & 1 deletion docs/deployments/request-handlers.md
Original file line number Diff line number Diff line change
Expand Up @@ -84,4 +84,4 @@ tensorflow-hub==0.7.0 # TensorFlow runtime only
tensorflow==2.0.0 # TensorFlow runtime only
```

Learn how to install additional packages [here](../dependencies/python-packages.md).
Learn how to install additional packages [here](../dependency-management/python-packages.md).
2 changes: 1 addition & 1 deletion docs/deployments/tensorflow.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ Deploy TensorFlow models as web services.
mem: <string> # memory request per replica (default: Null)
```

See [packaging TensorFlow models](../packaging/tensorflow.md) for how to export a TensorFlow model.
See [packaging TensorFlow models](../packaging-models/tensorflow.md) for how to export a TensorFlow model.

## Example

Expand Down
4 changes: 2 additions & 2 deletions docs/packaging/onnx.md → docs/packaging-models/onnx.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# Packaging ONNX models
# ONNX

Export your trained model to the ONNX model format. Here is an example of an sklearn model being exported to ONNX:

```Python
```python
from sklearn.linear_model import LogisticRegression
from onnxmltools import convert_sklearn
from onnxconverter_common.data_types import FloatTensorType
Expand Down
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
# Packaging TensorFlow models
# TensorFlow

<!-- CORTEX_VERSION_MINOR -->
Export your trained model and upload the export directory, or a checkpoint directory containing the export directory (which is usually the case if you used `estimator.train_and_evaluate`). An example is shown below (here is the [complete example](https://github.com/cortexlabs/cortex/blob/master/examples/tensorflow/sentiment-analyzer)):

```Python
```python
import tensorflow as tf

...
Expand Down
26 changes: 13 additions & 13 deletions docs/summary.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Summary
# Table of contents

* [Deploy machine learning models in production](../README.md)
* [Install](cluster/install.md)
* [Install](cluster-management/install.md)
* [Tutorial](../examples/sklearn/iris-classifier/README.md)
* [GitHub](https://github.com/cortexlabs/cortex)
* [Examples](https://github.com/cortexlabs/cortex/tree/master/examples) <!-- CORTEX_VERSION_MINOR -->
Expand All @@ -19,28 +19,28 @@
* [Autoscaling](deployments/autoscaling.md)
* [Prediction monitoring](deployments/prediction-monitoring.md)
* [Compute](deployments/compute.md)
* [CLI commands](cluster/cli.md)
* [API statuses](deployments/statuses.md)
* [Python client](deployments/python-client.md)

## Packaging models

* [TensorFlow](packaging/tensorflow.md)
* [ONNX](packaging/onnx.md)
* [TensorFlow](packaging-models/tensorflow.md)
* [ONNX](packaging-models/onnx.md)

## Dependency management

* [Python packages](dependencies/python-packages.md)
* [System packages](dependencies/system-packages.md)
* [Python packages](dependency-management/python-packages.md)
* [System packages](dependency-management/system-packages.md)

## Cluster management

* [Cluster configuration](cluster/config.md)
* [AWS credentials](cluster/aws.md)
* [Security](cluster/security.md)
* [Update](cluster/update.md)
* [Uninstall](cluster/uninstall.md)
* [CLI commands](cluster-management/cli.md)
* [Cluster configuration](cluster-management/config.md)
* [AWS credentials](cluster-management/aws.md)
* [Security](cluster-management/security.md)
* [Update](cluster-management/update.md)
* [Uninstall](cluster-management/uninstall.md)

## Contributing

* [Development](development.md)
* [Development](contributing/development.md)
20 changes: 10 additions & 10 deletions examples/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,29 +4,29 @@

- [Iris classification](tensorflow/iris-classifier): deploy a model to classify iris flowers.

- [Text generation with GPT-2](tensorflow/text-generator): deploy OpenAI's GPT-2 to generate text.
- [Text generation](tensorflow/text-generator): deploy OpenAI's GPT-2 to generate text.

- [Image classification with Inception V3](tensorflow/image-classifier): deploy an Inception V3 model to classify images.
- [Sentiment analysis](tensorflow/sentiment-analyzer): deploy a BERT model for sentiment analysis.

- [Sentiment analysis with BERT](tensorflow/sentiment-analyzer): deploy a BERT model for sentiment analysis.
- [Image classification](tensorflow/image-classifier): deploy an Inception model to classify images.

## PyTorch

- [Iris classification](pytorch/iris-classifier): deploy a model to classify iris flowers.

- [Search completion](pytorch/search-completer): deploy a model to complete search terms.
- [Search completion](pytorch/search-completer): deploy a Facebook's RoBERTa model to complete search terms.

- [Text generation with DistilGPT2](pytorch/text-generator): deploy Hugging Face's DistilGPT2 model to generate text.
- [Text generation](pytorch/text-generator): deploy Hugging Face's DistilGPT2 model to generate text.

- [Answer generation with DialoGPT](pytorch/answer-generator): deploy Microsoft's DialoGPT model to answer questions.
- [Answer generation](pytorch/answer-generator): deploy Microsoft's DialoGPT model to answer questions.

- [Text summarization with BERT](pytorch/text-summarizer): deploy a BERT model to summarize text.
- [Text summarization](pytorch/text-summarizer): deploy a BERT model to summarize text.

- [Reading comprehension with AllenNLP](pytorch/reading-comprehender): deploy an AllenNLP model for reading comprehension.
- [Reading comprehension](pytorch/reading-comprehender): deploy an AllenNLP model for reading comprehension.

- [Language identification with fastText](pytorch/language-identifier): deploy a fastText model to identify languages.
- [Language identification](pytorch/language-identifier): deploy a fastText model to identify languages.

- [Image Classification with AlexNet](pytorch/image-classifier): deploy an AlexNet model from TorchVision to classify images.
- [Image Classification](pytorch/image-classifier): deploy an AlexNet model from TorchVision to classify images.

## XGBoost

Expand Down