From 06fabaea6d4d303d3606765578727dd563c5d54f Mon Sep 17 00:00:00 2001
From: Omer Spillinger <ospillinger@users.noreply.github.com>
Date: Thu, 30 Apr 2020 13:41:16 -0700
Subject: [PATCH 01/20] Update README.md

---
 README.md | 110 ++++++++++++++++++++++++++++--------------------------
 1 file changed, 58 insertions(+), 52 deletions(-)
diff --git a/README.md b/README.md
index 58e71f3ff7..2d3978f322 100644
--- a/README.md
+++ b/README.md
@@ -6,7 +6,7 @@ Cortex is an open source platform for deploying machine learning models as produ
 
 <!-- Delete on release branches -->
 <!-- CORTEX_VERSION_README_MINOR -->
-[install](https://cortex.dev/install) • [tutorial](https://cortex.dev/iris-classifier) • [docs](https://cortex.dev) • [examples](https://github.com/cortexlabs/cortex/tree/0.15/examples) • [we're hiring](https://angel.co/cortex-labs-inc/jobs) • [email us](mailto:hello@cortex.dev) • [chat with us](https://gitter.im/cortexlabs/cortex)<br><br>
+[install](https://cortex.dev/install) • [tutorial](https://cortex.dev/iris-classifier) • [docs](https://cortex.dev) • [examples](https://github.com/cortexlabs/cortex/tree/0.15/examples) • [we're hiring](https://angel.co/cortex-labs-inc/jobs) • [chat with us](https://gitter.im/cortexlabs/cortex)<br><br>
 
 <!-- Set header Cache-Control=no-cache on the S3 object metadata (see https://help.github.com/en/articles/about-anonymized-image-urls) -->
 ![Demo](https://d1zqebknpdh033.cloudfront.net/demo/gif/v0.13_2.gif)
@@ -25,41 +25,6 @@ Cortex is an open source platform for deploying machine learning models as produ
 
 <br>
 
-## Spinning up a cluster
-
-Cortex is designed to be self-hosted on any AWS account. You can spin up a cluster with a single command:
-
-<!-- CORTEX_VERSION_README_MINOR -->
-```bash
-# install the CLI on your machine
-$ bash -c "$(curl -sS https://raw.githubusercontent.com/cortexlabs/cortex/0.15/get-cli.sh)"
-
-# provision infrastructure on AWS and spin up a cluster
-$ cortex cluster up
-
-aws region: us-west-2
-aws instance type: g4dn.xlarge
-spot instances: yes
-min instances: 0
-max instances: 5
-
-aws resource                                cost per hour
-1 eks cluster                               $0.10
-0 - 5 g4dn.xlarge instances for your apis   $0.1578 - $0.526 each (varies based on spot price)
-0 - 5 50gb ebs volumes for your apis        $0.007 each
-1 t3.medium instance for the operator       $0.0416
-1 20gb ebs volume for the operator          $0.003
-2 network load balancers                    $0.0225 each
-
-your cluster will cost $0.19 - $2.85 per hour based on cluster size and spot instance pricing/availability
-
-￮ spinning up your cluster ...
-
-your cluster is ready!
-```
-
-<br>
-
 ## Deploying a model
 
 ### Implement your predictor
@@ -84,14 +49,12 @@ class PythonPredictor:
   predictor:
     type: python
     path: predictor.py
-  tracker:
-    model_type: classification
   compute:
     gpu: 1
     mem: 4G
 ```
 
-### Deploy to AWS
+### Deploy your model
 
 ```bash
 $ cortex deploy
@@ -99,20 +62,57 @@ $ cortex deploy
 creating sentiment-classifier
 ```
 
-### Serve real-time predictions
+### Serve predictions
 
 ```bash
-$ curl http://***.amazonaws.com/sentiment-classifier \
+$ curl http://localhost:8888/sentiment-classifier \
     -X POST -H "Content-Type: application/json" \
-    -d '{"text": "the movie was amazing!"}'
+    -d '{"text": "serving models locally is cool!"}'
 
 positive
 ```
 
+<br>
+
+## Running Cortex in production
+
+### Spin up a cluster
+
+Cortex is designed to be self-hosted on any AWS account (GCP is coming soon). You can spin up a cluster with a single command:
+
+<!-- CORTEX_VERSION_README_MINOR -->
+```bash
+# install the CLI on your machine
+$ bash -c "$(curl -sS https://raw.githubusercontent.com/cortexlabs/cortex/0.15/get-cli.sh)"
+
+# provision infrastructure on AWS and spin up a cluster
+$ cortex cluster up
+
+aws region: us-west-2
+aws instance type: g4dn.xlarge
+spot instances: yes
+min instances: 0
+max instances: 5
+
+your cluster will cost $0.19 - $2.85 per hour based on cluster size and spot instance pricing/availability
+
+￮ spinning up your cluster ...
+
+your cluster is ready!
+```
+
+### Deploy to your cluster with the same code and configuration
+
+```bash
+$ cortex deploy
+
+creating sentiment-classifier
+```
+
 ### Monitor your deployment
 
 ```bash
-$ cortex get sentiment-classifier --watch
+$ cortex get sentiment-classifier
 
 status   up-to-date   requested   last update   avg request   2XX
 live     1            1           8s            24ms          12
@@ -122,15 +122,17 @@ positive  8
 negative  4
 ```
 
-<br>
+### Serve predictions at scale
 
-## What is Cortex similar to?
-
-Cortex is an open source alternative to serving models with SageMaker or building your own model deployment platform on top of AWS services like Elastic Kubernetes Service (EKS), Elastic Container Service (ECS), Lambda, Fargate, and Elastic Compute Cloud (EC2) and open source projects like Docker, Kubernetes, and TensorFlow Serving.
+```bash
+$ curl http://***.amazonaws.com/sentiment-classifier \
+    -X POST -H "Content-Type: application/json" \
+    -d '{"text": "serving models at scale is cooler!"}'
 
-<br>
+positive
+```
 
-## How does Cortex work?
+## How it works
 
 The CLI sends configuration and code to the cluster every time you run `cortex deploy`. Each model is loaded into a Docker container, along with any Python packages and request handling code. The model is exposed as a web service using Elastic Load Balancing (ELB), TensorFlow Serving, and ONNX Runtime. The containers are orchestrated on Elastic Kubernetes Service (EKS) while logs and metrics are streamed to CloudWatch.
 
@@ -138,11 +140,15 @@ Cortex manages its own Kubernetes cluster so that end-to-end functionality like
 
 <br>
 
+## What is Cortex similar to?
+
+Cortex is an open source alternative to serving models with SageMaker or building your own model deployment platform on top of AWS services like Elastic Kubernetes Service (EKS), Lambda, or Fargate and open source projects like Docker, Kubernetes, TensorFlow Serving, and TorchServe.
+
+<br>
+
 ## Examples of Cortex deployments
 
-<!-- CORTEX_VERSION_README_MINOR x5 -->
-* [Sentiment analysis](https://github.com/cortexlabs/cortex/tree/0.15/examples/tensorflow/sentiment-analyzer): deploy a BERT model for sentiment analysis.
+<!-- CORTEX_VERSION_README_MINOR x3 -->
 * [Image classification](https://github.com/cortexlabs/cortex/tree/0.15/examples/tensorflow/image-classifier): deploy an Inception model to classify images.
 * [Search completion](https://github.com/cortexlabs/cortex/tree/0.15/examples/pytorch/search-completer): deploy Facebook's RoBERTa model to complete search terms.
 * [Text generation](https://github.com/cortexlabs/cortex/tree/0.15/examples/pytorch/text-generator): deploy Hugging Face's DistilGPT2 model to generate text.
-* [Iris classification](https://github.com/cortexlabs/cortex/tree/0.15/examples/sklearn/iris-classifier): deploy a scikit-learn model to classify iris flowers.

From 92a6e5dcb3a2ba63e38595b397ed3e1afb4e7ce8 Mon Sep 17 00:00:00 2001
From: Omer Spillinger <ospillinger@users.noreply.github.com>
Date: Thu, 30 Apr 2020 13:45:08 -0700
Subject: [PATCH 02/20] Update README.md

---
 README.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/README.md b/README.md
index 2d3978f322..e4e7a6698c 100644
--- a/README.md
+++ b/README.md
@@ -6,7 +6,7 @@ Cortex is an open source platform for deploying machine learning models as produ
 
 <!-- Delete on release branches -->
 <!-- CORTEX_VERSION_README_MINOR -->
-[install](https://cortex.dev/install) • [tutorial](https://cortex.dev/iris-classifier) • [docs](https://cortex.dev) • [examples](https://github.com/cortexlabs/cortex/tree/0.15/examples) • [we're hiring](https://angel.co/cortex-labs-inc/jobs) • [chat with us](https://gitter.im/cortexlabs/cortex)<br><br>
+[install](https://cortex.dev/install) • [docs](https://cortex.dev) • [examples](https://github.com/cortexlabs/cortex/tree/0.15/examples) • [we're hiring](https://angel.co/cortex-labs-inc/jobs) • [chat with us](https://gitter.im/cortexlabs/cortex)<br><br>
 
 <!-- Set header Cache-Control=no-cache on the S3 object metadata (see https://help.github.com/en/articles/about-anonymized-image-urls) -->
 ![Demo](https://d1zqebknpdh033.cloudfront.net/demo/gif/v0.13_2.gif)
@@ -78,7 +78,7 @@ positive
 
 ### Spin up a cluster
 
-Cortex is designed to be self-hosted on any AWS account (GCP is coming soon). You can spin up a cluster with a single command:
+Cortex is designed to be self-hosted on any AWS account (GCP is coming soon).
 
 <!-- CORTEX_VERSION_README_MINOR -->
 ```bash
@@ -132,7 +132,7 @@ $ curl http://***.amazonaws.com/sentiment-classifier \
 positive
 ```
 
-## How it works
+### How it works
 
 The CLI sends configuration and code to the cluster every time you run `cortex deploy`. Each model is loaded into a Docker container, along with any Python packages and request handling code. The model is exposed as a web service using Elastic Load Balancing (ELB), TensorFlow Serving, and ONNX Runtime. The containers are orchestrated on Elastic Kubernetes Service (EKS) while logs and metrics are streamed to CloudWatch.
 

From b849e815378a07e91f15290e631f1c6459678ef5 Mon Sep 17 00:00:00 2001
From: Omer Spillinger <ospillinger@users.noreply.github.com>
Date: Thu, 30 Apr 2020 13:47:44 -0700
Subject: [PATCH 03/20] Update README.md

---
 README.md | 26 ++++++++++++++------------
 1 file changed, 14 insertions(+), 12 deletions(-)

diff --git a/README.md b/README.md
index e4e7a6698c..1528fcb6a5 100644
--- a/README.md
+++ b/README.md
@@ -1,7 +1,5 @@
 # Deploy machine learning models in production
 
-Cortex is an open source platform for deploying machine learning models as production web services.
-
 <br>
 
 <!-- Delete on release branches -->
@@ -13,6 +11,10 @@ Cortex is an open source platform for deploying machine learning models as produ
 
 <br>
 
+Cortex is an open source platform for deploying machine learning models as production web services.
+
+<br>
+
 ## Key features
 
 * **Multi framework:** deploy TensorFlow, PyTorch, scikit-learn, and other models.
@@ -109,6 +111,16 @@ $ cortex deploy
 creating sentiment-classifier
 ```
 
+### Serve predictions at scale
+
+```bash
+$ curl http://***.amazonaws.com/sentiment-classifier \
+    -X POST -H "Content-Type: application/json" \
+    -d '{"text": "serving models at scale is cooler!"}'
+
+positive
+```
+
 ### Monitor your deployment
 
 ```bash
@@ -122,16 +134,6 @@ positive  8
 negative  4
 ```
 
-### Serve predictions at scale
-
-```bash
-$ curl http://***.amazonaws.com/sentiment-classifier \
-    -X POST -H "Content-Type: application/json" \
-    -d '{"text": "serving models at scale is cooler!"}'
-
-positive
-```
-
 ### How it works
 
 The CLI sends configuration and code to the cluster every time you run `cortex deploy`. Each model is loaded into a Docker container, along with any Python packages and request handling code. The model is exposed as a web service using Elastic Load Balancing (ELB), TensorFlow Serving, and ONNX Runtime. The containers are orchestrated on Elastic Kubernetes Service (EKS) while logs and metrics are streamed to CloudWatch.

From 9dfc16f33a124757314c5bb6e137e77aee0e1441 Mon Sep 17 00:00:00 2001
From: Omer Spillinger <ospillinger@users.noreply.github.com>
Date: Thu, 30 Apr 2020 13:48:23 -0700
Subject: [PATCH 04/20] Update README.md

---
 README.md | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/README.md b/README.md
index 1528fcb6a5..6106298e63 100644
--- a/README.md
+++ b/README.md
@@ -11,10 +11,6 @@
 
 <br>
 
-Cortex is an open source platform for deploying machine learning models as production web services.
-
-<br>
-
 ## Key features
 
 * **Multi framework:** deploy TensorFlow, PyTorch, scikit-learn, and other models.

From 25ddca8c3d69241b3f005cb8aa5807e62fa8eaf8 Mon Sep 17 00:00:00 2001
From: Omer Spillinger <ospillinger@users.noreply.github.com>
Date: Thu, 30 Apr 2020 13:50:18 -0700
Subject: [PATCH 05/20] Update README.md

---
 README.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/README.md b/README.md
index 6106298e63..ae5a81f1ae 100644
--- a/README.md
+++ b/README.md
@@ -63,7 +63,7 @@ creating sentiment-classifier
 ### Serve predictions
 
 ```bash
-$ curl http://localhost:8888/sentiment-classifier \
+$ curl http://localhost:8888 \
     -X POST -H "Content-Type: application/json" \
     -d '{"text": "serving models locally is cool!"}'
 
@@ -112,7 +112,7 @@ creating sentiment-classifier
 ```bash
 $ curl http://***.amazonaws.com/sentiment-classifier \
     -X POST -H "Content-Type: application/json" \
-    -d '{"text": "serving models at scale is cooler!"}'
+    -d '{"text": "serving models at scale is really cool!"}'
 
 positive
 ```

From 87dd14ee0a3c17bdc75dfae235355d1a200eb375 Mon Sep 17 00:00:00 2001
From: Omer Spillinger <ospillinger@users.noreply.github.com>
Date: Thu, 30 Apr 2020 13:58:56 -0700
Subject: [PATCH 06/20] Update README.md

---
 README.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/README.md b/README.md
index ae5a81f1ae..a62a64a868 100644
--- a/README.md
+++ b/README.md
@@ -63,7 +63,7 @@ creating sentiment-classifier
 ### Serve predictions
 
 ```bash
-$ curl http://localhost:8888 \
+$ curl http://localhost:12345 \
     -X POST -H "Content-Type: application/json" \
     -d '{"text": "serving models locally is cool!"}'
 
@@ -76,7 +76,7 @@ positive
 
 ### Spin up a cluster
 
-Cortex is designed to be self-hosted on any AWS account (GCP is coming soon).
+Cortex clusters are designed to be self-hosted on any AWS account (GCP support is coming soon):
 
 <!-- CORTEX_VERSION_README_MINOR -->
 ```bash

From 5520c7b3b3cec1fc5f8c44a5c810fe9df52b3be8 Mon Sep 17 00:00:00 2001
From: Omer Spillinger <ospillinger@users.noreply.github.com>
Date: Thu, 30 Apr 2020 14:06:24 -0700
Subject: [PATCH 07/20] Update README.md

---
 README.md | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/README.md b/README.md
index a62a64a868..0307a13331 100644
--- a/README.md
+++ b/README.md
@@ -25,6 +25,13 @@
 
 ## Deploying a model
 
+### Install the CLI
+
+<!-- CORTEX_VERSION_README_MINOR -->
+```bash
+$ bash -c "$(curl -sS https://raw.githubusercontent.com/cortexlabs/cortex/0.15/get-cli.sh)"
+```
+
 ### Implement your predictor
 
 ```python
@@ -76,13 +83,9 @@ positive
 
 ### Spin up a cluster
 
-Cortex clusters are designed to be self-hosted on any AWS account (GCP support is coming soon):
+Cortex clusters are designed to be self-hosted on any AWS account (and GCP support is coming soon):
 
-<!-- CORTEX_VERSION_README_MINOR -->
 ```bash
-# install the CLI on your machine
-$ bash -c "$(curl -sS https://raw.githubusercontent.com/cortexlabs/cortex/0.15/get-cli.sh)"
-
 # provision infrastructure on AWS and spin up a cluster
 $ cortex cluster up
 
@@ -102,7 +105,7 @@ your cluster is ready!
 ### Deploy to your cluster with the same code and configuration
 
 ```bash
-$ cortex deploy
+$ cortex deploy --env aws
 
 creating sentiment-classifier
 ```

From 472cbc096f659fa6e004b56b73458cc994d36ded Mon Sep 17 00:00:00 2001
From: Omer Spillinger <ospillinger@users.noreply.github.com>
Date: Thu, 30 Apr 2020 20:43:18 -0700
Subject: [PATCH 08/20] Update tagline

---
 README.md                                        | 2 +-
 cli/cmd/root.go                                  | 2 +-
 docs/summary.md                                  | 2 +-
 examples/pytorch/language-identifier/sample.json | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/README.md b/README.md
index 0307a13331..5e1c9f0780 100644
--- a/README.md
+++ b/README.md
@@ -1,4 +1,4 @@
-# Deploy machine learning models in production
+# Machine learning model serving platform
 
 <br>
 
diff --git a/cli/cmd/root.go b/cli/cmd/root.go
index 7adfc89d9a..c39251bd7d 100644
--- a/cli/cmd/root.go
+++ b/cli/cmd/root.go
@@ -121,7 +121,7 @@ func initTelemetry() {
 var _rootCmd = &cobra.Command{
 	Use:     "cortex",
 	Aliases: []string{"cx"},
-	Short:   "deploy machine learning models in production",
+	Short:   "machine learning model serving platform",
 }
 
 func Execute() {
diff --git a/docs/summary.md b/docs/summary.md
index 7dcdd5851b..200d577b1e 100644
--- a/docs/summary.md
+++ b/docs/summary.md
@@ -1,6 +1,6 @@
 # Table of contents
 
-* [Deploy machine learning models in production](../README.md)
+* [Machine learning model serving platform](../README.md)
 * [Install](cluster-management/install.md)
 * [Tutorial](../examples/sklearn/iris-classifier/README.md)
 * [GitHub](https://github.com/cortexlabs/cortex)
diff --git a/examples/pytorch/language-identifier/sample.json b/examples/pytorch/language-identifier/sample.json
index 2329a0a714..0e1392bb07 100644
--- a/examples/pytorch/language-identifier/sample.json
+++ b/examples/pytorch/language-identifier/sample.json
@@ -1,3 +1,3 @@
 {
-  "text": "deploy machine learning models in production"
+  "text": "machine learning model serving platform"
 }

From ed31ab404ff6599a948f17338d8f5f492a7fe1a0 Mon Sep 17 00:00:00 2001
From: Caleb Kaiser <42076840+caleb-kaiser@users.noreply.github.com>
Date: Fri, 1 May 2020 16:03:17 -0400
Subject: [PATCH 09/20] Update README.md

---
 README.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/README.md b/README.md
index 5e1c9f0780..285095d77b 100644
--- a/README.md
+++ b/README.md
@@ -81,6 +81,8 @@ positive
 
 ## Running Cortex in production
 
+Cortex can also automatically provision and manage a Kubernetes cluster for inference workloads, typically for production use cases. Cortex manages its own Kubernetes cluster so that end-to-end functionality like request-based autoscaling, GPU support, and spot instance management can work out of the box without any additional DevOps work.
+
 ### Spin up a cluster
 
 Cortex clusters are designed to be self-hosted on any AWS account (and GCP support is coming soon):
@@ -137,8 +139,6 @@ negative  4
 
 The CLI sends configuration and code to the cluster every time you run `cortex deploy`. Each model is loaded into a Docker container, along with any Python packages and request handling code. The model is exposed as a web service using Elastic Load Balancing (ELB), TensorFlow Serving, and ONNX Runtime. The containers are orchestrated on Elastic Kubernetes Service (EKS) while logs and metrics are streamed to CloudWatch.
 
-Cortex manages its own Kubernetes cluster so that end-to-end functionality like request-based autoscaling, GPU support, and spot instance management can work out of the box without any additional DevOps work.
-
 <br>
 
 ## What is Cortex similar to?

From 9d3ca67c599bb15e86001e8838e88d4e530e98ba Mon Sep 17 00:00:00 2001
From: David Eliahu <deliahu@users.noreply.github.com>
Date: Fri, 1 May 2020 14:30:28 -0700
Subject: [PATCH 10/20] Update README.md

---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 285095d77b..4ecd4d764b 100644
--- a/README.md
+++ b/README.md
@@ -137,7 +137,7 @@ negative  4
 
 ### How it works
 
-The CLI sends configuration and code to the cluster every time you run `cortex deploy`. Each model is loaded into a Docker container, along with any Python packages and request handling code. The model is exposed as a web service using Elastic Load Balancing (ELB), TensorFlow Serving, and ONNX Runtime. The containers are orchestrated on Elastic Kubernetes Service (EKS) while logs and metrics are streamed to CloudWatch.
+The CLI sends configuration and code to the cluster every time you run `cortex deploy`. Each model is loaded into a Docker container, along with any Python packages and request handling code. The model is exposed as a web service using a Network Load Balancer (NLB) and FastAPI / TensorFlow Serving / ONNX Runtime (depending on the model type). The containers are orchestrated on Elastic Kubernetes Service (EKS) while logs and metrics are streamed to CloudWatch.
 
 <br>
 

From 9f8431960eacf032b670061bb0da9283dbee5c73 Mon Sep 17 00:00:00 2001
From: Vishal Bollu <vishal.bollu@gmail.com>
Date: Fri, 1 May 2020 19:35:35 -0400
Subject: [PATCH 11/20] Update README.md

---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 4ecd4d764b..030e2fc1ec 100644
--- a/README.md
+++ b/README.md
@@ -79,7 +79,7 @@ positive
 
 <br>
 
-## Running Cortex in production
+## Deploying models at scale
 
 Cortex can also automatically provision and manage a Kubernetes cluster for inference workloads, typically for production use cases. Cortex manages its own Kubernetes cluster so that end-to-end functionality like request-based autoscaling, GPU support, and spot instance management can work out of the box without any additional DevOps work.
 

From c84c60cc06a8e935722f9ab02c713eed5f9e557e Mon Sep 17 00:00:00 2001
From: David Eliahu <deliahu@users.noreply.github.com>
Date: Fri, 1 May 2020 17:43:53 -0700
Subject: [PATCH 12/20] Update README.md

---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 030e2fc1ec..3ed43b24e8 100644
--- a/README.md
+++ b/README.md
@@ -85,7 +85,7 @@ Cortex can also automatically provision and manage a Kubernetes cluster for infe
 
 ### Spin up a cluster
 
-Cortex clusters are designed to be self-hosted on any AWS account (and GCP support is coming soon):
+Cortex clusters are designed to be self-hosted on any AWS account (GCP support is coming soon):
 
 ```bash
 # provision infrastructure on AWS and spin up a cluster

From 8ddb59764ba4e6d63ef5c6d508717aea78d38394 Mon Sep 17 00:00:00 2001
From: Omer Spillinger <ospillinger@users.noreply.github.com>
Date: Fri, 1 May 2020 21:06:22 -0700
Subject: [PATCH 13/20] Update README.md

---
 README.md | 2 --
 1 file changed, 2 deletions(-)

diff --git a/README.md b/README.md
index 3ed43b24e8..f7d31eb7c2 100644
--- a/README.md
+++ b/README.md
@@ -81,8 +81,6 @@ positive
 
 ## Deploying models at scale
 
-Cortex can also automatically provision and manage a Kubernetes cluster for inference workloads, typically for production use cases. Cortex manages its own Kubernetes cluster so that end-to-end functionality like request-based autoscaling, GPU support, and spot instance management can work out of the box without any additional DevOps work.
-
 ### Spin up a cluster
 
 Cortex clusters are designed to be self-hosted on any AWS account (GCP support is coming soon):

From 0a1af5298b8f5712b8a6c6aea232b3a85b557e19 Mon Sep 17 00:00:00 2001
From: Omer Spillinger <42219498+ospillinger@users.noreply.github.com>
Date: Fri, 1 May 2020 21:13:17 -0700
Subject: [PATCH 14/20] Update README.md

---
 README.md | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/README.md b/README.md
index f7d31eb7c2..563dee27b2 100644
--- a/README.md
+++ b/README.md
@@ -1,4 +1,4 @@
-# Machine learning model serving platform
+# Cortex: machine learning model serving infrastructure
 
 <br>
 
@@ -86,7 +86,6 @@ positive
 Cortex clusters are designed to be self-hosted on any AWS account (GCP support is coming soon):
 
 ```bash
-# provision infrastructure on AWS and spin up a cluster
 $ cortex cluster up
 
 aws region: us-west-2

From 459c535c6833c41f191878a9130d7b61c5c7e94e Mon Sep 17 00:00:00 2001
From: Omer Spillinger <ospillinger@users.noreply.github.com>
Date: Fri, 1 May 2020 21:14:28 -0700
Subject: [PATCH 15/20] Update tagline

---
 cli/cmd/root.go                                  | 2 +-
 docs/summary.md                                  | 2 +-
 examples/pytorch/language-identifier/sample.json | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/cli/cmd/root.go b/cli/cmd/root.go
index c39251bd7d..750255c217 100644
--- a/cli/cmd/root.go
+++ b/cli/cmd/root.go
@@ -121,7 +121,7 @@ func initTelemetry() {
 var _rootCmd = &cobra.Command{
 	Use:     "cortex",
 	Aliases: []string{"cx"},
-	Short:   "machine learning model serving platform",
+	Short:   "machine learning model serving infrastructure",
 }
 
 func Execute() {
diff --git a/docs/summary.md b/docs/summary.md
index 200d577b1e..da02fa89f7 100644
--- a/docs/summary.md
+++ b/docs/summary.md
@@ -1,6 +1,6 @@
 # Table of contents
 
-* [Machine learning model serving platform](../README.md)
+* [Machine learning model serving infrastructure](../README.md)
 * [Install](cluster-management/install.md)
 * [Tutorial](../examples/sklearn/iris-classifier/README.md)
 * [GitHub](https://github.com/cortexlabs/cortex)
diff --git a/examples/pytorch/language-identifier/sample.json b/examples/pytorch/language-identifier/sample.json
index 0e1392bb07..76fce072eb 100644
--- a/examples/pytorch/language-identifier/sample.json
+++ b/examples/pytorch/language-identifier/sample.json
@@ -1,3 +1,3 @@
 {
-  "text": "machine learning model serving platform"
+  "text": "machine learning model serving infrastructure"
 }

From 8b13797cd87ee472c59c00b0bf05bbc458c7378d Mon Sep 17 00:00:00 2001
From: Omer Spillinger <42219498+ospillinger@users.noreply.github.com>
Date: Fri, 1 May 2020 21:15:11 -0700
Subject: [PATCH 16/20] Update README.md

---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 563dee27b2..d4e339a14a 100644
--- a/README.md
+++ b/README.md
@@ -1,4 +1,4 @@
-# Cortex: machine learning model serving infrastructure
+# Machine learning model serving infrastructure
 
 <br>
 

From 586fd188c711c8675d4d01d7b5aa5437142b5712 Mon Sep 17 00:00:00 2001
From: Omer Spillinger <42219498+ospillinger@users.noreply.github.com>
Date: Fri, 1 May 2020 21:23:54 -0700
Subject: [PATCH 17/20] Update README.md

---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index d4e339a14a..8e74dc804b 100644
--- a/README.md
+++ b/README.md
@@ -144,7 +144,7 @@ Cortex is an open source alternative to serving models with SageMaker or buildin
 
 <br>
 
-## Examples of Cortex deployments
+## Examples
 
 <!-- CORTEX_VERSION_README_MINOR x3 -->
 * [Image classification](https://github.com/cortexlabs/cortex/tree/0.15/examples/tensorflow/image-classifier): deploy an Inception model to classify images.

From a1345fc14dcd6b295be5cd5860655a7455a8ce1b Mon Sep 17 00:00:00 2001
From: Omer Spillinger <ospillinger@users.noreply.github.com>
Date: Sun, 3 May 2020 21:08:51 -0700
Subject: [PATCH 18/20] Update install.md

---
 docs/cluster-management/install.md | 46 ++++++++++++------------------
 1 file changed, 19 insertions(+), 27 deletions(-)

diff --git a/docs/cluster-management/install.md b/docs/cluster-management/install.md
index 195372c28d..c54639d471 100644
--- a/docs/cluster-management/install.md
+++ b/docs/cluster-management/install.md
@@ -2,41 +2,37 @@
 
 _WARNING: you are on the master branch, please refer to the docs on the branch that matches your `cortex version`_
 
-## Prerequisites
+## Running on your machine or a single instance
 
-1. [Docker](https://docs.docker.com/install)
-2. [AWS credentials](aws-credentials.md)
+[Docker](https://docs.docker.com/install) is required to run Cortex locally. In addition, your machine (or your Docker Desktop for Mac users) should have at least 8GB of memory if you plan to deploy large deep learning models.
 
-## Spin up a cluster
-
-See [cluster configuration](config.md) to learn how you can customize your cluster with `cluster.yaml` and see [EC2 instances](ec2-instances.md) for an overview of several EC2 instance types. To use GPU nodes, you may need to subscribe to the [EKS-optimized AMI with GPU Support](https://aws.amazon.com/marketplace/pp/B07GRHFXGM) and [file an AWS support ticket](https://console.aws.amazon.com/support/cases#/create?issueType=service-limit-increase&limitType=ec2-instances) to increase the limit for your desired instance type.
+### Install the CLI
 
 <!-- CORTEX_VERSION_MINOR -->
 ```bash
-# install the CLI on your machine
 $ bash -c "$(curl -sS https://raw.githubusercontent.com/cortexlabs/cortex/master/get-cli.sh)"
+```
 
-# provision infrastructure on AWS and spin up a cluster
-$ cortex cluster up
-
-aws resource                                cost per hour
-1 eks cluster                               $0.10
-0 - 5 g4dn.xlarge instances for your apis   $0.1578 - $0.526 each (varies based on spot price)
-0 - 5 50gb ebs volumes for your apis        $0.007 each
-1 t3.medium instance for the operator       $0.0416
-1 20gb ebs volume for the operator          $0.003
-2 network load balancers                    $0.0225 each
+## Running at scale on AWS
 
-your cluster will cost $0.19 - $2.85 per hour based on cluster size and spot instance pricing/availability
+[Docker](https://docs.docker.com/install) and valid [AWS credentials](aws-credentials.md) are required to run a Cortex cluster on AWS.
 
-￮ spinning up your cluster ...
+### Spin up a cluster
 
-your cluster is ready!
-```
+See [cluster configuration](config.md) to learn how you can customize your cluster with `cluster.yaml` and see [EC2 instances](ec2-instances.md) for an overview of several EC2 instance types.
 
-## Deploy a model
+To use GPU nodes, you may need to subscribe to the [EKS-optimized AMI with GPU Support](https://aws.amazon.com/marketplace/pp/B07GRHFXGM) and [file an AWS support ticket](https://console.aws.amazon.com/support/cases#/create?issueType=service-limit-increase&limitType=ec2-instances) to increase the limit for your desired instance type.
 
 <!-- CORTEX_VERSION_MINOR -->
+```bash
+# install the CLI on your machine
+$ bash -c "$(curl -sS https://raw.githubusercontent.com/cortexlabs/cortex/master/get-cli.sh)"
+
+# provision infrastructure on AWS and spin up a cluster
+$ cortex cluster up
+```
+
+## Deploy an example
 
 ```bash
 # clone the Cortex repository
@@ -45,7 +41,7 @@ git clone -b master https://github.com/cortexlabs/cortex.git
 # navigate to the TensorFlow iris classification example
 cd cortex/examples/tensorflow/iris-classifier
 
-# deploy the model to the cluster
+# deploy the model
 cortex deploy
 
 # view the status of the api
@@ -61,11 +57,7 @@ cortex get iris-classifier
 curl -X POST -H "Content-Type: application/json" \
   -d '{ "sepal_length": 5.2, "sepal_width": 3.6, "petal_length": 1.4, "petal_width": 0.3 }' \
   <API endpoint>
-```
 
-## Cleanup
-
-```bash
 # delete the api
 cortex delete iris-classifier
 ```

From df76a5533a7a9b805bccf5c1f9a3986f449665a9 Mon Sep 17 00:00:00 2001
From: Omer Spillinger <ospillinger@users.noreply.github.com>
Date: Sun, 3 May 2020 21:15:54 -0700
Subject: [PATCH 19/20] Update README.md

---
 examples/sklearn/iris-classifier/README.md | 66 ++++++++++++++++------
 1 file changed, 48 insertions(+), 18 deletions(-)

diff --git a/examples/sklearn/iris-classifier/README.md b/examples/sklearn/iris-classifier/README.md
index 19df9e16b1..7056aac36a 100644
--- a/examples/sklearn/iris-classifier/README.md
+++ b/examples/sklearn/iris-classifier/README.md
@@ -1,4 +1,4 @@
-# Deploy a model as a web service
+# Deploy models as a web APIs
 
 _WARNING: you are on the master branch, please refer to the examples on the branch that matches your `cortex version`_
 
@@ -119,9 +119,9 @@ Create a `cortex.yaml` file and add the configuration below and replace `cortex-
 
 <br>
 
-## Deploy to AWS
+## Deploy your model locally
 
-`cortex deploy` takes the configuration from `cortex.yaml` and creates it on your cluster:
+`cortex deploy` takes your model along with the configuration from `cortex.yaml` and creates a web API:
 
 ```bash
 $ cortex deploy
@@ -129,7 +129,7 @@ $ cortex deploy
 creating iris-classifier
 ```
 
-Track the status of your api using `cortex get`:
+Monitor the status of your API using `cortex get`:
 
 ```bash
 $ cortex get iris-classifier --watch
@@ -137,7 +137,7 @@ $ cortex get iris-classifier --watch
 status   up-to-date   requested   last update   avg request   2XX
 live     1            1           1m            -             -
 
-endpoint: http://***.amazonaws.com/iris-classifier
+endpoint: http://localhost:8888
 ```
 
 The output above indicates that one replica of your API was requested and is available to serve predictions. Cortex will automatically launch more replicas if the load increases and spin down replicas if there is unused capacity.
@@ -148,11 +148,37 @@ You can also stream logs from your API:
 $ cortex logs iris-classifier
 ```
 
+You can use `curl` to test your API:
+
+```bash
+$ curl http://localhost:8888 \
+    -X POST -H "Content-Type: application/json" \
+    -d '{"sepal_length": 5.2, "sepal_width": 3.6, "petal_length": 1.4, "petal_width": 0.3}'
+
+"setosa"
+```
+
 <br>
 
-## Serve real-time predictions
+## Deploy your model to AWS
 
-You can use `curl` to test your prediction service:
+Cortex can automatically provision infrastructure on your AWS account and deploy your models as production-ready web services:
+
+```bash
+$ cortex cluster up
+```
+
+You can deploy the model using the same code and configuration to your cluster:
+
+```bash
+$ cortex deploy --env aws
+
+creating iris-classifier
+```
+
+<br>
+
+## Serve predictions in production
 
 ```bash
 $ curl http://***.amazonaws.com/iris-classifier \
@@ -185,7 +211,7 @@ Add a `tracker` to your `cortex.yaml` and specify that this is a classification
 Run `cortex deploy` again to perform a rolling update to your API with the new configuration:
 
 ```bash
-$ cortex deploy
+$ cortex deploy --env aws
 
 updating iris-classifier
 ```
@@ -193,7 +219,7 @@ updating iris-classifier
 After making more predictions, your `cortex get` command will show information about your API's past predictions:
 
 ```bash
-$ cortex get iris-classifier --watch
+$ cortex get --env aws iris-classifier --watch
 
 status   up-to-date   requested   last update   avg request   2XX
 live     1            1           1m            1.1 ms        14
@@ -230,7 +256,7 @@ This model is fairly small but larger models may require more compute resources.
 You could also configure GPU compute here if your cluster supports it. Adding compute resources may help reduce your inference latency. Run `cortex deploy` again to update your API with this configuration:
 
 ```bash
-$ cortex deploy
+$ cortex deploy --env aws
 
 updating iris-classifier
 ```
@@ -238,7 +264,7 @@ updating iris-classifier
 Run `cortex get` again:
 
 ```bash
-$ cortex get iris-classifier --watch
+$ cortex get --env aws iris-classifier --watch
 
 status   up-to-date   requested   last update   avg request   2XX
 live     1            1           1m            1.1 ms        14
@@ -288,7 +314,7 @@ If you trained another model and want to A/B test it with your previous model, s
 Run `cortex deploy` to create the new API:
 
 ```bash
-$ cortex deploy
+$ cortex deploy --env aws
 
 iris-classifier is up to date
 creating another-iris-classifier
@@ -297,7 +323,7 @@ creating another-iris-classifier
 `cortex deploy` is declarative so the `iris-classifier` API is unchanged while `another-iris-classifier` is created:
 
 ```bash
-$ cortex get --watch
+$ cortex get --env aws --watch
 
 api                       status   up-to-date   requested   last update
 iris-classifier           live     1            1           5m
@@ -388,7 +414,7 @@ Next, add the `api` to `cortex.yaml`:
 Run `cortex deploy` to create your batch API:
 
 ```bash
-$ cortex deploy
+$ cortex deploy --env aws
 
 updating iris-classifier
 updating another-iris-classifier
@@ -400,7 +426,7 @@ Since a new file was added to the directory, and all files in the directory cont
 `cortex get` should show all 3 APIs now:
 
 ```bash
-$ cortex get --watch
+$ cortex get --env aws --watch
 
 api                       status   up-to-date   requested   last update
 iris-classifier           live     1            1           1m
@@ -446,15 +472,19 @@ $ curl http://***.amazonaws.com/batch-iris-classifier \
 Run `cortex delete` to delete each API:
 
 ```bash
-$ cortex delete iris-classifier
+$ cortex delete --env local iris-classifier
+
+deleting iris-classifier
+
+$ cortex delete --env aws iris-classifier
 
 deleting iris-classifier
 
-$ cortex delete another-iris-classifier
+$ cortex delete --env aws another-iris-classifier
 
 deleting another-iris-classifier
 
-$ cortex delete batch-iris-classifier
+$ cortex delete --env aws batch-iris-classifier
 
 deleting batch-iris-classifier
 ```

From f1993efb8f7e29b6ec3d30aa4cdbce225f55dad8 Mon Sep 17 00:00:00 2001
From: David Eliahu <deliahu@users.noreply.github.com>
Date: Sun, 3 May 2020 21:27:52 -0700
Subject: [PATCH 20/20] Update README.md

---
 README.md | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 8e74dc804b..4e298e3eb3 100644
--- a/README.md
+++ b/README.md
@@ -70,7 +70,7 @@ creating sentiment-classifier
 ### Serve predictions
 
 ```bash
-$ curl http://localhost:12345 \
+$ curl http://localhost:8888 \
     -X POST -H "Content-Type: application/json" \
     -d '{"text": "serving models locally is cool!"}'
 
@@ -136,6 +136,8 @@ negative  4
 
 The CLI sends configuration and code to the cluster every time you run `cortex deploy`. Each model is loaded into a Docker container, along with any Python packages and request handling code. The model is exposed as a web service using a Network Load Balancer (NLB) and FastAPI / TensorFlow Serving / ONNX Runtime (depending on the model type). The containers are orchestrated on Elastic Kubernetes Service (EKS) while logs and metrics are streamed to CloudWatch.
 
+Cortex manages its own Kubernetes cluster so that end-to-end functionality like request-based autoscaling, GPU support, and spot instance management can work out of the box without any additional DevOps work.
+
 <br>
 
 ## What is Cortex similar to?