diff --git a/.dockerignore b/.dockerignore index 1b0561f446..3d39c7390f 100644 --- a/.dockerignore +++ b/.dockerignore @@ -2,7 +2,7 @@ /bin/ /dev/ /docs/ -/examples/ +/test/ **/.* **/*.md diff --git a/.gitbook.yaml b/.gitbook.yaml index 7a67fbf248..1977d6450c 100644 --- a/.gitbook.yaml +++ b/.gitbook.yaml @@ -1,13 +1,5 @@ root: ./docs/ structure: - readme: ../README.md + readme: ./tutorials/realtime.md summary: summary.md - -redirects: - tutorial: ../examples/pytorch/text-generator/README.md - tutorial/realtime: ../examples/pytorch/text-generator/README.md - tutorial/batch: ../examples/batch/image-classifier/README.md - install: ./cloud/install.md - uninstall: ./cloud/uninstall.md - update: ./cloud/update.md diff --git a/CODE_OF_CONDUCT.md b/CODE_OF_CONDUCT.md deleted file mode 100644 index 425f0e1e73..0000000000 --- a/CODE_OF_CONDUCT.md +++ /dev/null @@ -1,76 +0,0 @@ -# Contributor Covenant Code of Conduct - -## Our Pledge - -In the interest of fostering an open and welcoming environment, we as -contributors and maintainers pledge to making participation in our project and -our community a harassment-free experience for everyone, regardless of age, body -size, disability, ethnicity, sex characteristics, gender identity and expression, -level of experience, education, socio-economic status, nationality, personal -appearance, race, religion, or sexual identity and orientation. - -## Our Standards - -Examples of behavior that contributes to creating a positive environment -include: - -* Using welcoming and inclusive language -* Being respectful of differing viewpoints and experiences -* Gracefully accepting constructive criticism -* Focusing on what is best for the community -* Showing empathy towards other community members - -Examples of unacceptable behavior by participants include: - -* The use of sexualized language or imagery and unwelcome sexual attention or - advances -* Trolling, insulting/derogatory comments, and personal or political attacks -* Public or private harassment -* Publishing others' private information, such as a physical or electronic - address, without explicit permission -* Other conduct which could reasonably be considered inappropriate in a - professional setting - -## Our Responsibilities - -Project maintainers are responsible for clarifying the standards of acceptable -behavior and are expected to take appropriate and fair corrective action in -response to any instances of unacceptable behavior. - -Project maintainers have the right and responsibility to remove, edit, or -reject comments, commits, code, wiki edits, issues, and other contributions -that are not aligned to this Code of Conduct, or to ban temporarily or -permanently any contributor for other behaviors that they deem inappropriate, -threatening, offensive, or harmful. - -## Scope - -This Code of Conduct applies both within project spaces and in public spaces -when an individual is representing the project or its community. Examples of -representing a project or community include using an official project e-mail -address, posting via an official social media account, or acting as an appointed -representative at an online or offline event. Representation of a project may be -further defined and clarified by project maintainers. - -## Enforcement - -Instances of abusive, harassing, or otherwise unacceptable behavior may be -reported by contacting the project team at contact@cortex.dev. All -complaints will be reviewed and investigated and will result in a response that -is deemed necessary and appropriate to the circumstances. The project team is -obligated to maintain confidentiality with regard to the reporter of an incident. -Further details of specific enforcement policies may be posted separately. - -Project maintainers who do not follow or enforce the Code of Conduct in good -faith may face temporary or permanent repercussions as determined by other -members of the project's leadership. - -## Attribution - -This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4, -available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html - -[homepage]: https://www.contributor-covenant.org - -For answers to common questions about this code of conduct, see -https://www.contributor-covenant.org/faq diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md deleted file mode 100644 index facbf253e0..0000000000 --- a/CONTRIBUTING.md +++ /dev/null @@ -1,9 +0,0 @@ -# Contributing - -Thank you for your interest in contributing to Cortex! - -- **Report a bug, request a feature, or share feedback:** please let us know via [email](mailto:hello@cortex.dev), [chat](https://gitter.im/cortexlabs/cortex), or [issues](https://github.com/cortexlabs/cortex/issues). - -- **Add an example:** we're always excited to see cool models deployed with Cortex. Please check out our [examples](examples) and feel free to add a one by submitting a pull request. - -- **Implement a feature:** here are [instructions for setting up a development environment](docs/contributing/development.md). If you'd like to contribute significant code to the project, please reach out to us so we can work together on the design and make sure we're on the same page before you get started. diff --git a/README.md b/README.md index 53657f23f2..9e450eb2d0 100644 --- a/README.md +++ b/README.md @@ -3,10 +3,6 @@
- - -[install](https://docs.cortex.dev/install) • [documentation](https://docs.cortex.dev) • [examples](https://github.com/cortexlabs/cortex/tree/0.23/examples) • [community](https://gitter.im/cortexlabs/cortex) - # Deploy machine learning models to production Cortex is an open source platform for deploying, managing, and scaling machine learning in production. @@ -56,8 +52,6 @@ cortex is ready! #### Implement a predictor ```python -# predictor.py - from transformers import pipeline class PythonPredictor: @@ -74,20 +68,13 @@ class PythonPredictor: api_spec = { "name": "text-generator", "kind": "RealtimeAPI", - "predictor": { - "type": "python", - "path": "predictor.py" - }, "compute": { "gpu": 1, - "mem": "8Gi", + "mem": "8Gi" }, "autoscaling": { "min_replicas": 1, "max_replicas": 10 - }, - "networking": { - "api_gateway": "public" } } ``` @@ -108,19 +95,15 @@ api_spec = { import cortex cx = cortex.client("aws") -cx.deploy(api_spec, project_dir=".") +cx.create_api(api_spec, predictor=PythonPredictor) # creating https://example.com/text-generator ``` #### Consume your API -```python -import requests - -endpoint = "https://example.com/text-generator" -payload = {"text": "hello world"} -prediction = requests.post(endpoint, payload) +```bash +$ curl https://example.com/text-generator -X POST -H "Content-Type: application/json" -d '{"text": "hello world"}' ```
@@ -131,4 +114,4 @@ prediction = requests.post(endpoint, payload) pip install cortex ``` -See the [installation guide](https://docs.cortex.dev/install) for next steps. +[Deploy models](https://docs.cortex.dev) and [join our community](https://gitter.im/cortexlabs/cortex). diff --git a/build/lint.sh b/build/lint.sh index cdd6b5d4cd..94dcd54465 100755 --- a/build/lint.sh +++ b/build/lint.sh @@ -74,7 +74,7 @@ output=$(cd "$ROOT" && find . -type f \ ! -path "**/.idea/*" \ ! -path "**/.history/*" \ ! -path "**/__pycache__/*" \ -! -path "./examples/*" \ +! -path "./test/*" \ ! -path "./dev/config/*" \ ! -path "./bin/*" \ ! -path "./.circleci/*" \ @@ -137,25 +137,12 @@ if [ "$is_release_branch" = "true" ]; then exit 1 fi - # Check for version warning comments in examples - output=$(cd "$ROOT/examples" && find . -type f \ - ! -name "README.md" \ - ! -name "*.json" \ - ! -name "*.txt" \ - ! -name ".*" \ - ! -name "*.bin" \ - -exec grep -L -e "this is an example for cortex release ${git_branch} and may not deploy correctly on other releases of cortex" {} \;) - if [[ $output ]]; then - echo "examples file(s) are missing appropriate version comment:" - echo "$output" - exit 1 - fi - else # Check for version warning comments in docs output=$(cd "$ROOT/docs" && find . -type f \ ! -path "./README.md" \ ! -name "summary.md" \ + ! -path "./tutorials/*" \ ! -name "development.md" \ ! -name "*.json" \ ! -name "*.txt" \ @@ -167,21 +154,6 @@ else echo "$output" exit 1 fi - - # Check for version warning comments in examples - output=$(cd "$ROOT/examples" && find . -type f \ - ! -path "./README.md" \ - ! -path "**/__pycache__/*" \ - ! -name "*.json" \ - ! -name "*.txt" \ - ! -name ".*" \ - ! -name "*.bin" \ - -exec grep -L "WARNING: you are on the master branch; please refer to examples on the branch corresponding to your \`cortex version\` (e\.g\. for version [0-9]*\.[0-9]*\.\*, run \`git checkout -b [0-9]*\.[0-9]*\` or switch to the \`[0-9]*\.[0-9]*\` branch on GitHub)" {} \;) - if [[ $output ]]; then - echo "example file(s) are missing version appropriate comment:" - echo "$output" - exit 1 - fi fi # Check for trailing whitespace diff --git a/build/test-examples.sh b/build/test-examples.sh index 33b562f095..3b334f4d00 100755 --- a/build/test-examples.sh +++ b/build/test-examples.sh @@ -19,7 +19,7 @@ set -eou pipefail ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")"/.. >/dev/null && pwd)" CORTEX="$ROOT/bin/cortex" -for example in $ROOT/examples/*/cortex.yaml; do +for example in $ROOT/test/*/cortex.yaml; do timer=1200 example_base_dir=$(dirname "${example}") retry="false" diff --git a/cli/cluster/errors.go b/cli/cluster/errors.go index a0a953428b..0256283da5 100644 --- a/cli/cluster/errors.go +++ b/cli/cluster/errors.go @@ -62,7 +62,7 @@ func ErrorFailedToConnectOperator(originalError error, envName string, operatorU msg += fmt.Sprintf(" → otherwise you can ignore this message, and prevent it in the future with `cortex env delete %s`\n", envName) msg += "\nif you have a cluster running:\n" msg += fmt.Sprintf(" → run `cortex cluster info --configure-env %s` to update your environment (include `--config ` if you have a cluster configuration file)\n", envName) - msg += fmt.Sprintf(" → if you set `operator_load_balancer_scheme: internal` in your cluster configuration file, your CLI must run from within a VPC that has access to your cluster's VPC (see https://docs.cortex.dev/v/%s/aws/vpc-peering)\n", consts.CortexVersionMinor) + msg += fmt.Sprintf(" → if you set `operator_load_balancer_scheme: internal` in your cluster configuration file, your CLI must run from within a VPC that has access to your cluster's VPC (see https://docs.cortex.dev/v/%s/)\n", consts.CortexVersionMinor) } return errors.WithStack(&errors.Error{ diff --git a/cli/cmd/errors.go b/cli/cmd/errors.go index 7757ed835b..bd94fa8b11 100644 --- a/cli/cmd/errors.go +++ b/cli/cmd/errors.go @@ -249,7 +249,7 @@ func ErrorMissingAWSCredentials() error { func ErrorCredentialsInClusterConfig(cmd string, path string) error { return errors.WithStack(&errors.Error{ Kind: ErrCredentialsInClusterConfig, - Message: fmt.Sprintf("specifying credentials in the cluster configuration is no longer supported, please specify aws credentials using flags (e.g. cortex cluster %s --config %s --aws-key --aws-secret ) or set environment variables; see https://docs.cortex.dev/v/%s/aws/security#iam-permissions for more information", cmd, path, consts.CortexVersionMinor), + Message: fmt.Sprintf("specifying credentials in the cluster configuration is no longer supported, please specify aws credentials using flags (e.g. cortex cluster %s --config %s --aws-key --aws-secret ) or set environment variables; see https://docs.cortex.dev/v/%s/ for more information", cmd, path, consts.CortexVersionMinor), }) } @@ -343,6 +343,6 @@ func ErrorDeployFromTopLevelDir(genericDirName string, providerType types.Provid } return errors.WithStack(&errors.Error{ Kind: ErrDeployFromTopLevelDir, - Message: fmt.Sprintf("cannot deploy from your %s directory - when deploying your API, cortex sends all files in your project directory (i.e. the directory which contains cortex.yaml) to your %s (see https://docs.cortex.dev/v/%s/deployments/realtime-api/predictors#project-files for Realtime API and https://docs.cortex.dev/v/%s/deployments/batch-api/predictors#project-files for Batch API); therefore it is recommended to create a subdirectory for your project files", genericDirName, targetStr, consts.CortexVersionMinor, consts.CortexVersionMinor), + Message: fmt.Sprintf("cannot deploy from your %s directory - when deploying your API, cortex sends all files in your project directory (i.e. the directory which contains cortex.yaml) to your %s (see https://docs.cortex.dev/v/%s/); therefore it is recommended to create a subdirectory for your project files", genericDirName, targetStr, consts.CortexVersionMinor), }) } diff --git a/cli/cmd/lib_aws_creds.go b/cli/cmd/lib_aws_creds.go index d2a8866393..99a49167db 100644 --- a/cli/cmd/lib_aws_creds.go +++ b/cli/cmd/lib_aws_creds.go @@ -69,7 +69,7 @@ func promptIfNotAdmin(awsClient *aws.Client, disallowPrompt bool) { } if !awsClient.IsAdmin() { - warningStr := fmt.Sprintf("warning: your IAM user%s does not have administrator access. This will likely prevent Cortex from installing correctly, so it is recommended to attach the AdministratorAccess policy to your IAM user (or to a group that your IAM user belongs to) via the AWS IAM console. If you'd like, you may provide separate credentials for your cluster to use after it's running (see https://docs.cortex.dev/v/%s/aws/security for instructions).\n\n", accessKeyMsg, consts.CortexVersionMinor) + warningStr := fmt.Sprintf("warning: your IAM user%s does not have administrator access. This will likely prevent Cortex from installing correctly, so it is recommended to attach the AdministratorAccess policy to your IAM user (or to a group that your IAM user belongs to) via the AWS IAM console. If you'd like, you may provide separate credentials for your cluster to use after it's running (see https://docs.cortex.dev/v/%s/).\n\n", accessKeyMsg, consts.CortexVersionMinor) if disallowPrompt { fmt.Print(warningStr) } else { diff --git a/cli/cmd/lib_cluster_config_aws.go b/cli/cmd/lib_cluster_config_aws.go index 2971e47c19..020a9648ee 100644 --- a/cli/cmd/lib_cluster_config_aws.go +++ b/cli/cmd/lib_cluster_config_aws.go @@ -70,7 +70,7 @@ func readCachedClusterConfigFile(clusterConfig *clusterconfig.Config, filePath s func readUserClusterConfigFile(clusterConfig *clusterconfig.Config) error { errs := cr.ParseYAMLFile(clusterConfig, clusterconfig.UserValidation, _flagClusterConfig) if errors.HasError(errs) { - return errors.Append(errors.FirstError(errs...), fmt.Sprintf("\n\ncluster configuration schema can be found here: https://docs.cortex.dev/v/%s/aws/install", consts.CortexVersionMinor)) + return errors.Append(errors.FirstError(errs...), fmt.Sprintf("\n\ncluster configuration schema can be found at https://docs.cortex.dev/v/%s/", consts.CortexVersionMinor)) } return nil @@ -85,7 +85,7 @@ func getNewClusterAccessConfig(disallowPrompt bool) (*clusterconfig.AccessConfig if _flagClusterConfig != "" { errs := cr.ParseYAMLFile(accessConfig, clusterconfig.AccessValidation, _flagClusterConfig) if errors.HasError(errs) { - return nil, errors.Append(errors.FirstError(errs...), fmt.Sprintf("\n\ncluster configuration schema can be found here: https://docs.cortex.dev/v/%s/aws/install", consts.CortexVersionMinor)) + return nil, errors.Append(errors.FirstError(errs...), fmt.Sprintf("\n\ncluster configuration schema can be found at https://docs.cortex.dev/v/%s/", consts.CortexVersionMinor)) } } @@ -121,7 +121,7 @@ func getClusterAccessConfigWithCache(disallowPrompt bool) (*clusterconfig.Access if _flagClusterConfig != "" { errs := cr.ParseYAMLFile(accessConfig, clusterconfig.AccessValidation, _flagClusterConfig) if errors.HasError(errs) { - return nil, errors.Append(errors.FirstError(errs...), fmt.Sprintf("\n\ncluster configuration schema can be found here: https://docs.cortex.dev/v/%s/aws/install", consts.CortexVersionMinor)) + return nil, errors.Append(errors.FirstError(errs...), fmt.Sprintf("\n\ncluster configuration schema can be found at https://docs.cortex.dev/v/%s/", consts.CortexVersionMinor)) } } @@ -192,7 +192,7 @@ func getInstallClusterConfig(awsClient *aws.Client, awsCreds AWSCredentials, acc err = clusterConfig.Validate(awsClient) if err != nil { - err = errors.Append(err, fmt.Sprintf("\n\ncluster configuration schema can be found here: https://docs.cortex.dev/v/%s/aws/install", consts.CortexVersionMinor)) + err = errors.Append(err, fmt.Sprintf("\n\ncluster configuration schema can be found at https://docs.cortex.dev/v/%s/", consts.CortexVersionMinor)) if _flagClusterConfig != "" { err = errors.Wrap(err, _flagClusterConfig) } @@ -258,7 +258,7 @@ func getConfigureClusterConfig(cachedClusterConfig clusterconfig.Config, awsCred err = userClusterConfig.Validate(awsClient) if err != nil { - err = errors.Append(err, fmt.Sprintf("\n\ncluster configuration schema can be found here: https://docs.cortex.dev/v/%s/aws/install", consts.CortexVersionMinor)) + err = errors.Append(err, fmt.Sprintf("\n\ncluster configuration schema can be found at https://docs.cortex.dev/v/%s/", consts.CortexVersionMinor)) if _flagClusterConfig != "" { err = errors.Wrap(err, _flagClusterConfig) } @@ -542,23 +542,23 @@ func confirmInstallClusterConfig(clusterConfig *clusterconfig.Config, awsCreds A fmt.Printf("cortex will also create an s3 bucket (%s) and a cloudwatch log group (%s)%s\n\n", clusterConfig.Bucket, clusterConfig.ClusterName, privateSubnetMsg) if clusterConfig.APIGatewaySetting == clusterconfig.NoneAPIGatewaySetting { - fmt.Print(fmt.Sprintf("warning: you've disabled API Gateway cluster-wide, so APIs will not be able to create API Gateway endpoints (they will still be reachable via the API load balancer; see https://docs.cortex.dev/v/%s/aws/networking for more information)\n\n", consts.CortexVersionMinor)) + fmt.Print(fmt.Sprintf("warning: you've disabled API Gateway cluster-wide, so APIs will not be able to create API Gateway endpoints (they will still be reachable via the API load balancer; see https://docs.cortex.dev/v/%s/ for more information)\n\n", consts.CortexVersionMinor)) } if clusterConfig.OperatorLoadBalancerScheme == clusterconfig.InternalLoadBalancerScheme { - fmt.Print(fmt.Sprintf("warning: you've configured the operator load balancer to be internal; you must configure VPC Peering to connect your CLI to your cluster operator (see https://docs.cortex.dev/v/%s/aws/vpc-peering)\n\n", consts.CortexVersionMinor)) + fmt.Print(fmt.Sprintf("warning: you've configured the operator load balancer to be internal; you must configure VPC Peering to connect your CLI to your cluster operator (see https://docs.cortex.dev/v/%s/)\n\n", consts.CortexVersionMinor)) } if isSpot && clusterConfig.SpotConfig.OnDemandBackup != nil && !*clusterConfig.SpotConfig.OnDemandBackup { if *clusterConfig.SpotConfig.OnDemandBaseCapacity == 0 && *clusterConfig.SpotConfig.OnDemandPercentageAboveBaseCapacity == 0 { - fmt.Printf("warning: you've disabled on-demand instances (%s=0 and %s=0); spot instances are not guaranteed to be available so please take that into account for production clusters; see https://docs.cortex.dev/v/%s/aws/spot for more information\n\n", clusterconfig.OnDemandBaseCapacityKey, clusterconfig.OnDemandPercentageAboveBaseCapacityKey, consts.CortexVersionMinor) + fmt.Printf("warning: you've disabled on-demand instances (%s=0 and %s=0); spot instances are not guaranteed to be available so please take that into account for production clusters; see https://docs.cortex.dev/v/%s/ for more information\n\n", clusterconfig.OnDemandBaseCapacityKey, clusterconfig.OnDemandPercentageAboveBaseCapacityKey, consts.CortexVersionMinor) } else { - fmt.Printf("warning: you've enabled spot instances; spot instances are not guaranteed to be available so please take that into account for production clusters; see https://docs.cortex.dev/v/%s/aws/spot for more information\n\n", consts.CortexVersionMinor) + fmt.Printf("warning: you've enabled spot instances; spot instances are not guaranteed to be available so please take that into account for production clusters; see https://docs.cortex.dev/v/%s/ for more information\n\n", consts.CortexVersionMinor) } } if !disallowPrompt { - exitMessage := fmt.Sprintf("cluster configuration can be modified via the cluster config file; see https://docs.cortex.dev/v/%s/aws/install for more information", consts.CortexVersionMinor) + exitMessage := fmt.Sprintf("cluster configuration can be modified via the cluster config file; see https://docs.cortex.dev/v/%s/ for more information", consts.CortexVersionMinor) prompt.YesOrExit("would you like to continue?", "", exitMessage) } } @@ -567,7 +567,7 @@ func confirmConfigureClusterConfig(clusterConfig clusterconfig.Config, awsCreds fmt.Println(clusterConfigConfirmationStr(clusterConfig, awsCreds, awsClient)) if !disallowPrompt { - exitMessage := fmt.Sprintf("cluster configuration can be modified via the cluster config file; see https://docs.cortex.dev/v/%s/aws/install for more information", consts.CortexVersionMinor) + exitMessage := fmt.Sprintf("cluster configuration can be modified via the cluster config file; see https://docs.cortex.dev/v/%s/ for more information", consts.CortexVersionMinor) prompt.YesOrExit(fmt.Sprintf("your cluster named \"%s\" in %s will be updated according to the configuration above, are you sure you want to continue?", clusterConfig.ClusterName, *clusterConfig.Region), "", exitMessage) } } diff --git a/cli/cmd/lib_cluster_config_gcp.go b/cli/cmd/lib_cluster_config_gcp.go index 4530bab618..62724d0616 100644 --- a/cli/cmd/lib_cluster_config_gcp.go +++ b/cli/cmd/lib_cluster_config_gcp.go @@ -66,7 +66,7 @@ func readCachedGCPClusterConfigFile(clusterConfig *clusterconfig.GCPConfig, file func readUserGCPClusterConfigFile(clusterConfig *clusterconfig.GCPConfig) error { errs := cr.ParseYAMLFile(clusterConfig, clusterconfig.UserGCPValidation, _flagClusterGCPConfig) if errors.HasError(errs) { - return errors.Append(errors.FirstError(errs...), fmt.Sprintf("\n\ncluster configuration schema can be found here: https://docs.cortex.dev/v/%s/gcp/install", consts.CortexVersionMinor)) + return errors.Append(errors.FirstError(errs...), fmt.Sprintf("\n\ncluster configuration schema can be found at https://docs.cortex.dev/v/%s/", consts.CortexVersionMinor)) } return nil @@ -81,7 +81,7 @@ func getNewGCPClusterAccessConfig(disallowPrompt bool) (*clusterconfig.GCPAccess if _flagClusterGCPConfig != "" { errs := cr.ParseYAMLFile(accessConfig, clusterconfig.GCPAccessValidation, _flagClusterGCPConfig) if errors.HasError(errs) { - return nil, errors.Append(errors.FirstError(errs...), fmt.Sprintf("\n\ncluster configuration schema can be found here: https://docs.cortex.dev/v/%s/gcp/install", consts.CortexVersionMinor)) + return nil, errors.Append(errors.FirstError(errs...), fmt.Sprintf("\n\ncluster configuration schema can be found at https://docs.cortex.dev/v/%s/", consts.CortexVersionMinor)) } } @@ -120,7 +120,7 @@ func getGCPClusterAccessConfigWithCache(disallowPrompt bool) (*clusterconfig.GCP if _flagClusterGCPConfig != "" { errs := cr.ParseYAMLFile(accessConfig, clusterconfig.GCPAccessValidation, _flagClusterGCPConfig) if errors.HasError(errs) { - return nil, errors.Append(errors.FirstError(errs...), fmt.Sprintf("\n\ncluster configuration schema can be found here: https://docs.cortex.dev/v/%s/gcp/install", consts.CortexVersionMinor)) + return nil, errors.Append(errors.FirstError(errs...), fmt.Sprintf("\n\ncluster configuration schema can be found at https://docs.cortex.dev/v/%s/", consts.CortexVersionMinor)) } } @@ -196,7 +196,7 @@ func getGCPInstallClusterConfig(gcpClient *gcp.Client, accessConfig clusterconfi err = clusterConfig.Validate(gcpClient) if err != nil { - err = errors.Append(err, fmt.Sprintf("\n\ncluster configuration schema can be found here: https://docs.cortex.dev/v/%s/gcp/install", consts.CortexVersionMinor)) + err = errors.Append(err, fmt.Sprintf("\n\ncluster configuration schema can be found at https://docs.cortex.dev/v/%s/", consts.CortexVersionMinor)) if _flagClusterGCPConfig != "" { err = errors.Wrap(err, _flagClusterGCPConfig) } @@ -212,7 +212,7 @@ func confirmGCPInstallClusterConfig(clusterConfig *clusterconfig.GCPConfig, disa fmt.Printf("a cluster named \"%s\" will be created in %s (zone: %s)\n\n", clusterConfig.ClusterName, *clusterConfig.Project, *clusterConfig.Zone) if !disallowPrompt { - exitMessage := fmt.Sprintf("cluster configuration can be modified via the cluster config file; see https://docs.cortex.dev/v/%s/gcp/install for more information", consts.CortexVersionMinor) + exitMessage := fmt.Sprintf("cluster configuration can be modified via the cluster config file; see https://docs.cortex.dev/v/%s/ for more information", consts.CortexVersionMinor) prompt.YesOrExit("would you like to continue?", "", exitMessage) } } diff --git a/cli/local/deploy.go b/cli/local/deploy.go index 3f3407ab42..3a7741c9ba 100644 --- a/cli/local/deploy.go +++ b/cli/local/deploy.go @@ -101,7 +101,7 @@ func deploy(env cliconfig.Environment, apiConfigs []userconfig.API, projectFiles models := []spec.CuratedModelResource{} err = ValidateLocalAPIs(apiConfigs, &models, projectFiles, awsClient, gcpClient) if err != nil { - err = errors.Append(err, fmt.Sprintf("\n\napi configuration schema for Realtime API can be found at https://docs.cortex.dev/v/%s/deployments/realtime-api/api-configuration", consts.CortexVersionMinor)) + err = errors.Append(err, fmt.Sprintf("\n\napi configuration schema for Realtime API can be found at https://docs.cortex.dev/v/%s/", consts.CortexVersionMinor)) return nil, err } diff --git a/dev/generate_python_client_md.sh b/dev/generate_python_client_md.sh index fd68e9130f..2aa4317250 100755 --- a/dev/generate_python_client_md.sh +++ b/dev/generate_python_client_md.sh @@ -30,38 +30,38 @@ cd $ROOT/pkg/workloads/cortex/client pip3 install -e . -pydoc-markdown -m cortex -m cortex.client --render-toc > $ROOT/docs/miscellaneous/python-client.md +pydoc-markdown -m cortex -m cortex.client --render-toc > $ROOT/docs/workloads/python-client.md # title -sed -i "s/# Table of Contents/# Python client\n\n_WARNING: you are on the master branch, please refer to the docs on the branch that matches your \`cortex version\`_/g" $ROOT/docs/miscellaneous/python-client.md +sed -i "s/# Table of Contents/# Python client\n\n_WARNING: you are on the master branch, please refer to the docs on the branch that matches your \`cortex version\`_/g" $ROOT/docs/workloads/python-client.md # delete links -sed -i "//g" $ROOT/docs/miscellaneous/python-client.md +sed -i "s/^## create\\\_api/## create\\\_api\n\n/g" $ROOT/docs/workloads/python-client.md pip3 uninstall -y cortex rm -rf $ROOT/pkg/workloads/cortex/client/cortex.egg-info diff --git a/docs/deployments/gpus.md b/docs/aws/gpu.md similarity index 91% rename from docs/deployments/gpus.md rename to docs/aws/gpu.md index cc7af572b7..98d950cea3 100644 --- a/docs/deployments/gpus.md +++ b/docs/aws/gpu.md @@ -5,9 +5,9 @@ _WARNING: you are on the master branch, please refer to the docs on the branch t To use GPUs: 1. Make sure your AWS account is subscribed to the [EKS-optimized AMI with GPU Support](https://aws.amazon.com/marketplace/pp/B07GRHFXGM). -2. You may need to [request a limit increase](https://console.aws.amazon.com/servicequotas/home?#!/services/ec2/quotas) for your desired instance type. -3. Set instance type to an AWS GPU instance (e.g. `g4dn.xlarge`) when installing Cortex. -4. Set the `gpu` field in the `compute` configuration for your API. One unit of GPU corresponds to one virtual GPU. Fractional requests are not allowed. +1. You may need to [request a limit increase](https://console.aws.amazon.com/servicequotas/home?#!/services/ec2/quotas) for your desired instance type. +1. Set instance type to an AWS GPU instance (e.g. `g4dn.xlarge`) when installing Cortex. +1. Set the `gpu` field in the `compute` configuration for your API. One unit of GPU corresponds to one virtual GPU. Fractional requests are not allowed. ## Tips diff --git a/docs/deployments/inferentia.md b/docs/aws/inferentia.md similarity index 95% rename from docs/deployments/inferentia.md rename to docs/aws/inferentia.md index 34390f3a57..cfd56efaa8 100644 --- a/docs/deployments/inferentia.md +++ b/docs/aws/inferentia.md @@ -66,11 +66,7 @@ model_neuron.save(compiled_model) The versions of `tensorflow-neuron` and `torch-neuron` that are used by Cortex are found in the [Realtime API pre-installed packages list](realtime-api/predictors.md#inferentia-equipped-apis) and [Batch API pre-installed packages list](batch-api/predictors.md#inferentia-equipped-apis). When installing these packages with `pip` to compile models of your own, use the extra index URL `--extra-index-url=https://pip.repos.neuron.amazonaws.com`. -A list of model compilation examples for Inferentia can be found on the [`aws/aws-neuron-sdk`](https://github.com/aws/aws-neuron-sdk) repo for [TensorFlow](https://github.com/aws/aws-neuron-sdk/blob/master/docs/tensorflow-neuron/) and for [PyTorch](https://github.com/aws/aws-neuron-sdk/blob/master/docs/pytorch-neuron/README.md). Here are 2 examples implemented with Cortex: - - -1. [ResNet50 in TensorFlow](https://github.com/cortexlabs/cortex/tree/master/examples/tensorflow/image-classifier-resnet50) -1. [ResNet50 in PyTorch](https://github.com/cortexlabs/cortex/tree/master/examples/pytorch/image-classifier-resnet50) +A list of model compilation examples for Inferentia can be found on the [`aws/aws-neuron-sdk`](https://github.com/aws/aws-neuron-sdk) repo for [TensorFlow](https://github.com/aws/aws-neuron-sdk/blob/master/docs/tensorflow-neuron/) and for [PyTorch](https://github.com/aws/aws-neuron-sdk/blob/master/docs/pytorch-neuron/README.md). ### Improving performance diff --git a/docs/aws/install.md b/docs/aws/install.md index 67bda7563b..bae9c28e4e 100644 --- a/docs/aws/install.md +++ b/docs/aws/install.md @@ -19,9 +19,6 @@ cortex cluster up # or: cortex cluster up --config cluster.yaml (see configurat cortex env default aws ``` - -Try the [tutorial](../../examples/pytorch/text-generator/README.md) or deploy one of our [examples](https://github.com/cortexlabs/cortex/tree/master/examples). - ## Configure Cortex @@ -65,7 +62,7 @@ nat_gateway: none api_load_balancer_scheme: internet-facing # operator load balancer scheme [internet-facing | internal] -# note: if using "internal", you must configure VPC Peering to connect your CLI to your cluster operator (https://docs.cortex.dev/v/master/aws/vpc-peering) +# note: if using "internal", you must configure VPC Peering to connect your CLI to your cluster operator operator_load_balancer_scheme: internet-facing # API Gateway [public (API Gateway will be used by default, can be disabled per API) | none (API Gateway will be disabled for all APIs)] @@ -103,8 +100,6 @@ image_istio_proxy: quay.io/cortexlabs/istio-proxy:master image_istio_pilot: quay.io/cortexlabs/istio-pilot:master ``` -The default docker images used for your Predictors are listed in the instructions for [system packages](../deployments/system-packages.md), and can be overridden in your [Realtime API configuration](../deployments/realtime-api/api-configuration.md) and in your [Batch API configuration](../deployments/batch-api/api-configuration.md). - ## Advanced * [Security](security.md) diff --git a/docs/aws/networking.md b/docs/aws/networking.md index eddddb4a31..824abe0c1f 100644 --- a/docs/aws/networking.md +++ b/docs/aws/networking.md @@ -4,7 +4,7 @@ _WARNING: you are on the master branch, please refer to the docs on the branch t ![api architecture diagram](https://user-images.githubusercontent.com/808475/84695323-8507dd00-aeff-11ea-8b32-5a55cef76c79.png) -APIs are deployed with a public API Gateway by default (the API Gateway forwards requests to the API load balancer). Each API can be independently configured to not create the API Gateway endpoint by setting `api_gateway: none` in the `networking` field of the [Realtime API configuration](../deployments/realtime-api/api-configuration.md) and [Batch API configuration](../deployments/batch-api/api-configuration.md). If the API Gateway endpoint is not created, your API can still be accessed via the API load balancer; `cortex get API_NAME` will show the load balancer endpoint if API Gateway is disabled. API Gateway is enabled by default, and is generally recommended unless it doesn't support your use case due to limitations such as the 29 second request timeout, or if you are keeping your APIs private to your VPC. See below for common configurations. To disable API Gateway cluster-wide (thereby enforcing that all APIs cannot create API Gateway endpoints), set `api_gateway: none` in your [cluster configuration](install.md) file (before creating your cluster). +APIs are deployed with a public API Gateway by default (the API Gateway forwards requests to the API load balancer). Each API can be independently configured to not create the API Gateway endpoint by setting `api_gateway: none` in the `networking` field of the [Realtime API configuration](../workloads/realtime/configuration.md) and [Batch API configuration](../workloads/batch/configuration.md). If the API Gateway endpoint is not created, your API can still be accessed via the API load balancer; `cortex get API_NAME` will show the load balancer endpoint if API Gateway is disabled. API Gateway is enabled by default, and is generally recommended unless it doesn't support your use case due to limitations such as the 29 second request timeout, or if you are keeping your APIs private to your VPC. See below for common configurations. To disable API Gateway cluster-wide (thereby enforcing that all APIs cannot create API Gateway endpoints), set `api_gateway: none` in your [cluster configuration](install.md) file (before creating your cluster). By default, the API load balancer is public. You can configure your API load balancer to be private by setting `api_load_balancer_scheme: internal` in your [cluster configuration](install.md) file (before creating your cluster). This will force external traffic to go through your API Gateway endpoint, or if you disabled API Gateway for your API, it will make your API only accessible through VPC Peering. Note that if API Gateway is used, endpoints will be public regardless of `api_load_balancer_scheme`. See below for common configurations. diff --git a/docs/aws/rest-api-gateway.md b/docs/aws/rest-api-gateway.md index cc5a5ced37..0144df386d 100644 --- a/docs/aws/rest-api-gateway.md +++ b/docs/aws/rest-api-gateway.md @@ -17,7 +17,7 @@ If your API load balancer is internal (i.e. you set `api_load_balancer_scheme: i Disable the default API Gateway: * If you haven't created your cluster yet, you can set `api_gateway: none` in your [cluster configuration file](install.md) before creating your cluster. -* If you have already created your cluster, you can set `api_gateway: none` in the `networking` field of your [Realtime API configuration](../deployments/realtime-api/api-configuration.md) and/or [Batch API configuration](../deployments/batch-api/api-configuration.md), and then re-deploy your API. +* If you have already created your cluster, you can set `api_gateway: none` in the `networking` field of your [Realtime API configuration](../workloads/realtime/configuration.md) and/or [Batch API configuration](../workloads/batch/configuration.md), and then re-deploy your API. ### Step 2 @@ -96,7 +96,7 @@ Delete the API Gateway before spinning down your Cortex cluster: Disable the default API Gateway: * If you haven't created your cluster yet, you can set `api_gateway: none` in your [cluster configuration file](install.md) before creating your cluster. -* If you have already created your cluster, you can set `api_gateway: none` in the `networking` field of your [Realtime API configuration](../deployments/realtime-api/api-configuration.md) and/or [Batch API configuration](../deployments/batch-api/api-configuration.md), and then re-deploy your API. +* If you have already created your cluster, you can set `api_gateway: none` in the `networking` field of your [Realtime API configuration](../workloads/realtime/configuration.md) and/or [Batch API configuration](../workloads/batch/configuration.md), and then re-deploy your API. ### Step 2 diff --git a/docs/cloud/install.md b/docs/cloud/install.md deleted file mode 100644 index c210d7c2e4..0000000000 --- a/docs/cloud/install.md +++ /dev/null @@ -1,15 +0,0 @@ -# Install - -_WARNING: you are on the master branch, please refer to the docs on the branch that matches your `cortex version`_ - -## AWS - -To spin up Cortex using AWS as the cloud provider, follow [these instructions](../aws/install.md). - -## GCP - -To spin up Cortex using GCP as the cloud provider, follow [these instructions](../gcp/install.md). - -## Local - -If you'll only be using Cortex locally, install it with `pip install cortex`. diff --git a/docs/cloud/uninstall.md b/docs/cloud/uninstall.md deleted file mode 100644 index a162f34dbd..0000000000 --- a/docs/cloud/uninstall.md +++ /dev/null @@ -1,15 +0,0 @@ -# Uninstall - -_WARNING: you are on the master branch, please refer to the docs on the branch that matches your `cortex version`_ - -## AWS - -To spin down a Cortex cluster on AWS, follow [these instructions](../aws/uninstall.md). - -## GCP - -To spin down a Cortex cluster on GCP, follow [these instructions](../gcp/uninstall.md). - -## Local - -To uninstall the Cortex CLI, run `pip uninstall cortex`. diff --git a/docs/cloud/update.md b/docs/cloud/update.md deleted file mode 100644 index 1cf87cc8da..0000000000 --- a/docs/cloud/update.md +++ /dev/null @@ -1,11 +0,0 @@ -# Update - -_WARNING: you are on the master branch, please refer to the docs on the branch that matches your `cortex version`_ - -## AWS - -To update the configuration of a running Cortex cluster on AWS, follow [these instructions](../aws/update.md). - -## GCP - -It is currently not possible to update a Cortex cluster running on GCP. diff --git a/docs/contact.md b/docs/contact.md deleted file mode 100644 index 70a9748f34..0000000000 --- a/docs/contact.md +++ /dev/null @@ -1,19 +0,0 @@ -# Contact us - -_WARNING: you are on the master branch, please refer to the docs on the branch that matches your `cortex version`_ - -## Support - -[GitHub](https://github.com/cortexlabs/cortex/issues) - Submit feature requests, file bugs, and track issues. - -[Gitter](https://gitter.im/cortexlabs/cortex) - Chat with us in our community channel. - -[Email](mailto:hello@cortex.dev) - Email us at `hello@cortex.dev` to contact us privately. - -## Contributing - -Find instructions for how to set up your development environment in the [development guide](contributing/development.md). - -## We're hiring - -Interested in joining us? See our [job postings](https://angel.co/company/cortex-labs-inc/jobs). diff --git a/docs/deployments/batch-api.md b/docs/deployments/batch-api.md deleted file mode 100644 index a9b368be15..0000000000 --- a/docs/deployments/batch-api.md +++ /dev/null @@ -1,43 +0,0 @@ -# Batch API Overview - -_WARNING: you are on the master branch, please refer to the docs on the branch that matches your `cortex version`_ - -You can deploy your model as a Batch API to create a web service that can receive job requests and orchestrate offline batch inference on large datasets across multiple workers. - -## When should I use a Batch API - -You may want to deploy your model as a Batch API if any of the following scenarios apply to your use case: - -* inference will run on a large dataset and can be distributed across multiple workers -* job progress and status needs to be monitored -* inference is a part of internal data pipelines that may be chained together -* a small number of requests are received, but each request takes minutes or hours to complete - -You may want to consider deploying your model as a [Realtime API](realtime-api.md) if these scenarios don't apply to you. - -A Batch API deployed in Cortex will create/support the following: - -* a REST web service to receive job requests, manage running jobs, and retrieve job statuses -* an autoscaling worker pool that can scale to 0 -* log aggregation and streaming -* `on_job_complete` hook to for aggregation or triggering webhooks - -## How does it work - -You specify the following: - -* a Cortex Predictor class in Python that defines how to initialize your model run batch inference -* an API configuration YAML file that defines how your API will behave in production (parallelism, networking, compute, etc.) - -Once you've implemented your predictor and defined your API configuration, you can use the Cortex CLI to deploy a Batch API. The Cortex CLI will package your predictor implementation and the rest of the code and dependencies and upload it to the Cortex Cluster. The Cortex Cluster will setup an endpoint to a web service that can receive job submission requests and manage jobs. - -A job submission typically consists of an input dataset or the location of your input dataset, the number of workers for your job, and the batch size. When a job is submitted to your Batch API endpoint, you will immediately receive a Job ID that you can use to get the job's status and logs, and stop the job if necessary. Behind the scenes, your Batch API will break down the dataset into batches and push them onto a queue. Once all of the batches have been enqueued, the Cortex Cluster will spin up the requested number of workers and initialize them with your predictor implementation. Each worker will take one batch at a time from the queue and run your Predictor implementation. After all batches have been processed, the `on_job_complete` hook in your predictor implementation (if provided) will be executed by one of the workers. - -At any point, you can use the Job ID that was provided upon job submission to make requests to the Batch API endpoint to get job status, progress metrics, and worker statuses. Logs for each job are aggregated and are accessible via the Cortex CLI or in your AWS console. - -## Next steps - -* Try the [tutorial](../../examples/batch/image-classifier/README.md) to deploy a Batch API on your Cortex cluster. -* See our [exporting guide](../guides/exporting.md) for how to export your model to use in a Batch API. -* See the [Predictor docs](batch-api/predictors.md) for how to implement a Predictor class. -* See the [API configuration docs](batch-api/api-configuration.md) for a full list of features that can be used to deploy your Batch API. diff --git a/docs/deployments/batch-api/deployment.md b/docs/deployments/batch-api/deployment.md deleted file mode 100644 index 81168fac42..0000000000 --- a/docs/deployments/batch-api/deployment.md +++ /dev/null @@ -1,127 +0,0 @@ -# Batch API deployment - -_WARNING: you are on the master branch, please refer to the docs on the branch that matches your `cortex version`_ - -Once your model is [exported](../../guides/exporting.md), you've implemented a [Predictor](predictors.md), and you've [configured your API](api-configuration.md), you're ready to deploy a Batch API. - -## `cortex deploy` - -The `cortex deploy` command collects your configuration and source code and deploys your API on your cluster: - -```bash -$ cortex deploy - -created image-classifier (BatchAPI) -``` - -APIs are declarative, so to update your API, you can modify your source code and/or configuration and run `cortex deploy` again. - -After deploying a Batch API you can use `cortex get ` to display the Batch API endpoint, which you can use to make the following requests: - -1. Submit a batch job -1. Get the status of a job -1. Stop a job - -You can find documentation for the Batch API endpoint [here](endpoints.md). - -## `cortex get` - -The `cortex get` command displays the status of all of your API: - -```bash -$ cortex get - -env batch api running jobs latest job id last update -aws image-classifier 1 69d9c0013c2d0d97 (submitted 30s ago) 46s -``` - -## `cortex get ` - -`cortex get ` shows additional information about a specific Batch API and lists a summary of all currently running / recently submitted jobs. - -```bash -$ cortex get image-classifier - -job id status progress failed start time duration -69d9c0013c2d0d97 running 1/24 0 29 Jul 2020 14:38:01 UTC 30s -69da5b1f8cd3b2d3 completed with failures 15/16 1 29 Jul 2020 13:38:01 UTC 5m20s -69da5bc32feb6aa0 succeeded 40/40 0 29 Jul 2020 12:38:01 UTC 10m21s -69da5bd5b2f87258 succeeded 34/34 0 29 Jul 2020 11:38:01 UTC 8m54s - -endpoint: http://***.amazonaws.com/image-classifier -... -``` - -Appending the `--watch` flag will re-run the `cortex get` command every 2 seconds. - -## Job commands - -Once a job has been submitted to your Batch API (see [here](endpoints.md#submit-a-job)), you can use the Job ID from job submission response to get the status, stream logs, and stop a running job using the CLI. - -### `cortex get ` - -After a submitting a job, you can use the `cortex get ` command to show information about the job: - -```bash -$ cortex get image-classifier 69d9c0013c2d0d97 - -job id: 69d9c0013c2d0d97 -status: running - -start time: 29 Jul 2020 14:38:01 UTC -end time: - -duration: 32s - -batch stats -total succeeded failed avg time per batch -24 1 0 20s - -worker stats -requested running failed succeeded -2 2 0 0 - -job endpoint: https://***..amazonaws.com/image-classifier/69d9c0013c2d0d97 -``` - -### `cortex logs ` - -You can use `cortex logs ` to stream logs from a job: - -```bash -$ cortex logs image-classifier 69d9c0013c2d0d97 - -started enqueuing batches -partitioning 240 items found in job submission into 24 batches of size 10 -completed enqueuing a total of 24 batches -spinning up workers... -2020-07-30 16:50:30.147522:cortex:pid-1:INFO:downloading the project code -2020-07-30 16:50:30.268987:cortex:pid-1:INFO:downloading the python serving image -.... -``` - -### `cortex delete ` - -You can use `cortex delete ` to stop a running job: - -```bash -$ cortex delete image-classifier 69d9c0013c2d0d97 - -stopped job 69d96a01ea55da8c -``` - -## `cortex delete` - -Use the `cortex delete` command to delete your API: - -```bash -$ cortex delete my-api - -deleting my-api -``` - -## Additional resources - - -* [Tutorial](../../../examples/batch/image-classifier/README.md) provides a step-by-step walkthrough of deploying an image classification batch API -* [CLI documentation](../../miscellaneous/cli.md) lists all CLI commands -* [Examples](https://github.com/cortexlabs/cortex/tree/master/examples/batch) demonstrate how to deploy models from common ML libraries diff --git a/docs/deployments/compute.md b/docs/deployments/compute.md deleted file mode 100644 index 7937ca00ab..0000000000 --- a/docs/deployments/compute.md +++ /dev/null @@ -1,36 +0,0 @@ -# Compute - -_WARNING: you are on the master branch, please refer to the docs on the branch that matches your `cortex version`_ - -Compute resource requests in Cortex follow the syntax and meaning of [compute resources in Kubernetes](https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container). - -For example: - -```yaml -- name: my-api - ... - compute: - cpu: 1 - gpu: 1 - mem: 1G -``` - -CPU, GPU, Inf, and memory requests in Cortex correspond to compute resource requests in Kubernetes. In the example above, the API will only be scheduled once 1 CPU, 1 GPU, and 1G of memory are available on any instance, and it will be guaranteed to have access to those resources throughout its execution. In some cases, resource requests can be (or may default to) `Null`. - -## CPU - -One unit of CPU corresponds to one virtual CPU on AWS. Fractional requests are allowed, and can be specified as a floating point number or via the "m" suffix (`0.2` and `200m` are equivalent). - -## GPU - -One unit of GPU corresponds to one virtual GPU. Fractional requests are not allowed. - -See [GPU documentation](gpus.md) for more information. - -## Memory - -One unit of memory is one byte. Memory can be expressed as an integer or by using one of these suffixes: `K`, `M`, `G`, `T` (or their power-of two counterparts: `Ki`, `Mi`, `Gi`, `Ti`). For example, the following values represent roughly the same memory: `128974848`, `129e6`, `129M`, `123Mi`. - -## Inf - -One unit of Inf corresponds to one Inferentia ASIC with 4 NeuronCores *(not the same thing as `cpu`)* and 8GB of cache memory *(not the same thing as `mem`)*. Fractional requests are not allowed. diff --git a/docs/deployments/realtime-api.md b/docs/deployments/realtime-api.md deleted file mode 100644 index f90110690f..0000000000 --- a/docs/deployments/realtime-api.md +++ /dev/null @@ -1,46 +0,0 @@ -# Realtime API Overview - -_WARNING: you are on the master branch, please refer to the docs on the branch that matches your `cortex version`_ - -You can deploy a Realtime API on Cortex to serve your model via an HTTP endpoint for on-demand inferences. - -## When should I use a Realtime API - -You may want to deploy your model as a Realtime API if any of the following scenarios apply to your use case: - -* predictions are served on demand -* predictions need to be made in the time of a single web request -* predictions need to be made on an individual basis -* predictions are served directly to consumers - -You may want to consider deploying your model as a [Batch API](batch-api.md) if these scenarios don't apply to you. - -A Realtime API deployed in Cortex has the following features: - -* request-based autoscaling -* rolling updates to enable you to update the model/serving code without downtime -* realtime metrics collection -* log streaming -* multi-model serving -* server-side batching -* traffic splitting (e.g. for A/B testing) - -## How does it work - -You specify the following: - -* a Cortex Predictor class in Python that defines how to initialize and serve your model -* an API configuration YAML file that defines how your API will behave in production (autoscaling, monitoring, networking, compute, etc.) - -Once you've implemented your predictor and defined your API configuration, you can use the Cortex CLI to deploy a Realtime API. The Cortex CLI will package your predictor implementation and the rest of the code and dependencies and upload it to the Cortex Cluster. The Cortex Cluster will set up an HTTP endpoint that routes traffic to multiple replicas/copies of web servers initialized with your code. - -When a request is made to the HTTP endpoint, it gets routed to one your API's replicas (at random). The replica receives the request, parses the payload and executes the inference code you've defined in your predictor implementation and sends a response. - -The Cortex Cluster will automatically scale based on the incoming traffic and the autoscaling configuration you've defined. You can safely update your model or your code and use the Cortex CLI to deploy without experiencing downtime because updates to your API will be rolled out automatically. Request metrics and logs will automatically be aggregated and be accessible via the Cortex CLI or on your AWS console. - -## Next steps - -* Try the [tutorial](../../examples/pytorch/text-generator/README.md) to deploy a Realtime API locally or on AWS. -* See our [exporting guide](../guides/exporting.md) for how to export your model to use in a Realtime API. -* See the [Predictor docs](realtime-api/predictors.md) for how to implement a Predictor class. -* See the [API configuration docs](realtime-api/api-configuration.md) for a full list of features that can be used to deploy your Realtime API. diff --git a/docs/deployments/realtime-api/deployment.md b/docs/deployments/realtime-api/deployment.md deleted file mode 100644 index a7c0a09a4c..0000000000 --- a/docs/deployments/realtime-api/deployment.md +++ /dev/null @@ -1,68 +0,0 @@ -# API deployment - -_WARNING: you are on the master branch, please refer to the docs on the branch that matches your `cortex version`_ - -Once your model is [exported](../../guides/exporting.md), you've implemented a [Predictor](predictors.md), and you've [configured your API](api-configuration.md), you're ready to deploy! - -## `cortex deploy` - -The `cortex deploy` command collects your configuration and source code and deploys your API on your cluster: - -```bash -$ cortex deploy - -creating my-api (RealtimeAPI) -``` - -APIs are declarative, so to update your API, you can modify your source code and/or configuration and run `cortex deploy` again. - -## `cortex get` - -The `cortex get` command displays the status of your APIs, and `cortex get ` shows additional information about a specific API. - -```bash -$ cortex get my-api - -status up-to-date requested last update avg request 2XX -live 1 1 1m - - - -endpoint: http://***.amazonaws.com/text-generator -... -``` - -Appending the `--watch` flag will re-run the `cortex get` command every 2 seconds. - -## `cortex logs` - -You can stream logs from your API using the `cortex logs` command: - -```bash -$ cortex logs my-api -``` - -## Making a prediction - -You can use `curl` to test your prediction service, for example: - -```bash -$ curl http://***.amazonaws.com/my-api \ - -X POST -H "Content-Type: application/json" \ - -d '{"key": "value"}' -``` - -## `cortex delete` - -Use the `cortex delete` command to delete your API: - -```bash -$ cortex delete my-api - -deleting my-api -``` - -## Additional resources - - -* [Tutorial](../../../examples/pytorch/text-generator/README.md) provides a step-by-step walkthrough of deploying a text generation API -* [CLI documentation](../../miscellaneous/cli.md) lists all CLI commands -* [Examples](https://github.com/cortexlabs/cortex/tree/master/examples) demonstrate how to deploy models from common ML libraries diff --git a/docs/gcp/install.md b/docs/gcp/install.md index e65520c4b4..263267fb78 100644 --- a/docs/gcp/install.md +++ b/docs/gcp/install.md @@ -19,12 +19,8 @@ cortex cluster-gcp up # or: cortex cluster-gcp up --config cluster.yaml (see co cortex env default gcp ``` - -Try the [tutorial](../../examples/pytorch/text-generator/README.md). - ## Configure Cortex - ```yaml # cluster.yaml @@ -62,5 +58,3 @@ image_istio_proxy: quay.io/cortexlabs/istio-proxy:master image_istio_pilot: quay.io/cortexlabs/istio-pilot:master image_pause: quay.io/cortexlabs/pause:master ``` - -The default docker images used for your Predictors are listed in the instructions for [system packages](../deployments/system-packages.md), and can be overridden in your [Realtime API configuration](../deployments/realtime-api/api-configuration.md). diff --git a/docs/contributing/development.md b/docs/guides/contributing.md similarity index 98% rename from docs/contributing/development.md rename to docs/guides/contributing.md index 1d480675a6..6eb6d76d88 100644 --- a/docs/contributing/development.md +++ b/docs/guides/contributing.md @@ -1,4 +1,6 @@ -# Development +# Contributing + +_WARNING: you are on the master branch, please refer to the docs on the branch that matches your `cortex version`_ ## Remote development diff --git a/docs/guides/docker-hub-rate-limiting.md b/docs/guides/docker-hub-rate-limiting.md index 98d9ff3570..378919686d 100644 --- a/docs/guides/docker-hub-rate-limiting.md +++ b/docs/guides/docker-hub-rate-limiting.md @@ -64,7 +64,7 @@ Once you've updated your cluster configuration file, you can spin up your cluste ### Update your API configuration file(s) -To configure your APIs to use the Quay images, you cna update your [API configuration files](../deployments/realtime-api/api-configuration.md). The image paths are specified in `predictor.image` (and `predictor.tensorflow_serving_image` for APIs with `kind: tensorflow`). Be advised that by default, the Docker Hub images are used for your predictors, so you will need to specify the Quay image paths for all of your APIs. +To configure your APIs to use the Quay images, you can update your [API configuration files](../workloads/realtime/configuration.md). The image paths are specified in `predictor.image` (and `predictor.tensorflow_serving_image` for APIs with `kind: tensorflow`). Be advised that by default, the Docker Hub images are used for your predictors, so you will need to specify the Quay image paths for all of your APIs. Here is a list of available images (make sure to set `` to your cluster's version): diff --git a/docs/guides/exporting.md b/docs/guides/exporting.md index 05823382e9..b34e6c5b82 100644 --- a/docs/guides/exporting.md +++ b/docs/guides/exporting.md @@ -10,10 +10,7 @@ Here are examples for some common ML libraries: ### `torch.save()` -The recommended approach is export your PyTorch model with [torch.save()](https://pytorch.org/docs/stable/torch.html?highlight=save#torch.save). Here is PyTorch's documentation on [saving and loading models](https://pytorch.org/tutorials/beginner/saving_loading_models.html). - - -[examples/pytorch/iris-classifier](https://github.com/cortexlabs/cortex/blob/master/examples/pytorch/iris-classifier) exports its trained model like this: +The recommended approach is export your PyTorch model with [torch.save()](https://pytorch.org/docs/stable/torch.html?highlight=save#torch.save). Here is PyTorch's documentation on [saving and loading models](https://pytorch.org/tutorials/beginner/saving_loading_models.html). For example: ```python torch.save(model.state_dict(), "weights.pth") @@ -23,10 +20,7 @@ For Inferentia-equipped instances, check the [Inferentia instructions](inferenti ### ONNX -It may also be possible to export your PyTorch model into the ONNX format using [torch.onnx.export()](https://pytorch.org/docs/stable/onnx.html#torch.onnx.export). - - -For example, if [examples/pytorch/iris-classifier](https://github.com/cortexlabs/cortex/blob/master/examples/pytorch/iris-classifier) were to export the model to ONNX, it would look like this: +It may also be possible to export your PyTorch model into the ONNX format using [torch.onnx.export()](https://pytorch.org/docs/stable/onnx.html#torch.onnx.export). For example: ```python placeholder = torch.randn(1, 4) @@ -63,8 +57,7 @@ A TensorFlow `SavedModel` directory should have this structure: └── variables.data-00002-of-... ``` - -Most of the TensorFlow examples use this approach. Here is the relevant code from [examples/tensorflow/sentiment-analyzer](https://github.com/cortexlabs/cortex/blob/master/examples/tensorflow/sentiment-analyzer): +For example: ```python import tensorflow as tf @@ -101,24 +94,15 @@ zip -r bert.zip 1568244606 aws s3 cp bert.zip s3://my-bucket/bert.zip ``` - -[examples/tensorflow/iris-classifier](https://github.com/cortexlabs/cortex/blob/master/examples/tensorflow/iris-classifier) also use the `SavedModel` approach, and includes a Python notebook demonstrating how it was exported. - ### Other model formats There are other ways to export Keras or TensorFlow models, and as long as they can be loaded and used to make predictions in Python, they will be supported by Cortex. - -For example, the `crnn` API in [examples/tensorflow/license-plate-reader](https://github.com/cortexlabs/cortex/blob/master/examples/tensorflow/license-plate-reader) uses this approach. - ## Scikit-learn ### `pickle` -Scikit-learn models are typically exported using `pickle`. Here is [Scikit-learn's documentation](https://scikit-learn.org/stable/modules/model_persistence.html). - - -[examples/sklearn/iris-classifier](https://github.com/cortexlabs/cortex/blob/master/examples/sklearn/iris-classifier) uses this approach. Here is the relevant code: +Scikit-learn models are typically exported using `pickle`. Here is [Scikit-learn's documentation](https://scikit-learn.org/stable/modules/model_persistence.html). For example: ```python pickle.dump(model, open("model.pkl", "wb")) @@ -126,7 +110,7 @@ pickle.dump(model, open("model.pkl", "wb")) ### ONNX -It is also possible to export a scikit-learn model to the ONNX format using [onnxmltools](https://github.com/onnx/onnxmltools). Here is an example: +It is also possible to export a scikit-learn model to the ONNX format using [onnxmltools](https://github.com/onnx/onnxmltools). For example: ```python from sklearn.linear_model import LogisticRegression @@ -168,10 +152,7 @@ model.save_model("model.bin") ### ONNX -It is also possible to export an XGBoost model to the ONNX format using [onnxmltools](https://github.com/onnx/onnxmltools). - - -[examples/onnx/iris-classifier](https://github.com/cortexlabs/cortex/blob/master/examples/onnx/iris-classifier) uses this approach. Here is the relevant code: +It is also possible to export an XGBoost model to the ONNX format using [onnxmltools](https://github.com/onnx/onnxmltools). For example: ```python from onnxmltools.convert import convert_xgboost diff --git a/docs/guides/multi-model.md b/docs/guides/multi-model.md index 79ed0507ab..7ea950b1f7 100644 --- a/docs/guides/multi-model.md +++ b/docs/guides/multi-model.md @@ -9,9 +9,6 @@ It is possible to serve multiple models in the same Cortex API using any type of ### Specifying models in API config - -The following template is based on the [live-reloading/python/mpg-estimator](https://github.com/cortexlabs/cortex/tree/master/examples/live-reloading/python/mpg-estimator) example. - #### `cortex.yaml` Even though it looks as if there's only a single model served, there are actually 4 different versions saved in `s3://cortex-examples/sklearn/mpg-estimator/linreg/`. @@ -79,9 +76,6 @@ $ curl "${api_endpoint}?version=2" -X POST -H "Content-Type: application/json" - For the Python Predictor, the API configuration for a multi-model API is similar to single-model APIs. The Predictor's `config` field can be used to customize the behavior of the `predictor.py` implementation. - -The following template is based on the [pytorch/multi-model-text-analyzer](https://github.com/cortexlabs/cortex/tree/master/examples/pytorch/multi-model-text-analyzer) example. - #### `cortex.yaml` ```yaml @@ -157,9 +151,6 @@ Machine learning is the study of algorithms and statistical models that computer For the TensorFlow Predictor, a multi-model API is configured by placing the list of models in the Predictor's `models` field (each model will specify its own unique name). The `predict()` method of the `tensorflow_client` object expects a second argument that represents the name of the model that will be used for inference. - -The following template is based on the [tensorflow/multi-model-classifier](https://github.com/cortexlabs/cortex/tree/master/examples/tensorflow/multi-model-classifier) example. - ### `cortex.yaml` ```yaml @@ -241,9 +232,6 @@ $ curl "${ENDPOINT}?model=inception" -X POST -H "Content-Type: application/json" For the ONNX Predictor, a multi-model API is configured by placing the list of models in the Predictor's `models` field (each model will specify its own unique name). The `predict()` method of the `onnx_client` object expects a second argument that represents the name of the model that will be used for inference. - -The following template is based on the [onnx/multi-model-classifier](https://github.com/cortexlabs/cortex/tree/master/examples/onnx/multi-model-classifier) example. - ### `cortex.yaml` ```yaml diff --git a/docs/guides/production.md b/docs/guides/production.md index bd2f259826..2f3eb6c0f0 100644 --- a/docs/guides/production.md +++ b/docs/guides/production.md @@ -10,24 +10,24 @@ _WARNING: you are on the master branch, please refer to the docs on the branch t **Additional tips for realtime APIs** -* Consider tuning `processes_per_replica` and `threads_per_process` in your [Realtime API configuration](../deployments/realtime-api/api-configuration.md). Each model behaves differently, so the best way to find a good value is to run a load test on a single replica (you can set `min_replicas` to 1 to avoid autocaling). Here is [additional information](../deployments/realtime-api/parallelism.md#concurrency) about these fields. +* Consider tuning `processes_per_replica` and `threads_per_process` in your [Realtime API configuration](../workloads/realtime/configuration.md). Each model behaves differently, so the best way to find a good value is to run a load test on a single replica (you can set `min_replicas` to 1 to avoid autocaling). Here is [additional information](../workloads/realtime/parallelism.md#concurrency) about these fields. -* You may wish to customize the autoscaler for your APIs. The [autoscaling documentation](../deployments/realtime-api/autoscaling.md) describes each of the parameters that can be configured. +* You may wish to customize the autoscaler for your APIs. The [autoscaling documentation](../workloads/realtime/autoscaling.md) describes each of the parameters that can be configured. * When creating an API that you will send large amounts of traffic to all at once, set `min_replicas` at (or slightly above) the number of replicas you expect will be necessary to handle the load at steady state. After traffic has been fully shifted to your API, `min_replicas` can be reduced to allow automatic downscaling. -* [Traffic splitters](./deployments/realtime-api/traffic-splitter.md) can be used to route a subset of traffic to an updated API. For example, you can create a traffic splitter named `my-api`, and route requests to `my-api` to any number of Realtime APIs (e.g. `my-api_v1`, `my-api_v2`, etc). The percentage of traffic that the traffic splitter routes to each API can be updated on the fly. +* [Traffic splitters](./workloads/realtime/traffic-splitter.md) can be used to route a subset of traffic to an updated API. For example, you can create a traffic splitter named `my-api`, and route requests to `my-api` to any number of Realtime APIs (e.g. `my-api_v1`, `my-api_v2`, etc). The percentage of traffic that the traffic splitter routes to each API can be updated on the fly. -* If initialization of your API replicas takes a while (e.g. due to downloading large models from slow hosts or installing dependencies), and responsive autoscaling is important to you, consider pre-building your API's Docker image. See [here](../deployments/system-packages.md#custom-docker-image) for instructions. +* If initialization of your API replicas takes a while (e.g. due to downloading large models from slow hosts or installing dependencies), and responsive autoscaling is important to you, consider pre-building your API's Docker image. See [here](../workloads/system-packages.md#custom-docker-image) for instructions. -* If your API is receiving many queries per second and you are using the TensorFlow Predictor, consider enabling [server-side batching](../deployments/realtime-api/parallelism.md#server-side-batching). +* If your API is receiving many queries per second and you are using the TensorFlow Predictor, consider enabling [server-side batching](../workloads/realtime/parallelism.md#server-side-batching). -* [Overprovisioning](../deployments/realtime-api/autoscaling.md#overprovisioning) can be used to reduce the chance of large queues building up. This can be especially important when inferences take a long time. +* [Overprovisioning](../workloads/realtime/autoscaling.md#overprovisioning) can be used to reduce the chance of large queues building up. This can be especially important when inferences take a long time. **Additional tips for inferences that take a long time:** -* Consider using [GPUs](../deployments/gpus.md) or [Inferentia](../deployments/inferentia.md) to speed up inference. +* Consider using [GPUs](../aws/gpu.md) or [Inferentia](../aws/inferentia.md) to speed up inference. -* Consider setting a low value for `max_replica_concurrency`, since if there are many requests in the queue, it will take a long time until newly received requests are processed. See [autoscaling docs](../deployments/realtime-api/autoscaling.md) for more details. +* Consider setting a low value for `max_replica_concurrency`, since if there are many requests in the queue, it will take a long time until newly received requests are processed. See [autoscaling docs](../workloads/realtime/autoscaling.md) for more details. -* Keep in mind that API Gateway has a 29 second timeout; if your requests take longer (due to a long inference time and/or long request queues), you will need to disable API Gateway for your API by setting `api_gateway: none` in the `networking` config in your [Realtime API configuration](../deployments/realtime-api/api-configuration.md) and/or [Batch API configuration](../deployments/batch-api/api-configuration.md). Alternatively, you can disable API gateway for all APIs in your cluster by setting `api_gateway: none` in your [cluster configuration file](../aws/install.md) before creating your cluster. +* Keep in mind that API Gateway has a 29 second timeout; if your requests take longer (due to a long inference time and/or long request queues), you will need to disable API Gateway for your API by setting `api_gateway: none` in the `networking` config in your [Realtime API configuration](../workloads/realtime/configuration.md) and/or [Batch API configuration](../workloads/batch/configuration.md). Alternatively, you can disable API gateway for all APIs in your cluster by setting `api_gateway: none` in your [cluster configuration file](../aws/install.md) before creating your cluster. diff --git a/docs/guides/self-hosted-images.md b/docs/guides/self-hosted-images.md index 61f298eaf0..916dd4a2ca 100644 --- a/docs/guides/self-hosted-images.md +++ b/docs/guides/self-hosted-images.md @@ -131,7 +131,7 @@ echo "-----------------------------------------------" The first list of images that were printed (the cluster images) can be directly copy-pasted in your [cluster configuration file](../aws/install.md) before spinning up your cluster. -The second list of images that were printed (the API images) can be used in your [API configuration files](../deployments/realtime-api/api-configuration.md). The image paths are specified in `predictor.image` (and `predictor.tensorflow_serving_image` for APIs with `kind: tensorflow`). Be advised that by default, the public images offered by Cortex are used for your predictors, so you will need to specify your ECR image paths for all of your APIs. +The second list of images that were printed (the API images) can be used in your [API configuration files](../workloads/realtime/api-configuration.md). The image paths are specified in `predictor.image` (and `predictor.tensorflow_serving_image` for APIs with `kind: tensorflow`). Be advised that by default, the public images offered by Cortex are used for your predictors, so you will need to specify your ECR image paths for all of your APIs. ## Step 5 diff --git a/docs/guides/single-node-deployment.md b/docs/guides/single-node-deployment.md index 7a21e560fe..6a949bb48c 100644 --- a/docs/guides/single-node-deployment.md +++ b/docs/guides/single-node-deployment.md @@ -101,7 +101,7 @@ $ sudo groupadd docker; sudo gpasswd -a $USER docker $ logout ``` -If you have installed Docker correctly, you should be able to run docker commands such as `docker run hello-world` without running into permission issues or needing `sudo`. +If you have installed Docker correctly, you should be able to run docker commands such as `docker run hello-world/python` without running into permission issues or needing `sudo`. ### Step 12 @@ -114,26 +114,4 @@ $ bash -c "$(curl -sS https://raw.githubusercontent.com/cortexlabs/cortex/master ### Step 13 -You can now use Cortex to deploy your model: - - -```bash -$ git clone -b master https://github.com/cortexlabs/cortex.git - -$ cd cortex/examples/pytorch/text-generator - -$ cortex deploy - -# take note of the curl command -$ cortex get text-generator -``` - -### Step 14 - -Make requests by replacing "localhost" in the curl command with your instance's public DNS: - -```bash -$ curl : \ - -X POST -H "Content-Type: application/json" \ - -d '{"text": "machine learning is"}' -``` +You can now use Cortex to deploy your model. diff --git a/docs/miscellaneous/architecture.md b/docs/miscellaneous/architecture.md deleted file mode 100644 index 88940898ec..0000000000 --- a/docs/miscellaneous/architecture.md +++ /dev/null @@ -1,7 +0,0 @@ -# Architecture diagram - -_WARNING: you are on the master branch, please refer to the docs on the branch that matches your `cortex version`_ - -![architecture diagram](https://user-images.githubusercontent.com/808475/83995909-92c1cf00-a90f-11ea-983f-c96117e42aa3.png) - -_note: this diagram is simplified for illustrative purposes_ diff --git a/docs/miscellaneous/cli.md b/docs/miscellaneous/cli.md deleted file mode 100644 index 580afd9fc9..0000000000 --- a/docs/miscellaneous/cli.md +++ /dev/null @@ -1,338 +0,0 @@ -# CLI commands - -_WARNING: you are on the master branch, please refer to the docs on the branch that matches your `cortex version`_ - -## Install the CLI - -```bash -pip install cortex -``` - -## Install the CLI without Python Client - -### Mac/Linux OS - -```bash -# Replace `INSERT_CORTEX_VERSION` with the complete CLI version (e.g. 0.18.1): -$ bash -c "$(curl -sS https://raw.githubusercontent.com/cortexlabs/cortex/vINSERT_CORTEX_VERSION/get-cli.sh)" - -# For example to download CLI version 0.18.1 (Note the "v"): -$ bash -c "$(curl -sS https://raw.githubusercontent.com/cortexlabs/cortex/v0.18.1/get-cli.sh)" -``` - -By default, the Cortex CLI is installed at `/usr/local/bin/cortex`. To install the executable elsewhere, export the `CORTEX_INSTALL_PATH` environment variable to your desired location before running the command above. - -By default, the Cortex CLI creates a directory at `~/.cortex/` and uses it to store environment configuration. To use a different directory, export the `CORTEX_CLI_CONFIG_DIR` environment variable before running a `cortex` command. - -### Windows - -To install the Cortex CLI on a Windows machine, follow [this guide](../guides/windows-cli.md). - -## Command overview - -### deploy - -```text -create or update apis - -Usage: - cortex deploy [CONFIG_FILE] [flags] - -Flags: - -e, --env string environment to use (default "local") - -f, --force override the in-progress api update - -y, --yes skip prompts - -o, --output string output format: one of pretty|json (default "pretty") - -h, --help help for deploy -``` - -### get - -```text -get information about apis or jobs - -Usage: - cortex get [API_NAME] [JOB_ID] [flags] - -Flags: - -e, --env string environment to use (default "local") - -w, --watch re-run the command every 2 seconds - -o, --output string output format: one of pretty|json (default "pretty") - -v, --verbose show additional information (only applies to pretty output format) - -h, --help help for get -``` - -### logs - -```text -stream logs from an api - -Usage: - cortex logs API_NAME [JOB_ID] [flags] - -Flags: - -e, --env string environment to use (default "local") - -h, --help help for logs -``` - -### patch - -```text -update API configuration for a deployed API - -Usage: - cortex patch [CONFIG_FILE] [flags] - -Flags: - -e, --env string environment to use (default "local") - -f, --force override the in-progress api update - -o, --output string output format: one of pretty|json (default "pretty") - -h, --help help for patch -``` - -### refresh - -```text -restart all replicas for an api (without downtime) - -Usage: - cortex refresh API_NAME [flags] - -Flags: - -e, --env string environment to use (default "local") - -f, --force override the in-progress api update - -o, --output string output format: one of pretty|json (default "pretty") - -h, --help help for refresh -``` - -### predict - -```text -make a prediction request using a json file - -Usage: - cortex predict API_NAME JSON_FILE [flags] - -Flags: - -e, --env string environment to use (default "local") - -h, --help help for predict -``` - -### delete - -```text -delete any kind of api or stop a batch job - -Usage: - cortex delete API_NAME [JOB_ID] [flags] - -Flags: - -e, --env string environment to use (default "local") - -f, --force delete the api without confirmation - -c, --keep-cache keep cached data for the api - -o, --output string output format: one of pretty|json (default "pretty") - -h, --help help for delete -``` - -### cluster up - -```text -spin up a cluster on aws - -Usage: - cortex cluster up [flags] - -Flags: - -c, --config string path to a cluster configuration file - --aws-key string aws access key id - --aws-secret string aws secret access key - --cluster-aws-key string aws access key id to be used by the cluster - --cluster-aws-secret string aws secret access key to be used by the cluster - -e, --configure-env string name of environment to configure (default "aws") - -y, --yes skip prompts - -h, --help help for up -``` - -### cluster info - -```text -get information about a cluster - -Usage: - cortex cluster info [flags] - -Flags: - -c, --config string path to a cluster configuration file - -n, --name string name of the cluster - -r, --region string aws region of the cluster - --aws-key string aws access key id - --aws-secret string aws secret access key - -e, --configure-env string name of environment to configure - -d, --debug save the current cluster state to a file - -y, --yes skip prompts - -h, --help help for info -``` - -### cluster configure - -```text -update a cluster's configuration - -Usage: - cortex cluster configure [flags] - -Flags: - -c, --config string path to a cluster configuration file - --aws-key string aws access key id - --aws-secret string aws secret access key - --cluster-aws-key string aws access key id to be used by the cluster - --cluster-aws-secret string aws secret access key to be used by the cluster - -e, --configure-env string name of environment to configure - -y, --yes skip prompts - -h, --help help for configure -``` - -### cluster down - -```text -spin down a cluster - -Usage: - cortex cluster down [flags] - -Flags: - -c, --config string path to a cluster configuration file - -n, --name string name of the cluster - -r, --region string aws region of the cluster - --aws-key string aws access key id - --aws-secret string aws secret access key - -y, --yes skip prompts - -h, --help help for down -``` - -### cluster export - -```text -download the code and configuration for APIs - -Usage: - cortex cluster export [API_NAME] [API_ID] [flags] - -Flags: - -c, --config string path to a cluster configuration file - -n, --name string name of the cluster - -r, --region string aws region of the cluster - --aws-key string aws access key id - --aws-secret string aws secret access key - -h, --help help for export -``` - -### env configure - -```text -configure an environment - -Usage: - cortex env configure [ENVIRONMENT_NAME] [flags] - -Flags: - -p, --provider string set the provider without prompting - -o, --operator-endpoint string set the operator endpoint without prompting - -k, --aws-access-key-id string set the aws access key id without prompting - -s, --aws-secret-access-key string set the aws secret access key without prompting - -r, --aws-region string set the aws region without prompting - -h, --help help for configure -``` - -### env list - -```text -list all configured environments - -Usage: - cortex env list [flags] - -Flags: - -o, --output string output format: one of pretty|json (default "pretty") - -h, --help help for list -``` - -### env default - -```text -set the default environment - -Usage: - cortex env default [ENVIRONMENT_NAME] [flags] - -Flags: - -h, --help help for default -``` - -### env delete - -```text -delete an environment configuration - -Usage: - cortex env delete [ENVIRONMENT_NAME] [flags] - -Flags: - -h, --help help for delete -``` - -### version - -```text -print the cli and cluster versions - -Usage: - cortex version [flags] - -Flags: - -e, --env string environment to use (default "local") - -h, --help help for version -``` - -### completion - -```text -generate shell completion scripts - -to enable cortex shell completion: - bash: - add this to ~/.bash_profile (mac) or ~/.bashrc (linux): - source <(cortex completion bash) - - note: bash-completion must be installed on your system; example installation instructions: - mac: - 1) install bash completion: - brew install bash-completion - 2) add this to your ~/.bash_profile: - source $(brew --prefix)/etc/bash_completion - 3) log out and back in, or close your terminal window and reopen it - ubuntu: - 1) install bash completion: - apt update && apt install -y bash-completion # you may need sudo - 2) open ~/.bashrc and uncomment the bash completion section, or add this: - if [ -f /etc/bash_completion ] && ! shopt -oq posix; then . /etc/bash_completion; fi - 3) log out and back in, or close your terminal window and reopen it - - zsh: - option 1: - add this to ~/.zshrc: - source <(cortex completion zsh) - if that failed, you can try adding this line (above the source command you just added): - autoload -Uz compinit && compinit - option 2: - create a _cortex file in your fpath, for example: - cortex completion zsh > /usr/local/share/zsh/site-functions/_cortex - -Note: this will also add the "cx" alias for cortex for convenience - -Usage: - cortex completion SHELL [flags] - -Flags: - -h, --help help for completion -``` diff --git a/docs/summary.md b/docs/summary.md index 5c2967a71c..1aac625cd9 100644 --- a/docs/summary.md +++ b/docs/summary.md @@ -1,18 +1,24 @@ # Table of contents -* [Deploy machine learning models to production](../README.md) -* [Install](cloud/install.md) -* [Tutorial](https://docs.cortex.dev/v/master/deployments/realtime-api/text-generator) -* [GitHub](https://github.com/cortexlabs/cortex) -* [Examples](https://github.com/cortexlabs/cortex/tree/master/examples) -* [Contact us](contact.md) +* [Get started](tutorials/realtime.md) +* [Chat with us](https://gitter.im/cortexlabs/cortex) -## Running Cortex on AWS +## Tutorials + +* [Realtime API](tutorials/realtime.md) +* [Batch API](tutorials/batch.md) +* [Multi-model API](tutorials/multi-model.md) +* [Traffic splitter](tutorials/traffic-splitter.md) +* [Project directory](tutorials/project.md) + +## Running on AWS * [Install](aws/install.md) * [Credentials](aws/credentials.md) * [Security](aws/security.md) * [Spot instances](aws/spot.md) +* [GPUs](aws/gpu.md) +* [Inferentia](aws/inferentia.md) * [Networking](aws/networking.md) * [VPC peering](aws/vpc-peering.md) * [Custom domain](aws/custom-domain.md) @@ -21,48 +27,33 @@ * [Update](aws/update.md) * [Uninstall](aws/uninstall.md) -## Running Cortex on GCP +## Running on GCP * [Install](gcp/install.md) * [Credentials](gcp/credentials.md) * [Uninstall](gcp/uninstall.md) -## Deployments - -* [Realtime API](deployments/realtime-api.md) - * [Predictor implementation](deployments/realtime-api/predictors.md) - * [API configuration](deployments/realtime-api/api-configuration.md) - * [API deployment](deployments/realtime-api/deployment.md) - * [API statuses](deployments/realtime-api/statuses.md) - * [Models](deployments/realtime-api/models.md) - * [Parallelism](deployments/realtime-api/parallelism.md) - * [Autoscaling](deployments/realtime-api/autoscaling.md) - * [Prediction monitoring](deployments/realtime-api/prediction-monitoring.md) - * [Traffic Splitter](deployments/realtime-api/traffic-splitter.md) - * [Realtime API tutorial](../examples/pytorch/text-generator/README.md) -* [Batch API](deployments/batch-api.md) - * [Predictor implementation](deployments/batch-api/predictors.md) - * [API configuration](deployments/batch-api/api-configuration.md) - * [API deployment](deployments/batch-api/deployment.md) - * [Endpoints](deployments/batch-api/endpoints.md) - * [Job statuses](deployments/batch-api/statuses.md) - * [Batch API tutorial](../examples/batch/image-classifier/README.md) - -## Advanced - -* [Compute](deployments/compute.md) -* [Using GPUs](deployments/gpus.md) -* [Using Inferentia](deployments/inferentia.md) -* [Python packages](deployments/python-packages.md) -* [System packages](deployments/system-packages.md) - -## Miscellaneous - -* [CLI commands](miscellaneous/cli.md) -* [Python client](miscellaneous/python-client.md) -* [Environments](miscellaneous/environments.md) -* [Architecture diagram](miscellaneous/architecture.md) -* [Telemetry](miscellaneous/telemetry.md) +## Workloads + +* [Realtime API](workloads/realtime.md) + * [Predictor implementation](workloads/realtime/predictors.md) + * [API configuration](workloads/realtime/configuration.md) + * [API statuses](workloads/realtime/statuses.md) + * [Models](workloads/realtime/models.md) + * [Parallelism](workloads/realtime/parallelism.md) + * [Autoscaling](workloads/realtime/autoscaling.md) + * [Prediction monitoring](workloads/realtime/prediction-monitoring.md) + * [Traffic Splitter](workloads/realtime/traffic-splitter.md) +* [Batch API](workloads/batch.md) + * [Predictor implementation](workloads/batch/predictors.md) + * [API configuration](workloads/batch/configuration.md) + * [Endpoints](workloads/batch/endpoints.md) + * [Job statuses](workloads/batch/statuses.md) +* [Python client](workloads/python-client.md) +* [Python packages](workloads/python-packages.md) +* [System packages](workloads/system-packages.md) +* [Environments](workloads/environments.md) +* [Telemetry](workloads/telemetry.md) ## Troubleshooting @@ -70,7 +61,7 @@ * [404/503 API responses](troubleshooting/api-request-errors.md) * [NVIDIA runtime not found](troubleshooting/nvidia-container-runtime-not-found.md) * [TF session in predict()](troubleshooting/tf-session-in-predict.md) -* [Serving-side batching errors](troubleshooting/server-side-batching-errors.md) +* [Server-side batching errors](troubleshooting/server-side-batching-errors.md) ## Guides @@ -85,7 +76,4 @@ * [Docker Hub rate limiting](guides/docker-hub-rate-limiting.md) * [Private docker registry](guides/private-docker.md) * [Install CLI on Windows](guides/windows-cli.md) - -## Contributing - -* [Development](contributing/development.md) +* [Contributing](guides/contributing.md) diff --git a/docs/troubleshooting/server-side-batching-errors.md b/docs/troubleshooting/server-side-batching-errors.md index 4740d903fa..df03f75b06 100644 --- a/docs/troubleshooting/server-side-batching-errors.md +++ b/docs/troubleshooting/server-side-batching-errors.md @@ -2,7 +2,7 @@ _WARNING: you are on the master branch, please refer to the docs on the branch that matches your `cortex version`_ -When `max_batch_size` and `batch_interval` fields are set for the [Realtime API TensorFlow Predictor](../deployments/realtime-api/predictors.md#tensorflow-predictor), errors can be encountered if the associated model hasn't been built for batching. +When `max_batch_size` and `batch_interval` fields are set for the [Realtime API TensorFlow Predictor](../workloads/realtime/predictors.md#tensorflow-predictor), errors can be encountered if the associated model hasn't been built for batching. The following error is an example of what happens when the input shape doesn't accommodate batching - e.g. when its shape is `[height, width, 3]` instead of `[batch_size, height, width, 3]`: diff --git a/docs/troubleshooting/tf-session-in-predict.md b/docs/troubleshooting/tf-session-in-predict.md index c8e1d56218..fa0f2d6b49 100644 --- a/docs/troubleshooting/tf-session-in-predict.md +++ b/docs/troubleshooting/tf-session-in-predict.md @@ -2,7 +2,7 @@ _WARNING: you are on the master branch, please refer to the docs on the branch that matches your `cortex version`_ -When doing inferences with TensorFlow using the [Realtime API Python Predictor](../deployments/realtime-api/predictors.md#python-predictor) or [Batch API Python Predictor](../deployments/batch-api/predictors.md#python-predictor), it should be noted that your Python Predictor's `__init__()` constructor is only called on one thread, whereas its `predict()` method can run on any of the available threads (which is configured via the `threads_per_process` field in the API's `predictor` configuration). If `threads_per_process` is set to `1` (the default value), then there is no concern, since `__init__()` and `predict()` will run on the same thread. However, if `threads_per_process` is greater than `1`, then only one of the inference threads will have executed the `__init__()` function. This can cause issues with TensorFlow because the default graph is a property of the current thread, so if `__init__()` initializes the TensorFlow graph, only the thread that executed `__init__()` will have the default graph set. +When doing inferences with TensorFlow using the [Realtime API Python Predictor](../workloads/realtime/predictors.md#python-predictor) or [Batch API Python Predictor](../workloads/batch/predictors.md#python-predictor), it should be noted that your Python Predictor's `__init__()` constructor is only called on one thread, whereas its `predict()` method can run on any of the available threads (which is configured via the `threads_per_process` field in the API's `predictor` configuration). If `threads_per_process` is set to `1` (the default value), then there is no concern, since `__init__()` and `predict()` will run on the same thread. However, if `threads_per_process` is greater than `1`, then only one of the inference threads will have executed the `__init__()` function. This can cause issues with TensorFlow because the default graph is a property of the current thread, so if `__init__()` initializes the TensorFlow graph, only the thread that executed `__init__()` will have the default graph set. The error you may see if the default graph is not set (as a consequence of `__init__()` and `predict()` running in separate threads) is: diff --git a/docs/tutorials/batch.md b/docs/tutorials/batch.md new file mode 100644 index 0000000000..b10691f806 --- /dev/null +++ b/docs/tutorials/batch.md @@ -0,0 +1,156 @@ +# Deploy a batch API + +Deploy models as batch APIs that can orchestrate distributed batch inference jobs on large datasets. + +## Key features + +* Distributed inference +* Fault tolerance with queues +* Metrics and log aggregation +* `on_job_complete` webhook +* Scale to 0 + +## How it works + +### Install cortex + +```bash +$ pip install cortex +``` + +### Spin up a cluster on AWS + +```bash +$ cortex cluster up +``` + +### Define a batch API + +```python +# batch.py + +import cortex + +class PythonPredictor: + def __init__(self, config, job_spec): + from torchvision import transforms + import torchvision + import requests + import boto3 + import re + + self.model = torchvision.models.alexnet(pretrained=True).eval() + self.labels = requests.get(config["labels"]).text.split("\n")[1:] + + normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) + self.preprocess = transforms.Compose( + [transforms.Resize(256), transforms.CenterCrop(224), transforms.ToTensor(), normalize] + ) + + self.s3 = boto3.client("s3") # initialize S3 client to save results + self.bucket, self.key = re.match("s3://(.+?)/(.+)", config["dest_s3_dir"]).groups() + self.key = os.path.join(self.key, job_spec["job_id"]) + + def predict(self, payload, batch_id): + import json + import torch + from PIL import Image + from io import BytesIO + import requests + + tensor_list = [] + for image_url in payload: # download and preprocess each image + img_pil = Image.open(BytesIO(requests.get(image_url).content)) + tensor_list.append(self.preprocess(img_pil)) + + img_tensor = torch.stack(tensor_list) + with torch.no_grad(): # classify the batch of images + prediction = self.model(img_tensor) + _, indices = prediction.max(1) + + results = [{"url": payload[i], "class": self.labels[class_idx]} for i, class_idx in enumerate(indices)] + self.s3.put_object(Bucket=self.bucket, Key=f"{self.key}/{batch_id}.json", Body=json.dumps(results)) + +requirements = ["torch", "boto3", "pillow", "torchvision", "requests"] + +api_spec = { + "name": "image-classifier", + "kind": "BatchAPI", + "predictor": { + "config": { + "labels": "https://storage.googleapis.com/download.tensorflow.org/data/ImageNetLabels.txt" + } + } +} + +cx = cortex.client("aws") +cx.create_api(api_spec, predictor=PythonPredictor, requirements=requirements) +``` + +### Deploy to your Cortex cluster on AWS + +```bash +$ python batch.py +``` + +### Describe the Batch API + +```bash +$ cortex get image-classifier --env aws +``` + +### Submit a job + +```python +import cortex +import requests + +cx = cortex.client("aws") +batch_endpoint = cx.get_api("image-classifier")["endpoint"] + +dest_s3_dir = # specify S3 directory for the results (make sure your cluster has access to this bucket) + +job_spec = { + "workers": 1, + "item_list": { + "items": [ + "https://i.imgur.com/PzXprwl.jpg", + "https://i.imgur.com/E4cOSLw.jpg", + "https://user-images.githubusercontent.com/4365343/96516272-d40aa980-1234-11eb-949d-8e7e739b8345.jpg", + "https://i.imgur.com/jDimNTZ.jpg", + "https://i.imgur.com/WqeovVj.jpg" + ], + "batch_size": 2 + }, + "config": { + "dest_s3_dir": dest_s3_dir + } +} + +response = requests.post(batch_endpoint, json=job_spec) + +print(response.text) +# > {"job_id":"69b183ed6bdf3e9b","api_name":"image-classifier", "config": {"dest_s3_dir": ...}} +``` + +### Monitor the job + +```bash +$ cortex get image-classifier 69b183ed6bdf3e9b +``` + +### Stream job logs + +```bash +$ cortex logs image-classifier 69b183ed6bdf3e9b +``` + +### View the results + +Once the job is complete, you should be able to find the results of the batch job in the S3 directory you've specified. + +### Delete the Batch API + +```bash +$ cortex delete image-classifier --env aws +``` diff --git a/docs/tutorials/multi-model.md b/docs/tutorials/multi-model.md new file mode 100644 index 0000000000..043ea6c6ad --- /dev/null +++ b/docs/tutorials/multi-model.md @@ -0,0 +1,43 @@ +# Deploy a multi-model API + +Deploy several models in a single API to improve resource utilization efficiency. + +### Define a multi-model API + +```python +# multi_model.py + +import cortex + +class PythonPredictor: + def __init__(self, config): + from transformers import pipeline + self.analyzer = pipeline(task="sentiment-analysis") + + import wget + import fasttext + wget.download( + "https://dl.fbaipublicfiles.com/fasttext/supervised-models/lid.176.bin", "/tmp/model" + ) + self.language_identifier = fasttext.load_model("/tmp/model") + + def predict(self, query_params, payload): + model = query_params.get("model") + if model == "sentiment": + return self.analyzer(payload["text"])[0] + elif model == "language": + return self.language_identifier.predict(payload["text"])[0][0][-2:] + +requirements = ["tensorflow", "transformers", "wget", "fasttext"] + +api_spec = {"name": "multi-model", "kind": "RealtimeAPI"} + +cx = cortex.client("aws") +cx.create_api(api_spec, predictor=PythonPredictor, requirements=requirements) +``` + +### Deploy + +```bash +$ python multi_model.py +``` diff --git a/docs/tutorials/project.md b/docs/tutorials/project.md new file mode 100644 index 0000000000..dc512bfb49 --- /dev/null +++ b/docs/tutorials/project.md @@ -0,0 +1,61 @@ +# Deploy a project + +You can deploy an API by providing a project directory. Cortex will save the project directory and make it available during API initialization. + +```bash +project/ + ├── model.py + ├── util.py + ├── predictor.py + ├── requirements.txt + └── ... +``` + +You can define your Predictor class in a separate python file and import code from your project. + +```python +# predictor.py + +from model import MyModel + +class PythonPredictor: + def __init__(self, config): + model = MyModel() + + def predict(payload): + return model(payload) +``` + +## Deploy using the Python Client + +```python +import cortex + +api_spec = { + "name": "text-generator", + "kind": "RealtimeAPI", + "predictor": { + "type": "python", + "path": "predictor.py" + } +} + +cx = cortex.client("aws") +cx.create_api(api_spec, project_dir=".") +``` + +## Deploy using the CLI + +```yaml +# api.yaml + +- name: text-generator + kind: RealtimeAPI + predictor: + type: python + path: predictor.py +``` + +```bash +$ cortex deploy api.yaml -e aws +``` diff --git a/docs/tutorials/realtime.md b/docs/tutorials/realtime.md new file mode 100644 index 0000000000..5befb96bc8 --- /dev/null +++ b/docs/tutorials/realtime.md @@ -0,0 +1,106 @@ +# Deploy a realtime API + +Deploy models as realtime APIs that can respond to prediction requests on demand. + +## Key features + +* Request-based autoscaling +* Multi-model endpoints +* Server-side batching +* Metrics and log aggregation +* Rolling updates + +## How it works + +### Install cortex + +```bash +$ pip install cortex +``` + +### Define a realtime API + +```python +# text_generator.py + +import cortex + +class PythonPredictor: + def __init__(self, config): + from transformers import pipeline + + self.model = pipeline(task="text-generation") + + def predict(self, payload): + return self.model(payload["text"])[0] + +requirements = ["tensorflow", "transformers"] + +api_spec = {"name": "text-generator", "kind": "RealtimeAPI"} + +cx = cortex.client("local") +cx.create_api(api_spec, predictor=PythonPredictor, requirements=requirements) +``` + +### Test locally (requires Docker) + +```bash +$ python text_generator.py +``` + +### Monitor + +```bash +$ cortex get text-generator --watch +``` + +### Make a request + +```bash +$ curl http://localhost:8889 -X POST -H "Content-Type: application/json" -d '{"text": "hello world"}' +``` + +### Stream logs + +```bash +$ cortex logs text-generator +``` + +### Spin up a cluster on AWS + +```bash +$ cortex cluster up +``` + +### Edit `text_generator.py` + +```python +# cx = cortex.client("local") +cx = cortex.client("aws") +``` + +### Deploy to AWS + +```bash +$ python text_generator.py +``` + +### Monitor + +```bash +$ cortex get text-generator --env aws --watch +``` + +### Make a request + +```bash +$ curl https://***.execute-api.us-west-2.amazonaws.com/text-generator -X POST -H "Content-Type: application/json" -d '{"text": "hello world"}' +``` + +### Delete the APIs + +```bash +$ cortex delete text-generator --env local + +$ cortex delete text-generator --env aws +``` diff --git a/docs/tutorials/traffic-splitter.md b/docs/tutorials/traffic-splitter.md new file mode 100644 index 0000000000..1191ab6c91 --- /dev/null +++ b/docs/tutorials/traffic-splitter.md @@ -0,0 +1,68 @@ +# Traffic splitter + +A Traffic Splitter can be used expose multiple APIs as a single endpoint. The percentage of traffic routed to each API can be controlled. This can be useful when performing A/B tests, setting up multi-armed bandits or performing canary deployments. + +**Note: Traffic Splitter is only supported on a Cortex cluster** + +## Deploy APIs + +```python +import cortex + +class PythonPredictor: + def __init__(self, config): + from transformers import pipeline + self.model = pipeline(task="text-generation") + + def predict(self, payload): + return self.model(payload["text"])[0] + +requirements = ["tensorflow", "transformers"] + +api_spec_cpu = { + "name": "text-generator-cpu", + "kind": "RealtimeAPI", + "compute": { + "cpu": 1, + }, +} + +api_spec_gpu = { + "name": "text-generator-gpu", + "kind": "RealtimeAPI", + "compute": { + "gpu": 1, + }, +} + +cx = cortex.client("aws") +cx.create_api(api_spec_cpu, predictor=PythonPredictor, requirements=requirements) +cx.create_api(api_spec_gpu, predictor=PythonPredictor, requirements=requirements) +``` + +## Deploy a traffic splitter + +```python +traffic_splitter_spec = { + "name": "text-generator", + "kind": "TrafficSplitter", + "apis": [ + {"name": "text-generator-cpu", "weight": 50}, + {"name": "text-generator-gpu", "weight": 50}, + ], +} + +cx.create_api(traffic_splitter_spec) +``` + +## Update the weights of the traffic splitter + +```python +traffic_splitter_spec = cx.get_api("text-generator")["spec"]["submitted_api_spec"] + +# send 99% of the traffic to text-generator-gpu +traffic_splitter_spec["apis"][0]["weight"] = 1 +traffic_splitter_spec["apis"][1]["weight"] = 99 + +cx.patch(traffic_splitter_spec) +``` diff --git a/docs/deployments/batch-api/api-configuration.md b/docs/workloads/batch/configuration.md similarity index 87% rename from docs/deployments/batch-api/api-configuration.md rename to docs/workloads/batch/configuration.md index e45ff2a850..ad5d216710 100644 --- a/docs/deployments/batch-api/api-configuration.md +++ b/docs/workloads/batch/configuration.md @@ -1,13 +1,7 @@ -# API configuration +# Batch API configuration _WARNING: you are on the master branch, please refer to the docs on the branch that matches your `cortex version`_ -Once your model is [exported](../../guides/exporting.md) and you've implemented a [Predictor](predictors.md), you can configure your API via a YAML file (typically named `cortex.yaml`). - -Reference the section below which corresponds to your Predictor type: [Python](#python-predictor), [TensorFlow](#tensorflow-predictor), or [ONNX](#onnx-predictor). - -**Batch APIs are only supported on a Cortex cluster (in AWS).** - ## Python Predictor @@ -31,8 +25,6 @@ Reference the section below which corresponds to your Predictor type: [Python](# mem: # memory request per worker, e.g. 200Mi or 1Gi (default: Null) ``` -See additional documentation for [compute](../compute.md), [networking](../../aws/networking.md), and [overriding API images](../system-packages.md). - ## TensorFlow Predictor @@ -67,8 +59,6 @@ See additional documentation for [compute](../compute.md), [networking](../../aw mem: # memory request per worker, e.g. 200Mi or 1Gi (default: Null) ``` -See additional documentation for [compute](../compute.md), [networking](../../aws/networking.md), and [overriding API images](../system-packages.md). - ## ONNX Predictor @@ -96,5 +86,3 @@ See additional documentation for [compute](../compute.md), [networking](../../aw gpu: # GPU request per worker (default: 0) mem: # memory request per worker, e.g. 200Mi or 1Gi (default: Null) ``` - -See additional documentation for [compute](../compute.md), [networking](../../aws/networking.md), and [overriding API images](../system-packages.md). diff --git a/docs/deployments/batch-api/endpoints.md b/docs/workloads/batch/endpoints.md similarity index 100% rename from docs/deployments/batch-api/endpoints.md rename to docs/workloads/batch/endpoints.md diff --git a/docs/deployments/batch-api/predictors.md b/docs/workloads/batch/predictors.md similarity index 95% rename from docs/deployments/batch-api/predictors.md rename to docs/workloads/batch/predictors.md index bd42cd464a..ada4321a20 100644 --- a/docs/deployments/batch-api/predictors.md +++ b/docs/workloads/batch/predictors.md @@ -94,11 +94,6 @@ class PythonPredictor: For proper separation of concerns, it is recommended to use the constructor's `config` parameter for information such as from where to download the model and initialization files, or any configurable model parameters. You define `config` in your [API configuration](api-configuration.md), and it is passed through to your Predictor's constructor. The `config` parameters in the `API configuration` can be overridden by providing `config` in the job submission requests. -### Examples - - -You can find an example of a BatchAPI using a PythonPredictor in [examples/batch/image-classifier](https://github.com/cortexlabs/cortex/tree/master/examples/batch/image-classifier). - ### Pre-installed packages The following Python packages are pre-installed in Python Predictors and can be used in your implementations: @@ -233,11 +228,6 @@ When multiple models are defined using the Predictor's `models` field, the `tens For proper separation of concerns, it is recommended to use the constructor's `config` parameter for information such as from where to download the model and initialization files, or any configurable model parameters. You define `config` in your [API configuration](api-configuration.md), and it is passed through to your Predictor's constructor. The `config` parameters in the `API configuration` can be overridden by providing `config` in the job submission requests. -### Examples - - -You can find an example of a BatchAPI using a TensorFlowPredictor in [examples/batch/tensorflow](https://github.com/cortexlabs/cortex/tree/master/examples/batch/tensorflow). - ### Pre-installed packages The following Python packages are pre-installed in TensorFlow Predictors and can be used in your implementations: @@ -321,11 +311,6 @@ When multiple models are defined using the Predictor's `models` field, the `onnx For proper separation of concerns, it is recommended to use the constructor's `config` parameter for information such as from where to download the model and initialization files, or any configurable model parameters. You define `config` in your [API configuration](api-configuration.md), and it is passed through to your Predictor's constructor. The `config` parameters in the `API configuration` can be overridden by providing `config` in the job submission requests. -### Examples - - -You can find an example of a BatchAPI using an ONNXPredictor in [examples/batch/onnx](https://github.com/cortexlabs/cortex/tree/master/examples/batch/onnx). - ### Pre-installed packages The following Python packages are pre-installed in ONNX Predictors and can be used in your implementations: diff --git a/docs/deployments/batch-api/statuses.md b/docs/workloads/batch/statuses.md similarity index 100% rename from docs/deployments/batch-api/statuses.md rename to docs/workloads/batch/statuses.md diff --git a/docs/miscellaneous/environments.md b/docs/workloads/environments.md similarity index 100% rename from docs/miscellaneous/environments.md rename to docs/workloads/environments.md diff --git a/docs/miscellaneous/python-client.md b/docs/workloads/python-client.md similarity index 91% rename from docs/miscellaneous/python-client.md rename to docs/workloads/python-client.md index 6c98188c56..3af866f68c 100644 --- a/docs/miscellaneous/python-client.md +++ b/docs/workloads/python-client.md @@ -116,13 +116,8 @@ Deploy an API. **Arguments**: -- `api_spec` - A dictionary defining a single Cortex API. Schema can be found here: - → Realtime API: https://docs.cortex.dev/v/master/deployments/realtime-api/api-configuration - → Batch API: https://docs.cortex.dev/v/master/deployments/batch-api/api-configuration - → Traffic Splitter: https://docs.cortex.dev/v/master/deployments/realtime-api/traffic-splitter +- `api_spec` - A dictionary defining a single Cortex API. See https://docs.cortex.dev/v/master/ for schema. - `predictor` - A Cortex Predictor class implementation. Not required when deploying a traffic splitter. - → Realtime API: https://docs.cortex.dev/v/master/deployments/realtime-api/predictors - → Batch API: https://docs.cortex.dev/v/master/deployments/batch-api/predictors - `requirements` - A list of PyPI dependencies that will be installed before the predictor class implementation is invoked. - `conda_packages` - A list of Conda dependencies that will be installed before the predictor class implementation is invoked. - `project_dir` - Path to a python project. diff --git a/docs/deployments/python-packages.md b/docs/workloads/python-packages.md similarity index 100% rename from docs/deployments/python-packages.md rename to docs/workloads/python-packages.md diff --git a/docs/deployments/realtime-api/autoscaling.md b/docs/workloads/realtime/autoscaling.md similarity index 100% rename from docs/deployments/realtime-api/autoscaling.md rename to docs/workloads/realtime/autoscaling.md diff --git a/docs/deployments/realtime-api/api-configuration.md b/docs/workloads/realtime/configuration.md similarity index 93% rename from docs/deployments/realtime-api/api-configuration.md rename to docs/workloads/realtime/configuration.md index b70e2d9f69..7b1d95dd2e 100644 --- a/docs/deployments/realtime-api/api-configuration.md +++ b/docs/workloads/realtime/configuration.md @@ -2,10 +2,6 @@ _WARNING: you are on the master branch, please refer to the docs on the branch that matches your `cortex version`_ -Once your model is [exported](../../guides/exporting.md) and you've implemented a [Predictor](predictors.md), you can configure your API via a YAML file (typically named `cortex.yaml`). - -Reference the section below which corresponds to your Predictor type: [Python](#python-predictor), [TensorFlow](#tensorflow-predictor), or [ONNX](#onnx-predictor). - ## Python Predictor @@ -60,8 +56,6 @@ Reference the section below which corresponds to your Predictor type: [Python](# max_unavailable: # maximum number of replicas that can be unavailable during an update; can be an absolute number, e.g. 5, or a percentage of desired replicas, e.g. 10% (default: 25%) ``` -See additional documentation for [models](models.md), [parallelism](parallelism.md), [autoscaling](autoscaling.md), [compute](../compute.md), [networking](../../aws/networking.md), [prediction monitoring](prediction-monitoring.md), and [overriding API images](../system-packages.md). - ## TensorFlow Predictor @@ -123,8 +117,6 @@ See additional documentation for [models](models.md), [parallelism](parallelism. max_unavailable: # maximum number of replicas that can be unavailable during an update; can be an absolute number, e.g. 5, or a percentage of desired replicas, e.g. 10% (default: 25%) ``` -See additional documentation for [models](models.md), [parallelism](parallelism.md), [autoscaling](autoscaling.md), [compute](../compute.md), [networking](../../aws/networking.md), [prediction monitoring](prediction-monitoring.md), and [overriding API images](../system-packages.md). - ## ONNX Predictor @@ -178,5 +170,3 @@ See additional documentation for [models](models.md), [parallelism](parallelism. max_surge: # maximum number of replicas that can be scheduled above the desired number of replicas during an update; can be an absolute number, e.g. 5, or a percentage of desired replicas, e.g. 10% (default: 25%) (set to 0 to disable rolling updates) max_unavailable: # maximum number of replicas that can be unavailable during an update; can be an absolute number, e.g. 5, or a percentage of desired replicas, e.g. 10% (default: 25%) ``` - -See additional documentation for [models](models.md), [parallelism](parallelism.md), [autoscaling](autoscaling.md), [compute](../compute.md), [networking](../../aws/networking.md), [prediction monitoring](prediction-monitoring.md), and [overriding API images](../system-packages.md). diff --git a/docs/deployments/realtime-api/models.md b/docs/workloads/realtime/models.md similarity index 95% rename from docs/deployments/realtime-api/models.md rename to docs/workloads/realtime/models.md index 07fdc2ce1a..ab88460a78 100644 --- a/docs/deployments/realtime-api/models.md +++ b/docs/workloads/realtime/models.md @@ -168,9 +168,6 @@ When using the `models.dir` field, the directory provided may contain multiple s In this case, there are two models in the directory, one of which is named "text-generator", and the other is named "sentiment-analyzer". - -Additional examples can be seen in the [multi model guide](../../guides/multi-model.md) and in [examples/model-caching](https://github.com/cortexlabs/cortex/tree/master/examples/model-caching) (remove the `cache_size` and `disk_cache_size` configurations in `cortex.yaml` to disable [multi model caching](#multi-model-caching)). - ## Live model reloading Live model reloading is a mechanism that periodically checks for updated models in the model path(s) provided in `predictor.model_path` or `predictor.models`. It is automatically enabled for all predictor types, including the Python predictor type (as long as model paths are specified via `model_path` or `models` in the `predictor` configuration). @@ -182,9 +179,6 @@ The following is a list of events that will trigger the API to update its model( * A model changes its directory structure. * A file in the model directory is updated in-place. - -Examples can be seen in [examples/live-reloading](https://github.com/cortexlabs/cortex/tree/master/examples/live-reloading). - Usage varies based on the predictor type: ### Python @@ -393,9 +387,6 @@ The model cache is a two-layer cache, configured by the following parameters in Both of these fields must be specified, in addition to either the `dir` or `paths` field (which specifies the model paths, see above for documentation). Multi model caching is only supported if `predictor.processes_per_replica` is set to 1 (the default value). - -See [examples/model-caching](https://github.com/cortexlabs/cortex/tree/master/examples/model-caching) for examples. - ### Caveats Cortex periodically runs a background script (every 10 seconds) that counts the number of models in memory and on disk, and evicts the least recently used models if the count exceeds `cache_size` / `disk_cache_size`. diff --git a/docs/deployments/realtime-api/parallelism.md b/docs/workloads/realtime/parallelism.md similarity index 95% rename from docs/deployments/realtime-api/parallelism.md rename to docs/workloads/realtime/parallelism.md index ad44641ff8..3ec3ca7854 100644 --- a/docs/deployments/realtime-api/parallelism.md +++ b/docs/workloads/realtime/parallelism.md @@ -47,6 +47,3 @@ When optimizing for maximum throughput, a good rule of thumb is to follow these 1. Multiply the maximum throughput from step 1 by the `batch_interval` from step 2. The result is a number which you can assign to `max_batch_size`. 1. Run the load test again. If the inference fails with that batch size (e.g. due to running out of GPU or RAM memory), then reduce `max_batch_size` to a level that works (reduce `batch_interval` by the same factor). 1. Use the load test to determine the peak throughput of the API replica. Multiply the observed throughput by the `batch_interval` to calculate the average batch size. If the average batch size coincides with `max_batch_size`, then it might mean that the throughput could still be further increased by increasing `max_batch_size`. If it's lower, then it means that `batch_interval` is triggering the inference before `max_batch_size` requests have been aggregated. If modifying both `max_batch_size` and `batch_interval` doesn't improve the throughput, then the service may be bottlenecked by something else (e.g. CPU, network IO, `processes_per_replica`, `threads_per_process`, etc). - - -An example of server-side batching for the TensorFlow Predictor that has been benchmarked is found in [ResNet50 in TensorFlow](https://github.com/cortexlabs/cortex/tree/master/examples/tensorflow/image-classifier-resnet50#throughput-test). diff --git a/docs/deployments/realtime-api/prediction-monitoring.md b/docs/workloads/realtime/prediction-monitoring.md similarity index 100% rename from docs/deployments/realtime-api/prediction-monitoring.md rename to docs/workloads/realtime/prediction-monitoring.md diff --git a/docs/deployments/realtime-api/predictors.md b/docs/workloads/realtime/predictors.md similarity index 88% rename from docs/deployments/realtime-api/predictors.md rename to docs/workloads/realtime/predictors.md index 627f37b73a..b9aa1fdfa8 100644 --- a/docs/deployments/realtime-api/predictors.md +++ b/docs/workloads/realtime/predictors.md @@ -134,64 +134,6 @@ Your API can accept requests with different types of payloads such as `JSON`-par Your `predictor` method can return different types of objects such as `JSON`-parseable, `string`, and `bytes` objects. Navigate to the [API responses](#api-responses) section to learn about how to configure your `predictor` method to respond with different response codes and content-types. -### Examples - - -Many of the [examples](https://github.com/cortexlabs/cortex/tree/master/examples) use the Python Predictor, including all of the PyTorch examples. - - -Here is the Predictor for [examples/pytorch/text-generator](https://github.com/cortexlabs/cortex/tree/master/examples/pytorch/text-generator): - -```python -import torch -from transformers import GPT2Tokenizer, GPT2LMHeadModel - - -class PythonPredictor: - def __init__(self, config): - self.device = "cuda" if torch.cuda.is_available() else "cpu" - print(f"using device: {self.device}") - self.tokenizer = GPT2Tokenizer.from_pretrained("gpt2") - self.model = GPT2LMHeadModel.from_pretrained("gpt2").to(self.device) - - def predict(self, payload): - input_length = len(payload["text"].split()) - tokens = self.tokenizer.encode(payload["text"], return_tensors="pt").to(self.device) - prediction = self.model.generate(tokens, max_length=input_length + 20, do_sample=True) - return self.tokenizer.decode(prediction[0]) -``` - - -Here is the Predictor for [examples/live-reloading/python/mpg-estimator](https://github.com/cortexlabs/cortex/tree/feature/master/examples/live-reloading/python/mpg-estimator): - -```python -import mlflow.sklearn -import numpy as np - - -class PythonPredictor: - def __init__(self, config, python_client): - self.client = python_client - - def load_model(self, model_path): - return mlflow.sklearn.load_model(model_path) - - def predict(self, payload, query_params): - model_version = query_params.get("version") - - model = self.client.get_model(model_version=model_version) - model_input = [ - payload["cylinders"], - payload["displacement"], - payload["horsepower"], - payload["weight"], - payload["acceleration"], - ] - result = model.predict([model_input]).item() - - return {"prediction": result, "model": {"version": model_version}} -``` - ### Pre-installed packages The following Python packages are pre-installed in Python Predictors and can be used in your implementations: @@ -335,27 +277,6 @@ Your API can accept requests with different types of payloads such as `JSON`-par Your `predictor` method can return different types of objects such as `JSON`-parseable, `string`, and `bytes` objects. Navigate to the [API responses](#api-responses) section to learn about how to configure your `predictor` method to respond with different response codes and content-types. -### Examples - - -Most of the examples in [examples/tensorflow](https://github.com/cortexlabs/cortex/tree/master/examples/tensorflow) use the TensorFlow Predictor. - - -Here is the Predictor for [examples/tensorflow/iris-classifier](https://github.com/cortexlabs/cortex/tree/master/examples/tensorflow/iris-classifier): - -```python -labels = ["setosa", "versicolor", "virginica"] - -class TensorFlowPredictor: - def __init__(self, tensorflow_client, config): - self.client = tensorflow_client - - def predict(self, payload): - prediction = self.client.predict(payload) - predicted_class_id = int(prediction["class_ids"][0]) - return labels[predicted_class_id] -``` - ### Pre-installed packages The following Python packages are pre-installed in TensorFlow Predictors and can be used in your implementations: @@ -448,31 +369,6 @@ Your API can accept requests with different types of payloads such as `JSON`-par Your `predictor` method can return different types of objects such as `JSON`-parseable, `string`, and `bytes` objects. Navigate to the [API responses](#api-responses) section to learn about how to configure your `predictor` method to respond with different response codes and content-types. -### Examples - - -[examples/onnx/iris-classifier](https://github.com/cortexlabs/cortex/tree/master/examples/onnx/iris-classifier) uses the ONNX Predictor: - -```python -labels = ["setosa", "versicolor", "virginica"] - -class ONNXPredictor: - def __init__(self, onnx_client, config): - self.client = onnx_client - - def predict(self, payload): - model_input = [ - payload["sepal_length"], - payload["sepal_width"], - payload["petal_length"], - payload["petal_width"], - ] - - prediction = self.client.predict(model_input) - predicted_class_id = prediction[0][0] - return labels[predicted_class_id] -``` - ### Pre-installed packages The following Python packages are pre-installed in ONNX Predictors and can be used in your implementations: diff --git a/docs/deployments/realtime-api/statuses.md b/docs/workloads/realtime/statuses.md similarity index 100% rename from docs/deployments/realtime-api/statuses.md rename to docs/workloads/realtime/statuses.md diff --git a/docs/deployments/realtime-api/traffic-splitter.md b/docs/workloads/realtime/traffic-splitter.md similarity index 89% rename from docs/deployments/realtime-api/traffic-splitter.md rename to docs/workloads/realtime/traffic-splitter.md index 3a8a004da1..adfee17215 100644 --- a/docs/deployments/realtime-api/traffic-splitter.md +++ b/docs/workloads/realtime/traffic-splitter.md @@ -73,9 +73,3 @@ deleted traffic-splitter ``` Note that this will not delete the Realtime APIs targeted by the Traffic Splitter. - -## Additional resources - -* [Traffic Splitter Tutorial](../../../examples/traffic-splitter/README.md) provides a step-by-step walkthrough for deploying an Traffic Splitter -* [Realtime API Tutorial](../../../examples/pytorch/text-generator/README.md) provides a step-by-step walkthrough of deploying a realtime API for text generation -* [CLI documentation](../../miscellaneous/cli.md) lists all CLI commands diff --git a/docs/deployments/system-packages.md b/docs/workloads/system-packages.md similarity index 100% rename from docs/deployments/system-packages.md rename to docs/workloads/system-packages.md diff --git a/docs/miscellaneous/telemetry.md b/docs/workloads/telemetry.md similarity index 50% rename from docs/miscellaneous/telemetry.md rename to docs/workloads/telemetry.md index b26f2ece87..0c9c3f4821 100644 --- a/docs/miscellaneous/telemetry.md +++ b/docs/workloads/telemetry.md @@ -6,11 +6,7 @@ By default, Cortex sends anonymous usage data to Cortex Labs. ## What data is collected? -If telemetry is enabled, events and errors are collected. Each time you run a command an event will be sent with a randomly generated unique CLI ID and the name of the command. For example, if you run `cortex deploy`, Cortex Labs will receive an event of the structure `{id: 1234, command: "deploy"}`. In addition, the operator sends heartbeats that include cluster metrics like the types of instances running in your cluster. - -## Why is this data being collected? - -Telemetry helps us make Cortex better. For example, we discovered that people are running `cortex delete` more times than we expected and realized that our documentation doesn't explain clearly that `cortex deploy` is declarative and can be run consecutively without deleting APIs. +If telemetry is enabled, events and errors are collected. Each time you run a command an event will be sent with a randomly generated unique CLI ID and the name of the command. For example, if you run `cortex get`, Cortex Labs will receive an event of the structure `{id: 1234, command: "get"}`. In addition, the operator sends heartbeats that include cluster metrics like the types of instances running in your cluster. ## How do I opt out? diff --git a/manager/debug.sh b/manager/debug.sh index 46c3a6e03c..0b292b5fed 100755 --- a/manager/debug.sh +++ b/manager/debug.sh @@ -27,7 +27,7 @@ if ! eksctl utils describe-stacks --cluster=$CORTEX_CLUSTER_NAME --region=$CORTE fi eksctl utils write-kubeconfig --cluster=$CORTEX_CLUSTER_NAME --region=$CORTEX_REGION | grep -v "saved kubeconfig as" | grep -v "using region" | grep -v "eksctl version" || true -out=$(kubectl get pods 2>&1 || true); if [[ "$out" == *"must be logged in to the server"* ]]; then echo "error: your aws iam user does not have access to this cluster; to grant access, see https://docs.cortex.dev/v/${CORTEX_VERSION_MINOR}/aws/security#running-cortex-cluster-commands-from-different-iam-users"; exit 1; fi +out=$(kubectl get pods 2>&1 || true); if [[ "$out" == *"must be logged in to the server"* ]]; then echo "error: your aws iam user does not have access to this cluster; to grant access, see https://docs.cortex.dev/v/${CORTEX_VERSION_MINOR}/"; exit 1; fi echo -n "gathering cluster data" diff --git a/manager/info.sh b/manager/info.sh index c754737605..a682d17f8b 100755 --- a/manager/info.sh +++ b/manager/info.sh @@ -36,7 +36,7 @@ if ! eksctl utils describe-stacks --cluster=$CORTEX_CLUSTER_NAME --region=$CORTE fi eksctl utils write-kubeconfig --cluster=$CORTEX_CLUSTER_NAME --region=$CORTEX_REGION | grep -v "saved kubeconfig as" | grep -v "using region" | grep -v "eksctl version" || true -out=$(kubectl get pods 2>&1 || true); if [[ "$out" == *"must be logged in to the server"* ]]; then echo "error: your aws iam user does not have access to this cluster; to grant access, see https://docs.cortex.dev/v/${CORTEX_VERSION_MINOR}/aws/security#running-cortex-cluster-commands-from-different-iam-users"; exit 1; fi +out=$(kubectl get pods 2>&1 || true); if [[ "$out" == *"must be logged in to the server"* ]]; then echo "error: your aws iam user does not have access to this cluster; to grant access, see https://docs.cortex.dev/v/${CORTEX_VERSION_MINOR}/"; exit 1; fi operator_endpoint=$(get_operator_endpoint) api_load_balancer_endpoint=$(get_api_load_balancer_endpoint) diff --git a/manager/install.sh b/manager/install.sh index cb0b9616ea..aeddadbc3a 100755 --- a/manager/install.sh +++ b/manager/install.sh @@ -97,7 +97,7 @@ function cluster_up_aws() { echo -e "\ncortex is ready!" if [ "$CORTEX_OPERATOR_LOAD_BALANCER_SCHEME" == "internal" ]; then - echo -e "note: you will need to configure VPC Peering to connect to your cluster: https://docs.cortex.dev/v/${CORTEX_VERSION_MINOR}/aws/vpc-peering" + echo -e "note: you will need to configure VPC Peering to connect to your cluster: https://docs.cortex.dev/v/${CORTEX_VERSION_MINOR}/" fi print_endpoints_aws @@ -242,7 +242,7 @@ function check_eks() { function write_kubeconfig() { eksctl utils write-kubeconfig --cluster=$CORTEX_CLUSTER_NAME --region=$CORTEX_REGION | grep -v "saved kubeconfig as" | grep -v "using region" | grep -v "eksctl version" || true - out=$(kubectl get pods 2>&1 || true); if [[ "$out" == *"must be logged in to the server"* ]]; then echo "error: your aws iam user does not have access to this cluster; to grant access, see https://docs.cortex.dev/v/${CORTEX_VERSION_MINOR}/aws/security#running-cortex-cluster-commands-from-different-iam-users"; exit 1; fi + out=$(kubectl get pods 2>&1 || true); if [[ "$out" == *"must be logged in to the server"* ]]; then echo "error: your aws iam user does not have access to this cluster; to grant access, see https://docs.cortex.dev/v/${CORTEX_VERSION_MINOR}/"; exit 1; fi } function setup_configmap() { diff --git a/manager/refresh.sh b/manager/refresh.sh index ce1389cdd5..42595008c4 100755 --- a/manager/refresh.sh +++ b/manager/refresh.sh @@ -27,7 +27,7 @@ if ! eksctl utils describe-stacks --cluster=$CORTEX_CLUSTER_NAME --region=$CORTE fi eksctl utils write-kubeconfig --cluster=$CORTEX_CLUSTER_NAME --region=$CORTEX_REGION | grep -v "saved kubeconfig as" | grep -v "using region" | grep -v "eksctl version" || true -out=$(kubectl get pods 2>&1 || true); if [[ "$out" == *"must be logged in to the server"* ]]; then echo "error: your aws iam user does not have access to this cluster; to grant access, see https://docs.cortex.dev/v/${CORTEX_VERSION_MINOR}/aws/security#running-cortex-cluster-commands-from-different-iam-users"; exit 1; fi +out=$(kubectl get pods 2>&1 || true); if [[ "$out" == *"must be logged in to the server"* ]]; then echo "error: your aws iam user does not have access to this cluster; to grant access, see https://docs.cortex.dev/v/${CORTEX_VERSION_MINOR}/"; exit 1; fi kubectl get -n=default configmap cluster-config -o yaml >> cluster_configmap.yaml python refresh_cluster_config.py cluster_configmap.yaml tmp_cluster_config.yaml diff --git a/pkg/lib/docker/errors.go b/pkg/lib/docker/errors.go index fe7483e41b..d12033731a 100644 --- a/pkg/lib/docker/errors.go +++ b/pkg/lib/docker/errors.go @@ -81,7 +81,7 @@ func ErrorImageInaccessible(image string, providerType types.ProviderType, cause } case types.AWSProviderType: if strings.Contains(cause.Error(), "authorized") || strings.Contains(cause.Error(), "authentication") { - message += fmt.Sprintf("\n\nif you would like to use a private docker registry, see https://docs.cortex.dev/v/%s/guides/private-docker", consts.CortexVersionMinor) + message += fmt.Sprintf("\n\nif you would like to use a private docker registry, see https://docs.cortex.dev/v/%s/", consts.CortexVersionMinor) } } diff --git a/pkg/lib/k8s/errors.go b/pkg/lib/k8s/errors.go index d47a644f4f..0fd3ff0f69 100644 --- a/pkg/lib/k8s/errors.go +++ b/pkg/lib/k8s/errors.go @@ -63,6 +63,6 @@ func ErrorParseAnnotation(annotationName string, annotationVal string, desiredTy func ErrorParseQuantity(qtyStr string) error { return errors.WithStack(&errors.Error{ Kind: ErrParseQuantity, - Message: fmt.Sprintf("%s: invalid kubernetes quantity, some valid examples are 1, 200m, 500Mi, 2G (see here for more information: https://docs.cortex.dev/v/%s/advanced/compute)", qtyStr, consts.CortexVersionMinor), + Message: fmt.Sprintf("%s: invalid kubernetes quantity, some valid examples are 1, 200m, 500Mi, 2G (see here for more information: https://docs.cortex.dev/v/%s/)", qtyStr, consts.CortexVersionMinor), }) } diff --git a/pkg/operator/endpoints/errors.go b/pkg/operator/endpoints/errors.go index 061df2e0bf..b4ba8ccfaa 100644 --- a/pkg/operator/endpoints/errors.go +++ b/pkg/operator/endpoints/errors.go @@ -42,7 +42,7 @@ const ( func ErrorAPIVersionMismatch(operatorVersion string, clientVersion string) error { return errors.WithStack(&errors.Error{ Kind: ErrAPIVersionMismatch, - Message: fmt.Sprintf("your CLI version (%s) doesn't match your Cortex operator version (%s); please update your cluster by following the instructions at https://docs.cortex.dev/update, or update your CLI (pip install cortex==%s)", clientVersion, operatorVersion, operatorVersion), + Message: fmt.Sprintf("your CLI version (%s) doesn't match your Cortex operator version (%s); please update your cluster by following the instructions at https://docs.cortex.dev, or update your CLI (pip install cortex==%s)", clientVersion, operatorVersion, operatorVersion), }) } diff --git a/pkg/operator/endpoints/submit_job.go b/pkg/operator/endpoints/submit_job.go index 44bc1f606c..e3eff9bd63 100644 --- a/pkg/operator/endpoints/submit_job.go +++ b/pkg/operator/endpoints/submit_job.go @@ -60,7 +60,7 @@ func SubmitJob(w http.ResponseWriter, r *http.Request) { err = json.Unmarshal(bodyBytes, &submission) if err != nil { - respondError(w, r, errors.Append(err, fmt.Sprintf("\n\njob submission schema can be found at https://docs.cortex.dev/v/%s/deployments/batch-api/endpoints", consts.CortexVersionMinor))) + respondError(w, r, errors.Append(err, fmt.Sprintf("\n\njob submission schema can be found at https://docs.cortex.dev/v/%s/", consts.CortexVersionMinor))) return } diff --git a/pkg/operator/resources/batchapi/validations.go b/pkg/operator/resources/batchapi/validations.go index 13323161b6..e1e7ebf84c 100644 --- a/pkg/operator/resources/batchapi/validations.go +++ b/pkg/operator/resources/batchapi/validations.go @@ -86,7 +86,7 @@ func validateJobSubmissionSchema(submission *schema.JobSubmission) error { func validateJobSubmission(submission *schema.JobSubmission) error { err := validateJobSubmissionSchema(submission) if err != nil { - return errors.Append(err, fmt.Sprintf("\n\njob submission schema can be found at https://docs.cortex.dev/v/%s/deployments/batch-api/endpoints", consts.CortexVersionMinor)) + return errors.Append(err, fmt.Sprintf("\n\njob submission schema can be found at https://docs.cortex.dev/v/%s/", consts.CortexVersionMinor)) } if submission.FilePathLister != nil { diff --git a/pkg/operator/resources/resources.go b/pkg/operator/resources/resources.go index fb98a2244e..13c0bfe750 100644 --- a/pkg/operator/resources/resources.go +++ b/pkg/operator/resources/resources.go @@ -101,7 +101,7 @@ func Deploy(projectBytes []byte, configFileName string, configBytes []byte, forc err = ValidateClusterAPIs(apiConfigs, projectFiles) if err != nil { - err = errors.Append(err, fmt.Sprintf("\n\napi configuration schema can be found here:\n → Realtime API: https://docs.cortex.dev/v/%s/deployments/realtime-api/api-configuration\n → Batch API: https://docs.cortex.dev/v/%s/deployments/batch-api/api-configuration\n → Traffic Splitter: https://docs.cortex.dev/v/%s/deployments/realtime-api/traffic-splitter", consts.CortexVersionMinor, consts.CortexVersionMinor, consts.CortexVersionMinor)) + err = errors.Append(err, fmt.Sprintf("\n\napi configuration schema can be found at https://docs.cortex.dev/v/%s/", consts.CortexVersionMinor)) return nil, err } @@ -252,7 +252,7 @@ func patchAPI(apiConfig *userconfig.API, configFileName string, force bool) (*sp err = ValidateClusterAPIs([]userconfig.API{*apiConfig}, projectFiles) if err != nil { - err = errors.Append(err, fmt.Sprintf("\n\napi configuration schema can be found here:\n → Realtime API: https://docs.cortex.dev/v/%s/deployments/realtime-api/api-configuration\n → Batch API: https://docs.cortex.dev/v/%s/deployments/batch-api/api-configuration\n → Traffic Splitter: https://docs.cortex.dev/v/%s/deployments/realtime-api/traffic-splitter", consts.CortexVersionMinor, consts.CortexVersionMinor, consts.CortexVersionMinor)) + err = errors.Append(err, fmt.Sprintf("\n\napi configuration schema can be found here:\n → Realtime API: https://docs.cortex.dev/v/%s/", consts.CortexVersionMinor)) return nil, "", err } diff --git a/pkg/types/spec/errors.go b/pkg/types/spec/errors.go index 937c60f6a7..db4cd568dc 100644 --- a/pkg/types/spec/errors.go +++ b/pkg/types/spec/errors.go @@ -100,14 +100,14 @@ var _modelCurrentStructure = ` func ErrorMalformedConfig() error { return errors.WithStack(&errors.Error{ Kind: ErrMalformedConfig, - Message: fmt.Sprintf("cortex YAML configuration files must contain a list of maps (see https://docs.cortex.dev/v/%s/deployments/realtime-api/api-configuration for Realtime API documentation and see https://docs.cortex.dev/v/%s/deployments/batch-api/api-configuration for Batch API documentation)", consts.CortexVersionMinor, consts.CortexVersionMinor), + Message: fmt.Sprintf("cortex YAML configuration files must contain a list of maps (see https://docs.cortex.dev/v/%s/ for api configuration schema)", consts.CortexVersionMinor), }) } func ErrorNoAPIs() error { return errors.WithStack(&errors.Error{ Kind: ErrNoAPIs, - Message: fmt.Sprintf("at least one API must be configured (see https://docs.cortex.dev/v/%s/deployments/realtime-api/api-configuration for Realtime API documentation and see https://docs.cortex.dev/v/%s/deployments/batch-api/api-configuration for Batch API documentation)", consts.CortexVersionMinor, consts.CortexVersionMinor), + Message: fmt.Sprintf("at least one API must be configured (see https://docs.cortex.dev/v/%s/ for api configuration schema)", consts.CortexVersionMinor), }) } diff --git a/pkg/types/spec/validations.go b/pkg/types/spec/validations.go index 46dcb5ae23..1522416d98 100644 --- a/pkg/types/spec/validations.go +++ b/pkg/types/spec/validations.go @@ -641,14 +641,7 @@ func ExtractAPIConfigs( kindString, _ := data[userconfig.KindKey].(string) kind := userconfig.KindFromString(kindString) err = errors.Wrap(errors.FirstError(errs...), userconfig.IdentifyAPI(configFileName, name, kind, i)) - switch provider { - case types.LocalProviderType: - return nil, errors.Append(err, fmt.Sprintf("\n\napi configuration schema for Realtime APIs can be found at https://docs.cortex.dev/v/%s/deployments/realtime-api/api-configuration", consts.CortexVersionMinor)) - case types.AWSProviderType: - return nil, errors.Append(err, fmt.Sprintf("\n\napi configuration schema can be found here:\n → Realtime API: https://docs.cortex.dev/v/%s/deployments/realtime-api/api-configuration\n → Batch API: https://docs.cortex.dev/v/%s/deployments/batch-api/api-configuration\n → Traffic Splitter: https://docs.cortex.dev/v/%s/deployments/realtime-api/traffic-splitter", consts.CortexVersionMinor, consts.CortexVersionMinor, consts.CortexVersionMinor)) - case types.GCPProviderType: - return nil, errors.Append(err, fmt.Sprintf("\n\napi configuration schema for Realtime APIs can be found at https://docs.cortex.dev/v/%s/deployments/realtime-api/api-configuration", consts.CortexVersionMinor)) - } + return nil, errors.Append(err, fmt.Sprintf("\n\napi configuration schema can be found at https://docs.cortex.dev/v/%s/", consts.CortexVersionMinor)) } if resourceStruct.Kind == userconfig.BatchAPIKind || resourceStruct.Kind == userconfig.TrafficSplitterKind { @@ -663,14 +656,7 @@ func ExtractAPIConfigs( kindString, _ := data[userconfig.KindKey].(string) kind := userconfig.KindFromString(kindString) err = errors.Wrap(errors.FirstError(errs...), userconfig.IdentifyAPI(configFileName, name, kind, i)) - switch kind { - case userconfig.RealtimeAPIKind: - return nil, errors.Append(err, fmt.Sprintf("\n\napi configuration schema for Realtime API can be found at https://docs.cortex.dev/v/%s/deployments/realtime-api/api-configuration", consts.CortexVersionMinor)) - case userconfig.BatchAPIKind: - return nil, errors.Append(err, fmt.Sprintf("\n\napi configuration schema for Batch API can be found at https://docs.cortex.dev/v/%s/deployments/batch-api/api-configuration", consts.CortexVersionMinor)) - case userconfig.TrafficSplitterKind: - return nil, errors.Append(err, fmt.Sprintf("\n\napi configuration schema for Traffic Splitter can be found at https://docs.cortex.dev/v/%s/deployments/realtime-api/traffic-splitter", consts.CortexVersionMinor)) - } + return nil, errors.Append(err, fmt.Sprintf("\n\napi configuration schema can be found at https://docs.cortex.dev/v/%s/", consts.CortexVersionMinor)) } api.Index = i api.FileName = configFileName diff --git a/pkg/workloads/cortex/client/cortex/client.py b/pkg/workloads/cortex/client/cortex/client.py index 27722bcaa2..e1af803e5a 100644 --- a/pkg/workloads/cortex/client/cortex/client.py +++ b/pkg/workloads/cortex/client/cortex/client.py @@ -44,7 +44,7 @@ def __init__(self, env: dict): self.env = env self.env_name = env["name"] - # CORTEX_VERSION_MINOR x5 + # CORTEX_VERSION_MINOR def create_api( self, api_spec: dict, @@ -59,13 +59,8 @@ def create_api( Deploy an API. Args: - api_spec: A dictionary defining a single Cortex API. Schema can be found here: - → Realtime API: https://docs.cortex.dev/v/master/deployments/realtime-api/api-configuration - → Batch API: https://docs.cortex.dev/v/master/deployments/batch-api/api-configuration - → Traffic Splitter: https://docs.cortex.dev/v/master/deployments/realtime-api/traffic-splitter + api_spec: A dictionary defining a single Cortex API. See https://docs.cortex.dev/v/master/ for schema. predictor: A Cortex Predictor class implementation. Not required when deploying a traffic splitter. - → Realtime API: https://docs.cortex.dev/v/master/deployments/realtime-api/predictors - → Batch API: https://docs.cortex.dev/v/master/deployments/batch-api/predictors requirements: A list of PyPI dependencies that will be installed before the predictor class implementation is invoked. conda_packages: A list of Conda dependencies that will be installed before the predictor class implementation is invoked. project_dir: Path to a python project. diff --git a/pkg/workloads/cortex/serve/init/bootloader.sh b/pkg/workloads/cortex/serve/init/bootloader.sh index d119fdf0ff..7ea38ff92f 100755 --- a/pkg/workloads/cortex/serve/init/bootloader.sh +++ b/pkg/workloads/cortex/serve/init/bootloader.sh @@ -21,9 +21,9 @@ export EXPECTED_CORTEX_VERSION=master if [ "$CORTEX_VERSION" != "$EXPECTED_CORTEX_VERSION" ]; then if [ "$CORTEX_PROVIDER" == "local" ]; then - echo "error: your Cortex CLI version ($CORTEX_VERSION) doesn't match your predictor image version ($EXPECTED_CORTEX_VERSION); please update your predictor image by modifying the \`image\` field in your API configuration file (e.g. cortex.yaml) and re-running \`cortex deploy\`, or update your CLI by following the instructions at https://docs.cortex.dev/update" + echo "error: your Cortex CLI version ($CORTEX_VERSION) doesn't match your predictor image version ($EXPECTED_CORTEX_VERSION); please update your predictor image by modifying the \`image\` field in your API configuration file (e.g. cortex.yaml) and re-running \`cortex deploy\`, or update your CLI by following the instructions at https://docs.cortex.dev/" else - echo "error: your Cortex operator version ($CORTEX_VERSION) doesn't match your predictor image version ($EXPECTED_CORTEX_VERSION); please update your predictor image by modifying the \`image\` field in your API configuration file (e.g. cortex.yaml) and re-running \`cortex deploy\`, or update your cluster by following the instructions at https://docs.cortex.dev/update" + echo "error: your Cortex operator version ($CORTEX_VERSION) doesn't match your predictor image version ($EXPECTED_CORTEX_VERSION); please update your predictor image by modifying the \`image\` field in your API configuration file (e.g. cortex.yaml) and re-running \`cortex deploy\`, or update your cluster by following the instructions at https://docs.cortex.dev/" fi exit 1 fi diff --git a/examples/README.md b/test/README.md similarity index 100% rename from examples/README.md rename to test/README.md diff --git a/examples/batch/image-classifier/README.md b/test/batch/image-classifier/README.md similarity index 99% rename from examples/batch/image-classifier/README.md rename to test/batch/image-classifier/README.md index 03cc827d35..3d62908e52 100644 --- a/examples/batch/image-classifier/README.md +++ b/test/batch/image-classifier/README.md @@ -105,7 +105,7 @@ class PythonPredictor: ) ``` -Here are the complete [Predictor docs](../../../docs/deployments/batch-api/predictors.md). +Here are the complete [Predictor docs](../../../docs/workloads/batch/predictors.md).
@@ -140,7 +140,7 @@ Create a `cortex.yaml` file and add the configuration below. An `api` with `kind cpu: 1 ``` -Here are the complete [API configuration docs](../../../docs/deployments/batch-api/api-configuration.md). +Here are the complete [API configuration docs](../../../docs/workloads/batch/configuration.md).
diff --git a/examples/batch/image-classifier/cortex.yaml b/test/batch/image-classifier/cortex.yaml similarity index 100% rename from examples/batch/image-classifier/cortex.yaml rename to test/batch/image-classifier/cortex.yaml diff --git a/examples/batch/image-classifier/predictor.py b/test/batch/image-classifier/predictor.py similarity index 100% rename from examples/batch/image-classifier/predictor.py rename to test/batch/image-classifier/predictor.py diff --git a/examples/batch/image-classifier/requirements.txt b/test/batch/image-classifier/requirements.txt similarity index 100% rename from examples/batch/image-classifier/requirements.txt rename to test/batch/image-classifier/requirements.txt diff --git a/examples/batch/image-classifier/sample.json b/test/batch/image-classifier/sample.json similarity index 100% rename from examples/batch/image-classifier/sample.json rename to test/batch/image-classifier/sample.json diff --git a/examples/batch/onnx/README.md b/test/batch/onnx/README.md similarity index 100% rename from examples/batch/onnx/README.md rename to test/batch/onnx/README.md diff --git a/examples/batch/onnx/cortex.yaml b/test/batch/onnx/cortex.yaml similarity index 100% rename from examples/batch/onnx/cortex.yaml rename to test/batch/onnx/cortex.yaml diff --git a/examples/batch/onnx/predictor.py b/test/batch/onnx/predictor.py similarity index 100% rename from examples/batch/onnx/predictor.py rename to test/batch/onnx/predictor.py diff --git a/examples/batch/onnx/requirements.txt b/test/batch/onnx/requirements.txt similarity index 100% rename from examples/batch/onnx/requirements.txt rename to test/batch/onnx/requirements.txt diff --git a/examples/batch/tensorflow/README.md b/test/batch/tensorflow/README.md similarity index 100% rename from examples/batch/tensorflow/README.md rename to test/batch/tensorflow/README.md diff --git a/examples/batch/tensorflow/cortex.yaml b/test/batch/tensorflow/cortex.yaml similarity index 100% rename from examples/batch/tensorflow/cortex.yaml rename to test/batch/tensorflow/cortex.yaml diff --git a/examples/batch/tensorflow/predictor.py b/test/batch/tensorflow/predictor.py similarity index 100% rename from examples/batch/tensorflow/predictor.py rename to test/batch/tensorflow/predictor.py diff --git a/examples/batch/tensorflow/requirements.txt b/test/batch/tensorflow/requirements.txt similarity index 100% rename from examples/batch/tensorflow/requirements.txt rename to test/batch/tensorflow/requirements.txt diff --git a/examples/keras/document-denoiser/README.md b/test/keras/document-denoiser/README.md similarity index 100% rename from examples/keras/document-denoiser/README.md rename to test/keras/document-denoiser/README.md diff --git a/examples/keras/document-denoiser/cortex.yaml b/test/keras/document-denoiser/cortex.yaml similarity index 100% rename from examples/keras/document-denoiser/cortex.yaml rename to test/keras/document-denoiser/cortex.yaml diff --git a/examples/keras/document-denoiser/predictor.py b/test/keras/document-denoiser/predictor.py similarity index 100% rename from examples/keras/document-denoiser/predictor.py rename to test/keras/document-denoiser/predictor.py diff --git a/examples/keras/document-denoiser/requirements.txt b/test/keras/document-denoiser/requirements.txt similarity index 100% rename from examples/keras/document-denoiser/requirements.txt rename to test/keras/document-denoiser/requirements.txt diff --git a/examples/keras/document-denoiser/sample.json b/test/keras/document-denoiser/sample.json similarity index 100% rename from examples/keras/document-denoiser/sample.json rename to test/keras/document-denoiser/sample.json diff --git a/examples/keras/document-denoiser/trainer.ipynb b/test/keras/document-denoiser/trainer.ipynb similarity index 100% rename from examples/keras/document-denoiser/trainer.ipynb rename to test/keras/document-denoiser/trainer.ipynb diff --git a/examples/live-reloading/onnx/README.md b/test/live-reloading/onnx/README.md similarity index 100% rename from examples/live-reloading/onnx/README.md rename to test/live-reloading/onnx/README.md diff --git a/examples/live-reloading/python/mpg-estimator/cortex.yaml b/test/live-reloading/python/mpg-estimator/cortex.yaml similarity index 100% rename from examples/live-reloading/python/mpg-estimator/cortex.yaml rename to test/live-reloading/python/mpg-estimator/cortex.yaml diff --git a/examples/live-reloading/python/mpg-estimator/predictor.py b/test/live-reloading/python/mpg-estimator/predictor.py similarity index 100% rename from examples/live-reloading/python/mpg-estimator/predictor.py rename to test/live-reloading/python/mpg-estimator/predictor.py diff --git a/examples/live-reloading/python/mpg-estimator/requirements.txt b/test/live-reloading/python/mpg-estimator/requirements.txt similarity index 100% rename from examples/live-reloading/python/mpg-estimator/requirements.txt rename to test/live-reloading/python/mpg-estimator/requirements.txt diff --git a/examples/live-reloading/python/mpg-estimator/sample.json b/test/live-reloading/python/mpg-estimator/sample.json similarity index 100% rename from examples/live-reloading/python/mpg-estimator/sample.json rename to test/live-reloading/python/mpg-estimator/sample.json diff --git a/examples/live-reloading/tensorflow/README.md b/test/live-reloading/tensorflow/README.md similarity index 100% rename from examples/live-reloading/tensorflow/README.md rename to test/live-reloading/tensorflow/README.md diff --git a/examples/model-caching/onnx/multi-model-classifier/README.md b/test/model-caching/onnx/multi-model-classifier/README.md similarity index 100% rename from examples/model-caching/onnx/multi-model-classifier/README.md rename to test/model-caching/onnx/multi-model-classifier/README.md diff --git a/examples/model-caching/onnx/multi-model-classifier/cortex.yaml b/test/model-caching/onnx/multi-model-classifier/cortex.yaml similarity index 100% rename from examples/model-caching/onnx/multi-model-classifier/cortex.yaml rename to test/model-caching/onnx/multi-model-classifier/cortex.yaml diff --git a/examples/model-caching/onnx/multi-model-classifier/predictor.py b/test/model-caching/onnx/multi-model-classifier/predictor.py similarity index 100% rename from examples/model-caching/onnx/multi-model-classifier/predictor.py rename to test/model-caching/onnx/multi-model-classifier/predictor.py diff --git a/examples/model-caching/onnx/multi-model-classifier/requirements.txt b/test/model-caching/onnx/multi-model-classifier/requirements.txt similarity index 100% rename from examples/model-caching/onnx/multi-model-classifier/requirements.txt rename to test/model-caching/onnx/multi-model-classifier/requirements.txt diff --git a/examples/model-caching/onnx/multi-model-classifier/sample.json b/test/model-caching/onnx/multi-model-classifier/sample.json similarity index 100% rename from examples/model-caching/onnx/multi-model-classifier/sample.json rename to test/model-caching/onnx/multi-model-classifier/sample.json diff --git a/examples/model-caching/python/mpg-estimator/README.md b/test/model-caching/python/mpg-estimator/README.md similarity index 100% rename from examples/model-caching/python/mpg-estimator/README.md rename to test/model-caching/python/mpg-estimator/README.md diff --git a/examples/model-caching/python/mpg-estimator/cortex.yaml b/test/model-caching/python/mpg-estimator/cortex.yaml similarity index 100% rename from examples/model-caching/python/mpg-estimator/cortex.yaml rename to test/model-caching/python/mpg-estimator/cortex.yaml diff --git a/examples/model-caching/python/mpg-estimator/predictor.py b/test/model-caching/python/mpg-estimator/predictor.py similarity index 100% rename from examples/model-caching/python/mpg-estimator/predictor.py rename to test/model-caching/python/mpg-estimator/predictor.py diff --git a/examples/model-caching/python/mpg-estimator/requirements.txt b/test/model-caching/python/mpg-estimator/requirements.txt similarity index 100% rename from examples/model-caching/python/mpg-estimator/requirements.txt rename to test/model-caching/python/mpg-estimator/requirements.txt diff --git a/examples/model-caching/python/mpg-estimator/sample.json b/test/model-caching/python/mpg-estimator/sample.json similarity index 100% rename from examples/model-caching/python/mpg-estimator/sample.json rename to test/model-caching/python/mpg-estimator/sample.json diff --git a/examples/model-caching/tensorflow/multi-model-classifier/README.md b/test/model-caching/tensorflow/multi-model-classifier/README.md similarity index 100% rename from examples/model-caching/tensorflow/multi-model-classifier/README.md rename to test/model-caching/tensorflow/multi-model-classifier/README.md diff --git a/examples/model-caching/tensorflow/multi-model-classifier/cortex.yaml b/test/model-caching/tensorflow/multi-model-classifier/cortex.yaml similarity index 100% rename from examples/model-caching/tensorflow/multi-model-classifier/cortex.yaml rename to test/model-caching/tensorflow/multi-model-classifier/cortex.yaml diff --git a/examples/model-caching/tensorflow/multi-model-classifier/predictor.py b/test/model-caching/tensorflow/multi-model-classifier/predictor.py similarity index 100% rename from examples/model-caching/tensorflow/multi-model-classifier/predictor.py rename to test/model-caching/tensorflow/multi-model-classifier/predictor.py diff --git a/examples/model-caching/tensorflow/multi-model-classifier/requirements.txt b/test/model-caching/tensorflow/multi-model-classifier/requirements.txt similarity index 100% rename from examples/model-caching/tensorflow/multi-model-classifier/requirements.txt rename to test/model-caching/tensorflow/multi-model-classifier/requirements.txt diff --git a/examples/model-caching/tensorflow/multi-model-classifier/sample-image.json b/test/model-caching/tensorflow/multi-model-classifier/sample-image.json similarity index 100% rename from examples/model-caching/tensorflow/multi-model-classifier/sample-image.json rename to test/model-caching/tensorflow/multi-model-classifier/sample-image.json diff --git a/examples/model-caching/tensorflow/multi-model-classifier/sample-iris.json b/test/model-caching/tensorflow/multi-model-classifier/sample-iris.json similarity index 100% rename from examples/model-caching/tensorflow/multi-model-classifier/sample-iris.json rename to test/model-caching/tensorflow/multi-model-classifier/sample-iris.json diff --git a/examples/onnx/iris-classifier/README.md b/test/onnx/iris-classifier/README.md similarity index 100% rename from examples/onnx/iris-classifier/README.md rename to test/onnx/iris-classifier/README.md diff --git a/examples/onnx/iris-classifier/cortex.yaml b/test/onnx/iris-classifier/cortex.yaml similarity index 100% rename from examples/onnx/iris-classifier/cortex.yaml rename to test/onnx/iris-classifier/cortex.yaml diff --git a/examples/onnx/iris-classifier/predictor.py b/test/onnx/iris-classifier/predictor.py similarity index 100% rename from examples/onnx/iris-classifier/predictor.py rename to test/onnx/iris-classifier/predictor.py diff --git a/examples/onnx/iris-classifier/sample.json b/test/onnx/iris-classifier/sample.json similarity index 100% rename from examples/onnx/iris-classifier/sample.json rename to test/onnx/iris-classifier/sample.json diff --git a/examples/onnx/iris-classifier/xgboost.ipynb b/test/onnx/iris-classifier/xgboost.ipynb similarity index 100% rename from examples/onnx/iris-classifier/xgboost.ipynb rename to test/onnx/iris-classifier/xgboost.ipynb diff --git a/examples/onnx/multi-model-classifier/README.md b/test/onnx/multi-model-classifier/README.md similarity index 100% rename from examples/onnx/multi-model-classifier/README.md rename to test/onnx/multi-model-classifier/README.md diff --git a/examples/onnx/multi-model-classifier/cortex.yaml b/test/onnx/multi-model-classifier/cortex.yaml similarity index 100% rename from examples/onnx/multi-model-classifier/cortex.yaml rename to test/onnx/multi-model-classifier/cortex.yaml diff --git a/examples/onnx/multi-model-classifier/predictor.py b/test/onnx/multi-model-classifier/predictor.py similarity index 100% rename from examples/onnx/multi-model-classifier/predictor.py rename to test/onnx/multi-model-classifier/predictor.py diff --git a/examples/onnx/multi-model-classifier/requirements.txt b/test/onnx/multi-model-classifier/requirements.txt similarity index 100% rename from examples/onnx/multi-model-classifier/requirements.txt rename to test/onnx/multi-model-classifier/requirements.txt diff --git a/examples/onnx/multi-model-classifier/sample.json b/test/onnx/multi-model-classifier/sample.json similarity index 100% rename from examples/onnx/multi-model-classifier/sample.json rename to test/onnx/multi-model-classifier/sample.json diff --git a/examples/onnx/yolov5-youtube/README.md b/test/onnx/yolov5-youtube/README.md similarity index 100% rename from examples/onnx/yolov5-youtube/README.md rename to test/onnx/yolov5-youtube/README.md diff --git a/examples/onnx/yolov5-youtube/conda-packages.txt b/test/onnx/yolov5-youtube/conda-packages.txt similarity index 100% rename from examples/onnx/yolov5-youtube/conda-packages.txt rename to test/onnx/yolov5-youtube/conda-packages.txt diff --git a/examples/onnx/yolov5-youtube/cortex.yaml b/test/onnx/yolov5-youtube/cortex.yaml similarity index 100% rename from examples/onnx/yolov5-youtube/cortex.yaml rename to test/onnx/yolov5-youtube/cortex.yaml diff --git a/examples/onnx/yolov5-youtube/labels.json b/test/onnx/yolov5-youtube/labels.json similarity index 100% rename from examples/onnx/yolov5-youtube/labels.json rename to test/onnx/yolov5-youtube/labels.json diff --git a/examples/onnx/yolov5-youtube/predictor.py b/test/onnx/yolov5-youtube/predictor.py similarity index 100% rename from examples/onnx/yolov5-youtube/predictor.py rename to test/onnx/yolov5-youtube/predictor.py diff --git a/examples/onnx/yolov5-youtube/requirements.txt b/test/onnx/yolov5-youtube/requirements.txt similarity index 100% rename from examples/onnx/yolov5-youtube/requirements.txt rename to test/onnx/yolov5-youtube/requirements.txt diff --git a/examples/onnx/yolov5-youtube/sample.json b/test/onnx/yolov5-youtube/sample.json similarity index 100% rename from examples/onnx/yolov5-youtube/sample.json rename to test/onnx/yolov5-youtube/sample.json diff --git a/examples/onnx/yolov5-youtube/utils.py b/test/onnx/yolov5-youtube/utils.py similarity index 100% rename from examples/onnx/yolov5-youtube/utils.py rename to test/onnx/yolov5-youtube/utils.py diff --git a/examples/pytorch/answer-generator/README.md b/test/pytorch/answer-generator/README.md similarity index 100% rename from examples/pytorch/answer-generator/README.md rename to test/pytorch/answer-generator/README.md diff --git a/examples/pytorch/answer-generator/cortex.yaml b/test/pytorch/answer-generator/cortex.yaml similarity index 100% rename from examples/pytorch/answer-generator/cortex.yaml rename to test/pytorch/answer-generator/cortex.yaml diff --git a/examples/pytorch/answer-generator/generator.py b/test/pytorch/answer-generator/generator.py similarity index 100% rename from examples/pytorch/answer-generator/generator.py rename to test/pytorch/answer-generator/generator.py diff --git a/examples/pytorch/answer-generator/predictor.py b/test/pytorch/answer-generator/predictor.py similarity index 100% rename from examples/pytorch/answer-generator/predictor.py rename to test/pytorch/answer-generator/predictor.py diff --git a/examples/pytorch/answer-generator/requirements.txt b/test/pytorch/answer-generator/requirements.txt similarity index 100% rename from examples/pytorch/answer-generator/requirements.txt rename to test/pytorch/answer-generator/requirements.txt diff --git a/examples/pytorch/answer-generator/sample.json b/test/pytorch/answer-generator/sample.json similarity index 100% rename from examples/pytorch/answer-generator/sample.json rename to test/pytorch/answer-generator/sample.json diff --git a/examples/pytorch/image-classifier-alexnet/README.md b/test/pytorch/image-classifier-alexnet/README.md similarity index 100% rename from examples/pytorch/image-classifier-alexnet/README.md rename to test/pytorch/image-classifier-alexnet/README.md diff --git a/examples/pytorch/image-classifier-alexnet/cortex.yaml b/test/pytorch/image-classifier-alexnet/cortex.yaml similarity index 100% rename from examples/pytorch/image-classifier-alexnet/cortex.yaml rename to test/pytorch/image-classifier-alexnet/cortex.yaml diff --git a/examples/pytorch/image-classifier-alexnet/predictor.py b/test/pytorch/image-classifier-alexnet/predictor.py similarity index 100% rename from examples/pytorch/image-classifier-alexnet/predictor.py rename to test/pytorch/image-classifier-alexnet/predictor.py diff --git a/examples/pytorch/image-classifier-alexnet/requirements.txt b/test/pytorch/image-classifier-alexnet/requirements.txt similarity index 100% rename from examples/pytorch/image-classifier-alexnet/requirements.txt rename to test/pytorch/image-classifier-alexnet/requirements.txt diff --git a/examples/pytorch/image-classifier-alexnet/sample.json b/test/pytorch/image-classifier-alexnet/sample.json similarity index 100% rename from examples/pytorch/image-classifier-alexnet/sample.json rename to test/pytorch/image-classifier-alexnet/sample.json diff --git a/examples/pytorch/image-classifier-resnet50/README.md b/test/pytorch/image-classifier-resnet50/README.md similarity index 100% rename from examples/pytorch/image-classifier-resnet50/README.md rename to test/pytorch/image-classifier-resnet50/README.md diff --git a/examples/pytorch/image-classifier-resnet50/cortex.yaml b/test/pytorch/image-classifier-resnet50/cortex.yaml similarity index 100% rename from examples/pytorch/image-classifier-resnet50/cortex.yaml rename to test/pytorch/image-classifier-resnet50/cortex.yaml diff --git a/examples/pytorch/image-classifier-resnet50/cortex_gpu.yaml b/test/pytorch/image-classifier-resnet50/cortex_gpu.yaml similarity index 100% rename from examples/pytorch/image-classifier-resnet50/cortex_gpu.yaml rename to test/pytorch/image-classifier-resnet50/cortex_gpu.yaml diff --git a/examples/pytorch/image-classifier-resnet50/cortex_inf.yaml b/test/pytorch/image-classifier-resnet50/cortex_inf.yaml similarity index 100% rename from examples/pytorch/image-classifier-resnet50/cortex_inf.yaml rename to test/pytorch/image-classifier-resnet50/cortex_inf.yaml diff --git a/examples/pytorch/image-classifier-resnet50/generate_resnet50_models.ipynb b/test/pytorch/image-classifier-resnet50/generate_resnet50_models.ipynb similarity index 100% rename from examples/pytorch/image-classifier-resnet50/generate_resnet50_models.ipynb rename to test/pytorch/image-classifier-resnet50/generate_resnet50_models.ipynb diff --git a/examples/pytorch/image-classifier-resnet50/predictor.py b/test/pytorch/image-classifier-resnet50/predictor.py similarity index 100% rename from examples/pytorch/image-classifier-resnet50/predictor.py rename to test/pytorch/image-classifier-resnet50/predictor.py diff --git a/examples/pytorch/image-classifier-resnet50/sample.json b/test/pytorch/image-classifier-resnet50/sample.json similarity index 100% rename from examples/pytorch/image-classifier-resnet50/sample.json rename to test/pytorch/image-classifier-resnet50/sample.json diff --git a/examples/pytorch/iris-classifier/README.md b/test/pytorch/iris-classifier/README.md similarity index 100% rename from examples/pytorch/iris-classifier/README.md rename to test/pytorch/iris-classifier/README.md diff --git a/examples/pytorch/iris-classifier/cortex.yaml b/test/pytorch/iris-classifier/cortex.yaml similarity index 100% rename from examples/pytorch/iris-classifier/cortex.yaml rename to test/pytorch/iris-classifier/cortex.yaml diff --git a/examples/pytorch/iris-classifier/model.py b/test/pytorch/iris-classifier/model.py similarity index 100% rename from examples/pytorch/iris-classifier/model.py rename to test/pytorch/iris-classifier/model.py diff --git a/examples/pytorch/iris-classifier/predictor.py b/test/pytorch/iris-classifier/predictor.py similarity index 100% rename from examples/pytorch/iris-classifier/predictor.py rename to test/pytorch/iris-classifier/predictor.py diff --git a/examples/pytorch/iris-classifier/requirements.txt b/test/pytorch/iris-classifier/requirements.txt similarity index 100% rename from examples/pytorch/iris-classifier/requirements.txt rename to test/pytorch/iris-classifier/requirements.txt diff --git a/examples/pytorch/iris-classifier/sample.json b/test/pytorch/iris-classifier/sample.json similarity index 100% rename from examples/pytorch/iris-classifier/sample.json rename to test/pytorch/iris-classifier/sample.json diff --git a/examples/pytorch/language-identifier/README.md b/test/pytorch/language-identifier/README.md similarity index 100% rename from examples/pytorch/language-identifier/README.md rename to test/pytorch/language-identifier/README.md diff --git a/examples/pytorch/language-identifier/cortex.yaml b/test/pytorch/language-identifier/cortex.yaml similarity index 100% rename from examples/pytorch/language-identifier/cortex.yaml rename to test/pytorch/language-identifier/cortex.yaml diff --git a/examples/pytorch/language-identifier/predictor.py b/test/pytorch/language-identifier/predictor.py similarity index 100% rename from examples/pytorch/language-identifier/predictor.py rename to test/pytorch/language-identifier/predictor.py diff --git a/examples/pytorch/language-identifier/requirements.txt b/test/pytorch/language-identifier/requirements.txt similarity index 100% rename from examples/pytorch/language-identifier/requirements.txt rename to test/pytorch/language-identifier/requirements.txt diff --git a/examples/pytorch/language-identifier/sample.json b/test/pytorch/language-identifier/sample.json similarity index 100% rename from examples/pytorch/language-identifier/sample.json rename to test/pytorch/language-identifier/sample.json diff --git a/examples/pytorch/multi-model-text-analyzer/README.md b/test/pytorch/multi-model-text-analyzer/README.md similarity index 100% rename from examples/pytorch/multi-model-text-analyzer/README.md rename to test/pytorch/multi-model-text-analyzer/README.md diff --git a/examples/pytorch/multi-model-text-analyzer/cortex.yaml b/test/pytorch/multi-model-text-analyzer/cortex.yaml similarity index 100% rename from examples/pytorch/multi-model-text-analyzer/cortex.yaml rename to test/pytorch/multi-model-text-analyzer/cortex.yaml diff --git a/examples/pytorch/multi-model-text-analyzer/predictor.py b/test/pytorch/multi-model-text-analyzer/predictor.py similarity index 100% rename from examples/pytorch/multi-model-text-analyzer/predictor.py rename to test/pytorch/multi-model-text-analyzer/predictor.py diff --git a/examples/pytorch/multi-model-text-analyzer/requirements.txt b/test/pytorch/multi-model-text-analyzer/requirements.txt similarity index 100% rename from examples/pytorch/multi-model-text-analyzer/requirements.txt rename to test/pytorch/multi-model-text-analyzer/requirements.txt diff --git a/examples/pytorch/multi-model-text-analyzer/sample-sentiment.json b/test/pytorch/multi-model-text-analyzer/sample-sentiment.json similarity index 100% rename from examples/pytorch/multi-model-text-analyzer/sample-sentiment.json rename to test/pytorch/multi-model-text-analyzer/sample-sentiment.json diff --git a/examples/pytorch/multi-model-text-analyzer/sample-summarizer.json b/test/pytorch/multi-model-text-analyzer/sample-summarizer.json similarity index 100% rename from examples/pytorch/multi-model-text-analyzer/sample-summarizer.json rename to test/pytorch/multi-model-text-analyzer/sample-summarizer.json diff --git a/examples/pytorch/object-detector/README.md b/test/pytorch/object-detector/README.md similarity index 100% rename from examples/pytorch/object-detector/README.md rename to test/pytorch/object-detector/README.md diff --git a/examples/pytorch/object-detector/coco_labels.txt b/test/pytorch/object-detector/coco_labels.txt similarity index 100% rename from examples/pytorch/object-detector/coco_labels.txt rename to test/pytorch/object-detector/coco_labels.txt diff --git a/examples/pytorch/object-detector/cortex.yaml b/test/pytorch/object-detector/cortex.yaml similarity index 100% rename from examples/pytorch/object-detector/cortex.yaml rename to test/pytorch/object-detector/cortex.yaml diff --git a/examples/pytorch/object-detector/predictor.py b/test/pytorch/object-detector/predictor.py similarity index 100% rename from examples/pytorch/object-detector/predictor.py rename to test/pytorch/object-detector/predictor.py diff --git a/examples/pytorch/object-detector/requirements.txt b/test/pytorch/object-detector/requirements.txt similarity index 100% rename from examples/pytorch/object-detector/requirements.txt rename to test/pytorch/object-detector/requirements.txt diff --git a/examples/pytorch/object-detector/sample.json b/test/pytorch/object-detector/sample.json similarity index 100% rename from examples/pytorch/object-detector/sample.json rename to test/pytorch/object-detector/sample.json diff --git a/examples/pytorch/question-generator/cortex.yaml b/test/pytorch/question-generator/cortex.yaml similarity index 100% rename from examples/pytorch/question-generator/cortex.yaml rename to test/pytorch/question-generator/cortex.yaml diff --git a/examples/pytorch/question-generator/dependencies.sh b/test/pytorch/question-generator/dependencies.sh similarity index 100% rename from examples/pytorch/question-generator/dependencies.sh rename to test/pytorch/question-generator/dependencies.sh diff --git a/examples/pytorch/question-generator/predictor.py b/test/pytorch/question-generator/predictor.py similarity index 100% rename from examples/pytorch/question-generator/predictor.py rename to test/pytorch/question-generator/predictor.py diff --git a/examples/pytorch/question-generator/requirements.txt b/test/pytorch/question-generator/requirements.txt similarity index 100% rename from examples/pytorch/question-generator/requirements.txt rename to test/pytorch/question-generator/requirements.txt diff --git a/examples/pytorch/question-generator/sample.json b/test/pytorch/question-generator/sample.json similarity index 100% rename from examples/pytorch/question-generator/sample.json rename to test/pytorch/question-generator/sample.json diff --git a/examples/pytorch/reading-comprehender/README.md b/test/pytorch/reading-comprehender/README.md similarity index 100% rename from examples/pytorch/reading-comprehender/README.md rename to test/pytorch/reading-comprehender/README.md diff --git a/examples/pytorch/reading-comprehender/cortex.yaml b/test/pytorch/reading-comprehender/cortex.yaml similarity index 100% rename from examples/pytorch/reading-comprehender/cortex.yaml rename to test/pytorch/reading-comprehender/cortex.yaml diff --git a/examples/pytorch/reading-comprehender/predictor.py b/test/pytorch/reading-comprehender/predictor.py similarity index 100% rename from examples/pytorch/reading-comprehender/predictor.py rename to test/pytorch/reading-comprehender/predictor.py diff --git a/examples/pytorch/reading-comprehender/requirements.txt b/test/pytorch/reading-comprehender/requirements.txt similarity index 100% rename from examples/pytorch/reading-comprehender/requirements.txt rename to test/pytorch/reading-comprehender/requirements.txt diff --git a/examples/pytorch/reading-comprehender/sample.json b/test/pytorch/reading-comprehender/sample.json similarity index 100% rename from examples/pytorch/reading-comprehender/sample.json rename to test/pytorch/reading-comprehender/sample.json diff --git a/examples/pytorch/search-completer/README.md b/test/pytorch/search-completer/README.md similarity index 100% rename from examples/pytorch/search-completer/README.md rename to test/pytorch/search-completer/README.md diff --git a/examples/pytorch/search-completer/cortex.yaml b/test/pytorch/search-completer/cortex.yaml similarity index 100% rename from examples/pytorch/search-completer/cortex.yaml rename to test/pytorch/search-completer/cortex.yaml diff --git a/examples/pytorch/search-completer/predictor.py b/test/pytorch/search-completer/predictor.py similarity index 100% rename from examples/pytorch/search-completer/predictor.py rename to test/pytorch/search-completer/predictor.py diff --git a/examples/pytorch/search-completer/requirements.txt b/test/pytorch/search-completer/requirements.txt similarity index 100% rename from examples/pytorch/search-completer/requirements.txt rename to test/pytorch/search-completer/requirements.txt diff --git a/examples/pytorch/search-completer/sample.json b/test/pytorch/search-completer/sample.json similarity index 100% rename from examples/pytorch/search-completer/sample.json rename to test/pytorch/search-completer/sample.json diff --git a/examples/pytorch/sentiment-analyzer/README.md b/test/pytorch/sentiment-analyzer/README.md similarity index 100% rename from examples/pytorch/sentiment-analyzer/README.md rename to test/pytorch/sentiment-analyzer/README.md diff --git a/examples/pytorch/sentiment-analyzer/cortex.yaml b/test/pytorch/sentiment-analyzer/cortex.yaml similarity index 100% rename from examples/pytorch/sentiment-analyzer/cortex.yaml rename to test/pytorch/sentiment-analyzer/cortex.yaml diff --git a/examples/pytorch/sentiment-analyzer/predictor.py b/test/pytorch/sentiment-analyzer/predictor.py similarity index 100% rename from examples/pytorch/sentiment-analyzer/predictor.py rename to test/pytorch/sentiment-analyzer/predictor.py diff --git a/examples/pytorch/sentiment-analyzer/requirements.txt b/test/pytorch/sentiment-analyzer/requirements.txt similarity index 100% rename from examples/pytorch/sentiment-analyzer/requirements.txt rename to test/pytorch/sentiment-analyzer/requirements.txt diff --git a/examples/pytorch/sentiment-analyzer/sample.json b/test/pytorch/sentiment-analyzer/sample.json similarity index 100% rename from examples/pytorch/sentiment-analyzer/sample.json rename to test/pytorch/sentiment-analyzer/sample.json diff --git a/examples/pytorch/text-generator/README.md b/test/pytorch/text-generator/README.md similarity index 100% rename from examples/pytorch/text-generator/README.md rename to test/pytorch/text-generator/README.md diff --git a/examples/pytorch/text-generator/deploy.ipynb b/test/pytorch/text-generator/deploy.ipynb similarity index 100% rename from examples/pytorch/text-generator/deploy.ipynb rename to test/pytorch/text-generator/deploy.ipynb diff --git a/examples/pytorch/text-generator/predictor.py b/test/pytorch/text-generator/predictor.py similarity index 100% rename from examples/pytorch/text-generator/predictor.py rename to test/pytorch/text-generator/predictor.py diff --git a/examples/pytorch/text-generator/requirements.txt b/test/pytorch/text-generator/requirements.txt similarity index 100% rename from examples/pytorch/text-generator/requirements.txt rename to test/pytorch/text-generator/requirements.txt diff --git a/examples/pytorch/text-summarizer/README.md b/test/pytorch/text-summarizer/README.md similarity index 100% rename from examples/pytorch/text-summarizer/README.md rename to test/pytorch/text-summarizer/README.md diff --git a/examples/pytorch/text-summarizer/cortex.yaml b/test/pytorch/text-summarizer/cortex.yaml similarity index 100% rename from examples/pytorch/text-summarizer/cortex.yaml rename to test/pytorch/text-summarizer/cortex.yaml diff --git a/examples/pytorch/text-summarizer/predictor.py b/test/pytorch/text-summarizer/predictor.py similarity index 100% rename from examples/pytorch/text-summarizer/predictor.py rename to test/pytorch/text-summarizer/predictor.py diff --git a/examples/pytorch/text-summarizer/requirements.txt b/test/pytorch/text-summarizer/requirements.txt similarity index 100% rename from examples/pytorch/text-summarizer/requirements.txt rename to test/pytorch/text-summarizer/requirements.txt diff --git a/examples/pytorch/text-summarizer/sample.json b/test/pytorch/text-summarizer/sample.json similarity index 100% rename from examples/pytorch/text-summarizer/sample.json rename to test/pytorch/text-summarizer/sample.json diff --git a/examples/sklearn/iris-classifier/README.md b/test/sklearn/iris-classifier/README.md similarity index 100% rename from examples/sklearn/iris-classifier/README.md rename to test/sklearn/iris-classifier/README.md diff --git a/examples/sklearn/iris-classifier/cortex.yaml b/test/sklearn/iris-classifier/cortex.yaml similarity index 100% rename from examples/sklearn/iris-classifier/cortex.yaml rename to test/sklearn/iris-classifier/cortex.yaml diff --git a/examples/sklearn/iris-classifier/predictor.py b/test/sklearn/iris-classifier/predictor.py similarity index 100% rename from examples/sklearn/iris-classifier/predictor.py rename to test/sklearn/iris-classifier/predictor.py diff --git a/examples/sklearn/iris-classifier/requirements.txt b/test/sklearn/iris-classifier/requirements.txt similarity index 100% rename from examples/sklearn/iris-classifier/requirements.txt rename to test/sklearn/iris-classifier/requirements.txt diff --git a/examples/sklearn/iris-classifier/sample.json b/test/sklearn/iris-classifier/sample.json similarity index 100% rename from examples/sklearn/iris-classifier/sample.json rename to test/sklearn/iris-classifier/sample.json diff --git a/examples/sklearn/iris-classifier/trainer.py b/test/sklearn/iris-classifier/trainer.py similarity index 100% rename from examples/sklearn/iris-classifier/trainer.py rename to test/sklearn/iris-classifier/trainer.py diff --git a/examples/sklearn/mpg-estimator/README.md b/test/sklearn/mpg-estimator/README.md similarity index 100% rename from examples/sklearn/mpg-estimator/README.md rename to test/sklearn/mpg-estimator/README.md diff --git a/examples/sklearn/mpg-estimator/cortex.yaml b/test/sklearn/mpg-estimator/cortex.yaml similarity index 100% rename from examples/sklearn/mpg-estimator/cortex.yaml rename to test/sklearn/mpg-estimator/cortex.yaml diff --git a/examples/sklearn/mpg-estimator/predictor.py b/test/sklearn/mpg-estimator/predictor.py similarity index 100% rename from examples/sklearn/mpg-estimator/predictor.py rename to test/sklearn/mpg-estimator/predictor.py diff --git a/examples/sklearn/mpg-estimator/requirements.txt b/test/sklearn/mpg-estimator/requirements.txt similarity index 100% rename from examples/sklearn/mpg-estimator/requirements.txt rename to test/sklearn/mpg-estimator/requirements.txt diff --git a/examples/sklearn/mpg-estimator/sample.json b/test/sklearn/mpg-estimator/sample.json similarity index 100% rename from examples/sklearn/mpg-estimator/sample.json rename to test/sklearn/mpg-estimator/sample.json diff --git a/examples/sklearn/mpg-estimator/trainer.py b/test/sklearn/mpg-estimator/trainer.py similarity index 100% rename from examples/sklearn/mpg-estimator/trainer.py rename to test/sklearn/mpg-estimator/trainer.py diff --git a/examples/spacy/entity-recognizer/README.md b/test/spacy/entity-recognizer/README.md similarity index 100% rename from examples/spacy/entity-recognizer/README.md rename to test/spacy/entity-recognizer/README.md diff --git a/examples/spacy/entity-recognizer/cortex.yaml b/test/spacy/entity-recognizer/cortex.yaml similarity index 100% rename from examples/spacy/entity-recognizer/cortex.yaml rename to test/spacy/entity-recognizer/cortex.yaml diff --git a/examples/spacy/entity-recognizer/predictor.py b/test/spacy/entity-recognizer/predictor.py similarity index 100% rename from examples/spacy/entity-recognizer/predictor.py rename to test/spacy/entity-recognizer/predictor.py diff --git a/examples/spacy/entity-recognizer/requirements.txt b/test/spacy/entity-recognizer/requirements.txt similarity index 100% rename from examples/spacy/entity-recognizer/requirements.txt rename to test/spacy/entity-recognizer/requirements.txt diff --git a/examples/spacy/entity-recognizer/sample.json b/test/spacy/entity-recognizer/sample.json similarity index 100% rename from examples/spacy/entity-recognizer/sample.json rename to test/spacy/entity-recognizer/sample.json diff --git a/examples/tensorflow/image-classifier-inception/README.md b/test/tensorflow/image-classifier-inception/README.md similarity index 100% rename from examples/tensorflow/image-classifier-inception/README.md rename to test/tensorflow/image-classifier-inception/README.md diff --git a/examples/tensorflow/image-classifier-inception/cortex.yaml b/test/tensorflow/image-classifier-inception/cortex.yaml similarity index 100% rename from examples/tensorflow/image-classifier-inception/cortex.yaml rename to test/tensorflow/image-classifier-inception/cortex.yaml diff --git a/examples/tensorflow/image-classifier-inception/cortex_server_side_batching.yaml b/test/tensorflow/image-classifier-inception/cortex_server_side_batching.yaml similarity index 100% rename from examples/tensorflow/image-classifier-inception/cortex_server_side_batching.yaml rename to test/tensorflow/image-classifier-inception/cortex_server_side_batching.yaml diff --git a/examples/tensorflow/image-classifier-inception/inception.ipynb b/test/tensorflow/image-classifier-inception/inception.ipynb similarity index 100% rename from examples/tensorflow/image-classifier-inception/inception.ipynb rename to test/tensorflow/image-classifier-inception/inception.ipynb diff --git a/examples/tensorflow/image-classifier-inception/predictor.py b/test/tensorflow/image-classifier-inception/predictor.py similarity index 100% rename from examples/tensorflow/image-classifier-inception/predictor.py rename to test/tensorflow/image-classifier-inception/predictor.py diff --git a/examples/tensorflow/image-classifier-inception/requirements.txt b/test/tensorflow/image-classifier-inception/requirements.txt similarity index 100% rename from examples/tensorflow/image-classifier-inception/requirements.txt rename to test/tensorflow/image-classifier-inception/requirements.txt diff --git a/examples/tensorflow/image-classifier-inception/sample.json b/test/tensorflow/image-classifier-inception/sample.json similarity index 100% rename from examples/tensorflow/image-classifier-inception/sample.json rename to test/tensorflow/image-classifier-inception/sample.json diff --git a/examples/tensorflow/image-classifier-resnet50/README.md b/test/tensorflow/image-classifier-resnet50/README.md similarity index 100% rename from examples/tensorflow/image-classifier-resnet50/README.md rename to test/tensorflow/image-classifier-resnet50/README.md diff --git a/examples/tensorflow/image-classifier-resnet50/cortex.yaml b/test/tensorflow/image-classifier-resnet50/cortex.yaml similarity index 100% rename from examples/tensorflow/image-classifier-resnet50/cortex.yaml rename to test/tensorflow/image-classifier-resnet50/cortex.yaml diff --git a/examples/tensorflow/image-classifier-resnet50/cortex_gpu.yaml b/test/tensorflow/image-classifier-resnet50/cortex_gpu.yaml similarity index 100% rename from examples/tensorflow/image-classifier-resnet50/cortex_gpu.yaml rename to test/tensorflow/image-classifier-resnet50/cortex_gpu.yaml diff --git a/examples/tensorflow/image-classifier-resnet50/cortex_gpu_server_side_batching.yaml b/test/tensorflow/image-classifier-resnet50/cortex_gpu_server_side_batching.yaml similarity index 100% rename from examples/tensorflow/image-classifier-resnet50/cortex_gpu_server_side_batching.yaml rename to test/tensorflow/image-classifier-resnet50/cortex_gpu_server_side_batching.yaml diff --git a/examples/tensorflow/image-classifier-resnet50/cortex_inf.yaml b/test/tensorflow/image-classifier-resnet50/cortex_inf.yaml similarity index 100% rename from examples/tensorflow/image-classifier-resnet50/cortex_inf.yaml rename to test/tensorflow/image-classifier-resnet50/cortex_inf.yaml diff --git a/examples/tensorflow/image-classifier-resnet50/cortex_inf_server_side_batching.yaml b/test/tensorflow/image-classifier-resnet50/cortex_inf_server_side_batching.yaml similarity index 100% rename from examples/tensorflow/image-classifier-resnet50/cortex_inf_server_side_batching.yaml rename to test/tensorflow/image-classifier-resnet50/cortex_inf_server_side_batching.yaml diff --git a/examples/tensorflow/image-classifier-resnet50/generate_gpu_resnet50_model.ipynb b/test/tensorflow/image-classifier-resnet50/generate_gpu_resnet50_model.ipynb similarity index 100% rename from examples/tensorflow/image-classifier-resnet50/generate_gpu_resnet50_model.ipynb rename to test/tensorflow/image-classifier-resnet50/generate_gpu_resnet50_model.ipynb diff --git a/examples/tensorflow/image-classifier-resnet50/generate_resnet50_models.ipynb b/test/tensorflow/image-classifier-resnet50/generate_resnet50_models.ipynb similarity index 100% rename from examples/tensorflow/image-classifier-resnet50/generate_resnet50_models.ipynb rename to test/tensorflow/image-classifier-resnet50/generate_resnet50_models.ipynb diff --git a/examples/tensorflow/image-classifier-resnet50/predictor.py b/test/tensorflow/image-classifier-resnet50/predictor.py similarity index 100% rename from examples/tensorflow/image-classifier-resnet50/predictor.py rename to test/tensorflow/image-classifier-resnet50/predictor.py diff --git a/examples/tensorflow/image-classifier-resnet50/requirements.txt b/test/tensorflow/image-classifier-resnet50/requirements.txt similarity index 100% rename from examples/tensorflow/image-classifier-resnet50/requirements.txt rename to test/tensorflow/image-classifier-resnet50/requirements.txt diff --git a/examples/tensorflow/image-classifier-resnet50/sample.bin b/test/tensorflow/image-classifier-resnet50/sample.bin similarity index 100% rename from examples/tensorflow/image-classifier-resnet50/sample.bin rename to test/tensorflow/image-classifier-resnet50/sample.bin diff --git a/examples/tensorflow/image-classifier-resnet50/sample.json b/test/tensorflow/image-classifier-resnet50/sample.json similarity index 100% rename from examples/tensorflow/image-classifier-resnet50/sample.json rename to test/tensorflow/image-classifier-resnet50/sample.json diff --git a/examples/tensorflow/iris-classifier/README.md b/test/tensorflow/iris-classifier/README.md similarity index 100% rename from examples/tensorflow/iris-classifier/README.md rename to test/tensorflow/iris-classifier/README.md diff --git a/examples/tensorflow/iris-classifier/cortex.yaml b/test/tensorflow/iris-classifier/cortex.yaml similarity index 100% rename from examples/tensorflow/iris-classifier/cortex.yaml rename to test/tensorflow/iris-classifier/cortex.yaml diff --git a/examples/tensorflow/iris-classifier/predictor.py b/test/tensorflow/iris-classifier/predictor.py similarity index 100% rename from examples/tensorflow/iris-classifier/predictor.py rename to test/tensorflow/iris-classifier/predictor.py diff --git a/examples/tensorflow/iris-classifier/sample.json b/test/tensorflow/iris-classifier/sample.json similarity index 100% rename from examples/tensorflow/iris-classifier/sample.json rename to test/tensorflow/iris-classifier/sample.json diff --git a/examples/tensorflow/iris-classifier/tensorflow.ipynb b/test/tensorflow/iris-classifier/tensorflow.ipynb similarity index 100% rename from examples/tensorflow/iris-classifier/tensorflow.ipynb rename to test/tensorflow/iris-classifier/tensorflow.ipynb diff --git a/examples/tensorflow/license-plate-reader/README.md b/test/tensorflow/license-plate-reader/README.md similarity index 100% rename from examples/tensorflow/license-plate-reader/README.md rename to test/tensorflow/license-plate-reader/README.md diff --git a/examples/tensorflow/license-plate-reader/config.json b/test/tensorflow/license-plate-reader/config.json similarity index 100% rename from examples/tensorflow/license-plate-reader/config.json rename to test/tensorflow/license-plate-reader/config.json diff --git a/examples/tensorflow/license-plate-reader/cortex_full.yaml b/test/tensorflow/license-plate-reader/cortex_full.yaml similarity index 100% rename from examples/tensorflow/license-plate-reader/cortex_full.yaml rename to test/tensorflow/license-plate-reader/cortex_full.yaml diff --git a/examples/tensorflow/license-plate-reader/cortex_lite.yaml b/test/tensorflow/license-plate-reader/cortex_lite.yaml similarity index 100% rename from examples/tensorflow/license-plate-reader/cortex_lite.yaml rename to test/tensorflow/license-plate-reader/cortex_lite.yaml diff --git a/examples/tensorflow/license-plate-reader/predictor_crnn.py b/test/tensorflow/license-plate-reader/predictor_crnn.py similarity index 100% rename from examples/tensorflow/license-plate-reader/predictor_crnn.py rename to test/tensorflow/license-plate-reader/predictor_crnn.py diff --git a/examples/tensorflow/license-plate-reader/predictor_lite.py b/test/tensorflow/license-plate-reader/predictor_lite.py similarity index 100% rename from examples/tensorflow/license-plate-reader/predictor_lite.py rename to test/tensorflow/license-plate-reader/predictor_lite.py diff --git a/examples/tensorflow/license-plate-reader/predictor_yolo.py b/test/tensorflow/license-plate-reader/predictor_yolo.py similarity index 100% rename from examples/tensorflow/license-plate-reader/predictor_yolo.py rename to test/tensorflow/license-plate-reader/predictor_yolo.py diff --git a/examples/tensorflow/license-plate-reader/requirements.txt b/test/tensorflow/license-plate-reader/requirements.txt similarity index 100% rename from examples/tensorflow/license-plate-reader/requirements.txt rename to test/tensorflow/license-plate-reader/requirements.txt diff --git a/examples/tensorflow/license-plate-reader/sample_inference.py b/test/tensorflow/license-plate-reader/sample_inference.py similarity index 100% rename from examples/tensorflow/license-plate-reader/sample_inference.py rename to test/tensorflow/license-plate-reader/sample_inference.py diff --git a/examples/tensorflow/license-plate-reader/utils/__init__.py b/test/tensorflow/license-plate-reader/utils/__init__.py similarity index 100% rename from examples/tensorflow/license-plate-reader/utils/__init__.py rename to test/tensorflow/license-plate-reader/utils/__init__.py diff --git a/examples/tensorflow/license-plate-reader/utils/bbox.py b/test/tensorflow/license-plate-reader/utils/bbox.py similarity index 100% rename from examples/tensorflow/license-plate-reader/utils/bbox.py rename to test/tensorflow/license-plate-reader/utils/bbox.py diff --git a/examples/tensorflow/license-plate-reader/utils/colors.py b/test/tensorflow/license-plate-reader/utils/colors.py similarity index 100% rename from examples/tensorflow/license-plate-reader/utils/colors.py rename to test/tensorflow/license-plate-reader/utils/colors.py diff --git a/examples/tensorflow/license-plate-reader/utils/preprocess.py b/test/tensorflow/license-plate-reader/utils/preprocess.py similarity index 100% rename from examples/tensorflow/license-plate-reader/utils/preprocess.py rename to test/tensorflow/license-plate-reader/utils/preprocess.py diff --git a/examples/tensorflow/license-plate-reader/utils/utils.py b/test/tensorflow/license-plate-reader/utils/utils.py similarity index 100% rename from examples/tensorflow/license-plate-reader/utils/utils.py rename to test/tensorflow/license-plate-reader/utils/utils.py diff --git a/examples/tensorflow/multi-model-classifier/README.md b/test/tensorflow/multi-model-classifier/README.md similarity index 100% rename from examples/tensorflow/multi-model-classifier/README.md rename to test/tensorflow/multi-model-classifier/README.md diff --git a/examples/tensorflow/multi-model-classifier/cortex.yaml b/test/tensorflow/multi-model-classifier/cortex.yaml similarity index 100% rename from examples/tensorflow/multi-model-classifier/cortex.yaml rename to test/tensorflow/multi-model-classifier/cortex.yaml diff --git a/examples/tensorflow/multi-model-classifier/predictor.py b/test/tensorflow/multi-model-classifier/predictor.py similarity index 100% rename from examples/tensorflow/multi-model-classifier/predictor.py rename to test/tensorflow/multi-model-classifier/predictor.py diff --git a/examples/tensorflow/multi-model-classifier/requirements.txt b/test/tensorflow/multi-model-classifier/requirements.txt similarity index 100% rename from examples/tensorflow/multi-model-classifier/requirements.txt rename to test/tensorflow/multi-model-classifier/requirements.txt diff --git a/examples/tensorflow/multi-model-classifier/sample-image.json b/test/tensorflow/multi-model-classifier/sample-image.json similarity index 100% rename from examples/tensorflow/multi-model-classifier/sample-image.json rename to test/tensorflow/multi-model-classifier/sample-image.json diff --git a/examples/tensorflow/multi-model-classifier/sample-iris.json b/test/tensorflow/multi-model-classifier/sample-iris.json similarity index 100% rename from examples/tensorflow/multi-model-classifier/sample-iris.json rename to test/tensorflow/multi-model-classifier/sample-iris.json diff --git a/examples/tensorflow/sentiment-analyzer/README.md b/test/tensorflow/sentiment-analyzer/README.md similarity index 100% rename from examples/tensorflow/sentiment-analyzer/README.md rename to test/tensorflow/sentiment-analyzer/README.md diff --git a/examples/tensorflow/sentiment-analyzer/bert.ipynb b/test/tensorflow/sentiment-analyzer/bert.ipynb similarity index 100% rename from examples/tensorflow/sentiment-analyzer/bert.ipynb rename to test/tensorflow/sentiment-analyzer/bert.ipynb diff --git a/examples/tensorflow/sentiment-analyzer/cortex.yaml b/test/tensorflow/sentiment-analyzer/cortex.yaml similarity index 100% rename from examples/tensorflow/sentiment-analyzer/cortex.yaml rename to test/tensorflow/sentiment-analyzer/cortex.yaml diff --git a/examples/tensorflow/sentiment-analyzer/predictor.py b/test/tensorflow/sentiment-analyzer/predictor.py similarity index 100% rename from examples/tensorflow/sentiment-analyzer/predictor.py rename to test/tensorflow/sentiment-analyzer/predictor.py diff --git a/examples/tensorflow/sentiment-analyzer/requirements.txt b/test/tensorflow/sentiment-analyzer/requirements.txt similarity index 100% rename from examples/tensorflow/sentiment-analyzer/requirements.txt rename to test/tensorflow/sentiment-analyzer/requirements.txt diff --git a/examples/tensorflow/sentiment-analyzer/sample.json b/test/tensorflow/sentiment-analyzer/sample.json similarity index 100% rename from examples/tensorflow/sentiment-analyzer/sample.json rename to test/tensorflow/sentiment-analyzer/sample.json diff --git a/examples/tensorflow/text-generator/README.md b/test/tensorflow/text-generator/README.md similarity index 100% rename from examples/tensorflow/text-generator/README.md rename to test/tensorflow/text-generator/README.md diff --git a/examples/tensorflow/text-generator/cortex.yaml b/test/tensorflow/text-generator/cortex.yaml similarity index 100% rename from examples/tensorflow/text-generator/cortex.yaml rename to test/tensorflow/text-generator/cortex.yaml diff --git a/examples/tensorflow/text-generator/encoder.py b/test/tensorflow/text-generator/encoder.py similarity index 100% rename from examples/tensorflow/text-generator/encoder.py rename to test/tensorflow/text-generator/encoder.py diff --git a/examples/tensorflow/text-generator/gpt-2.ipynb b/test/tensorflow/text-generator/gpt-2.ipynb similarity index 100% rename from examples/tensorflow/text-generator/gpt-2.ipynb rename to test/tensorflow/text-generator/gpt-2.ipynb diff --git a/examples/tensorflow/text-generator/predictor.py b/test/tensorflow/text-generator/predictor.py similarity index 100% rename from examples/tensorflow/text-generator/predictor.py rename to test/tensorflow/text-generator/predictor.py diff --git a/examples/tensorflow/text-generator/requirements.txt b/test/tensorflow/text-generator/requirements.txt similarity index 100% rename from examples/tensorflow/text-generator/requirements.txt rename to test/tensorflow/text-generator/requirements.txt diff --git a/examples/tensorflow/text-generator/sample.json b/test/tensorflow/text-generator/sample.json similarity index 100% rename from examples/tensorflow/text-generator/sample.json rename to test/tensorflow/text-generator/sample.json diff --git a/examples/traffic-splitter/README.md b/test/traffic-splitter/README.md similarity index 100% rename from examples/traffic-splitter/README.md rename to test/traffic-splitter/README.md diff --git a/examples/traffic-splitter/cortex.yaml b/test/traffic-splitter/cortex.yaml similarity index 100% rename from examples/traffic-splitter/cortex.yaml rename to test/traffic-splitter/cortex.yaml diff --git a/examples/traffic-splitter/model.py b/test/traffic-splitter/model.py similarity index 100% rename from examples/traffic-splitter/model.py rename to test/traffic-splitter/model.py diff --git a/examples/traffic-splitter/onnx_predictor.py b/test/traffic-splitter/onnx_predictor.py similarity index 100% rename from examples/traffic-splitter/onnx_predictor.py rename to test/traffic-splitter/onnx_predictor.py diff --git a/examples/traffic-splitter/pytorch_predictor.py b/test/traffic-splitter/pytorch_predictor.py similarity index 100% rename from examples/traffic-splitter/pytorch_predictor.py rename to test/traffic-splitter/pytorch_predictor.py diff --git a/examples/traffic-splitter/sample.json b/test/traffic-splitter/sample.json similarity index 100% rename from examples/traffic-splitter/sample.json rename to test/traffic-splitter/sample.json diff --git a/examples/utils/README.md b/test/utils/README.md similarity index 100% rename from examples/utils/README.md rename to test/utils/README.md diff --git a/examples/utils/throughput_test.py b/test/utils/throughput_test.py similarity index 100% rename from examples/utils/throughput_test.py rename to test/utils/throughput_test.py