Production infrastructure for machine learning at scale

Website • Slack • Docs

Deploy, manage, and scale machine learning models in production.

Serverless workloads

Realtime - respond to requests in real-time and autoscale based on in-flight request volumes.

Async - process requests asynchronously and autoscale based on request queue length.

Batch - run distributed and fault-tolerant batch processing jobs on-demand.

Autoscaling - elastically scale clusters with CPU and GPU instances.

Spot instances - run workloads on spot instances with automated on-demand backups.

Environments - create multiple clusters with different configurations.

Provisioning - provision clusters with declarative configuration or a Terraform provider.

Metrics - send metrics to any monitoring tool or use pre-built Grafana dashboards.

Logs - stream logs to any log management tool or use the pre-built CloudWatch integration.

EKS - Cortex runs on top of EKS to scale workloads reliably and cost-effectively.

VPC - deploy clusters into a VPC on your AWS account to keep your data private.

IAM - integrate with IAM for authentication and authorization workflows.

Name		Name	Last commit message	Last commit date
Latest commit History 2,325 Commits
.circleci		.circleci
.github		.github
build		build
cli		cli
cmd		cmd
dev		dev
docs		docs
images		images
manager		manager
pkg		pkg
python/client		python/client
test		test
.dockerignore		.dockerignore
.gitbook.yaml		.gitbook.yaml
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
get-cli.sh		get-cli.sh
go.mod		go.mod
go.sum		go.sum