Flyte is a workflow automation platform for complex, mission-critical data and ML processes at scale
Home Page · Quick Start · Documentation · Features · Community & Resources · Changelogs · Components
Flyte is a structured programming and distributed processing platform that enables highly concurrent, scalable and maintainable workflows for Machine Learning
and Data Processing
. It is a fabric that connects disparate computation backends using a type safe data dependency graph. It records all changes to a pipeline, making it possible to rewind time. It also stores
a history of all executions and provides an intuitive UI, CLI and REST/gRPC API to interact with the computation.
Flyte is more than a workflow engine -- it uses a workflow
as a core concept and a task
(a single unit of execution) as a top level concept. Multiple tasks arranged in a data
producer-consumer order create a workflow.
Workflows
and Tasks
can be written in any language, with out of the box support for Python, Java and Scala.
- Kubernetes-Native Workflow Automation Platform
- Ergonomic SDK's in Python, Java & Scala
- Versioned & Auditable
- Reproducible Pipelines
- Strong Data Typing
With docker installed, run the following command:
docker run --rm --privileged -p 30081:30081 -p 30084:30084 cr.flyte.org/flyteorg/flyte-sandbox
This creates a local Flyte sandbox. Once the sandbox is ready, you should see the following message: Flyte is ready! Flyte UI is available at http://localhost:30081/console
.
Visit http://localhost:30081/console to view the Flyte dashboard.
Here's a quick visual tour of the console.
To dig deeper into Flyte, refer to the Documentation.
- Used at Scale in production by 500+ users at Lyft with more than 1 million executions and 40+ million container executions per month
- A data aware platform
- Enables collaboration across your organization by:
- Executing distributed data pipelines/workflows
- Reusing tasks across projects, users, and workflows
- Making it easy to stitch together workflows from different teams and domain experts
- Backtracing to a specified workflow
- Comparing results of training workflows over time and across pipelines
- Sharing workflows and tasks across your teams
- Simplifying the complexity of multi-step, multi-owner workflows
- Quick registration -- start locally and scale to the cloud instantly
- Centralized Inventory constituting Tasks, Workflows and Executions
- gRPC / REST interface to define and execute tasks and workflows
- Type safe construction of pipelines -- each task has an interface which is characterized by its input and output, so illegal construction of pipelines fails during declaration rather than at runtime
- Supports multiple data types for machine learning and data processing pipelines, such as Blobs (images, arbitrary files), Directories, Schema (columnar structured data), collections, maps, etc.
- Memoization and Lineage tracking
- Provides logging and observability
- Workflow features:
- Start with one task, convert to a pipeline, attach multiple schedules, trigger using a programmatic API, or on-demand
- Parallel step execution
- Extensible backend to add customized plugin experience (with simplified user experience)
- Branching
- Inline subworkflows (a workflow can be embeded within one node of the top level workflow)
- Distributed remote child workflows (a remote workflow can be triggered and statically verified at compile time)
- Array Tasks (map a function over a large dataset -- ensures controlled execution of thousands of containers)
- Dynamic workflow creation and execution with runtime type safety
- Container side plugins with first class support in Python
- PreAlpha: Arbitrary flytekit-less containers supported (RawContainer)
- Guaranteed reproducibility of pipelines via:
- Versioned data, code and models
- Automatically tracked executions
- Declarative pipelines
- Multi cloud support (AWS, GCP and others)
- Extensible core, modularized, and deep observability
- No single point of failure and is resilient by design
- Automated notifications to Slack, Email, and Pagerduty
- Multi K8s cluster support
- Out of the box support to run Spark jobs on K8s, Hive queries, etc.
- Snappy Console
- Python CLI and Golang CLI (flytectl)
- Written in Golang and optimized for large running jobs' performance
- Grafana templates (user/system observability)
- Helm chart for Flyte (coming soon - June)
- Flink-K8s (coming soon - June)
- One click deploy to AWS
- Reactive pipelines & Events
- Containers
- K8s Pods
- AWS Batch Arrays
- K8s Pod Arrays
- K8s Spark (native Pyspark and Java/Scala)
- AWS Athena
- Qubole Hive
- Presto Queries
- Distributed Pytorch (K8s Native) -- Pytorch Operator
- Sagemaker (builtin algorithms & custom models)
- Distributed Tensorflow (K8s Native) -- TFOperator
- Papermill notebook execution (Python and Spark)
- Type safe and data checking for Pandas dataframe using Pandera
- Versioned datastores using DoltHub and Dolt
- Use SQLAlchemy to query any relational database
- Build your own plugins that use library containers
Repo | Language | Purpose | Status |
---|---|---|---|
flyte | Kustomize,RST | deployment, documentation, issues | Production-grade |
flyteidl | Protobuf | interface definitions | Production-grade |
flytepropeller | Go | execution engine | Production-grade |
flyteadmin | Go | control plane | Production-grade |
flytekit | Python | python SDK and tools | Production-grade |
flyteconsole | Typescript | admin console | Production-grade |
datacatalog | Go | manage input & output artifacts | Production-grade |
flyteplugins | Go | flyte plugins | Production-grade |
flytestdlib | Go | standard library | Production-grade |
flytesnacks | Python | examples, tips, and tricks | Incubating |
flytekit-java | Java/Scala | Java & scala SDK for authoring Flyte workflows | Incubating |
flytectl | Go | A standalone Flyte CLI | Incomplete |
Repo | Language | Purpose |
---|---|---|
Spark | Go | Apache Spark batch |
Flink | Go | Apache Flink streaming |
Here are some resources to help you learn more about Flyte.
- 📣 Flyte OSS Community Sync happens every other Tuesday, 9am-10am PDT (Checkout the events calendar). Here's the zoom link.
- Meeting notes and backlog of topics are captured in doc.
- If you'd like to revisit any community sync meeting that has happened, you can access the video recordings.
- Kubecon 2019 - Flyte: Cloud Native Machine Learning and Data Processing Platform video | deck
- Kubecon 2019 - Running LargeScale Stateful workloads on Kubernetes at Lyft video
- re:invent 2019 - Implementing ML workflows with Kubernetes and Amazon Sagemaker video
- Cloud-native machine learning at Lyft with AWS Batch and Amazon EKS video
- OSS + ELC NA 2020 splash
- Datacouncil video | splash
- FB AI@Scale Making MLOps & DataOps a reality
- GAIC 2020
- TWIML&AI - Scalable and Maintainable ML Workflows at Lyft - Flyte
- Software Engineering Daily - Flyte: Lyft Data Processing Platform
- MLOps Coffee session - Flyte: an open-source tool for scalable, extensible, and portable workflows
A big thank you to the community for making Flyte possible!
- @wild-endeavor
- @katrogan
- @EngHabu
- @akhurana001
- @anandswaminathan
- @kanterov
- @honnix
- @jeevb
- @jonathanburns
- @migueltol22
- @varshaparthay
- @pingsutw
- @narape
- @lu4nm3
- @bnsblue
- @RubenBarragan
- @schottra
- @evalsocket
- @matthewphsmith
- @slai
- @derwiki
- @tnsetting
- @jbrambleDC
- @igorvalko
- @chanadian
- @surindersinghp
- @vsbus
- @catalinii
- @kumare3