Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rearrangement and Toctree updates #1200

Open
wants to merge 18 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
4 changes: 2 additions & 2 deletions docs/bioinformatics_examples.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ Flyte very much supports running your bioinformatics applications. Dive deeper i
:header-rows: 0
:widths: 20 30

* - {doc}`Nucleotide Sequence Querying with BLASTX <auto_examples/blast/index>`
* - {doc}`Nucleotide Sequence Querying with BLASTX <examples/blast/README>`
- Use BLASTX to Query a Nucleotide Sequence Against a Local Protein Database
```

Expand All @@ -19,5 +19,5 @@ Flyte very much supports running your bioinformatics applications. Dive deeper i
:caption: Contents
:hidden:

auto_examples/blast/index
examples/blast/README
```
2 changes: 1 addition & 1 deletion docs/environment_setup.md
Original file line number Diff line number Diff line change
Expand Up @@ -170,4 +170,4 @@ image you want to use with the `--image` option in `pyflyte run`.

## What's next?

Try out the examples in the {doc}`Basics <auto_examples/basics/index>` section.
Try out the examples in the {doc}`Basics <examples/basics/README>` section.
21 changes: 21 additions & 0 deletions docs/examples/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Flyte Examples

This directory contains a series of example projects that demonstrate how to use
Flyte. The basic structure of the examples is as follows:

```{code-block} bash
example_project
├── README.md # High-level description of the example project
├── Dockerfile # Dockerfile for packaging up the project requirements
├── requirements.in # Minimal python requirements for the project
├── requirements.txt # Compiled python requirements using pip-compile
└── example_project # Python package containing examples with the same name as the project
   ├── __init__.py
   ├── example_01.py
   ├── example_02.py
   ├── ...
   └── example_n.py
```

These example projects are meant to be stand-alone projects that can be built
and run by themselves.
33 changes: 33 additions & 0 deletions docs/examples/advanced_composition/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# ######################
# NOTE: For CI/CD only #
########################
FROM python:3.11-slim-buster
LABEL org.opencontainers.image.source https://github.com/flyteorg/flytesnacks

WORKDIR /root
ENV VENV /opt/venv
ENV LANG C.UTF-8
ENV LC_ALL C.UTF-8
ENV PYTHONPATH /root

# This is necessary for opencv to work
RUN apt-get update && apt-get install -y libsm6 libxext6 libxrender-dev ffmpeg build-essential curl

WORKDIR /root

# Virtual environment
ENV VENV /opt/venv
RUN python3 -m venv ${VENV}
ENV PATH="${VENV}/bin:$PATH"

# Install Python dependencies
COPY requirements.txt /root
RUN pip install -r /root/requirements.txt

# Copy the actual code
COPY . /root

# This tag is supplied by the build script and will be used to determine the version
# when registering tasks, workflows, and launch plans
ARG tag
ENV FLYTE_INTERNAL_IMAGE $tag
23 changes: 23 additions & 0 deletions docs/examples/advanced_composition/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
(advanced_composition)=

# Advanced Composition

This section of the user guide introduces the advanced features of the flytekit Python SDK.
These examples cover more complex aspects of Flyte, including conditions, subworkflows,
dynamic workflows, map tasks, gate nodes and more.

```{auto-examples-toc}
files
folders
conditions
chain_entities
subworkflows
dynamics
map_task
merge_sort
eager_workflows
decorating_tasks
decorating_workflows
checkpoint
waiting_for_external_inputs
```
Empty file.
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
# %% [markdown]
# (chain_flyte_entities)=
#
# # Chain Flyte Entities
#
# ```{eval-rst}
# .. tags:: Basic
# ```
#
# flytekit provides a mechanism to chain Flyte entities using the `>>` operator.
#
# ## Tasks
#
# Let's enforce an order for `t1()` to happen after `t0()`, and for `t2()` to happen after `t1()`.
#
# Import the necessary dependencies.
# %%
from flytekit import task, workflow


@task
def t2():
pass


@task
def t1():
pass


@task
def t0():
pass


# %% [markdown]
# We want to enforce an order here: `t0()` followed by `t1()` followed by `t2()`.
# %%
@workflow
def chain_tasks_wf():
t2_promise = t2()
t1_promise = t1()
t0_promise = t0()

t0_promise >> t1_promise
t1_promise >> t2_promise


# %% [markdown]
# ## Chain SubWorkflows
#
# Similar to tasks, you can chain {ref}`subworkflows <subworkflows>`.
# %%
@workflow
def sub_workflow_1():
t1()


@workflow
def sub_workflow_0():
t0()


# %% [markdown]
# Use `>>` to chain the subworkflows.
# %%
@workflow
def chain_workflows_wf():
sub_wf1 = sub_workflow_1()
sub_wf0 = sub_workflow_0()

sub_wf0 >> sub_wf1


# %% [markdown]
# Run the workflows locally.
#
# %%
if __name__ == "__main__":
print(f"Running {__file__} main...")
print(f"Running chain_tasks_wf()... {chain_tasks_wf()}")
print(f"Running chain_workflows_wf()... {chain_workflows_wf()}")
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
# %% [markdown]
# # Intratask Checkpoints
#
# ```{eval-rst}
# .. tags:: MachineLearning, Intermediate
# ```
#
# :::{note}
# This feature is available from Flytekit version 0.30.0b6+ and needs a Flyte backend version of at least 0.19.0+.
# :::
#
# A checkpoint recovers a task from a previous failure by recording the state of a task before the failure and
# resuming from the latest recorded state.
#
# ## Why Intra-task Checkpoints?
#
# Flyte, at its core, is a workflow engine. Workflows provide a way to break up an operation/program/idea
# logically into smaller tasks. If a task fails, the workflow does not need to run the previously completed tasks. It can
# simply retry the task that failed. Eventually, when the task succeeds, it will not run again. Thus, task boundaries
# naturally serve as checkpoints.
#
# There are cases where it is not easy or desirable to break a task into smaller tasks, because running a task
# adds to the overhead. This is true when running a large computation in a tight-loop. In such cases, users can
# split each loop iteration into its own task using {ref}`dynamic workflows <Dynamic Workflows>`, but the overhead of spawning new tasks, recording
# intermediate results, and reconstituting the state can be expensive.
#
# ### Model-training Use Case
#
# An example of this case is model training. Running multiple epochs or different iterations with the same
# dataset can take a long time, but the bootstrap time may be high and creating task boundaries can be expensive.
#
# To tackle this, Flyte offers a way to checkpoint progress within a task execution as a file or a set of files. These
# checkpoints can be written synchronously or asynchronously. In case of failure, the checkpoint file can be re-read to resume
# most of the state without re-running the entire task. This opens up the opportunity to use alternate compute systems with
# lower guarantees like [AWS Spot Instances](https://aws.amazon.com/ec2/spot/), [GCP Pre-emptible Instances](https://cloud.google.com/compute/docs/instances/preemptible), etc.
#
# These instances offer great performance at much lower price-points as compared to their on-demand or reserved alternatives.
# This is possible if you construct the tasks in a fault-tolerant manner. In most cases, when the task runs for a short duration,
# e.g., less than 10 minutes, the potential of failure is insignificant and task-boundary-based recovery offers
# significant fault-tolerance to ensure successful completion.
#
# But as the time for a task increases, the cost of re-running it increases, and reduces the chances of successful
# completion. This is where Flyte's intra-task checkpointing truly shines.
#
# Let's look at an example of how to develop tasks which utilize intra-task checkpointing. It only provides the low-level API, though. We intend to integrate
# higher-level checkpointing APIs available in popular training frameworks like Keras, Pytorch, Scikit-learn, and
# big-data frameworks like Spark and Flink to supercharge their fault-tolerance.

# %%
from flytekit import current_context, task, workflow
from flytekit.exceptions.user import FlyteRecoverableException

RETRIES = 3


# %% [markdown]
# This task shows how checkpoints can help resume execution in case of a failure. This is an example task and shows the API for
# the checkpointer. The checkpoint system exposes other APIs. For a detailed understanding, refer to the [checkpointer code](https://github.com/flyteorg/flytekit/blob/master/flytekit/core/checkpointer.py).
#
# The goal of this method is to loop for exactly n_iterations, checkpointing state and recovering from simualted failures.
# %%
@task(retries=RETRIES)
def use_checkpoint(n_iterations: int) -> int:
cp = current_context().checkpoint
prev = cp.read()
start = 0
if prev:
start = int(prev.decode())

# create a failure interval so we can create failures for across 'n' iterations and then succeed after
# configured retries
failure_interval = n_iterations // RETRIES
i = 0
for i in range(start, n_iterations):
# simulate a deterministic failure, for demonstration. We want to show how it eventually completes within
# the given retries
if i > start and i % failure_interval == 0:
raise FlyteRecoverableException(f"Failed at iteration {i}, failure_interval {failure_interval}")
# save progress state. It is also entirely possible save state every few intervals.
cp.write(f"{i + 1}".encode())

return i


# %% [markdown]
# The workflow here simply calls the task. The task itself
# will be retried for the {ref}`FlyteRecoverableException <flytekit:exception_handling>`.
# %%
@workflow
def example(n_iterations: int) -> int:
return use_checkpoint(n_iterations=n_iterations)


# %% [markdown]
# The checkpoint is stored locally, but it is not used since retries are not supported.
#
# %%
if __name__ == "__main__":
try:
example(n_iterations=10)
except RuntimeError as e: # noqa : F841
# no retries are performed, so an exception is expected when run locally.
pass
Loading