In [1]:
!pip install txtai[all] > /dev/null

Workflows are a simple yet powerful construct that takes a callable and returns elements. Workflows operate well with pipelines but can work with any callable object. Workflows are streaming and work on data in batches, allowing large volumes of data to be processed efficiently.

Workflows combine machine-learning pipelines together to aggregate logic. This application provides a number of pre-configured workflows to get a feel of how they work. Workflows can be exported and run locally through FastAPI. Read more on GitHub and in the Docs.

In [None]:
mult_2 = lambda x: [y * 2 for y in x]

In [None]:
from txtai.workflow import FileTask, Task, Workflow

workflow = Workflow([Task(mult_2)])
list(workflow([1, 2, 3]))

[2, 4, 6]

In [None]:
from txtai.pipeline import Summary

summary = Summary()
new_task = Task(summary)

No model was supplied, defaulted to sshleifer/distilbart-cnn-12-6 and revision a4f8f3e (https://huggingface.co/sshleifer/distilbart-cnn-12-6).
Using a pipeline without specifying a model name and revision in production is not recommended.


In [None]:
workflow = Workflow([new_task])

list(workflow(["Very long text here"]))

['Very long text here']

In [None]:
workflow = Workflow([Task([lambda x: [y * 3 for y in x], 
                           lambda x: [y - 1 for y in x]],
                           unpack=False, column={0:0, 1:1})])
list(workflow([(2, 8),(4, 18)]))

[(6, 7), (12, 17)]

In [None]:
workflow = Workflow([Task([lambda x: [y * 3 for y in x], 
                           lambda x: [y - 1 for y in x]],
                           unpack=True, column={0:0, 1:1})])
list(workflow([(2, 8),(4, 18)]))

[(2, (24, 7)), (4, (54, 17))]

The Console Task prints task inputs and outputs to standard output. This task is mainly used for debugging and can be added at any point in a workflow.

In [None]:
from txtai.workflow import FileTask, Workflow, ConsoleTask

workflow = Workflow([ConsoleTask()])
workflow(["Input 1", "Input2"])

<generator object Workflow.__call__ at 0x7fb02c5f02e0>

The File Task validates a file exists. It handles both file paths and local file urls. Note that this task only works with local files.



In [None]:
from txtai.workflow import FileTask, Workflow

workflow = Workflow([FileTask()])
workflow(["/path/to/file", "file:///path/to/file"])

The Image Task reads file paths, check the file is an image and opens it as an Image object. Note that this task only works with local files.

In [None]:
from txtai.workflow import ImageTask, Workflow

workflow = Workflow([ImageTask()])
workflow(["image.jpg", "image.gif"])

The Retrieve Task connects to a url and downloads the content locally. This task is helpful when working with actions that require data to be available locally.

In [None]:
from txtai.workflow import RetrieveTask, Workflow

workflow = Workflow([RetrieveTask(directory="/tmp")])
workflow(["https://file.to.download", 
          "/local/file/to/copy"])

The Service Task extracts content from a http service.

In [None]:
from txtai.workflow import ServiceTask, Workflow

workflow = Workflow([ServiceTask(url="https://service.url/action)])
workflow(["parameter"])

The Storage Task expands a local directory or cloud storage bucket into a list of URLs to process.

In [None]:
from txtai.workflow import StorageTask, Workflow

workflow = Workflow([StorageTask()])
workflow(["s3://path/to/bucket", "local://local/directory"])

The Template Task generates text from a template and task inputs. Templates can be used to prepare data for a number of tasks including generating large language model (LLM) prompts.

In [None]:
from txtai.workflow import TemplateTask, Workflow

workflow = Workflow([TemplateTask(template="This is a {text} task")])
workflow([{"text": "template"}])

The Url Task validates that inputs start with a url prefix.



In [None]:
from txtai.workflow import UrlTask, Workflow

workflow = Workflow([UrlTask()]) workflow(["https://file.to.download", "file:////local/file/to/copy"])

The Workflow Task runs a workflow. Allows creating workflows of workflows.

In [None]:
from txtai.workflow import WorkflowTask, Workflow

workflow = Workflow([WorkflowTask(otherworkflow)])
workflow(["input data"])