In the previous tutorial, you've learned how to create a workflow that downloads a single .zip
file and extracts it to Steep's output directory. We will now extend this workflow to download and extract multiple files.
In certain situations, it can be beneficial to perform operations in parallel. For example, running multiple downloads at the same time can improve bandwidth usage. Since a .zip
file can only be extracted with a single thread, running multiple unzip
instances in parallel can improve CPU utilization. Also, through horizontal scaling, CPU-intensive tasks can be distributed to multiple machines.
This tutorial will show you how to write workflows where the individual actions are executed in parallel. Note that you don't have to model parallelization yourself. Steep is able to automatically detect independent branches of the workflow graph (so-called process chains) and execute them in parallel. Read more about the approach to parallelization and process chain generation in How does Steep work?