# Tasks with Dependencies

There are many times where certain tasks need to be run before other tasks. In these situations it is necessary to use task dependencies when submitting tasks to the job. Otherwise, a task that needs to run after another could possibly be run before the other, which could cause issues.

In this following example, suppose we have some data that contain info for various states and populations in which we need to model the number of people affected in each state. Then we need to compile this info together in a separate step after all the other tasks run.

In this example we'll assume a pool is already created with the appropriate python libraries installed and mounted to the Blob container "input-test".

In [None]:
# import and initialize CloudClient with Managed identity
from cfa.cloudops import CloudClient

cc = CloudClient()

In [None]:
# upload the relevant files
cc.upload_files(
    files=["states.py", "compile.py", "us_pop_by_state.csv"],
    container_name="input-test",
)

In [None]:
job_name = "sample_job_w_deps"
cc.create_job(job_name, pool_name="rr-test-pool")

In [None]:
# list out the states we want to process
states = ["CA", "AZ", "NY", "MD", "PA"]

# empty list to hold our task ID references
task_list = []

# iterate through the states and add tasks for each state
for state in states:
    task_list.append(cc.add_task(f"python3 /input-test/states.py -s {state}"))

In [None]:
# now add the final compile task that depends on all previous tasks
cc.add_task("python3 /input-test/compile.py", depends_on=task_list)

In [None]:
# monitor the job, downloading task output when complete
cc.monitor_job(job_name, download_task_output=True)