<center> <h1> <span style="color:black"> IA|BE Data Science Certificate - Module 3 - Computer lab 3  </h1> </center>



<center> <h2> <span style="color:red"> MLOps workflow and popular tools</h1> </center>

MLOps (Machine Learning Operations) is loosely defined as a set of practices that aims to deploy and maintain ML models in production. MLOps lies at the intersection of Machine Learning, Data Engineering and DevOps. MLOps *provides an end-to-end machine learning development process to design, build and manage reproducible, testable, and evolvable ML-powered software* [(ml-ops.org)](https://ml-ops.org).

Some key elements to introduce MLOps:
* DevOps deals with code, MLOps deals with code, data, models and their interactions [(motivation)](https://ml-ops.org/content/motivation)
* Stop thinking about models, start thinking about workflows [(phase 0)](https://ml-ops.org/content/phase-zero)
* One step deeper even, start thinking about end-to-end workflows [(E2E workflow)](https://ml-ops.org/content/end-to-end-ml-workflow)
* MLOps puts focus on elements such as automation, continuous X (CI/CD), versioning, experiment tracking, testing and monitoring [(principles)](https://ml-ops.org/content/mlops-principles)

This workshop will cover the following elements from the MLOps realm:
* **Version Control** 
  * goal: keep track of every modification to your code in a special kind of database
  * tools: git and GitHub
* **Containers**
  * goal: run your software application reliably when moved from one computing environment to another
  * tools: Docker and Docker Hub
* **ML Workflows**
  * goal: orchestration and scheduling of all the different steps in your ML process
  * tool: Apache Airflow

A lot of tools will be introduced and used in this workshop, so the first thing to do is install all of these such that you can follow along on your own machine. The following section walks you through the different steps, good luck!

Most of this workshop will take place in your command line tool, for example the Terminal (macOS) or PowerShell (Windows). By default (if not specified differently), all commands should be run there. 

# Technical prerequisites

This section lists the installation process for all the tools that we will be using in this workshop.

## Install Anaconda (optional)

This section covers the installation of Anaconda Distribution:
* **Installation**: select the appropriate [installer](https://www.anaconda.com/products/distribution#Downloads) based on your operating system
* **Documentation**: the process is fairly straightforward, but additional [documentation](https://docs.anaconda.com/anaconda/install/) is available for each operating system

Anaconda Distribution installs Python, package managers (pip & conda) and numerous Data Science Python packages like numpy, pandas, matplotlib, scikit-learn and more.

Test that both python and pip are installed by running the following commands in your Terminal (macOS) or PowerShell (Windows):
* `pip --version`
* `python --version` (you might need to try `py --version` in Windows)

## Install git

This section covers the installation of git for different operating systems:

* **Windows**:
  * download the [.exe file](https://git-scm.com/download/win)
  * run the installation wizard
* **macOS**:
  * open your terminal
  * install [homebrew](https://brew.sh) by running the following command from the homebrew link in the terminal:
    * `/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"` 
  * install git via homebrew by running `brew install git` in the terminal
* **Linux**:
  * choose the appropriate way of installation based on your distribution from [here](https://git-scm.com/download/linux)

Test that git is installed by running the following command in your Terminal (macOS) or PowerShell (Windows):
* `git --version`

## Create GitHub account and generate your PAT

This section covers the setup of your GitHub account:

* Create a GitHub account via the [sign-up](https://github.com/signup?ref_cta=Sign+up&ref_loc=header+logged+out&ref_page=%2F&source=header-home) page
* Once signed in to your account, generate your personal access token (PAT) by following this [process](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/creating-a-personal-access-token)
  * Be sure to save your PAT somewhere, because you will only be able to see it this one time.
  * You need this PAT when you are performing git actions via the command line that require authentication. So when GitHub ask for your password you simply provide your PAT. This is required since GitHub increased the security of such actions and they no longer support password for this, only PATs.

## Install Docker

This section covers the installation of Docker Desktop for different operating systems:

* **Windows**:
  * download the [.exe file](https://docs.docker.com/desktop/windows/install/) and run the installation wizard
* **macOS**:
  * download the [.dmg file](https://docs.docker.com/desktop/mac/install/) and run the installation wizard
* **Linux**:
  * choose the appropriate way of installation based on your distribution from [here](https://docs.docker.com/desktop/linux/install/)
  * if your distribution does not support Docker Desktop yet, then you can install Docker Engine Server from [here](https://docs.docker.com/engine/install/)

Test that Docker is installed by running the following command in your Terminal (macOS) or PowerShell (Windows):
* `docker --version`

Docker Desktop normally includes Docker Compose, which we will be using during this workshop to run Airflow. Make sure that Docker Compose is installed by running the command below in your Terminal (macOS) or PowerShell (Windows). If Docker compose is not installed, you can follow these [instructions](https://docs.docker.com/compose/install/) based on your operating system.
* `docker-compose --version`

## Create a Docker Hub account

This section covers the setup of your Docker Hub account:

* Create a Docker Hub account via the [sign-up](https://hub.docker.com/signup) page

## Install Airflow

You could try to install Airflow locally by following the following [instructions](https://airflow.apache.org/docs/apache-airflow/stable/start/local.html). However, a local installation of Airflow runs into errors many times and takes a lot of effort to fix and maintain over time due to frequently updated dependencies. For that reason we will leverage Docker (the tool you just installed and will learn more about during the workshop) to demonstrate the use of Airflow. So no installation needed so far!

If you tried the local install, then run the following commands in your Terminal (macOS) or PowerShell (Windows) to verify your success:
* `airflow version` (without the double dash this time)
* `airflow standalone` to launch the Airflow webserver

Do note that this workshop will not use the local installation of Airflow, so no worries if the above airflow commands don't work. 


## You are all set and good to go for the workshop, have fun! 🚀

# Version Control with Git and GitHub

Doing a large project in your company with a lot of contributors comes with a lot of challenges:
* efficient collaboration among people
* storing different versions of the project
* being able to rollback to previous versions and backup in case of system failure
* figuring out why v1 worked well, but v2 results in errors all over the place

Version Control to the rescue! The basic idea of version control is to track changes in your (and your collaborators) files. VC systems set up all the technical details needed for doing exactly that. This comes with a lot of benefits:
* managing and protecting the developers' source code by keeping track of all source code changes
* allowing collaboration among developers such that they are always working on the latest version of the code
* go back in time and figure out what changed in the source code to troubleshoot issues
* always have a backup available with the total history of the project

This sections provides an introduction to version control with git and GitHub, but what are these tools used for? **Git** is a local open-source tool that helps developers to manage source code. **GitHub** an online service where developers can connect to resources using git.

## Configure git

You can configure git such that it (and other developers) know who you are and how to reach you by running the following terminal commands:

* set your name: `git config --global user.name "Your Name"`
* set your email: `git config --global user.email your_name@mail.com`
* check the configuration: `git config -l`
* another way to check the configuration via the .gitconfig file: `cat ~/.gitconfig`

We used the `--global` tag to set these configurations globally over all your repositories. You can set the configuration for each repo seperately by navigating into this repo and running the command without this tag.

## Basic git commands

We will cover the following git commands:
* `git init`: initialize a git repo
* `git status`: check the status of your repo
* `git add`: stage files for git to start tracking changes
* `git commit`: commit changes to your repo
* `git log`: get a log overview of all your commits

This script can be followed to play along with some basic git operations:

* `cd Desktop`: go to your desktop directory (might be different depending on where your command line tool situates you)
* `mkdir demo-git`: create an empty directory "demo-git"
* `cd demo-git`: got into that empty directory "demo-git"
* `ls`: list all the files to verify that the directory is empty
* `git init`: **initialize a new repository inside "demo-git" (this creates a .git directory in the repo used for git metadata)**
* `ls -a`: list all files (also hidden ones)
* `git status`: **so far nothing to commit or track**
* `echo "hi there" > hello.txt`: create a simple hello.txt file with the text "hello there"
* `git status`: **git tells us that we have one untracked file (hello.txt)**
* `git add hello.txt`: **tell git to track changes in this file**
* `git status`: **git tells us that we have one new file being tracked (hello.txt)**
* `git commit -m "our first commit"`: **this commits our staged changes to the repo**
* `git log`: **shows an overview of all your commits**


## Branching and merging

We will cover the following git commands:
* `git branch`: create a new branch in your repo
* `git checkout`: switch to another branch in your repo
* `git merge`: merge another branch into your current branch

This script can be followed to see the effect of branches and merges:

* `git branch dev-test`: **create a new branch with the name "dev-test"**
* `git checkout dev-test`: **switch the git HEAD to our new branch**
* `git log`: **notice that our original commit to master is also part of this new branch**
* `echo "general kenobi" > hello2.txt`: create a new hello2.txt file
* `git add .`: **stage these changes for git to track**
* `git commit -m "second commit"`: **commit these changes on our dev-test branch**
* `git log`: **we now see 2 commits in our logs**
* `git checkout master`: **switch back to the master branch**
* `git log`: **we only see our first commit on the master branch**
* `git merge dev-test`: **merge the dev-test branch into the master branch**
* `git log`: **now we see both commits in the master branch, great!**

## Remote repositories on GitHub

We will cover the following git commands:
* `git clone`: clone a remote repo to your local filesystem
* `git push`: push changes from your local repo to the remote repo
* `git pull`: pull changes from a remote repo to your local repo

Let us create our very first GitHub repo:

* On your GitHub accout homepage, go to the tab "Repositories" and click the green "New" button
* Give your repo a name "demo-test" and (optional) description
* Optionally you can add a README file, .gitignore template and a license
* Click "Create repository"

Cool! We now have our repo publicly available on GitHub.

We can now "clone" this repo onto our local machine to start working with it following these steps:

* `cd ~/Desktop`: we go to our desktop folder and will clone our git repo here
* `git clone https://github.com/rhen-pl/demo-test.git`: **this command clones the remote repo to our desktop**
  * replace my username "rhen-pl" with your username if you created a repo on GitHub
  * you can optionally specify another path at the end of this command for clone destination (default to your work dir)
  * the link we used can be found via "code > HTTPS" on your remote repo page
* `cd demo-test`: let us move into our local repo
* `git status`: **our branch is up to date with 'origin/main'**
* `echo "test" > test.txt`: create a simple text file in your local repo
* `git status`: **we now have untracked changes like we have seen before**
* `git add .`: **stage your changes**
* `git commit -m "my first commit"`: **commit your changes**
* `git status`: **our branch is ahead of 'origin/main' by 1 commit**
* `git push origin main`: **push your local changes into the remote repo**
  * GitHub will ask for your username and password
  * Be sure to put your PAT as password
  * If you now go to your remote GitHub repo, you will see the "test.txt" file. Cool, right?!
* `git pull`: **pull changes from the remote repo to your local one**
  * git will tel you *Already up to date.*
  * This makes sense since we just pushed our local changes into the remote repo
  * Let's add a file by clicking *Add file* in the remote repo on GitHub
  * Specify the name (something.txt) with some random content and click *Commit new file*
* `git pull`: **now we pulled that new file from the remote repo to the local one**
  * You should see this file in your local directory now
  * `ls` to list the files, is it there?
  * `cat something.txt` to see the content of the file

What if you want to start experimenting with some publicly available repo on GitHub? You won't be able to clone from and push to someone's work without their permission. To solve this issue, you can "Fork" any GitHub repo:

* Go to the page of the repo with which you want to experiment
* Click the "Fork" button on the top right of the page
* This create a copy of this repo into your own repository list
* Now you can experiment freely without affecting the original owner's work

# Containers with Docker and Docker Hub

At some point in time, the code you have written in **development** will move to **production**. Now let's imagine the following situation:

> You're a programmer and you've just written some code. It seems to work fine locally, all your unit tests pass and everything. Time to put stuff to production! So you contact your infrastructure team to set up a production server for you, install an OS there, configure the environment, install dependencies, and so on. After that you finally deploy your code to prod. Hooray, right? Wrong! You see you developed in a Windows environment, and the server machine is running on Linux. And because of environment differences, your unit tests now fail...

This unfortunately happens from time to time (especially in the past) until **virtual machines** came into existence. A virtual machine (VM) is a computer inside your computer. A VM emulates a whole system from hardware up, so you get your own CPU power, hard drive space and everything on top - an operating system with all its processes, anything you wish to install on it, your programming environment etc. Having one virtual machine per process/environment is the perfect way to isolate that process/environment. However, VMs are expensive to store, to run, and to modify. Enter containers!

For all intents and purposes, you can think of a **container** as a lighter, faster, much easier (to spin up, operate and change) version of VMs. Containers will share the underlying operating system resources, but above that the environment will be completely isolated. Because containers are lightweight and easy to build, they're also much easier to share than VMs, so you can ensure everyone in your team works in exactly the same environment. Or that the environment on your machine is identical to the one on the production server so you can be sure that test run locally during development will give an accurate execution result in production.

There are 3 basic building blocks to container based development: an **image**, a **container**, and a container **engine**. 
* an ***image*** is a snapshot of your runtime environment. Think a powered down virtual machine. It has all the prerequisites that you need to run your code: the environment, dependencies, the code itself and instructions on how to run the code when it's booted.
* A ***container*** is a running instance of an image, so basically an up and running VM.
* A container ***engine*** is software that runs and manages images in an environment (e.g., Docker Desktop or Kubernetes Container Engine).

This section provides an introduction to containers with Docker. Be sure to start Docker Desktop before continuing.

## Hello Docker

Run the following command in your terminal and let's observe what happens: `docker run hello-world`

Wow, look at that! You just interacted with your first Docker image and spun up a container from that image. But how could we access this image without having it on our local machine? Docker actually tells us this in the output:

> Unable to find image 'hello-world:latest' locally

> latest: Pulling from library/hello-world

> 7050e35b49f5: Pull complete 

Docker pulled the official `hello-world` [image](https://hub.docker.com/_/hello-world) from Docker Hub. The output list the steps which are taken by Docker under the hood:
1. The Docker client contacted the Docker daemon.
2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
3. The Docker daemon created a new container from that image which runs the
    executable that produces the output you are currently reading.
4. The Docker daemon streamed that output to the Docker client, which sent it
    to your terminal.


## Listing images and containers

Try the following commands in the terminal to list images and containers:
* `docker image ls`: list all your Docker **images**
  * do you spot the hello-world image?
* `docker container ls`: list all your **running containers**
  * can you see the hello-world container? Why (not)?
* `docker container ls -a`: list **all containers**, also the stopped ones
  * now we can see the hello-world container listed

## Hello Python

Run the following command in your terminal and let's observe what happens: `docker run python`

This will pull the offical Python [image](https://hub.docker.com/_/python) from Docker Hub (note that this download might take a minute). Once this pull is completed, nothing seems to have happened? Let's list our images and containers:
* `docker images ls`: we can see both the hello-world and python images
* `docker container ls`: there seem to be 0 containers running
* `docker container ls -a`: the python container is indeed in a stopped state

The container executes the python3 command and then exits. If you want to run a container in interactive mode, run the following command: `docker run -it python` (with the interactive and tty flag).

Now you arrive in an interactive Python prompt where you can execute Python commands:
* `print('Hello there")`
* `1+2`
* `[1,2,3]`
* `exit()` to exit

When you now list your containers, you should see two stopped python containers. Notice the weird random names that are assigned? This might get confusing fast once you have more containers from the same image to track. Run a container with a specific name as follows: `docker run --name my-python-container python`. Can you spot this one in your container listing?


## Container management

We have multiple containers from the python image. Run the `docker image ls` command to verify that there is still only one python image in our listing. All our python containers are seperate running instances from the same image.

Let's clean up! Run `docker container prune` and confirm with `y`. This removes all stopped containers, but not the images. Confirm this yourself with the listing commands. Notes: 
* run `docker container prune --force` without the need for confirmation
* run `docker run --rm python` if you want to remove the container automatically once it's stopped

You can run a container in the background in detached mode as follows: `docker run -dt --name my-python python`.
  * list the *running* containers with `docker container ls`, what do you see?
  * stop the container via `docker stop my-python`
  * restart the container via `docker restart my-python`


## Images and the Dockerfile

A Docker image is the source of a container, sort of a blueprint of what needs to happen at runtime. The Dockerfile can be seen as the step-by-step plan of instructions which specifies all the layers like code, runtime, libraries, variables and configurations in the image.


### Base images

The [Dockerfile](https://github.com/docker-library/hello-world/blob/3332fbee4210b41738d83f6cfdc301a42b96e30d/amd64/hello-world/Dockerfile) from the hello-word image has the following three lines of code:
```
FROM scratch
COPY hello /
CMD ["/hello"]
```
* The first line `FROM scratch` indicates that this is a **base image** which literally starts from *scratch*. 
* Next, a hello binary that exists in the git [directory](https://github.com/docker-library/hello-world/tree/3332fbee4210b41738d83f6cfdc301a42b96e30d/amd64/hello-world) is copied in the root file system
* Finally, the CMD is run when the container is spun up (and the container exits after executing this)

### Custom image

There is of course no need to reinvent the wheel every time. You can easily base your image on another **parent image** and start building from there. An example Dockerfile is as follows:



```
# syntax=docker/dockerfile:1                            
FROM python
WORKDIR /my_awesome_app
COPY requirements.txt requirements.txt						
RUN pip3 install -r requirements.txt.
COPY . .
CMD ["python3", "hello_world.py"]
```
* Let Docker know which version of syntax you're using
* Start from the **python** image as parent image
* Create folder **my_awesome_app** at the root of the image & make it a working directory
* Copy dependency **requirements** from your local machine to the image
* Install dependencies by running the **pip3 install** command
* **Copy** your entire project into the image
* Execute the command **hello_world.py** on container startup

## Building your image

Create a directory "demo-docker" with the following three files and run `cd demo-docker`:
* **Dockerfile** with the above commands
* **requirements.txt** with *numpy* as package
* **hello_world.py** with *print("Hello there")*


Run the following commands to build the image and run a container:
* `docker build . -t my-awesome-image`: build the image in the directory where your Dockerfile is located
  * if your path is not in this directory, the `.` needs to contain the path to the Dockerfile
  * the `-t` flag is used to tag your image with a name
* `docker image ls`: verify your image is built
* `docker run --name my-awesome-container my-awesome-image`: run a container from your custom image

## Publishing your image

Run the follwoing commands:
* `docker login --username=<username>`: login to Docker Hub by specifying your own username
* `docker tag my-awesome-image <username>/my-awesome-repo:1.0`: tag your image for a repository
* `docker push <username>/my-awesome-repo:1.0`: push your image to the repository
  * now check your Docker Hub account, is the repo there?
* `docker logout`: log out from Docker Hub

# Scheduling workflows with Apache Airflow

Airflow is an orchestration tool that lets you schedule, author and monitor **workflows** programmatically, aka via code. But what is a workflow? A workflow is a **Directed Acyclic Graph** (or **DAG**) with multiple **tasks** that can be executed independently. Directed means flow in a certain direction, while acyclic means that loops are not allowed in the flow. So, a DAG connects tasks together, specifies their relationship (what runs before what) and dependencies (task B is ran if task A is completed successfully).

## Programmatic workflow

Programmatically, DAGs are python files (`.py`) stored in a `/dags` folder of your project. They are defined (and scheduled) as follows:
```
from airflow.models import DAG

with DAG("ml_pipeline", description='End-to-end ML pipeline example', schedule_interval='@daily') as dag:

	execute_your_pipeline_steps
```

Every step of a pipeline is either a **task** (a single operation) or a **task group** (a bunch of tasks executed in parallel). 

To create a task in AirFlow, you use an operator. There are multiple operator classes, with the default ones being PythonOperator (for executing Python code), EmailOperator (sending emails) and BashOperator (executing bash scripts). In code this looks as follows:

```
from airflow.models import DAG
from airflow.operators.python import PythonOperator

def hello():
	print('hello')

with DAG("ml_pipeline", description='End-to-end ML pipeline example', schedule_interval='@daily') as dag:

	hello_task = PythonOperator(task_id = 'hello_task', python_callable = hello)
	email_task = EmailOperator(to="admin@example.com",subject="hello ran")

	hello_task >> email_task 
```
The last line specifies the order of execution of tasks, and that's how you chain tasks in general, i.e. first_task >> second_task >> … >> last_task. These will be executed sequentially: task2 with start running only after task1 is executed successfully. To run tasks in parallel, you can combine them into task groups as follows:

```
from airflow.models import DAG
from airflow.operators.python import PythonOperator
from airflow.utils.task_group import TaskGroup

def hello():
	print('hello')

def bye():
	print('bye')

with DAG("ml_pipeline", description='End-to-end ML pipeline example', schedule_interval='@daily') as dag:

	########task group########
	with TaskGroup('hello_bye') as hello_bye:
		
		hello_task = PythonOperator(task_id = 'hello_task',python_callable = hello)
		bye_task = PythonOperator(task_id = 'bye_task', python_callable = bye)
	
	email_task = EmailOperator(to="admin@example.com", subject="hello_bye ran")

	hello_bye >> email_task 
```

## Starting Airflow

Perform the following steps to launch Airflow via Docker Compose:
* start Docker Desktop
* `cd Desktop`: go to your desktop directory (not the Docker one)
* `git clone https://github.com/ProphecyLabs/demo_airflow.git`: clone the Prophecy Labs demo_airflow git [repo](https://github.com/ProphecyLabs/demo_airflow)
* `cd demo_airflow`: change into the airflow demo directory
* `docker-compose -f docker-compose.yaml up -d`: launch the airflow container via Docker Compose
* After docker-compose finishes, go to [localhost:8080](http://localhost:8080) in your browser to see the Airflow UI. Enter login and password (airflow/airflow).

On Linux you might need to perform the following steps before running the `docker-compose` command:
* `mkdir ./dags ./data ./models ./logs ./plugins`: create directories necessary for demo
* `echo -e "AIRFLOW_UID=$(id -u)" > .env`: match Airflow UID to Docker UID

## What just happened?

You're running an Airflow application, meaning you spun up:

* Airflow webserver, the very tool at [localhost:8080](http://localhost:8080), giving you a nice UI to see job executions, logs, graphs of your pipelines etc.
* Airflow triggerer, a tool to trigger the execution of your pipelines
* Airflow scheduler, a tool to schedule pipeline execution
* Airflow init, a platform initalization service
* Airflow database, a DB that stores configurations such as variables, policies, links between tasks, permissions etc.

The components above are essential to any Airflow run, and they all need to be present for Airflow to work. This is why you used docker-compose (runs multi-container apps) and not docker run (runs a single container).

Other than spinning up Airflow, there're a few things the [docker-compose.yaml](https://github.com/ProphecyLabs/demo_airflow/blob/main/docker-compose.yaml) file did, namely:
* created three folders in the root folder of your code repo, and mounted them to volumes inside your Airflow container:
```
volumes: 
	- ./dags:/opt/airflow/dags
    - ./data:/opt/airflow/data
    - ./models:/opt/airflow/models
```
* Added a pip dependency for sklearn. Since we're building a ML pipeline it is reasonable to assume that we'll need to use some sort of ML library:
```
_PIP_ADDITIONAL_REQUIREMENTS: ${_PIP_ADDITIONAL_REQUIREMENTS:-scikit-learn}
```


## Demo time

Visit [localhost:8080](http://localhost:8080) to get access to the Airflow UI for the demo. We will cover the three DAGs in the [dags](https://github.com/ProphecyLabs/demo_airflow/tree/main/dags) repo folder.


### ml_pipeline

Will run this [DAG](https://github.com/ProphecyLabs/demo_airflow/blob/main/dags/ml_pipeline.py).

* Click on the ml-pipeline DAG to see the different tasks in this pipeline (either in Tree or Graph view)
* Trigger the pipeline by clicking on the play-button in the top-right and then "trigger DAG"
* Now you can follow the execution of the pipeline in real time
* After completion, you can visualise the time evolution of the different tasks in the Gantt tab
* After completion, you can access the logs of each task by clicking on the task and then "Log"


### broken_code

Will (try to) run this [DAG](https://github.com/ProphecyLabs/demo_airflow/blob/main/dags/broken_code.py) but fail due to code error.

* Click on the broken_code DAG to see the different tasks in this pipeline (either in Tree or Graph view)
* Trigger the pipeline by clicking on the play-button in the top-right and then "trigger DAG"
* Now you can follow the execution of the pipeline in real time
* Autch, we seem to have a failed run
* In the logs we can find the root cause: *NameError: name 'np' is not defined*

### broken DAG

Can't run (or even show) this [DAG](https://github.com/ProphecyLabs/demo_airflow/blob/main/dags/broken_dag.py) because there is a DAG error. Airflow tells us this in the top of the UI. See the full stack trace as follows:
* Open Docker Desktop
* Navigate to Containers > demo_airflow > any container > CLI
* `cd dags`: go into dags folder
* `python broken_dag.py`: run the broken DAG
* See the error *NameError: name 'DAG' is not defined*



## Terminate the Airflow demo

Run the command `docker-compose down` to temrinate the Airflow server in Docker Compose.