# Airflow Scheduling

## Scheduling Basics

```python
from datetime import datetime
from airflow import DAG

with DAG(
    dag_id="my_amazing_dag",
    start_date=datetime(2025, 1, 1),  # January 1, 2025 at 12:00AM (midnight)
    schedule="@daily",
):
    # do something amazing here
    pass
```

### First DAG Run occurs...

![HTTP Flow](./images/ExecutionTimes.jpeg)

## Cron

### Standard

https://crontab.guru/

```
┌── minute (0-59)
| ┌── hour (0-23)
| | ┌── day of the month (1-31)
| | | ┌── month (1-12)   
| | | | ┌── day of the week (0-6) (Sunday to Saturday) 
* * * * *
```

### Preset

| Preset        | Meaning                                                    | Cron          |
| ------------- | ---------------------------------------------------------- | ------------- |
| `@once`       | Schedule once and only once                                |               |
| `@continuous` | Run as soon as the previous run finishes                   |               |
| `@hourly`     | Run once an hour at the end of the hour                    | `0 * * * *`   |
| `@daily`      | Run once a day at midnight                                 | `0 0 * * *`   |
| `@weekly`     | Run once a week at midnight on Sunday                      | `0 0 * * 0`   |
| `@monthly`    | Run once a month at midnight of the first day of the month | `0 0 1 * *`   |
| `@quarterly`  | Run once a quarter at midnight on the first day            | `0 0 1 */3 *` |
| `@yearly`     | Run once a year at midnight of January 1                   | `0 0 1 1 *`   |

### Extended

#### 2nd Friday of each Month? 

Use the Hash symbol!

| Schedule        | Meaning                      |
| --------------- | ---------------------------- |
| `30 14 * * 5`   | Every Friday at 14:30        |
| `30 14 * * 5#2` | 2nd Friday of month at 14:30 |

#### Every 10 minutes?

Use step values!

- `0,10,20,30,40,50 * * * *`
- `*/10 * * * *`

## Timedelta

#### Every 10 minutes... Again?

```python
from datetime import datetime, timedelta
from airflow import DAG

with DAG(
    dag_id="my_frequency_based_dag",
    start_date=datetime(2025, 1, 1),
    schedule=timedelta(minutes=10),  # every minutes after 1/1/25
):
    # do stuff
```

#### Every 4 days?

In [2]:
!cal -A 1 -B 1

   December 2024          January 2025         February 2025      
Su Mo Tu We Th Fr Sa  Su Mo Tu We Th Fr Sa  Su Mo Tu We Th Fr Sa  
 1  2  3  4  5  6  7            1  2  3  4                     1  
 8  9 10 11 12 13 14   5  6  7  8  9 10 11   2  3  4  5  6  7  8  
15 16 17 18 19 20 21  12 13 14 15 16 17 18   9 10 11 12 13 14 15  
22 23 24 25 26 27 28  19 20 21 22 23 24 25  16 17 18 19 20 21 22  
29 30 31              26 27 28 29 30 31     23 24 25 26 27 28     
                                                                  


```python
with DAG(
    dag_id="my_frequency_based_dag",
    start_date=datetime(2025, 1, 1),
    schedule=timedelta(days=4),  # every 4 days after 1/1/25
):
    # do stuff
```

## Dataset

```python
from datetime import datetime
from airflow import DAG
from airflow.datasets import Dataset

my_dataset = Dataset(uri="/opt/airflow/my_file.txt")  # <--- 1. define Dataset

with DAG(
    dag_id="produce_dataset",
    start_date=datetime(2025, 1, 1),
    schedule="45 15 * * 4",
):
    # ...
    create_dataset = BashOperator(
        task_id="create_dataset",
        bash_command=f"echo 'Keep it secret, Keep it safe' > {my_dataset.uri}",
        outlets=[my_dataset],    # <--- 2. reference Dataset in outlet of this task
    )
    # ...


with DAG(
    dag_id="consume_dataset",
    start_date=datetime(2025, 1, 1),
    schedule=my_dataset,         # <--- 3. use Dataset in schedule of downstream DAG
):
    #...
    read_dataset = BashOperator(
        task_id="read_dataset",
        bash_command=f"cat {my_dataset.uri}",
    )
    # ...
```

## Timetable

```python
from datetime import datetime
import pendulum

from airflow import DAG
from airflow.timetables.events import EventsTimetable

my_events = EventsTimetable(  # <--- define custom Timetable
    event_dates=[
        pendulum.datetime(2025, 1, 1), # New Years Day
        pendulum.datetime(2025, 1, 20), # MLK Jr Day
        pendulum.datetime(2025, 2, 17), # Presidents' Day
        pendulum.datetime(2025, 5, 26), # Memorial Day
        pendulum.datetime(2025, 6, 19), # Juneteenth
        pendulum.datetime(2025, 7, 4), # Independence Day
        pendulum.datetime(2025, 7, 31), # Harry Potter's Birthday
        pendulum.datetime(2025, 9, 1), # Labor Day
        pendulum.datetime(2025, 11, 11), # Veterans Day
        pendulum.datetime(2025, 11, 27), # Thanksgiving Day
        pendulum.datetime(2025, 12, 25), # Christmas Day
    ],
)

with DAG(
    dag_id="my_timetable_dag",
    start_date=datetime(2025, 1, 1),
    schedule=my_events,    # <--- use custom Timetable as schedule
):
    # do stuff
```