# Install prefect
see https://docs.prefect.io/v3/get-started/install

## note on prefect:
there are three components in a prefect setup:
1. Developement Environment
    - `prefect deploy ...` (deploy flow to orchestration environment)
    - `prefect deployment run ...` (manually start a run)
2. Orchestration Environment `prefect server start` or `prefect cloud login`
    - deployment
    - flow run
    - work pool (requests worker to start working when/where/how)
3. Execution Environment `prefect worker start -p module-03-pool`
    - start worker which polls the specified work pool

In [1]:
!pip install -U prefect
!prefect version

Version:             3.4.7
API version:         0.8.4
Python version:      3.12.1
Git commit:          cd81d15a
Built:               Thu, Jun 26, 2025 09:16 PM
OS/Arch:             linux/x86_64
Profile:             local
Server type:         ephemeral
Pydantic version:    2.11.7
Server:
  Database:          sqlite
  SQLite version:    3.45.1


# Quickstart (Open Source version)
see https://docs.prefect.io/v3/get-started/quickstart#open-source
1. run `prefect server start`
2. look at server UI
3. run the getting started script

In [2]:
!python 01_getting_started.py

13:58:49.716 | [36mINFO[0m    | prefect - Starting temporary server on [94mhttp://127.0.0.1:8098[0m
See [94mhttps://docs.prefect.io/3.0/manage/self-host#self-host-a-prefect-server[0m for more information on running a dedicated Prefect server.
13:58:52.963 | [36mINFO[0m    | Flow run[35m 'hospitable-angelfish'[0m - Beginning flow run[35m 'hospitable-angelfish'[0m for flow[1;35m 'main'[0m
13:58:53.147 | [36mINFO[0m    | Task run 'get_customer_ids-0bd' - Finished in state [32mCompleted[0m()
13:58:53.539 | [36mINFO[0m    | Task run 'process_customer-a51' - Finished in state [32mCompleted[0m()
13:58:53.556 | [36mINFO[0m    | Task run 'process_customer-d71' - Finished in state [32mCompleted[0m()
13:58:53.564 | [36mINFO[0m    | Task run 'process_customer-833' - Finished in state [32mCompleted[0m()
13:58:53.566 | [36mINFO[0m    | Task run 'process_customer-56c' - Finished in state [32mCompleted[0m()
13:58:53.571 | [36mINFO[0m    | Task run 'process_customer-

# Orchestrate pipeline code from class
- add `@flow` and `@task` decorators to the respective functions
- run the following:

start mlflow in the folder 03-orchestration/module-3.3-orch

`mlflow server --backend-store-uri sqlite:///mlflow.db`

In [7]:
!python duration-prediction.py --year 2021 --month 1

14:52:04.836 | [36mINFO[0m    | prefect - Starting temporary server on [94mhttp://127.0.0.1:8472[0m
See [94mhttps://docs.prefect.io/3.0/manage/self-host#self-host-a-prefect-server[0m for more information on running a dedicated Prefect server.
14:52:08.141 | [36mINFO[0m    | Flow run[35m 'brainy-panther'[0m - Beginning flow run[35m 'brainy-panther'[0m for flow[1;35m 'run'[0m
14:52:08.798 | [36mINFO[0m    | Task run 'read_dataframe-df8' - Finished in state [32mCompleted[0m()
14:52:09.153 | [36mINFO[0m    | Task run 'read_dataframe-08a' - Finished in state [32mCompleted[0m()
14:52:09.513 | [36mINFO[0m    | Task run 'create_X-0b7' - Finished in state [32mCompleted[0m()
14:52:09.814 | [36mINFO[0m    | Task run 'create_X-9d1' - Finished in state [32mCompleted[0m()
  self.starting_round = model.num_boosted_rounds()
14:52:12.762 | [36mINFO[0m    | Task run 'train_model-71e' - [0]       validation-rmse:11.44482
14:52:14.419 | [36mINFO[0m    | Task run 'train_mo

# Parametrizing the Workflow

## General
run the following commands to
1. create a work pool
    - where (locally, Docker, EC2)
    - how (settings, environment variables, ressouce limits)
    - when (schedule)
2. configure deployment

In [5]:
!pwd

/workspaces/mlops-zoomcamp/03-orchestration/module-3.3-orch


- in the directory `mlops-zoomcamp/03-orchestration/module-3.3-orch` run the following command to configure a deployment

`prefect deploy duration-prediction.py:run -n taxi-1 -p module-03-pool`

1. configure it to take the local code (no pull)
2. configure schedule using cron string: `0 16 1 * *` (At 04:00 PM, on day 1 of the month)
3. configure time zone CET
4. activate schedule: no
5. add another schedule: no
6. save configuration: yes (create `prefect.yaml`)

## activate monthly schedule

In [16]:
!prefect deployment schedule ls run/taxi-1

13:14:47.265 | [36mINFO[0m    | prefect - Starting temporary server on [94mhttp://127.0.0.1:8404[0m
See [94mhttps://docs.prefect.io/3.0/manage/self-host#self-host-a-prefect-server[0m for more information on running a dedicated Prefect server.
[3m                        Deployment Schedules                        [0m
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓
┃[1m [0m[1mID                                  [0m[1m [0m┃[1m [0m[1mSchedule        [0m[1m [0m┃[1m [0m[1mActive[0m[1m [0m┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩
│[34m [0m[34m4cbfb9af-9212-42ad-85ec-f31c34257653[0m[34m [0m│[36m [0m[36mcron: 0 16 1 * *[0m[36m [0m│[38;5;129m [0m[38;5;129mTrue  [0m[38;5;129m [0m│
└──────────────────────────────────────┴──────────────────┴────────┘
13:14:50.681 | [36mINFO[0m    | prefect - Stopping temporary server on [94mhttp://127.0.0.1:8404[0m


In [17]:
!prefect deployment schedule resume run/taxi-1 4cbfb9af-9212-42ad-85ec-f31c34257653

13:15:00.187 | [36mINFO[0m    | prefect - Starting temporary server on [94mhttp://127.0.0.1:8974[0m
See [94mhttps://docs.prefect.io/3.0/manage/self-host#self-host-a-prefect-server[0m for more information on running a dedicated Prefect server.
[31mDeployment schedule 4cbfb9af-9212-42ad-85ec-f31c34257653 is already active[0m
13:15:03.475 | [36mINFO[0m    | prefect - Stopping temporary server on [94mhttp://127.0.0.1:8974[0m


## change script to use previous 2 months
## if you want to redeploy after code change:
`prefect deploy`
- select configuration interactively

## start a work which polls the work pool (EXECUTION ENVIRONMENT)
```
export PREFECT_API_URL=http://127.0.0.1:4200/api
prefect worker start -p module-03-pool
```
- now the flow can run, started by the UI or `prefect deployment run run/taxi-1`
- you can list the deployment via `prefect deployment ls`

warum bleiben workpools noch für kurze zeit online wenn der worker nicht mehr da ist?

# Backfilling
- as there is this does not work with prefect schedule with rrule, here is the manual way vai CLI

In [10]:
!prefect deployment run run/taxi-1 --params '{"year": 2024, "month": 1}'
!prefect deployment run run/taxi-1 --params '{"year": 2024, "month": 2}'
!prefect deployment run run/taxi-1 --params '{"year": 2024, "month": 3}'

12:43:39.306 | [36mINFO[0m    | prefect - Starting temporary server on [94mhttp://127.0.0.1:8758[0m
See [94mhttps://docs.prefect.io/3.0/manage/self-host#self-host-a-prefect-server[0m for more information on running a dedicated Prefect server.
Creating flow run for deployment 'run/taxi-1'...
Created flow run 'dark-weasel'.
└── UUID: a044f97d-9409-4896-8a32-2fc3c09bed38
└── Parameters: {'year': 2024, 'month': 1}
└── Job Variables: {}
└── Scheduled start time: 2025-07-03 12:43:38 UTC (now)
└── URL: <no dashboard available>
12:43:42.594 | [36mINFO[0m    | prefect - Stopping temporary server on [94mhttp://127.0.0.1:8758[0m
12:43:48.039 | [36mINFO[0m    | prefect - Starting temporary server on [94mhttp://127.0.0.1:8704[0m
See [94mhttps://docs.prefect.io/3.0/manage/self-host#self-host-a-prefect-server[0m for more information on running a dedicated Prefect server.
Creating flow run for deployment 'run/taxi-1'...
Created flow run 'prompt-bison'.
└── UUID: c714d508-f4cb-4884-934f