# 1st Lesson: [Negative Engineering and Workflow Orchestration](https://www.youtube.com/watch?v=eKzCjNXoCTc&list=PL3MmuxUbc_hIUISrluw_A7wDSmfOhErJK&index=20)

### Orchestration with Prefect - [Docs](https://orion-docs.prefect.io/)

- Automate the different steps - workflow orchestration with prefect;
- Machine Learning Pipeline with Kubeflow;

**Extra:** [Fugue](https://github.com/fugue-project/fugue) - unified interface for distributed computing that lets users execute Python, pandas, and SQL code on Spark and Dask without rewrites;


### Workflow Orchestration

- Set of tools that schedule and monitor work that one wants to acomplish
- Minize the impact of errors that are "normal" to occur in a workflow
	- Failure mechanisms

**Negative Engineering**

90% of engineering time spent

- Retries when APIs go down;
- Malformed Data;
- Notifications;
- Observability into Failure;
- Conditional Failure Logic;
- Timeouts

**NOTE:** Prefect helps to reduce the negative engineering to 70%/80% leaving you with more time to work in modelling.

# 2nd Lesson: [Introduction to Prefect 2.0](https://www.youtube.com/watch?v=Yb6NJwI7bXw&list=PL3MmuxUbc_hIUISrluw_A7wDSmfOhErJK&index=20)

#### Introduction Prefect - Goal eliminate negative engineering

- open-source;
- python-based;
- modern data stack;
- native dask integration;
- very active community;
- prefect cloud/server; - AWS
- prefect orion (prefect 2.0) - AWS


#### Filesystem organization:
- ``orchestration.py`` - is the goal;
- ``requirements.txt``;

### Steps:
1. Copy past week duration-prediction code to a py script - named: ``model_training.py``
2. Test it in your environment to check if it works properly
3. Define set_tracking_url: ``mlflow.set_tracking_uri("sqlite:///backend.db")``
4. Run in the terminal: ``mlflow server --backend-store-uri sqlite:///backend.db --default-artifact-root ./artifacts_local`` 
5. Run the `model_training.py` function in the terminal: ```python model_training.py```

In [1]:
! ls -la

total 244
drwxr-xr-x 7 fdelca fdelca   4096 Jun  2 15:11 .
drwxr-xr-x 8 fdelca fdelca   4096 Jun  2 11:18 ..
-rw-r--r-- 1 fdelca fdelca     67 Jun  2 11:24 .gitignore
drwxr-xr-x 2 fdelca fdelca   4096 Jun  2 11:27 .ipynb_checkpoints
-rw-r--r-- 1 fdelca fdelca   1394 Jun  2 11:24 README.md
-rw-r--r-- 1 fdelca fdelca  16106 Jun  2 15:09 Week3-LearningNotes.ipynb
-rw-r--r-- 1 fdelca fdelca 143360 Jun  2 15:11 backend.db
drwxr-xr-x 2 fdelca fdelca   4096 Jun  2 12:41 data
-rw-r--r-- 1 fdelca fdelca     11 Jun  2 11:24 homework.md
drwxr-xr-x 2 fdelca fdelca   4096 Jun  2 11:24 images
-rw-r--r-- 1 fdelca fdelca    910 Jun  2 11:24 meta.json
drwxr-xr-x 3 fdelca fdelca   4096 Jun  2 15:11 mlruns
-rw-r--r-- 1 fdelca fdelca   6185 Jun  2 15:07 model_training.py
drwxr-xr-x 2 fdelca fdelca   4096 Jun  2 15:11 models
-rw-r--r-- 1 fdelca fdelca   5051 Jun  2 11:24 orchestration.py
-rw-r--r-- 1 fdelca fdelca   5023 Jun  2 11:24 prefect_deploy.py
-rw-r--r-- 1 fdelca fdelca   4674 Jun 

# 3rd Lesson: [First Prefect flow and basics](https://www.youtube.com/watch?v=MCFpURG506w&list=PL3MmuxUbc_hIUISrluw_A7wDSmfOhErJK&index=21)

Topics:
- Sample use-cases with MLflow;
- Installing prefect on local;
- Add them all to a flow and run locally;
- Turn functions into tasks;
- Parameters and Type Checking;
- Show the Local UI and the flow run information

In [4]:
# ! pip install prefect==2.0b5

**Notes:**

1. By adding `@task` to a function we need to add ``.result()`` method when we are calling the function. (e.g. function ``add_features()`` in ``prefect_flow.py``;

2. One can start the local UI by using: `prefect orion start` in the terminal, similar to `mlflow`

3. By default **Perfect** will try to run the tasks in a parallel manner, so if they are dependent of each other will not run properly, one must give a parameter to ``@flow`` function, like this:
    - ``@flow(flow_runner=SequentialTaskRunner())`` or ``@flow(flow_runner=ConcurrentTaskRunner())``

# 4th Lesson: [Remote Prefect Orion deployment](https://www.youtube.com/watch?v=ComkSIAB0k4&list=PL3MmuxUbc_hIUISrluw_A7wDSmfOhErJK&index=22)

1. [Create an instance](https://www.youtube.com/watch?v=IXSiYkP23zo&list=PL3MmuxUbc_hIUISrluw_A7wDSmfOhErJK&index=3)
2. Change the inbound rules, as mention in the lesson's video;
3. Access remotely to the instance.
    - If there is problems with key-pair premissions, use this command line: `sudo chmod 600 ~/.ssh/id_rsa`;
    - Connect to the machine with this command: `ssh -i <PATH_TO_KEY_PAIR> ubuntu@<PUBLIC_IP_ADDRESS>` or define a `config file` inside the `.ssh` directory
4. Install prefect and run the first command of this [link](https://discourse.prefect.io/t/hosting-an-orion-instance-on-a-cloud-vm/967)
5. Check if it is well set, using `prefect config view`
6. Then start prefect: `prefect orion start --host 0.0.0.0`

Perfect is now ready to be connected from your local computer.

Now in our local machine, we will set the prefect remote location, so prefect knows where to log everything:
- `prefect config set PREFECT_API_URL="http://<external-ip>:4200/api"`


And after connecting your local computer to the remote address, you can run the sript and it will save your workflow into remotely set machine.

# 5th Lesson: [Deploymentof Prefect flow](https://www.youtube.com/watch?v=xw9JfaWPPps&list=PL3MmuxUbc_hIUISrluw_A7wDSmfOhErJK&index=24)

- Show remote hosted UI;
- Deploy flow on schedule on remote Prefect Orion;
- Work queues and agents;
- Next Steps;

### Storage

Flows are stored somewhere, we need to define a storage: (in the remote machine)
1. Check if there is already a storage defined: `prefect storage ls`;
2. We can run the following command to create a storage:
    - `prefect storage create`;
    - And then define the type of storage we want, in this tutorial we are going to use local filesystem; Option `3`
    - Define a path to store (note: it should be the full path): `.prefect`
**NOTE:** One can run it on docker, kubernetes, or locally.

3. Then one must change the script, to be able to deploy it. Add this to the end of your script: (locally)

```python
# Setting deployment specifications
from prefect.deployments import DeploymentSpec
from prefect.orion.schemas.schedules import IntervalSchedule
from prefect.flow_runners import SubprocessFlowRunner # With this we will run it without a container
from datetime import timedelta

DeploymentSpec(
    flow=main,
    name='model_training',
    schedule=IntervalSchedule(interval=timedelta(minutes=5)),
    tags=['ml'],
)

```

4. Afterwards, one can run it by: `prefect deployment create prefect_deploy.py` (locally)
- You will have schedule runs for the next `5 minutes`;
5. Also, one can set the `work queues` to catch the flows and start an agent with this code:
- `prefect agent start a8aa05de-7da0-4f96-ba7f-99f706eb63a1`