In [1]:
# | default_exp workflow
%load_ext lab_black
# nb_black if running in jupyter
%load_ext autoreload
# automatically reload (local) python modules if they are updated
%autoreload 2

In [2]:
# | hide
from nbdev.showdoc import *

# Workflow

    Define workflow for automatically updating, training and deploying your ML model!


***input:*** data, model & loss notebooks and related modules

***output:*** script for executing the ML model update workflow

***description:***

A ML model update workflow allows you to automatically reload your data, train, evaluate and deploy your model.
Note that by following the notebook templates you have already done most of the work - the notebooks **are** the workflow!

So, in this notebook you define a script to automatically execute the other notebooks with the [papermill](https://papermill.readthedocs.io/) tool. Note, that you can input parameters to the notebooks!

You can either define static workflow, where every step is always recreated every time,
or a dynamic workflow, where only the parts of the workflow are recreated that are affected by the changes since last model update.
For dynamic workflows we encourage utilizing the [Snakemake](https://snakemake.readthedocs.io/) tool.

Here we present a super simple static workflow example that you can build upon in your project. 

Edit this and other text cells to describe your project. 

Remember that you can utilize the `#export` tag to export cell commands to `[your_module]/workflow.py`.

## Import relevant modules

In [3]:
from datetime import datetime
import papermill
from pathlib import Path
import os

# your code here

## Define notebook parameters

In [4]:
# this cell is tagged with 'parameters'
seed = 0
backup_name_base = "run_"
timestamp = True
run_id = "setup_1"
# your code here

make direct derivations from the paramerters:

In [5]:
# your code here

## Define workflow

Here we present a tiny example you can try running yourself and then extend to your needs.

Note that if you run `nbdev_build_lib`, the script is exported to `[your_module]/workflow.py`.

Then, you can run `python [your_module]/workflow.py` to run the workflow automatically!

In [6]:
"""
A workflow to re-run your machine learning workflow automatically.

This example script will
- rebuild your python module
- run data notebook to reload and clean data
- run model notebook to sw test your model
- run loss notebook to train and evaluate your model with full data,
    and save or deploy it for further use

Feel free to edit!
"""

# create name for backup folder folder
run_name = backup_name_base + run_id + "_"
if timestamp:
    run_name += datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
run_name = run_name.rstrip("_")

# create backup folder if it does not exist
cwd = Path().cwd()
backup_path = cwd.parent / "local_data" / "ml_pipe_backups" / run_name

try:
    backup_path.mkdir(parents=True, exist_ok=False)
except FileExistsError:  # do not overwrite
    pass

# make sure changes are updated to module
# (this will do nothing if you run the workflow from inside a notebook)
os.system("nbdev_export")

# run workflow
for notebook in [
    "00_data_etl.ipynb",
    "01_model_class.ipynb",
    "02_train_test_validate.ipynb",
]:
    papermill.execute_notebook(
        notebook,  # this notebook will be executed
        backup_path
        / ("_" + notebook),  # this is where the executed notebook will be saved
        # (notebooks named with '_' -prefix are ignored by nbdev build_lib & build_docs!)
        parameters={"seed": 1},  # you can change notebook parameters
        # kernel_name="python38myenv",
    )  # note: change kernel according to your project setup!

Input notebook does not contain a cell with tag 'parameters'
  from .autonotebook import tqdm as notebook_tqdm
Executing: 100%|██████████| 30/30 [00:01<00:00, 19.31cell/s]
Executing: 100%|██████████| 25/25 [00:01<00:00, 20.83cell/s]
Executing: 100%|██████████| 31/31 [00:01<00:00, 25.48cell/s]


You can also define your workflow in another language than Python and write it into a file from this notebook utilizing the %%writefile magic.
This way your script is still included the documentation without copy-pasting.
You can also add the script to .gitignore to avoid double tracking.

## You can now move on to API notebook!