# Using papermill to execute notebooks

Papermill lets you execute the contents of Jupyter notebooks. It also lets you define parameters as inputs so that you can build pipelines of execution. We'll make use of scrapbook to record data in each notebook and print the results here.

This is a demo to show off some basic functionality of Papermill. We'll demonstrate how to prepare a notebook for
execution with Papermill, how to execute the notebook with various APIs, and how to look at the results.

## Our goal

Let's say that you've got a dataset that covers a span of time. We've got a notebook that does the following
things:

1. Read in the dataset with Pandas
2. Plot the entire dataset over time
3. Select a subset of dates and highlight them in the visualization

Let's see how we can accomplish this programmatically with papermill and scrapbook.

<a href="highlight_dates.ipynb"><button type="button">Link to input notebook!</button></a>


## Prepare the input notebook

First, we'll prepare the input notebook for use with papermill. We've done three things to it:

1. Added a `parameters` cell so that we can define inputs at runtime
2. Used the `scrapbook.glue()` function to store information about how many datapoints were highlited when the notebook was run
3. Used the `scrapbook.glue()` function to store the plot we generate for later inspection.

## Execute the notebook

There are two primary ways to execute a notebook with `papermill`:

* A command-line interface (`papermill <input-notebook> <output-notebook>`)
* An interactive interface (`pm.execute_notebook()`)

We'll cover each below

### Command-line interface

First we'll use the command line interface. By supplying the parameters with `-p <param-name> <param-value>` we
override the defaults specified in the input notebook (in the cell with the `parameters` tag).

In [None]:
%%bash
papermill highlight_dates.ipynb ./highlight_dates_run_cli.ipynb -p start_date 2011-01-01 -p stop_date 2014-02-02

This generated the output notebook `highlight_dates_run.ipynb` along with metadata about the state of
this notebook after it was executed.

We can then use the interactive `scrapbook` API to inspect some information of the output notebook.

First, we'll inspect some metadata about the notebook. This includes the values of parameters that were
specified when we ran the notebook, as well as the value that we stored with `sb.glue()`.

In [None]:
import scrapbook as sb
out = sb.read_notebook('./highlight_dates_run_cli.ipynb')
out.papermill_dataframe

We can also display the output for the cell that we specified in the input notebook (with the `sb.glue(encoder='display')` function)

In [None]:
out.reglue('highlight_dates_fig')

Finally, you can also explore the output notebook, which has all of the above information and more embedded in it.

<a href="highlight_dates_run_cli.ipynb"><button type="button">Link to output notebook!</button></a>


## Interactive interface

Next we'll show execution with the interactive interface. We'll use the `pm.execute_notebook()` function,
and provide a dictionary of new dates that we'd like to use to run the notebook.

Let's first execute the notebook!

In [None]:
import papermill as pm
# This line is equivalent to the one above
new_dates = {'start_date': "2014-01-01", "stop_date":"2015-02-02"}
pm.execute_notebook('./highlight_dates.ipynb', './highlight_dates_run_interactive.ipynb', new_dates);

Now, we can read in the notebook, display the metadata that was generated by papermill, and visualize the figure
we've created once more. Note that note the highlighted area has changed because we've changed the input
parameters!

In [None]:
out = sb.read_notebook('./highlight_dates_run_interactive.ipynb')
out.papermill_dataframe

In [None]:
out.reglue('highlight_dates_fig')

Again, we can also explore the output notebook:

<a href="highlight_dates_run_interactive.ipynb"><button type="button">Link to output notebook!</button></a>
