# Using scrapbook to read data from executed notebooks

Scrapbook allows you to record and recall data. Papermill lets you execute the contents of Jupyter notebooks. Together we'll use this to generate some outcomes and recall them in this parent notebook.

## Our goal

Let's say that you've got a dataset that covers a span of time. We've got a notebook that does the following
things:

1. Read in the dataset with Pandas
2. Plot the entire dataset over time
3. Select a subset of dates and highlight them in the visualization

Let's see how we can accomplish this programmatically with Scrapbook and Papermill.

<a href="highlight_dates.ipynb"><button type="button">Link to input notebook!</button></a>


## Prepare the input notebook

First, we'll prepare the input notebook for use with Papermill. We've done three things to it:

1. Added a `parameters` cell so that we can define inputs at runtime
2. Used the `scrapbook.glue()` function to store information about how many datapoints were highlited when the notebook was run
3. Used the `book.display()` function to store the plot we generate for later inspection.

## Execute the notebook

There are two primary ways to execute a notebook with `papermill`:

* A command-line interface (`papermill <input-notebook> <output-notebook>`)
* An interactive interface (`pm.execute_notebook()`)

We'll cover each below

This generated the output notebook `highlight_dates_run_<#num>.ipynb` along with metadata about the state of
this notebook after it was executed.

We can then use the interactive `scrapbook` API to inspect some information of the output notebook.

First, we'll inspect some metadata about the notebook. This includes the values of parameters that were
specified when we ran the notebook, as well as the value that we stored with `pm.glue()`.

## Interactive interface

Next we'll show execution with the interactive interface. We'll use the `pm.execute_notebook()` function,
and provide a dictionary of new dates that we'd like to use to run the notebook.

Let's first execute the notebook!

In [None]:
import os
import papermill as pm
import scrapbook as sb

In [None]:
# Ensure our outcome folder exists
if not os.path.exists('./outcomes'):
    os.mkdir('./outcomes')

In [None]:
new_dates_one = {'start_date': "2014-01-01", "stop_date":"2015-02-02"}
pm.execute_notebook('./highlight_dates.ipynb', './outcomes/highlight_dates_run_one.ipynb', new_dates_one);

In [None]:
new_dates_two = {'start_date': "2014-07-01", "stop_date":"2015-10-02"}
pm.execute_notebook('./highlight_dates.ipynb', './outcomes/highlight_dates_run_two.ipynb', new_dates_two);

Now, we can read in the notebook, display the metadata that was generated by scrapbook, and visualize the figure
we've created once more. Note that note the highlighted area has changed because we've changed the input
parameters!

In [None]:
out_one = sb.read_notebook('./outcomes/highlight_dates_run_one.ipynb')
out_one.scrap_dataframe

In [None]:
out_one.reglue('highlight_dates_fig')

Again, we can also explore the output notebook:

<a href="./outcomes/highlight_dates_run_one.ipynb"><button type="button">Link to output notebook!</button></a>


In [None]:
out_two = sb.read_notebook('./outcomes/highlight_dates_run_two.ipynb')
out_one.scrap_dataframe

In [None]:
out_two.reglue('highlight_dates_fig')

For the second notebook source:

<a href="./outcomes/highlight_dates_run_two.ipynb"><button type="button">Link to output notebook!</button></a>

We can also collect all the notebooks from a path and report on their values as a group report.

In [None]:
import scrapbook as sb
book = sb.read_notebooks('./outcomes');
book.scraps_report(include_data=True)