# PersIst Extension Tutorial

PersIst is a JupyterLab extension to make interactions in cell output persistent. Persist tracks interactions for compatible outputs in a provenance graph.

Capturing the interaction provenance allows us to save our interactions with the notebook, and replay them at point later. You can even rerun the interactions if the value of the variable in the persisted cell has changed. Persist will try it's best to apply the interactions to the new variable value.

You can import and enable the PersIst extension as follows:

```python
import persist_ext as PR
```

In [1]:
import persist_ext as PR

from vega_datasets import data
import altair as alt
import pandas as pd

## Supported Outputs

Currently, persist supports two major output types: Vega-Altair plots and interactive datatables.

### **Vega-Altair**

You can create any interactive Vega-Altair plot and pass it to create `PR.PersistChart` to create a persistent chart.

E.g.
```python

selector = alt.selection_interval(name="selector")

chart = alt.Chart(df).mark_point().encode(
    x="X:Q",
    y="Y:Q",
    color=alt.condition(selector, alt.value("steelblue"), alt.value("gray"))
).add_params(selector)

PR.PersistChart(chart)

```

The above snippet of code will create an interactive persist scatterplot with brush interactions.

Alternatively, persist comes with `plot` module which provides helper functions to create simple plots like scatterplot and barchart.

In [10]:
df = data.cars()

selector = alt.selection_interval(name="selector")

chart = alt.Chart(df).mark_point().encode(
    x="Miles_per_Gallon:Q",
    y="Weight_in_lbs:Q",
    color=alt.condition(selector, "Origin:N", alt.value("gray"))
).add_params(selector)

PR.PersistChart(chart, data=df)

PersistWidget(data_values=[{'__id_column': '1', 'Name': 'chevrolet chevelle malibu', 'Miles_per_Gallon': 18.0,…


You can see that the output area looks different from the usual output for Vega-Altair. We have a toolbar on the top and a sidebar on the right. The toolbar gives us host of interactions we can perform to manipulate our plots.

Persist keeps a track of your interactions and stores them in graph called provenance graph. The first tab in the sidebar titled `Trrack` visualizes the provenance for all your interactions in the cell output.

In the second tab you can see the text summary of everything you have done to reach the current output state.


In [4]:
PR.PersistTable(df)

PersistWidget(data_values=[{'__id_column': '1', 'Name': 'chevrolet chevelle malibu', 'Miles_per_Gallon': 18.0,…

In [11]:
persist_df_2.head()

Unnamed: 0,car size,Name,Displacementttt,Miles_per_Gallon,Cylinders,Horsepower,Weight_in_lbs,Acceleration,Year,Origin
124,fiat,fiat 128,68.0,29.0,4,49,1867,19.5,94694400000,Japan
118,No Assignment,maxda rx3,70.0,18.0,3,90,2124,13.5,94694400000,Japan
341,No Assignment,mazda rx-7 gs,70.0,23.7,3,100,2420,12.5,315532800000,Japan
78,No Assignment,mazda rx2 coupe,70.0,19.0,3,97,2330,13.5,63072000000,Japan
60,No Assignment,toyota corolla 1200,71.0,31.0,4,65,1773,19.0,31536000000,Japan


In [13]:
asd

Unnamed: 0,car size,Name,Displacementttt,Miles_per_Gallon,Cylinders,Horsepower,Weight_in_lbs,Acceleration,Year,Origin
124,fiat,fiat 128,68.0,29.0,4,49,1867,19.5,94694400000,Japan
118,No Assignment,maxda rx3,70.0,18.0,3,90,2124,13.5,94694400000,Japan
341,No Assignment,mazda rx-7 gs,70.0,23.7,3,100,2420,12.5,315532800000,Japan
78,No Assignment,mazda rx2 coupe,70.0,19.0,3,97,2330,13.5,63072000000,Japan
60,No Assignment,toyota corolla 1200,71.0,31.0,4,65,1773,19.0,31536000000,Japan
...,...,...,...,...,...,...,...,...,...,...
101,No Assignment,chrysler new yorker brougham,440.0,13.0,8,215,4735,11.0,94694400000,USA
6,No Assignment,chevrolet impala,454.0,14.0,8,220,4354,9.0,0,USA
102,No Assignment,buick electra 225 custom,455.0,12.0,8,225,4951,11.0,94694400000,USA
8,No Assignment,pontiac catalina,455.0,14.0,8,225,4425,10.0,0,USA


In [9]:
persist_df_2.head()

Unnamed: 0,Miles_per_Gallon,Cylinders,Displacement,Horsepower,Weight_in_lbs,Acceleration,Year,Origin
0,18.0,8,307.0,130,3504,12.0,0,USA
1,,82,350.0,165,3693,11.5,0,USA
2,18.0,8,318.0,150,3436,11.0,0,USA
3,16.0,8,304.0,150,3433,12.0,0,USA
4,17.0,8,302.0,140,3449,10.5,0,USA


#### Interactions & Persistence
In the next cell we have the same scatterplot, try interactively selecting points. You can repeat the selections as many times as possible. Try clicking on different nodes in the provenance visualization.

In [10]:
df = data.cars()

selector = alt.selection_interval(name="selector")

chart = alt.Chart(df).mark_point().encode(
    x="Miles_per_Gallon:Q",
    y="Weight_in_lbs:Q",
    color=alt.condition(selector, "Origin:N", alt.value("gray"))
).add_params(selector)

PR.PersistChart(chart, data=df)

PersistWidget(data_values=[{'__id_column': '1', 'Name': 'chevrolet chevelle malibu', 'Miles_per_Gallon': 18.0,…

You can run the next cell to see and see the history already populated with selections. Persist allows you to save the interactions and loads them the next time you run the cell. You can use the first two buttons in the header to `undo` or `redo` your interactions.

In [11]:
df = data.cars()

selector = alt.selection_interval(name="selector")

chart = alt.Chart(df).mark_point().encode(
    x="Miles_per_Gallon:Q",
    y="Weight_in_lbs:Q",
    color=alt.condition(selector, "Origin:N", alt.value("gray"))
).add_params(selector)

PR.PersistChart(chart, data=df)

PersistWidget(data_values=[{'__id_column': '1', 'Name': 'chevrolet chevelle malibu', 'Miles_per_Gallon': 18.0,…

#### Rename columns

You can use the `Rename Column` button to select and rename a column. Try renaming `Miles_per_Gallon` to `MPG` in the next cell.

In [12]:
df = data.cars()

selector = alt.selection_interval(name="selector")

chart = alt.Chart(df).mark_point().encode(
    x="Miles_per_Gallon:Q",
    y="Weight_in_lbs:Q",
    color=alt.condition(selector, "Origin:N", alt.value("gray"))
).add_params(selector)

PR.PersistChart(chart, data=df)

PersistWidget(data_values=[{'__id_column': '1', 'Name': 'chevrolet chevelle malibu', 'Miles_per_Gallon': 18.0,…

#### Delete columns

You can use the `Drop columns` button to select and remove one or multiple columns. Try deleting the `Cylinders` column first, then try removing `Weight_in_lbs` column.

In [13]:
df = data.cars()

selector = alt.selection_interval(name="selector")

chart = alt.Chart(df).mark_point().encode(
    x="Miles_per_Gallon:Q",
    y="Weight_in_lbs:Q",
    color=alt.condition(selector, "Origin:N", alt.value("gray"))
).add_params(selector)

PR.PersistChart(chart, data=df)

PersistWidget(data_values=[{'__id_column': '1', 'Name': 'chevrolet chevelle malibu', 'Miles_per_Gallon': 18.0,…

#### Categories

**Adding new category:**

**Assigning categories:**