# Using git with Jupyter Notebooks

When storing notebooks in git, it is preferable to *not* also commit the output cells, because otherwise diffs become unwieldy. Even with the preview features in many git web UIs, there are better ways for publishing full notebooks, see the `publish` directory in this repository for details.

## Committing Clean Notebooks Manually

The easiest way to get a clean notebook that nicely fits into git is to call the `Cell » All Output » Clear` menu item, then save and finally commit the notebook. Restore the output cells using `Cell » Run All`.

If you want to be sure that there are no hidden state problems (order of cell execution, disappeared variable definitions, …), then use `Kernel » Restart & Clear Output` followed by `Cell » Run All`. Finally before the commit, call `Cell » All Output » Clear`.

## Pushing Text-Only Notebooks

If you want to use the preview of GitHub and similar web interfaces
*including* results, the prime goal is to avoid the lengthy blobs of
HTML text in output cells, that represent dataframe tables or images. 

For textual output, simply `print(…)` the objects instead of 
having them as the last expression in a cell.

For images, this can be done by storing created charts as PNG files and 
then putting HTML code into the output that references them.

In [1]:
%matplotlib notebook

import time
from IPython.display import HTML, clear_output
import seaborn as sns

df = sns.load_dataset("anscombe")
plot = sns.lmplot(x="x", y="y", col="dataset", hue="dataset", data=df,
                  col_wrap=2, ci=None, palette="muted", height=2,
                  scatter_kws={"s": 50, "alpha": 1})

chart_img = 'img/seaborn-lmplot.png'
plot.savefig(chart_img)
clear_output()
print(df.drop_duplicates(subset='dataset'))
HTML('<img src="{}?{}"></img>'.format(chart_img, time.time()))

   dataset     x     y
0        I  10.0  8.04
11      II  10.0  9.14
22     III  10.0  7.46
33      IV   8.0  6.58


## git Hook Configuration

**TODO**