# Save and Load Pandas Objects as Pickles

Sometimes it is necessary (or just easier) to save an object that you know you'll have to regenerate again later. 
The `pickle` module implements an algorithm for saving a Python object as a series of characters, and generating a Python object from such a series.  

__Pickling__ (aka serializing) converts a Python object into a byte stream. __Unpickling__ (aka unserializing) is the inverse operation.

We might want to pickle a Python object when:
* We want to use/share/copy an object to different Python runtime
* We want the state of an object to *persist*: this could be as simple as an updated to-do list, or for storing the weights of a trained neural network (although other file formats may be more appropriate, such as HD5)
* We want to transmit an object over a network (although JSON would probably be a better choice, at least for dict-like objects)

This notebook shows how to pickle and unpickle a Pandas object. 

## The Docs

* __The `Pickle` module:__ https://docs.python.org/3/library/pickle.html  
* __Pandas `to_pickle()`:__ https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_pickle.html  
* __Pandas `read_pickle()`__: http://pandas.pydata.org/pandas-docs/version/0.20/generated/pandas.read_pickle.html

In [7]:
import numpy as np
import pandas as pd
import os

### Create a DataFrame


In [8]:
arr = np.random.randn(100000, 50)
df = pd.DataFrame(arr)
df.head()

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,40,41,42,43,44,45,46,47,48,49
0,-0.660088,1.38347,1.309787,-0.625291,0.060541,1.506813,0.395334,0.674483,-1.043515,0.565327,...,0.876511,-0.701888,-1.530755,1.371536,0.546003,-1.026056,0.319412,-0.691389,-0.602095,0.087288
1,0.730525,-0.480288,1.155445,0.688505,0.893147,-0.12971,-1.458417,-0.430269,-0.091271,1.650079,...,0.874648,-0.376625,-1.02141,-1.842244,0.340244,2.63043,-0.745768,-0.257635,-0.104907,0.230542
2,-0.073626,-1.160102,0.034969,0.307971,0.372423,-0.593925,0.455628,-0.906303,-0.967471,0.042608,...,-1.90916,1.792644,0.309179,0.550838,0.238832,-1.636512,0.212442,-1.100379,-1.863755,1.150953
3,-0.093463,-0.917941,0.348087,-0.282903,-0.942316,-0.164607,-1.033117,0.26805,0.708036,0.961799,...,0.453279,-0.824579,0.74316,1.754776,1.330467,-0.323358,0.623249,-1.833264,-1.456771,0.976378
4,1.185077,-1.069351,-1.337688,-0.910487,0.829983,1.537555,1.512678,-1.757886,-0.328062,0.632613,...,-0.70334,-0.306416,0.942558,-1.249406,0.197195,-1.317949,0.052019,-0.461903,-0.861282,0.58758


### Pickle the DataFrame, then delete it from the current session

In [13]:
filepath = os.path.join('/', 'home', 'kris', 'workspace', 'df.pkl')
df.to_pickle(filepath)
del df

### Unpickle and view the DataFrame

In [14]:
df = pd.read_pickle(filepath)
df.head()

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,40,41,42,43,44,45,46,47,48,49
0,-0.660088,1.38347,1.309787,-0.625291,0.060541,1.506813,0.395334,0.674483,-1.043515,0.565327,...,0.876511,-0.701888,-1.530755,1.371536,0.546003,-1.026056,0.319412,-0.691389,-0.602095,0.087288
1,0.730525,-0.480288,1.155445,0.688505,0.893147,-0.12971,-1.458417,-0.430269,-0.091271,1.650079,...,0.874648,-0.376625,-1.02141,-1.842244,0.340244,2.63043,-0.745768,-0.257635,-0.104907,0.230542
2,-0.073626,-1.160102,0.034969,0.307971,0.372423,-0.593925,0.455628,-0.906303,-0.967471,0.042608,...,-1.90916,1.792644,0.309179,0.550838,0.238832,-1.636512,0.212442,-1.100379,-1.863755,1.150953
3,-0.093463,-0.917941,0.348087,-0.282903,-0.942316,-0.164607,-1.033117,0.26805,0.708036,0.961799,...,0.453279,-0.824579,0.74316,1.754776,1.330467,-0.323358,0.623249,-1.833264,-1.456771,0.976378
4,1.185077,-1.069351,-1.337688,-0.910487,0.829983,1.537555,1.512678,-1.757886,-0.328062,0.632613,...,-0.70334,-0.306416,0.942558,-1.249406,0.197195,-1.317949,0.052019,-0.461903,-0.861282,0.58758
