![My Logo](https://mlpractitioner.com/wp-content/uploads/generic-soc-media-hero-logo-lockup.png)

### Pickling allows us to serialize a dataset and save it to a file

In these examples, we will 

Using Python **pickle**

1. Create a simple dictionary
2. do a pickle.dump to a binary file
3. do a pickle.load back into a variable
4. Print the dict under a new variable name

Using Pandas pd.to_pickle 

1. fetch a .csv file into a df using pd.read_csv from the a local drive
2. and pickle it to a file called 'pickled_df'

Using Pandas pd.read_pickle

1. Read back the pickled file into cucumber_df
2. Show a head of cucumber_df

Pickling is good for saving a dataframe in a condensed form so you can retrieve it and re-constitute the dataframe as it was before pickling.

   ### The forms of pickle
   
   1. python pickle - uses a standard python function
   2. pandas df.to_pickle('picklefilename')
   3. pandas pd.read_pickle('picklefilename')

   ### The Python Form
   
   1. python pickle - uses a standard python function

In [23]:
import pickle

In [24]:
my_cuke = {'fruit':['apples','oranges','pears','grapes'],'numbers':[1,2,3,4,5],'cars':['ford','hundai','chevy','tesla']}
my_cuke

{'fruit': ['apples', 'oranges', 'pears', 'grapes'],
 'numbers': [1, 2, 3, 4, 5],
 'cars': ['ford', 'hundai', 'chevy', 'tesla']}

In [25]:
with open('my_cuke.pickle','wb') as f:
    pickle.dump(my_cuke, f)

In [26]:
!ls -al my_cuke.pickle

-rw-r--r--  1 salvideoguy  staff  170 Dec  4 15:07 my_cuke.pickle


In [27]:
# # Show that we deleted the dict 'my_cuke'
del my_cuke
print (my_cuke)

NameError: name 'my_cuke' is not defined

In [6]:
with open('my_cuke.pickle','rb') as f:
    my_cuke = pickle.load(f)

In [7]:
# This will tell us what format the file is 
!file -I my_cuke.pickle

my_cuke.pickle: application/octet-stream; charset=binary


In [8]:
print ('my_cuke is', my_cuke)

my_cuke is {'fruit': ['apples', 'oranges', 'pears', 'grapes'], 'numbers': [1, 2, 3, 4, 5], 'cars': ['ford', 'hundai', 'chevy', 'tesla']}


   ### The Pandas Form
   
   2. pandas df.to_pickle('picklefilename')
   3. pandas pd.read_pickle('picklefilename')

In [9]:
import pandas as pd

In [10]:
# Read the titanic.csv file from my external drive
df = pd.read_csv('/Volumes/LaCie/datasets/kaggle/titanicdataset/titanic_data.csv')

In [11]:
df.head()

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.925,,S
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1,C123,S
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.05,,S


In [12]:
# pickle it and write to a local file 
df.to_pickle("./pickled_df.pkl")

In [14]:
# Ok, its there (pickled_df.pkl)
!ls -al pickled_df.pkl

-rw-r--r--  1 salvideoguy  staff  101862 Dec  4 13:26 pickled_df.pkl


In [15]:
# Since pickles start out as cucumbers, we will create a new dataframe called 'cucumber_df'
# Read it back in - de-pickling it
cucumber_df = pd.read_pickle("./pickled_df.pkl")

In [16]:
# back to a cuke
cucumber_df.head()

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.925,,S
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1,C123,S
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.05,,S


In [17]:
# the original dataframe
df.head()

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.925,,S
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1,C123,S
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.05,,S
