## pandas and ticdat

You can convert an object from normal `TicDat` dict-of-dict representations to a `pandas.DataFrame` and back again.

This notebook does a quick demo of both, with a bit more detail on the latter.

We assume `diet.py` and `diet_sample_data.sql` are in this directory. These files are easy to [find](https://github.com/opalytics/opalytics-ticdat/tree/master/examples/diet) if you want to copy them over and reproduce this notebook.

In [1]:
from diet import input_schema
dat = input_schema.sql.create_tic_dat_from_sql("diet_sample_data.sql")

In [2]:
dat.nutrition_quantities

{(u'chicken', u'calories'): _td:{'Quantity': 420},
 (u'chicken', u'fat'): _td:{'Quantity': 10},
 (u'chicken', u'protein'): _td:{'Quantity': 32},
 (u'chicken', u'sodium'): _td:{'Quantity': 1190},
 (u'fries', u'calories'): _td:{'Quantity': 380},
 (u'fries', u'fat'): _td:{'Quantity': 19},
 (u'fries', u'protein'): _td:{'Quantity': 4},
 (u'fries', u'sodium'): _td:{'Quantity': 270},
 (u'hamburger', u'calories'): _td:{'Quantity': 410},
 (u'hamburger', u'fat'): _td:{'Quantity': 26},
 (u'hamburger', u'protein'): _td:{'Quantity': 24},
 (u'hamburger', u'sodium'): _td:{'Quantity': 730},
 (u'hot dog', u'calories'): _td:{'Quantity': 560},
 (u'hot dog', u'fat'): _td:{'Quantity': 32},
 (u'hot dog', u'protein'): _td:{'Quantity': 20},
 (u'hot dog', u'sodium'): _td:{'Quantity': 1800},
 (u'ice cream', u'calories'): _td:{'Quantity': 330},
 (u'ice cream', u'fat'): _td:{'Quantity': 10},
 (u'ice cream', u'protein'): _td:{'Quantity': 8},
 (u'ice cream', u'sodium'): _td:{'Quantity': 180},
 (u'macaroni', u'calor

In [3]:
pan_dat = input_schema.copy_to_pandas(dat)

In [4]:
pan_dat.nutrition_quantities

Unnamed: 0_level_0,Unnamed: 1_level_0,Quantity
Food,Category,Unnamed: 2_level_1
chicken,calories,420.0
chicken,fat,10.0
chicken,protein,32.0
chicken,sodium,1190.0
fries,calories,380.0
fries,fat,19.0
fries,protein,4.0
fries,sodium,270.0
hamburger,calories,410.0
hamburger,fat,26.0


In [5]:
{pan_dat.nutrition_quantities.__class__, pan_dat.categories.__class__, pan_dat.foods.__class__}

{pandas.core.frame.DataFrame}

Now going the other way, convering `DataFrame` back into `TicDat` objects. Since `pan_dat` just has a collection of `DataFrames` attached to it, this is easy to show.

In [6]:
dat2 = input_schema.TicDat(foods = pan_dat.foods, categories = pan_dat.categories, 
                           nutrition_quantities = pan_dat.nutrition_quantities)

In [7]:
dat2.nutrition_quantities

{(u'chicken', u'calories'): _td:{'Quantity': 420.0},
 (u'chicken', u'fat'): _td:{'Quantity': 10.0},
 (u'chicken', u'protein'): _td:{'Quantity': 32.0},
 (u'chicken', u'sodium'): _td:{'Quantity': 1190.0},
 (u'fries', u'calories'): _td:{'Quantity': 380.0},
 (u'fries', u'fat'): _td:{'Quantity': 19.0},
 (u'fries', u'protein'): _td:{'Quantity': 4.0},
 (u'fries', u'sodium'): _td:{'Quantity': 270.0},
 (u'hamburger', u'calories'): _td:{'Quantity': 410.0},
 (u'hamburger', u'fat'): _td:{'Quantity': 26.0},
 (u'hamburger', u'protein'): _td:{'Quantity': 24.0},
 (u'hamburger', u'sodium'): _td:{'Quantity': 730.0},
 (u'hot dog', u'calories'): _td:{'Quantity': 560.0},
 (u'hot dog', u'fat'): _td:{'Quantity': 32.0},
 (u'hot dog', u'protein'): _td:{'Quantity': 20.0},
 (u'hot dog', u'sodium'): _td:{'Quantity': 1800.0},
 (u'ice cream', u'calories'): _td:{'Quantity': 330.0},
 (u'ice cream', u'fat'): _td:{'Quantity': 10.0},
 (u'ice cream', u'protein'): _td:{'Quantity': 8.0},
 (u'ice cream', u'sodium'): _td:{'Q

In [8]:
input_schema._same_data(dat, dat2)

True

That said, if you drop the indicies then things don't work, so be sure to set `DataFrame` indicies correctly.

In [9]:
df = pan_dat.nutrition_quantities.reset_index(drop=False)
df

Unnamed: 0,Food,Category,Quantity
0,chicken,calories,420.0
1,chicken,fat,10.0
2,chicken,protein,32.0
3,chicken,sodium,1190.0
4,fries,calories,380.0
5,fries,fat,19.0
6,fries,protein,4.0
7,fries,sodium,270.0
8,hamburger,calories,410.0
9,hamburger,fat,26.0


So, well this might appear to be ok, it does need the index for the current `ticdat` implementation.

In [10]:
input_schema.TicDat(foods = pan_dat.foods, categories = pan_dat.categories, 
                           nutrition_quantities = df)

TicDatError: nutrition_quantities cannot be treated as a ticDat table : Could not find a pandas index matching the primary key for nutrition_quantities