# Combining and reshaping data

In this notebook, we introduce the `pandas` `DataFrame`, show how to combine multiple data sets into a single `DataFeame`. We also show how to change the *layout* of a `DataFrame` for more convenient analysis without actually changing the *content*. Modifying the content will be covered in the next session.

In [1]:
np.random.seed(123)

## Series

### Numeric data

In [2]:
age = pd.Series([23, 17, 22, 37, 42], name='years')
age

0    23
1    17
2    22
3    37
4    42
Name: years, dtype: int64

In [3]:
type(age)

pandas.core.series.Series

In [4]:
age.index

RangeIndex(start=0, stop=5, step=1)

In [5]:
age.sort_values()

1    17
2    22
0    23
3    37
4    42
Name: years, dtype: int64

In [6]:
age.nsmallest(3)

1    17
2    22
0    23
Name: years, dtype: int64

In [7]:
age.values

array([23, 17, 22, 37, 42])

### String data

In [8]:
species = pd.Series(['mouse', 'mouse', 'human', 'human', 'mouse', 'mouse'])

In [9]:
species.sort_values()

2    human
3    human
0    mouse
1    mouse
4    mouse
5    mouse
dtype: object

In [10]:
species.unique()

array(['mouse', 'human'], dtype=object)

In [11]:
species.str.title()

0    Mouse
1    Mouse
2    Human
3    Human
4    Mouse
5    Mouse
dtype: object

In [12]:
species.str[2:4]

0    us
1    us
2    ma
3    ma
4    us
5    us
dtype: object

In [13]:
species.replace({'mosue': 'mus musculus', 'human': 'homo sapiens'})

0           mouse
1           mouse
2    homo sapiens
3    homo sapiens
4           mouse
5           mouse
dtype: object

### Cateogrical data

In [14]:
species = species.astype('category')

In [15]:
species.cat.codes

0    1
1    1
2    0
3    0
4    1
5    1
dtype: int8

In [16]:
species.cat.categories

Index(['human', 'mouse'], dtype='object')

## DataFrame

### Read CSV

In [17]:
iris_1 = pd.read_csv('data/iris.csv')

In [18]:
iris_1.head()

Unnamed: 0,Sepal.Length,Sepal.Width,Petal.Length,Petal.Width,Species
0,5.1,3.5,1.4,0.2,setosa
1,4.9,3.0,1.4,0.2,setosa
2,4.7,3.2,1.3,0.2,setosa
3,4.6,3.1,1.5,0.2,setosa
4,5.0,3.6,1.4,0.2,setosa


### Write CSV

In [19]:
iris_1.to_csv('data/iris_1.csv', index=False)

### Read Excel

In [20]:
iris_2 = pd.read_excel('data/iris.xlsx')

In [21]:
iris_2.head()

Unnamed: 0,Sepal.Length,Sepal.Width,Petal.Length,Petal.Width,Species
0,5.1,3.5,1.4,0.2,setosa
1,4.9,3.0,1.4,0.2,setosa
2,4.7,3.2,1.3,0.2,setosa
3,4.6,3.1,1.5,0.2,setosa
4,5.0,3.6,1.4,0.2,setosa


### Write Excel

In [22]:
iris_2.to_excel('data/iris_2.xlsx', index=False)

### Check files using Unix shell commands

In [23]:
! ls

Data Manipulation.ipynb      [34mdata[m[m
Introduction to Pandas.ipynb schhedule.md


In [24]:
! head -n 5 iris_1.csv | cat

head: iris_1.csv: No such file or directory


## Combining data sets

### Combining rows

In [25]:
df_versiocolor = pd.read_csv('data/versicolor.csv')
df_virginica = pd.read_csv('data/virginica.csv')
df_sertosa = pd.read_csv('data/setosa.csv')
dfs = [df_versiocolor, df_virginica, df_sertosa]

In [26]:
[df.shape for df in dfs]

[(50, 5), (50, 5), (50, 5)]

#### Each DataFrame only contains data about one species

In [27]:
for df in dfs:
    print(df.Species.unique())

['versicolor']
['virginica']
['setosa']


#### Combine with `concat`

We first add a column containing the plant_id for each DataFrame so that each row has a unique identifier (combination of Species and plant_id) and then combine the data frames using `concat`.

In [28]:
df = pd.concat(dfs)
df.shape

(150, 5)

#### Combined DataFrame contains all 3 species

In [29]:
df.Species.unique()

array(['versicolor', 'virginica', 'setosa'], dtype=object)

### Combining columns

In [30]:
df_sepal = pd.read_csv('data/sepal.csv')
df_petal = pd.read_csv('data/petal.csv')

In [31]:
df_sepal.head(3)

Unnamed: 0,Species,Sepal.Length,Sepal.Width
0,setosa,5.1,3.5
1,setosa,4.9,3.0
2,setosa,4.7,3.2


In [32]:
df_petal.head(3)

Unnamed: 0,Species,Petal.Length,Petal.Width
0,setosa,1.4,0.2
1,setosa,1.4,0.2
2,setosa,1.3,0.2


In [33]:
df_sepal.shape, df_petal.shape

((150, 3), (150, 3))

In [34]:
df_cols = pd.merge(df_sepal, df_petal, on = 'Species', left_index=True, right_index=True)
df_cols.shape

(150, 5)

In [35]:
df_cols.head(3)

Unnamed: 0,Species,Sepal.Length,Sepal.Width,Petal.Length,Petal.Width
0,setosa,5.1,3.5,1.4,0.2
1,setosa,4.9,3.0,1.4,0.2
2,setosa,4.7,3.2,1.3,0.2


#### Joininig on a single unique column

Combining values for the same subject across different messurements.

In [36]:
pid1 = np.random.choice(100, 6, replace=False)
pid1

array([ 8, 70, 82, 28, 63,  0])

In [37]:
val1 = np.random.normal(10, 1, 6)
val1

array([ 10.46843912,   9.16884502,  11.16220405,   8.90279695,
         7.87689965,  11.03972709])

In [38]:
df1 = pd.DataFrame({'pid': pid1, 'val': val1})
df1

Unnamed: 0,pid,val
0,8,10.468439
1,70,9.168845
2,82,11.162204
3,28,8.902797
4,63,7.8769
5,0,11.039727


In [39]:
pid2 = np.random.permutation(pid1)
pid2

array([28, 82,  8, 70, 63,  0])

In [40]:
val2 = np.random.normal(15, 1, 6)
val2

array([ 14.16248328,  13.39403724,  16.25523737,  14.31113102,
        16.66095249,  15.80730819])

In [41]:
df2 = pd.DataFrame({'pid': pid2, 'val': val2})
df2

Unnamed: 0,pid,val
0,28,14.162483
1,82,13.394037
2,8,16.255237
3,70,14.311131
4,63,16.660952
5,0,15.807308


In [42]:
pd.merge(df1, df2, on='pid', suffixes=['_visit_1', '_visit_2'])

Unnamed: 0,pid,val_visit_1,val_visit_2
0,8,10.468439,16.255237
1,70,9.168845,14.311131
2,82,11.162204,13.394037
3,28,8.902797,14.162483
4,63,7.8769,16.660952
5,0,11.039727,15.807308


#### Joining on two unique columns

In [43]:
df1['stim'] = np.random.choice(['cmv', 'flu'], 6, replace=True)
df1 = df1[['pid', 'stim', 'val']]
df1

Unnamed: 0,pid,stim,val
0,8,cmv,10.468439
1,70,flu,9.168845
2,82,cmv,11.162204
3,28,cmv,8.902797
4,63,cmv,7.8769
5,0,cmv,11.039727


In [44]:
df2['stim'] = np.random.choice(['cmv', 'flu'], 6, replace=True)
df2 = df2[['pid', 'stim', 'val']]
df2

Unnamed: 0,pid,stim,val
0,28,flu,14.162483
1,82,cmv,13.394037
2,8,flu,16.255237
3,70,flu,14.311131
4,63,cmv,16.660952
5,0,cmv,15.807308


In [45]:
pd.merge(df1, df2, on = ['pid', 'stim'], suffixes = ['_visit_1', '_visit_2'])

Unnamed: 0,pid,stim,val_visit_1,val_visit_2
0,70,flu,9.168845,14.311131
1,82,cmv,11.162204,13.394037
2,63,cmv,7.8769,16.660952
3,0,cmv,11.039727,15.807308


In [46]:
pd.merge(df1, df2, on = ['pid', 'stim'], how = 'left', suffixes = ['_visit_1', '_visit_2'])

Unnamed: 0,pid,stim,val_visit_1,val_visit_2
0,8,cmv,10.468439,
1,70,flu,9.168845,14.311131
2,82,cmv,11.162204,13.394037
3,28,cmv,8.902797,
4,63,cmv,7.8769,16.660952
5,0,cmv,11.039727,15.807308


In [47]:
pd.merge(df1, df2, on = ['pid', 'stim'], how = 'right', suffixes = ['_visit_1', '_visit_2'])

Unnamed: 0,pid,stim,val_visit_1,val_visit_2
0,70.0,flu,9.168845,14.311131
1,82.0,cmv,11.162204,13.394037
2,63.0,cmv,7.8769,16.660952
3,0.0,cmv,11.039727,15.807308
4,28.0,flu,,14.162483
5,8.0,flu,,16.255237


In [48]:
pd.merge(df1, df2, on = ['pid', 'stim'], how = 'outer', suffixes = ['_visit_1', '_visit_2'])

Unnamed: 0,pid,stim,val_visit_1,val_visit_2
0,8.0,cmv,10.468439,
1,70.0,flu,9.168845,14.311131
2,82.0,cmv,11.162204,13.394037
3,28.0,cmv,8.902797,
4,63.0,cmv,7.8769,16.660952
5,0.0,cmv,11.039727,15.807308
6,28.0,flu,,14.162483
7,8.0,flu,,16.255237


## Separate multiple values in a single column

In [49]:
from collections import OrderedDict

In [50]:
d = OrderedDict()
d['pid-visit-stim'] = ['1-1-cmv', '1-1-hiv', '1-2-cmv', '1-2-hiv', '1-3-cmv', '1-3-hiv', '2-1-cmv', '2-1-hiv', '2-2-cmv', '2-2-hiv']
d['tnf'] = [1.0, 2.0, 1.1, 2.1, 1.2, 2.2, 3, 4, 3.1, 4.1]
d['ifn'] = [11.0, 12.0, 11.1, 12.1, 11.2, 12.2, 13, 14, 13.1, 14.1]
d['il2'] = [0.0, 0.0, 0.1, 0.1, 0.2, 0.2, 0.1, 0.3, 0.1, 0.1]
df = pd.DataFrame(d)

In [51]:
df

Unnamed: 0,pid-visit-stim,tnf,ifn,il2
0,1-1-cmv,1.0,11.0,0.0
1,1-1-hiv,2.0,12.0,0.0
2,1-2-cmv,1.1,11.1,0.1
3,1-2-hiv,2.1,12.1,0.1
4,1-3-cmv,1.2,11.2,0.2
5,1-3-hiv,2.2,12.2,0.2
6,2-1-cmv,3.0,13.0,0.1
7,2-1-hiv,4.0,14.0,0.3
8,2-2-cmv,3.1,13.1,0.1
9,2-2-hiv,4.1,14.1,0.1


In [52]:
df1 = pd.DataFrame(df['pid-visit-stim'].str.split('-').tolist(), 
                   columns = ['pid', 'visit', 'stim'])

In [53]:
df1 = pd.concat([df1, df], axis=1)

In [54]:
df1.drop('pid-visit-stim', axis=1)

Unnamed: 0,pid,visit,stim,tnf,ifn,il2
0,1,1,cmv,1.0,11.0,0.0
1,1,1,hiv,2.0,12.0,0.0
2,1,2,cmv,1.1,11.1,0.1
3,1,2,hiv,2.1,12.1,0.1
4,1,3,cmv,1.2,11.2,0.2
5,1,3,hiv,2.2,12.2,0.2
6,2,1,cmv,3.0,13.0,0.1
7,2,1,hiv,4.0,14.0,0.3
8,2,2,cmv,3.1,13.1,0.1
9,2,2,hiv,4.1,14.1,0.1


#### Wrap into a convenient function

In [55]:
def separate(df, column, sep):
    df1 = pd.DataFrame(df[column].str.split(sep).tolist(), columns = column.split(sep))
    df1 = pd.concat([df1, df], axis=1)
    return df1.drop(column, axis = 1)

In [56]:
separate(df, 'pid-visit-stim', '-')

Unnamed: 0,pid,visit,stim,tnf,ifn,il2
0,1,1,cmv,1.0,11.0,0.0
1,1,1,hiv,2.0,12.0,0.0
2,1,2,cmv,1.1,11.1,0.1
3,1,2,hiv,2.1,12.1,0.1
4,1,3,cmv,1.2,11.2,0.2
5,1,3,hiv,2.2,12.2,0.2
6,2,1,cmv,3.0,13.0,0.1
7,2,1,hiv,4.0,14.0,0.3
8,2,2,cmv,3.1,13.1,0.1
9,2,2,hiv,4.1,14.1,0.1


## Reshaping `DataFrame`

In [57]:
d = OrderedDict()
d['pid'] = ['1', '1', '1', '1', '1', '1', '2', '2', '2', '2']
d['visit'] = ['1', '1', '2', '2', '3', '3', '1', '1', '2', '2']
d['stim'] = ['cmv', 'hiv', 'cmv', 'hiv', 'cmv', 'hiv', 'cmv', 'hiv', 'cmv', 'hiv']
d['tnf'] = [1.0, 2.0, 1.1, 2.1, 1.2, 2.2, 3, 4, 3.1, 4.1]
d['ifn'] = [11.0, 12.0, 11.1, 12.1, 11.2, 12.2, 13, 14, 13.1, 14.1]
d['il2'] = [0.0, 0.0, 0.1, 0.1, 0.2, 0.2, 0.1, 0.3, 0.1, 0.1]
df = pd.DataFrame(d)

In [58]:
df

Unnamed: 0,pid,visit,stim,tnf,ifn,il2
0,1,1,cmv,1.0,11.0,0.0
1,1,1,hiv,2.0,12.0,0.0
2,1,2,cmv,1.1,11.1,0.1
3,1,2,hiv,2.1,12.1,0.1
4,1,3,cmv,1.2,11.2,0.2
5,1,3,hiv,2.2,12.2,0.2
6,2,1,cmv,3.0,13.0,0.1
7,2,1,hiv,4.0,14.0,0.3
8,2,2,cmv,3.1,13.1,0.1
9,2,2,hiv,4.1,14.1,0.1


### Wide to Long

In [59]:
long1 = pd.melt(df, id_vars =['pid', 'stim', 'visit'])
long1

Unnamed: 0,pid,stim,visit,variable,value
0,1,cmv,1,tnf,1.0
1,1,hiv,1,tnf,2.0
2,1,cmv,2,tnf,1.1
3,1,hiv,2,tnf,2.1
4,1,cmv,3,tnf,1.2
5,1,hiv,3,tnf,2.2
6,2,cmv,1,tnf,3.0
7,2,hiv,1,tnf,4.0
8,2,cmv,2,tnf,3.1
9,2,hiv,2,tnf,4.1


In [60]:
long2 = pd.melt(df, id_vars = ['pid', 'stim', 'visit'], 
               value_vars = ['tnf', 'ifn', 'il2'])
long2.sample(6)

Unnamed: 0,pid,stim,visit,variable,value
14,1,cmv,3,ifn,11.2
2,1,cmv,2,tnf,1.1
23,1,hiv,2,il2,0.1
4,1,cmv,3,tnf,1.2
25,1,hiv,3,il2,0.2
26,2,cmv,1,il2,0.1


In [61]:
long3 = pd.melt(df, id_vars = ['pid', 'stim', 'visit'], 
                value_vars = ['tnf', 'il2'])
long3.sample(6)

Unnamed: 0,pid,stim,visit,variable,value
13,1,hiv,2,il2,0.1
4,1,cmv,3,tnf,1.2
14,1,cmv,3,il2,0.2
18,2,cmv,2,il2,0.1
15,1,hiv,3,il2,0.2
0,1,cmv,1,tnf,1.0


### Long to Wide

There is no function in `pandas` to undo the `wide_to_long` operation, and the details are tricky, so I have written a small function to do this.

In [62]:
def long_to_wide(df, index):
    df = df.set_index(index).unstack().reset_index()
    cols = [t[1] if t[1] else t[0] for t in df.columns]
    df.columns = cols
    return df

In [63]:
wide1 = long_to_wide(long1, ['pid', 'stim', 'visit', 'variable'])
wide1.head(6)

Unnamed: 0,pid,stim,visit,ifn,il2,tnf
0,1,cmv,1,11.0,0.0,1.0
1,1,cmv,2,11.1,0.1,1.1
2,1,cmv,3,11.2,0.2,1.2
3,1,hiv,1,12.0,0.0,2.0
4,1,hiv,2,12.1,0.1,2.1
5,1,hiv,3,12.2,0.2,2.2


In [64]:
wide2 = long_to_wide(long2, ['pid', 'stim', 'visit', 'variable'])
wide2.head(6)

Unnamed: 0,pid,stim,visit,ifn,il2,tnf
0,1,cmv,1,11.0,0.0,1.0
1,1,cmv,2,11.1,0.1,1.1
2,1,cmv,3,11.2,0.2,1.2
3,1,hiv,1,12.0,0.0,2.0
4,1,hiv,2,12.1,0.1,2.1
5,1,hiv,3,12.2,0.2,2.2


In [65]:
wide3 = long_to_wide(long3, ['pid', 'stim', 'visit', 'variable'])
wide3.head(6)

Unnamed: 0,pid,stim,visit,il2,tnf
0,1,cmv,1,0.0,1.0
1,1,cmv,2,0.1,1.1
2,1,cmv,3,0.2,1.2
3,1,hiv,1,0.0,2.0
4,1,hiv,2,0.1,2.1
5,1,hiv,3,0.2,2.2


## Hierarchical Indexes

In [66]:
df.head()

Unnamed: 0,pid,visit,stim,tnf,ifn,il2
0,1,1,cmv,1.0,11.0,0.0
1,1,1,hiv,2.0,12.0,0.0
2,1,2,cmv,1.1,11.1,0.1
3,1,2,hiv,2.1,12.1,0.1
4,1,3,cmv,1.2,11.2,0.2


#### Add a multi-index consisting of two levels - pid,  stim and visit

In [67]:
df1 = df.set_index(['pid', 'stim', 'visit'])
df1

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,tnf,ifn,il2
pid,stim,visit,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1,cmv,1,1.0,11.0,0.0
1,hiv,1,2.0,12.0,0.0
1,cmv,2,1.1,11.1,0.1
1,hiv,2,2.1,12.1,0.1
1,cmv,3,1.2,11.2,0.2
1,hiv,3,2.2,12.2,0.2
2,cmv,1,3.0,13.0,0.1
2,hiv,1,4.0,14.0,0.3
2,cmv,2,3.1,13.1,0.1
2,hiv,2,4.1,14.1,0.1


### Indexing for mult-index

With the multi-index, each "cell" is now a block of values for the combinations (pid, stim, visit).

#### Find TNF values

In [68]:
df1[['tnf']]

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,tnf
pid,stim,visit,Unnamed: 3_level_1
1,cmv,1,1.0
1,hiv,1,2.0
1,cmv,2,1.1
1,hiv,2,2.1
1,cmv,3,1.2
1,hiv,3,2.2
2,cmv,1,3.0
2,hiv,1,4.0
2,cmv,2,3.1
2,hiv,2,4.1


#### Find all values for Subject 2

In [69]:
df1.ix['2']

Unnamed: 0_level_0,Unnamed: 1_level_0,tnf,ifn,il2
stim,visit,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
cmv,1,3.0,13.0,0.1
hiv,1,4.0,14.0,0.3
cmv,2,3.1,13.1,0.1
hiv,2,4.1,14.1,0.1


#### Find TNF values for subject 2

In [70]:
df1.ix['2', ['tnf']]

Unnamed: 0_level_0,Unnamed: 1_level_0,tnf
stim,visit,Unnamed: 2_level_1
cmv,1,3.0
hiv,1,4.0
cmv,2,3.1
hiv,2,4.1


#### Undo

In [71]:
df1.reset_index()

Unnamed: 0,pid,stim,visit,tnf,ifn,il2
0,1,cmv,1,1.0,11.0,0.0
1,1,hiv,1,2.0,12.0,0.0
2,1,cmv,2,1.1,11.1,0.1
3,1,hiv,2,2.1,12.1,0.1
4,1,cmv,3,1.2,11.2,0.2
5,1,hiv,3,2.2,12.2,0.2
6,2,cmv,1,3.0,13.0,0.1
7,2,hiv,1,4.0,14.0,0.3
8,2,cmv,2,3.1,13.1,0.1
9,2,hiv,2,4.1,14.1,0.1


#### Move pid  from column to row

In [72]:
df1.unstack('pid')

Unnamed: 0_level_0,Unnamed: 1_level_0,tnf,tnf,ifn,ifn,il2,il2
Unnamed: 0_level_1,pid,1,2,1,2,1,2
stim,visit,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2
cmv,1,1.0,3.0,11.0,13.0,0.0,0.1
cmv,2,1.1,3.1,11.1,13.1,0.1,0.1
cmv,3,1.2,,11.2,,0.2,
hiv,1,2.0,4.0,12.0,14.0,0.0,0.3
hiv,2,2.1,4.1,12.1,14.1,0.1,0.1
hiv,3,2.2,,12.2,,0.2,


In [73]:
df1.unstack(0)

Unnamed: 0_level_0,Unnamed: 1_level_0,tnf,tnf,ifn,ifn,il2,il2
Unnamed: 0_level_1,pid,1,2,1,2,1,2
stim,visit,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2
cmv,1,1.0,3.0,11.0,13.0,0.0,0.1
cmv,2,1.1,3.1,11.1,13.1,0.1,0.1
cmv,3,1.2,,11.2,,0.2,
hiv,1,2.0,4.0,12.0,14.0,0.0,0.3
hiv,2,2.1,4.1,12.1,14.1,0.1,0.1
hiv,3,2.2,,12.2,,0.2,


#### Move stim from column to row

In [74]:
df1.unstack('stim')

Unnamed: 0_level_0,Unnamed: 1_level_0,tnf,tnf,ifn,ifn,il2,il2
Unnamed: 0_level_1,stim,cmv,hiv,cmv,hiv,cmv,hiv
pid,visit,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2
1,1,1.0,2.0,11.0,12.0,0.0,0.0
1,2,1.1,2.1,11.1,12.1,0.1,0.1
1,3,1.2,2.2,11.2,12.2,0.2,0.2
2,1,3.0,4.0,13.0,14.0,0.1,0.3
2,2,3.1,4.1,13.1,14.1,0.1,0.1


In [75]:
df1.unstack(1)

Unnamed: 0_level_0,Unnamed: 1_level_0,tnf,tnf,ifn,ifn,il2,il2
Unnamed: 0_level_1,stim,cmv,hiv,cmv,hiv,cmv,hiv
pid,visit,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2
1,1,1.0,2.0,11.0,12.0,0.0,0.0
1,2,1.1,2.1,11.1,12.1,0.1,0.1
1,3,1.2,2.2,11.2,12.2,0.2,0.2
2,1,3.0,4.0,13.0,14.0,0.1,0.3
2,2,3.1,4.1,13.1,14.1,0.1,0.1


#### Move pid and stim from column to row

In [76]:
df1.unstack(['pid', 'stim'])

Unnamed: 0_level_0,tnf,tnf,tnf,tnf,ifn,ifn,ifn,ifn,il2,il2,il2,il2
pid,1,1,2,2,1,1,2,2,1,1,2,2
stim,cmv,hiv,cmv,hiv,cmv,hiv,cmv,hiv,cmv,hiv,cmv,hiv
visit,Unnamed: 1_level_3,Unnamed: 2_level_3,Unnamed: 3_level_3,Unnamed: 4_level_3,Unnamed: 5_level_3,Unnamed: 6_level_3,Unnamed: 7_level_3,Unnamed: 8_level_3,Unnamed: 9_level_3,Unnamed: 10_level_3,Unnamed: 11_level_3,Unnamed: 12_level_3
1,1.0,2.0,3.0,4.0,11.0,12.0,13.0,14.0,0.0,0.0,0.1,0.3
2,1.1,2.1,3.1,4.1,11.1,12.1,13.1,14.1,0.1,0.1,0.1,0.1
3,1.2,2.2,,,11.2,12.2,,,0.2,0.2,,


## Exercises

## Version information

In [79]:
%load_ext version_information
%version_information

Software,Version
Python,3.5.2 64bit [GCC 4.2.1 Compatible Apple LLVM 4.2 (clang-425.0.28)]
IPython,5.0.0
OS,Darwin 15.6.0 x86_64 i386 64bit
Mon Aug 15 10:38:21 2016 EDT,Mon Aug 15 10:38:21 2016 EDT
