In [None]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns

Construct a DataFrame from a dict whose values are all equal length lists

In [None]:
data = {
    'state': ['Ohio', 'Ohio', 'Ohio', 'Nevada', 'Nevada', 'Nevada'],
    'year': [2000, 2001, 2002, 2001, 2002, 2003],
    'pop': [1.5, 1.7, 3.6, 2.4, 2.9, 3.2]
}
frame = pd.DataFrame(data)

In [None]:
frame

In [None]:
frame.head()

Specify both the data and a sequence a columns when constructing the DataFrame

In [None]:
pd.DataFrame(data, columns=['year', 'state', 'pop'])

Passing a column name not in the dictionary keys produces a columns of `NaN` values

In [None]:
frame2 = pd.DataFrame(data, columns=['year', 'state', 'pop', 'debt'],
                      index=['one', 'two', 'three', 'four', 'five', 'six'])

In [None]:
frame2

In [None]:
frame2.columns

A column in a DataFrame can be retrieved as a Series either by dict-like notation or by attribute.

In [None]:
frame2['state']

In [None]:
frame2.year

Rows can also be retrieved by position or name with the special `loc` attribute (much more on this later)

In [None]:
frame2.loc['three']

Columns can be modified by assignment.

In [None]:
frame2['debt'] = 16.5

In [None]:
frame2

In [None]:
frame2['debt'] = np.arange(6.)

In [None]:
frame2

In [None]:
val = pd.Series([-1.2, -1.5, -1.7], index=['two', 'four', 'five'])

In [None]:
frame2['debt'] = val

In [None]:
frame2

Assigning a column that doesn't exist will create a new column. 
The `del` keyword will delete columns as with a `dict`.

In [None]:
# Add a column of boolean values where the state column equals 'Ohio'
frame2['eastern'] = frame2.state == 'Ohio'

In [None]:
frame2

In [None]:
# The `del` method can be used to remove this column
del frame2['eastern']

In [None]:
frame2.columns

Another common form of data is a nested `dict` of `dicts`.

In [None]:
pop = {
    'Nevada': {2001: 2.4, 2002: 2.9,},
    'Ohio': {2000: 1.5, 2001: 1.7, 2002: 3.6},
}

If a `dict` of `dicts` is passed to the `DataFrame` constructor, 
`pandas` will interpret the outer `dict` keys as the columns
and the inner keys as the row indices.

In [None]:
frame3 = pd.DataFrame(pop)

In [None]:
frame3

You can transpose the DataFrame (swap rows and columns) 
with similar syntax to a NumPy array

In [None]:
frame3.T

The keys in the inner `dict` are combined and sorted
to form the index of the DataFrame result. However,
if one supplies an index explicitly, `pandas`
**does not** perform this combining and sorting.

In [None]:
pd.DataFrame(pop, index=[2001, 2002, 2003])       

Dicts of pandas Series are treated much the same way.

In [None]:
pdata = {
    'Ohio': frame3['Ohio'][:-1],
    'Nevada': frame3['Nevada'][:2],
}

In [None]:
pd.DataFrame(pdata)

In [None]:
frame3.index.name = 'year'
frame3.columns.name = 'state'

In [None]:
frame3

In [None]:
frame3.values

In [None]:
frame2.values