### Creating Data Frames
documentation: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html

DataFrame is a 2-dimensional labeled data structure with columns of potentially different types. You can think of it
like a spreadsheet or SQL table, or a dict of Series objects.

You can create a data frame using:
- Dict of 1D ndarrays, lists, dicts, or Series
- 2-D numpy.ndarray
- Structured or record ndarray
- A Series
- Another DataFrame


### Data Frame attributes
| T       | Transpose index and columns                                                                                       |   |
|---------|-------------------------------------------------------------------------------------------------------------------|---|
| at      | Fast label-based scalar accessor                                                                                  |   |
| axes    | Return a list with the row axis labels and column axis labels as the only members.                                |   |
| blocks  | Internal property, property synonym for as_blocks()                                                               |   |
| dtypes  | Return the dtypes in this object.                                                                                 |   |
| empty   | True if NDFrame is entirely empty [no items], meaning any of the axes are of length 0.                            |   |
| ftypes  | Return the ftypes (indication of sparse/dense and dtype) in this object.                                          |   |
| iat     | Fast integer location scalar accessor.                                                                            |   |
| iloc    | Purely integer-location based indexing for selection by position.                                                 |   |
| is_copy |                                                                                                                   |   |
| ix      | A primarily label-location based indexer, with integer position fallback.                                         |   |
| loc     | Purely label-location based indexer for selection by label.                                                       |   |
| ndim    | Number of axes / array dimensions                                                                                 |   |
| shape   | Return a tuple representing the dimensionality of the DataFrame.                                                  |   |
| size    | number of elements in the NDFrame                                                                                 |   |
| style   | Property returning a Styler object containing methods for building a styled HTML representation fo the DataFrame. |   |
| values  | Numpy representation of NDFrame                                                                                   |   |

In [3]:
import pandas as pd
import numpy as np

### Creating data frames from various data types
documentation: http://pandas.pydata.org/pandas-docs/stable/dsintro.html

cookbook: http://pandas.pydata.org/pandas-docs/stable/cookbook.html

##### create data frame from Python dictionary

In [4]:
my_dictionary = {'a' : 45., 'b' : -19.5, 'c' : 4444}
print(my_dictionary.keys())
print(my_dictionary.values())


dict_keys(['a', 'b', 'c'])
dict_values([45.0, -19.5, 4444])


In [5]:
my_dictionary_df = pd.DataFrame(my_dictionary, index=['first', 'again'])
my_dictionary_df

Unnamed: 0,a,b,c
first,45.0,-19.5,4444
again,45.0,-19.5,4444


##### constructor without explicit index

In [6]:
cookbook_df = pd.DataFrame({'AAA' : [4,5,6,7], 'BBB' : [10,20,30,40],'CCC' : [100,50,-30,-50]})
cookbook_df

Unnamed: 0,AAA,BBB,CCC
0,4,10,100
1,5,20,50
2,6,30,-30
3,7,40,-50


##### constructor contains dictionary with Series as values

In [7]:
series_dict = {'one' : pd.Series([1., 2., 3.], index=['a', 'b', 'c']),
               'two' : pd.Series([1., 2., 3., 4.], index=['a', 'b', 'c', 'd'])}
series_df = pd.DataFrame(series_dict)
series_df

Unnamed: 0,one,two
a,1.0,1.0
b,2.0,2.0
c,3.0,3.0
d,,4.0


##### dictionary of lists

In [8]:
produce_dict = {'veggies': ['potatoes', 'onions', 'peppers', 'carrots'],
                'fruits': ['apples', 'bananas', 'pineapple', 'berries']}
produce_dict

{'veggies': ['potatoes', 'onions', 'peppers', 'carrots'],
 'fruits': ['apples', 'bananas', 'pineapple', 'berries']}

In [9]:
pd.DataFrame(produce_dict)

Unnamed: 0,veggies,fruits
0,potatoes,apples
1,onions,bananas
2,peppers,pineapple
3,carrots,berries


##### list of dictionaries

In [10]:
data2 = [{'a': 1, 'b': 2}, {'a': 5, 'b': 10, 'c': 20}]
pd.DataFrame(data2)

Unnamed: 0,a,b,c
0,1,2,
1,5,10,20.0


##### dictionary of tuples, with  multi index

In [11]:
pd.DataFrame({('a', 'b'): {('A', 'B'): 1, ('A', 'C'): 2},
('a', 'a'): {('A', 'C'): 3, ('A', 'B'): 4},
('a', 'c'): {('A', 'B'): 5, ('A', 'C'): 6},
('b', 'a'): {('A', 'C'): 7, ('A', 'B'): 8},
('b', 'b'): {('A', 'D'): 9, ('A', 'B'): 10}})

Unnamed: 0_level_0,Unnamed: 1_level_0,a,a,a,b,b
Unnamed: 0_level_1,Unnamed: 1_level_1,b,a,c,a,b
A,B,1.0,4.0,5.0,8.0,10.0
A,C,2.0,3.0,6.0,7.0,
A,D,,,,,9.0
