# **PANDAS**

In [0]:
import numpy as np
import pandas as pd

*pandas* is built upon *numpy* and *scipy*, and *numpy* is built upon **lists**.

## Pandas Series

*pandas Series* is a one-dimensional labeled array.

In [5]:
ser = pd.Series(data=[100, 'foo', 300, 'bar', 500], index=['tom', 'bob', 'nancy', 'dan', 'eric'])
ser

tom      100
bob      foo
nancy    300
dan      bar
eric     500
dtype: object

In [6]:
ser.index

Index(['tom', 'bob', 'nancy', 'dan', 'eric'], dtype='object')

In [7]:
ser.shape

(5,)

In [8]:
ser.loc[['nancy', 'bob']]

nancy    300
bob      foo
dtype: object

In [9]:
ser[[4, 3, 1]]

eric    500
dan     bar
bob     foo
dtype: object

In [11]:
ser.iloc[[2,3]]

nancy    300
dan      bar
dtype: object

In [12]:
print(ser[2])
print(ser.iloc[2])

300
300


In [13]:
ser*2

tom         200
bob      foofoo
nancy       600
dan      barbar
eric       1000
dtype: object

In [14]:
ser[['nancy', 'eric']] ** 2

nancy     90000
eric     250000
dtype: object

In [0]:
ser ** 2

## Pandas DataFrame

*pandas DataFrame* is a 2-dimensional labeled Data Structure.

### Create DataFrame from Dictionary of Python Series.

In [0]:
d = {'one': pd.Series([100., 200., 300.], index=['apple', 'ball', 'clock']),
     'two': pd.Series([111., 222., 333., 444.], index=['apple', 'ball', 'cerill', 'dancy'])}

In [24]:
df = pd.DataFrame(d)
df

Unnamed: 0,one,two
apple,100.0,111.0
ball,200.0,222.0
cerill,,333.0
clock,300.0,
dancy,,444.0


In [20]:
pd.isnull(df)

Unnamed: 0,one,two
apple,False,False
ball,False,False
cerill,True,False
clock,False,True
dancy,True,False


In [21]:
df.index

Index(['apple', 'ball', 'cerill', 'clock', 'dancy'], dtype='object')

In [22]:
df.columns

Index(['one', 'two'], dtype='object')

In [23]:
pd.DataFrame(d, index=['dancy', 'ball', 'apple'])

Unnamed: 0,one,two
dancy,,444.0
ball,200.0,222.0
apple,100.0,111.0


In [25]:
pd.DataFrame(d, index=['dancy', 'ball', 'apple'], columns=['two', 'five'])

Unnamed: 0,two,five
dancy,444.0,
ball,222.0,
apple,111.0,


### Create DataFrame from list of Python Dictionaries

In [0]:
data = [{'alex':1, 'joe':2}, {'ema':5, 'dora':10, 'alice':20}]

In [27]:
pd.DataFrame(data)

Unnamed: 0,alex,joe,ema,dora,alice
0,1.0,2.0,,,
1,,,5.0,10.0,20.0


In [28]:
pd.DataFrame(data, index=['orange', 'red'])

Unnamed: 0,alex,joe,ema,dora,alice
orange,1.0,2.0,,,
red,,,5.0,10.0,20.0


In [29]:
pd.DataFrame(data, columns=['joe', 'dora', 'alice'])

Unnamed: 0,joe,dora,alice
0,2.0,,
1,,10.0,20.0


### Basic DataFrame Operations

In [30]:
df

Unnamed: 0,one,two
apple,100.0,111.0
ball,200.0,222.0
cerill,,333.0
clock,300.0,
dancy,,444.0


In [31]:
df['one']

apple     100.0
ball      200.0
cerill      NaN
clock     300.0
dancy       NaN
Name: one, dtype: float64