## Panda: Fundamental DataFrame Operations

### I. Select an index or column from a DataFrame

In [1]:
import pandas as pd
import numpy as np

In [17]:
# Take a DataFrame as input to your DataFrame 
# intialise data of lists.
data = {'A':[1, 4, 7],
        'B':[2, 5, 6],
        'C':[3, 6, 9]}
df = pd.DataFrame(data)
print(df)

   A  B  C
0  1  2  3
1  4  5  6
2  7  6  9


#### access the value that is at index 0, in column ‘A’.

In [18]:
# Using `iloc[]`
print(df.iloc[0][0])

# Using `loc[]`
print(df.loc[0]['A'])

# Using `at[]`
print(df.at[0,'A'])

# Using `iat[]`
print(df.iat[0,0])

1
1
1
1


Important to remember .loc[] and .iloc[]

#### select row and column

In [26]:
print(df)

# Use `iloc[]` to select row `0`
print(df.iloc[0])

# Use `loc[]` to select column `'A'`
print(df.loc[:,'A'])

   A  B  C
0  1  2  3
1  4  5  6
2  7  6  9
A    1
B    2
C    3
Name: 0, dtype: int64
0    1
1    4
2    7
Name: A, dtype: int64


### II. How to add an Index, Row or Column to DataFrame

#### How to add an index to DF

In [28]:
print(df)

   A  B  C
0  1  2  3
1  4  5  6
2  7  6  9


In [29]:
df.set_index('C')

Unnamed: 0_level_0,A,B
C,Unnamed: 1_level_1,Unnamed: 2_level_1
3,1,2
6,4,5
9,7,6


#### How to add rows to a DF

.loc[] vs .ilock[] vs .ix[]

- .loc[] works on labels of your index. This means that if you give in loc[2], you look for the values of your DataFrame that have an index labeled 2.

- .iloc[] works on the positions in your index. This means that if you give in iloc[2], you look for the values of your DataFrame that are at index ’2`.

- .ix[] is a more complex case: when the index is integer-based, you pass a label to .ix[]. ix[2] then means that you’re looking in your DataFrame for values that have an index labeled 2. This is just like .loc[]! However, if your index is not solely integer-based, ix will work with positions, just like .iloc[]

In [32]:
df = pd.DataFrame(data=np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]), index= [2, 'A', 4], columns=[48, 49, 50])
print(df)

   48  49  50
2   1   2   3
A   4   5   6
4   7   8   9


In [38]:
print(df.loc[2])

48    1
49    2
50    3
Name: 2, dtype: int64


In [40]:
print(df.iloc[2])

48    7
49    8
50    9
Name: 4, dtype: int64


In [41]:
print(df.ix[2])

48    7
49    8
50    9
Name: 4, dtype: int64


.ix is deprecated. Please use
.loc for label based indexing or
.iloc for positional indexing

See the documentation here:
http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#ix-indexer-is-deprecated
  """Entry point for launching an IPython kernel.


#### Adding a Column to Your DataFrame

In [42]:
df = pd.DataFrame(data=np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]), columns=['A', 'B', 'C'])
print(df)

   A  B  C
0  1  2  3
1  4  5  6
2  7  8  9


In [47]:
#Use .index
#df['D'] = df.index   # add index to column D

In [48]:
print(df)

   A  B  C  D
0  1  2  3  0
1  4  5  6  1
2  7  8  9  2


In [50]:
# Study the DataFrame `df`
print(df)

# Append a column to `df`
df.loc[:, 4] = pd.Series(['5', '6'], index=df.index)

# Print out `df` again to see the changes
print(df)

   A  B  C  D
0  1  2  3  0
1  4  5  6  1
2  7  8  9  2


ValueError: Length of passed values is 2, index implies 3

#### Resetting the Index of Your DataFrame

In [51]:
# Check out the weird index of your dataframe
print(df)

# Use `reset_index()` to reset the values
df_reset = df.reset_index(level=0, drop=True)

# Print `df_reset`
print(df_reset)

   A  B  C  D
0  1  2  3  0
1  4  5  6  1
2  7  8  9  2
   A  B  C  D
0  1  2  3  0
1  4  5  6  1
2  7  8  9  2
