# Week 4a: Data Frame Review

## Imported packages

In [0]:
import pandas as pd
import numpy as np

## I. Data Frames
Let's play around with data frames a bit for some practice. Let's work with the following simple dataframe, which is created using code pilfered (and slightly modified) from https://pbpython.com/pandas-list-dict.html.

In [0]:
# one way to manually create a data frame
sales = [['account', ['Jones LLC', 'Alpha Co', 'Blue Inc']],
         ['unneeded information',['fjkdlsa','hgiewa','hgidsla']],
         ['Jan', [150, 200, 50]],
         ['Feb', [200, 210, 90]],
         ['Mar', [140, 215, 95]],
         ]
df = pd.DataFrame.from_dict(dict(sales))
print(df)

     account unneeded information  Jan  Feb  Mar
0  Jones LLC              fjkdlsa  150  200  140
1   Alpha Co               hgiewa  200  210  215
2   Blue Inc              hgidsla   50   90   95


### Part A: Redefine the index
This dataframe read in with a worthless index, let's make the account name the index instead.

In [0]:
df_newind = df.set_index('account')
print(df_newind)

          unneeded information  Jan  Feb  Mar
account                                      
Jones LLC              fjkdlsa  150  200  140
Alpha Co                hgiewa  200  210  215
Blue Inc               hgidsla   50   90   95


### Part B: Transpose a data frame
Swap indices with column names in a data frame.

In [0]:
df_swap = df_newind.transpose()
print(df_swap)

account              Jones LLC Alpha Co Blue Inc
unneeded information   fjkdlsa   hgiewa  hgidsla
Jan                        150      200       50
Feb                        200      210       90
Mar                        140      215       95


### Part C: Delete a row or column
The row with the index 'unneeded information' is just that. Let's delete it.

In [0]:
# drop a row using df.drop(index label,0)
# the 0 argument above tells Python you are dropping a row
df_swapdrop = df_swap.drop('unneeded information',0)
print(df_swapdrop)

account Jones LLC Alpha Co Blue Inc
Jan           150      200       50
Feb           200      210       90
Mar           140      215       95


In [0]:
# note that we could have done the same thing before transposing, dropping the column instead
# this is done by passing the column name and a second argument of 1, which tells Pyton you are dropping a column
df_drop = df_newind.drop('unneeded information',1)
print(df_drop)

           Jan  Feb  Mar
account                 
Jones LLC  150  200  140
Alpha Co   200  210  215
Blue Inc    50   90   95


### Part D: Accessing data from a Data Frame
You can access columns, rows, a specific value, or a range of values. Working with `df_swap`, defined in part B, try the following:

In [0]:
# access all column names
df_swap.columns

Index(['Jones LLC', 'Alpha Co', 'Blue Inc'], dtype='object', name='account')

In [0]:
# access all row names (indices)
df_swap.index

Index(['unneeded information', 'Jan', 'Feb', 'Mar'], dtype='object')

In [0]:
# access Alpha Co earnings in February
df_swap['Alpha Co']['Feb']

# note the syntax is df[column][index], so the following commented code will not work
#df_swap['Feb']['Alpha Co']

210

In [0]:
# access all earnings from Jan through March for Blue Inc
df_swap['Blue Inc']['Jan':'Mar']

Jan    50
Feb    90
Mar    95
Name: Blue Inc, dtype: object