# Slicing and Indexing DataFrames

1) Explicit indexes

2) Slicing and subsetting with .loc and .iloc

3) Working with pivot tables

In [3]:
import pandas as pd
homelessness = pd.read_csv('./Data/homelessness.csv', index_col = 0)
print(homelessness.head())

               region       state  individuals  family_members  state_pop
0  East South Central     Alabama       2570.0           864.0    4887681
1             Pacific      Alaska       1434.0           582.0     735139
2            Mountain     Arizona       7259.0          2606.0    7158024
3  West South Central    Arkansas       2280.0           432.0    3009733
4             Pacific  California     109008.0         20964.0   39461588


In [4]:
homelessness.columns

Index(['region', 'state', 'individuals', 'family_members', 'state_pop'], dtype='object')

In [9]:
homelessness.head().index

Int64Index([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
            17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
            34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49,
            50],
           dtype='int64')

In [6]:
print(homelessness.head())

               region       state  individuals  family_members  state_pop
0  East South Central     Alabama       2570.0           864.0    4887681
1             Pacific      Alaska       1434.0           582.0     735139
2            Mountain     Arizona       7259.0          2606.0    7158024
3  West South Central    Arkansas       2280.0           432.0    3009733
4             Pacific  California     109008.0         20964.0   39461588


### Setting a column as the index

In [7]:
homelessness_ind = homelessness.set_index("region")
print(homelessness_ind.head())

                         state  individuals  family_members  state_pop
region                                                                
East South Central     Alabama       2570.0           864.0    4887681
Pacific                 Alaska       1434.0           582.0     735139
Mountain               Arizona       7259.0          2606.0    7158024
West South Central    Arkansas       2280.0           432.0    3009733
Pacific             California     109008.0         20964.0   39461588


### Removing an index

In [8]:
# You can move a column from the body of the DataFrame to the index. 
# This is called "setting an index,"

homelessness_ind = homelessness.reset_index()
print(homelessness_ind.head())

   index              region       state  individuals  family_members  \
0      0  East South Central     Alabama       2570.0           864.0   
1      1             Pacific      Alaska       1434.0           582.0   
2      2            Mountain     Arizona       7259.0          2606.0   
3      3  West South Central    Arkansas       2280.0           432.0   
4      4             Pacific  California     109008.0         20964.0   

   state_pop  
0    4887681  
1     735139  
2    7158024  
3    3009733  
4   39461588  


### Dropping an index

In [49]:
homelessness_ind = homelessness.reset_index(drop = True)
print(homelessness_ind.head())

               region       state  individuals  family_members  state_pop
0  East South Central     Alabama       2570.0           864.0    4887681
1             Pacific      Alaska       1434.0           582.0     735139
2            Mountain     Arizona       7259.0          2606.0    7158024
3  West South Central    Arkansas       2280.0           432.0    3009733
4             Pacific  California     109008.0         20964.0   39461588


In [53]:
homelessness_ind.reset_index().head()

Unnamed: 0,index,region,state,individuals,family_members,state_pop
0,0,East South Central,Alabama,2570.0,864.0,4887681
1,1,Pacific,Alaska,1434.0,582.0,735139
2,2,Mountain,Arizona,7259.0,2606.0,7158024
3,3,West South Central,Arkansas,2280.0,432.0,3009733
4,4,Pacific,California,109008.0,20964.0,39461588


### Indexes make subsetting simpler

![Index%20vs%20Loc.PNG](attachment:Index%20vs%20Loc.PNG)

![Index-1.PNG](attachment:Index-1.PNG)

![Slicing%20Issue.PNG](attachment:Slicing%20Issue.PNG)

## Drawbacks of using Indexes

![Problems%20with%20indexes.PNG](attachment:Problems%20with%20indexes.PNG)

## Working with pivot tables

In [75]:
import pandas as pd

sales = pd.read_csv('./Data/sales.csv', index_col = 0)
print(sales)

            date  weekly_sales  temperature_c  fuel_price_usd_per_l
0       2/5/2010      24924.50       5.727778              0.679451
1       3/5/2010      21827.90       8.055556              0.693452
2       4/2/2010      57258.43      16.816667              0.718284
3       5/7/2010      17413.94      22.527778              0.748928
4       6/4/2010      17558.09      27.050000              0.714586
...          ...           ...            ...                   ...
10769  12/9/2011        895.00       9.644444              0.834256
10770   2/3/2012        350.00      15.938889              0.887619
10771   6/8/2012        450.00      27.288889              0.911922
10772  7/13/2012          0.06      25.644444              0.860145
10773  10/5/2012        915.00      22.250000              0.955511

[10774 rows x 4 columns]


![.loc%20+%20slicing%20-%20Powerful%20combo!.PNG](attachment:.loc%20+%20slicing%20-%20Powerful%20combo!.PNG)