### Pew Research Center Dataset

This dataset explores the relationship between income and religion.

Problem: 

- The columns headers are composed of the possible income values.

In [1]:
import pandas as pd

#### Read in the data

- Read data in from the CSV file `pew-raw.csv` in the `./data/` directory, 
- and print out the table

In [2]:
# Read in data from CSV file
df = pd.read_csv("./data/pew-raw.csv")

# and display the original table
df

Unnamed: 0,religion,<$10k,$10-20k,$20-30k,$30-40k,$40-50k,$50-75k
0,Agnostic,27,34,60,81,76,137
1,Atheist,12,27,37,52,35,70
2,Buddhist,27,21,30,34,33,58
3,Catholic,418,617,732,670,638,1116
4,Dont know/refused,15,14,15,11,10,35
5,Evangelical Prot,575,869,1064,982,881,1486
6,Hindu,1,9,7,9,11,34
7,Historically Black Prot,228,244,236,238,197,223
8,Jehovahs Witness,20,27,24,24,21,30
9,Jewish,19,19,25,25,30,95


#### Pivot/melt

- Pivot/melt every column except for `religion`,
- make a new column called `"income"` based on the old column headers
- and a new column called `"freq"` based on the frequency values under those headers

In [3]:
formatted_df = pd.melt(df,["religion"], var_name="income", value_name="freq")

#### Sort temporarily

- Display the first 15 values, sorted by `religion`
- *Don't actually permanently sort the DataFrame yet!*

In [4]:
formatted_df.sort_values(by=["religion"]).head(15)

Unnamed: 0,religion,income,freq
0,Agnostic,<$10k,27
30,Agnostic,$30-40k,81
40,Agnostic,$40-50k,76
50,Agnostic,$50-75k,137
10,Agnostic,$10-20k,34
20,Agnostic,$20-30k,60
41,Atheist,$40-50k,35
21,Atheist,$20-30k,37
11,Atheist,$10-20k,27
31,Atheist,$30-40k,52


#### Sort and store the results

- Now sort again, this time storing the values
- this time **descending** by `"freq"` so the highest frequency values are at the top
- *The default is NOT to sort "in place", so need to reassign to keep sorting results*
- Display the results


In [5]:
formatted_df = formatted_df.sort_values(by=["freq"], ascending=False)

# display the results
formatted_df.head(15)

Unnamed: 0,religion,income,freq
55,Evangelical Prot,$50-75k,1486
53,Catholic,$50-75k,1116
25,Evangelical Prot,$20-30k,1064
35,Evangelical Prot,$30-40k,982
45,Evangelical Prot,$40-50k,881
15,Evangelical Prot,$10-20k,869
23,Catholic,$20-30k,732
33,Catholic,$30-40k,670
43,Catholic,$40-50k,638
13,Catholic,$10-20k,617
