# Final Project: Places Display

- **Vintage**:  2020
- **Geography Level**: Place
- **Variables**:
    - **DP02_0116E**: Estimate of population (5 years and over) who speaks Spanish at home
    - **DP02_0116PE**: Percent of population (5 years and over) who speaks Spanish at home
#
- **Variables List**:  https://api.census.gov/data/2020/acs/acs5/profile/variables.html 
- **Supported Geographies**: https://api.census.gov/data/2020/acs/acs5/profile/geography.html

### ***Questions***:  
3. Barcharts:
    - 3.1. Top 10 California Places with more people speaking Spanish at home (DP02_0116E)
    - 3.2. Top 10 California Places with more percentage of people speaking Spanish at home (DP02_0116PE)


## 1. Import necessary packages

In [31]:
import pandas as pd
import plotly.express as px
import json  

## 2. Read csv file

In [32]:
df = pd.read_csv('Data/Places_Data.csv', dtype={'FIPS_Place': str, 
                                                'FIPS_State': str})

print("Number of rows:", df.shape[0])
print("Number of columns:", df.shape[1])
df.head()

Number of rows: 1611
Number of columns: 6


Unnamed: 0,Place_Name,State_Name,Language spoken at home (Spanish) (DP02_0116E),Language spoken at home (Spanish) - Percent (DP02_0116PE),FIPS_Place,FIPS_State
0,Home Garden CDP,California,913,64.3,34281,6
1,Home Gardens CDP,California,6587,58.5,34302,6
2,Homeland CDP,California,3226,45.3,34316,6
3,Homestead Valley CDP,California,182,7.1,34392,6
4,Homewood Canyon CDP,California,31,12.9,34405,6


In [33]:
# Print data types
print("Data types: ")
df.dtypes

Data types: 


Place_Name                                                    object
State_Name                                                    object
Language spoken at home (Spanish) (DP02_0116E)                 int64
Language spoken at home (Spanish) - Percent (DP02_0116PE)    float64
FIPS_Place                                                    object
FIPS_State                                                    object
dtype: object

## 3. Barcharts:

### 3.1. Top 10 California Places with more people speaking Spanish at home (DP02_0116E)

- Sort values

In [34]:
df_estimate = df.sort_values(by="Language spoken at home (Spanish) (DP02_0116E)", ascending=False)

print("Number of rows:", df_estimate.shape[0])
print("Number of columns:", df_estimate.shape[1])
df_estimate.head()

Number of rows: 1611
Number of columns: 6


Unnamed: 0,Place_Name,State_Name,Language spoken at home (Spanish) (DP02_0116E),Language spoken at home (Spanish) - Percent (DP02_0116PE),FIPS_Place,FIPS_State
289,Los Angeles city,California,1554567,41.5,44000,6
1121,San Diego city,California,293883,22.0,66000,6
1131,San Jose city,California,214611,22.1,68000,6
1149,Santa Ana city,California,209183,67.4,69000,6
282,Long Beach city,California,147021,33.9,43000,6


- Get Top 10

In [35]:
df_estimate = df_estimate.iloc[ : 10]

print("Number of rows:", df_estimate.shape[0])
print("Number of columns:", df_estimate.shape[1])
df_estimate

Number of rows: 10
Number of columns: 6


Unnamed: 0,Place_Name,State_Name,Language spoken at home (Spanish) (DP02_0116E),Language spoken at home (Spanish) - Percent (DP02_0116PE),FIPS_Place,FIPS_State
289,Los Angeles city,California,1554567,41.5,44000,6
1121,San Diego city,California,293883,22.0,66000,6
1131,San Jose city,California,214611,22.1,68000,6
1149,Santa Ana city,California,209183,67.4,69000,6
282,Long Beach city,California,147021,33.9,43000,6
1398,Fresno city,California,144732,29.9,27000,6
401,Anaheim city,California,139531,42.1,2000,6
859,Chula Vista city,California,118347,47.0,13392,6
495,Bakersfield city,California,117251,33.6,3526,6
698,Oxnard city,California,115885,59.8,54652,6


- Sort again to get the plot in ascending way

In [36]:
df_estimate.sort_values(by="Language spoken at home (Spanish) (DP02_0116E)", ascending=True, inplace=True)

- Plot

In [37]:
fig = px.bar(df_estimate,              
             x='Language spoken at home (Spanish) (DP02_0116E)', 
             y='Place_Name',
             text='Language spoken at home (Spanish) (DP02_0116E)',
             orientation='h',   
             template='seaborn',
             title='Top 10 California Places with more people speaking Spanish at home (DP02_0116E)')

# Formatting bar labels
fig.update_traces(textposition='auto', 
                  texttemplate='%{text:,.2s}'
                 )

fig.show()

### 3.2. Top 10 California Places with more percentage of people speaking Spanish at home (DP02_0116PE)

- Sort values

In [38]:
df_percent = df.sort_values(by="Language spoken at home (Spanish) - Percent (DP02_0116PE)", ascending=False)

print("Number of rows:", df_percent.shape[0])
print("Number of columns:", df_percent.shape[1])
df_percent.head()

Number of rows: 1611
Number of columns: 6


Unnamed: 0,Place_Name,State_Name,Language spoken at home (Spanish) (DP02_0116E),Language spoken at home (Spanish) - Percent (DP02_0116PE),FIPS_Place,FIPS_State
77,Westley CDP,California,657,100.0,84480,6
669,Oakville CDP,California,74,100.0,53196,6
1021,Rodriguez Camp CDP,California,61,100.0,62496,6
680,Old River CDP,California,66,100.0,53574,6
901,Potrero CDP,California,230,100.0,58478,6


- Get Top 10

In [39]:
df_percent = df_percent.iloc[ : 10]

print("Number of rows:", df_percent.shape[0])
print("Number of columns:", df_percent.shape[1])
df_percent

Number of rows: 10
Number of columns: 6


Unnamed: 0,Place_Name,State_Name,Language spoken at home (Spanish) (DP02_0116E),Language spoken at home (Spanish) - Percent (DP02_0116PE),FIPS_Place,FIPS_State
77,Westley CDP,California,657,100.0,84480,6
669,Oakville CDP,California,74,100.0,53196,6
1021,Rodriguez Camp CDP,California,61,100.0,62496,6
680,Old River CDP,California,66,100.0,53574,6
901,Potrero CDP,California,230,100.0,58478,6
1472,Three Rocks CDP,California,97,100.0,78652,6
644,Bucks Lake CDP,California,21,100.0,8744,6
257,Linnell Camp CDP,California,601,100.0,41740,6
130,Kettleman City CDP,California,1190,100.0,38394,6
177,Woodville Farm Labor Camp CDP,California,569,98.8,86489,6


- Get value in percentage format

In [40]:
df_percent['Language spoken at home (Spanish) - Percent (DP02_0116PE)'] = df_percent['Language spoken at home (Spanish) - Percent (DP02_0116PE)'] / 100

- Sort again to get the plot in ascending way

In [41]:
df_percent.sort_values(by="Language spoken at home (Spanish) - Percent (DP02_0116PE)", ascending=True, inplace=True)

- Plot

In [42]:
fig = px.bar(df_percent,              
             x='Language spoken at home (Spanish) - Percent (DP02_0116PE)', 
             y='Place_Name',
             text='Language spoken at home (Spanish) - Percent (DP02_0116PE)',
             orientation='h',   
             template='seaborn',
             title='Top 10 California Places with more percentage of people speaking Spanish at home (DP02_0116PE)')

# Formatting bar labels
fig.update_traces(textposition='auto', 
                  texttemplate='%{text:.1%}'
                 )

fig.show()