# Choropleth Maps Individual Project 

This is an individual small project with the U.S. population dataset.
The dataset is obtained on the US Census Bureau page.

To manipulate the data more easily in the project, only selected columns are extracted to a separate csv file.

The following link redirects the source page for more information.
[US Population Estimate Data](https://www.census.gov/data/datasets/time-series/demo/popest/2010s-national-total.html)

## Import necessary libraries

In [1]:
import seaborn as sns
import plotly.graph_objs as go 
from plotly.offline import init_notebook_mode,iplot
init_notebook_mode(connected=True) 

Next, import the data using pandas.

In [2]:
import pandas as pd
df = pd.read_csv('US_Population_Estimates.csv')

In [3]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 51 entries, 0 to 50
Data columns (total 11 columns):
State        51 non-null object
State Abv    51 non-null object
2010         51 non-null int64
2011         51 non-null int64
2012         51 non-null int64
2013         51 non-null int64
2014         51 non-null int64
2015         51 non-null int64
2016         51 non-null int64
2017         51 non-null int64
2018         51 non-null int64
dtypes: int64(9), object(2)
memory usage: 4.5+ KB


In [4]:
df.head()

Unnamed: 0,State,State Abv,2010,2011,2012,2013,2014,2015,2016,2017,2018
0,Alabama,AL,4785448,4798834,4815564,4830460,4842481,4853160,4864745,4875120,4887871
1,Alaska,AK,713906,722038,730399,737045,736307,737547,741504,739786,737438
2,Arizona,AZ,6407774,6473497,6556629,6634999,6733840,6833596,6945452,7048876,7171646
3,Arkansas,AR,2921978,2940407,2952109,2959549,2967726,2978407,2990410,3002997,3013825
4,California,CA,37320903,37641823,37960782,38280824,38625139,38953142,39209127,39399349,39557045


Now, let's draw the most recent population data on the US map.

In [5]:
data = dict(type='choropleth',
            colorscale = 'Viridis',
            reversescale = True,
            locations = df['State Abv'],
            z = df['2018'],
            locationmode = 'USA-states',
            text = df['State'],
            colorbar = {'title':"US Population Estimates"}
            ) 

In [6]:
layout = dict(title = 'US Population Estimates - Year 2018',
              geo = dict(scope='usa',
                         showlakes = True,
                         lakecolor = 'rgb(85,173,240)')
             )

In [7]:
choromap = go.Figure(data = [data],layout = layout)
iplot(choromap,validate=False)

Let's examine the data more closely to find any trend over time. To do this, create a new column that stores the population difference between year 2013 and 2018. (recent 5 years)

In [8]:
df['population change (2013-2018)'] = df['2018'] - df['2013']
df['population change by percent (2013-2018)'] = (df['2018'] - df['2013'])/df['2013'] * 100

Notice that some states show decreasing population trend in recent 5 years. 

In [9]:
df[df['population change (2013-2018)'] < 0][['State','State Abv','2013','2018','population change (2013-2018)','population change by percent (2013-2018)']]

Unnamed: 0,State,State Abv,2013,2018,population change (2013-2018),population change by percent (2013-2018)
6,Connecticut,CT,3594915,3572665,-22250,-0.61893
13,Illinois,IL,12898269,12741080,-157189,-1.218683
24,Mississippi,MS,2988797,2986530,-2267,-0.07585
32,New York,NY,19628043,19542209,-85834,-0.437303
48,West Virginia,WV,1853873,1805832,-48041,-2.591386
50,Wyoming,WY,582123,577737,-4386,-0.753449


Sort the resulting table by population change percent in descending order.

In [10]:
df[df['population change (2013-2018)'] < 0][['State','State Abv','2013','2018','population change (2013-2018)','population change by percent (2013-2018)']].sort_values(by='population change by percent (2013-2018)')

Unnamed: 0,State,State Abv,2013,2018,population change (2013-2018),population change by percent (2013-2018)
48,West Virginia,WV,1853873,1805832,-48041,-2.591386
13,Illinois,IL,12898269,12741080,-157189,-1.218683
50,Wyoming,WY,582123,577737,-4386,-0.753449
6,Connecticut,CT,3594915,3572665,-22250,-0.61893
32,New York,NY,19628043,19542209,-85834,-0.437303
24,Mississippi,MS,2988797,2986530,-2267,-0.07585


Let's draw the population change in the U.S map.

In [11]:
data = dict(type='choropleth',
            colorscale = 'Portland',
            reversescale = True,
            locations = df['State Abv'],
            z = df['population change by percent (2013-2018)'],
            locationmode = 'USA-states',
            text = df['State'],
            colorbar = {'title':"US Population Change"}
            ) 

In [12]:
layout = dict(title = 'Population change by percent (2013-2018)',
              geo = dict(scope='usa',
                         showlakes = True,
                         lakecolor = 'rgb(85,173,240)')
             )

In [13]:
choromap = go.Figure(data = [data],layout = layout)
iplot(choromap,validate=False)

We can easily see that West Virginia has experienced the most population decrease by percent among 50 states.