<a href="https://colab.research.google.com/github/tmarissa/marissa_DATA606/blob/main/ipynb/602_Choloropeth_State.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# DATA 606 Capstone 
## Marissa Tan
### Impact of COVID-19 on the US Housing Market
__Real Estate and Density Dataset__<br>
_Choropleth Map for 2019 and 2021_
- State(Mean) 
  - Average Listing Price (Refer data to '601 Analysis State.ipynb' Section 2.2)
  - Average Listing Price Change (%) (Refer data to '601 Analysis State.ipynb' Section 2.3)<br>

Reference:<br>
https://ecyy.medium.com/mapping-by-geopandas-in-colab-fe4b63b9ac00<br>
https://plotly.com/python/county-choropleth/

In [1]:
!apt install gdal-bin python-gdal python3-gdal
!apt install python3-rtree 
!pip install geopandas
!pip install pyshp==1.2.10
!pip install shapely==1.6.3
!pip install chart_studio
!pip install plotly==5.6.0
!pip install plotly-geo==1.0.0

Reading package lists... Done
Building dependency tree       
Reading state information... Done
gdal-bin is already the newest version (2.2.3+dfsg-2).
python-gdal is already the newest version (2.2.3+dfsg-2).
The following additional packages will be installed:
  python3-numpy
Suggested packages:
  python-numpy-doc python3-nose python3-numpy-dbg
The following NEW packages will be installed:
  python3-gdal python3-numpy
0 upgraded, 2 newly installed, 0 to remove and 39 not upgraded.
Need to get 2,288 kB of archives.
After this operation, 13.2 MB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu bionic/main amd64 python3-numpy amd64 1:1.13.3-2ubuntu1 [1,943 kB]
Get:2 http://archive.ubuntu.com/ubuntu bionic/universe amd64 python3-gdal amd64 2.2.3+dfsg-2 [346 kB]
Fetched 2,288 kB in 0s (14.8 MB/s)
Selecting previously unselected package python3-numpy.
(Reading database ... 156210 files and directories currently installed.)
Preparing to unpack .../python3-numpy_1

In [2]:
#import tools: NumPy for Advanced linear algebra, Matplotlib for Visualization and data plotting, Pandas for Data manipulation and analysis, Geopandas for programming geospatial data in python,  matplotlib.pyplot for plotting map
import pandas as pd
import numpy as np
import geopandas as gpd
from shapely.geometry import Point
import matplotlib
import matplotlib.pyplot as plt
import chart_studio.plotly as py
import plotly.figure_factory as ff
import plotly.graph_objects as go

# 1. Cleansed Master Dataset



## 1.1 Read CSV

### 1.1a For the Year 2019

In [3]:
df_2019 = pd.read_csv('df_2019.csv', index_col=False)
df_2019.sample(5)

Unnamed: 0,FIPS,year,state,county,density,average_listing_price,average_listing_price_mm,average_listing_price_yy,median_listing_price,median_listing_price_mm,median_listing_price_yy,rural_%,rural_cat,total_listing_count
8101,18075,2019,IN,Jay County,53,104050.0,-0.0484,-0.111,69900.0,0.0,-0.1124,55.65332,1,53.0
12483,27169,2019,MN,Winona County,79,245986.0,0.0833,0.2011,219950.0,0.1626,0.2976,34.53489,2,176.0
3047,8125,2019,CO,Yuma County,4,241222.0,0.1722,0.3035,229750.0,0.1168,0.2589,64.910883,1,38.0
6599,16065,2019,ID,Madison County,112,297287.0,-0.0506,-0.1814,279900.0,-0.0667,0.0525,28.463342,2,45.0
14338,28143,2019,MS,Tunica County,21,129215.0,0.0027,-0.3105,132000.0,-0.0383,-0.2084,66.023381,1,15.0


In [4]:
# This group the average listing price of every state by its mean
df_2019_mean = df_2019.groupby(['state'])['average_listing_price'].mean().reset_index()

In [5]:
# This group the average listing price change for every state by its mean
df_2019_yy = df_2019.groupby(['state'])['average_listing_price_yy'].mean().reset_index()

### 1.1b For the Year 2021

In [6]:
df_2021 = pd.read_csv('df_2021.csv', index_col=False)
df_2021.sample(5)

Unnamed: 0,FIPS,year,state,county,density,average_listing_price,average_listing_price_mm,average_listing_price_yy,median_listing_price,median_listing_price_mm,median_listing_price_yy,rural_%,rural_cat,total_listing_count
19575,41023,2021,OR,Grant County,1,433459.0,0.0697,0.5196,419900.0,0.1828,0.7146,100.0,1,44.0
23409,48375,2021,TX,Potter County,130,372340.0,-0.0171,0.0302,219500.0,0.0342,-0.0738,8.996225,2,237.0
3239,12023,2021,FL,Columbia County,87,356445.0,0.0552,0.3229,299900.0,0.0749,0.2548,62.057425,1,217.0
15997,37139,2021,NC,Pasquotank County,178,290562.0,-0.0208,0.1425,289450.0,-0.0336,0.2864,41.319692,2,175.0
19665,41041,2021,OR,Lincoln County,51,603278.0,-0.0055,0.2393,477450.0,0.0402,0.2337,37.589608,2,376.0


In [7]:
# This group the average listing price per state by its mean
df_2021_mean = df_2021.groupby(['state'])['average_listing_price'].mean().reset_index()


In [8]:
# This group the average listing price change per state by its mean
df_2021_yy = df_2021.groupby(['state'])['average_listing_price_yy'].mean().reset_index()

# 2. State

## 2.1 Average Listing Price

### 2.1a For the Year 2019

In [9]:
fig = go.Figure(data=go.Choropleth(
    locations=df_2019_mean['state'],
    z=df_2019_mean['average_listing_price'].astype(float),
    locationmode='USA-states',
    colorscale='Viridis_r',
    autocolorscale=False,
    text=df_2019_mean['state'], # hover text
    marker_line_color='white', # line markers between states
    colorbar_title="US $"
))

fig.update_layout(
    title_text='2019 US Average Listing Price per State',
    font=dict(
        family="Courier New, monospace",
        size=25,
        color="RebeccaPurple"
    ), 
    geo = dict(
        scope='usa',
        projection=go.layout.geo.Projection(type = 'albers usa'),
        showlakes=True, # lakes
        lakecolor='rgb(255, 255, 255)')
)

fig.show()

### 2.1b For the Year 2021

In [10]:
fig = go.Figure(data=go.Choropleth(
    locations=df_2021_mean['state'],
    z=df_2021_mean['average_listing_price'].astype(float),
    locationmode='USA-states',
    colorscale='Viridis_r',
    autocolorscale=False,
    text=df_2021_mean['state'], # hover text
    marker_line_color='white', # line markers between states
    colorbar_title="US $"
))

fig.update_layout(
    title_text='2021 US Average Listing Price per State',
    font=dict(
        family="Courier New, monospace",
        size=25,
        color="RebeccaPurple"
    ),
    geo = dict(
        scope='usa',
        projection=go.layout.geo.Projection(type = 'albers usa'),
        showlakes=True, # lakes
        lakecolor='rgb(255, 255, 255)'),
)

fig.show()

Comparing the 2019 and 2021 Choropleth Maps, the colors in 2021 are much deeper 
which means the there has been a price increase. Hawaii has consistently been the darkest shade meaning it has been the most expensive. In 2019, Nevada, surrounded by Arizona, Utah, and Idaho, has average listing price almost similar to its neighbor. By 2021, Nevada became surrounded by states with houses that has higher average listing price. 

## 2.2 Average Listing Price Change


### 2.1a For the Year 2019

In [11]:
fig = go.Figure(data=go.Choropleth(
    locations=df_2019_yy['state'],
    z=df_2019_yy['average_listing_price_yy'].astype(float),
    locationmode='USA-states',
    colorscale='Reds',
    autocolorscale=False,
    text=df_2019_yy['state'], # hover text
    marker_line_color='white', # line markers between states
    colorbar_title="percentage"
))

fig.update_layout(
    title_text='2019 US Average Listing Price Change per State (%)',
    font=dict(
        family="Courier New, monospace",
        size=25,
        color="black"
    ),
    geo = dict(
        scope='usa',
        projection=go.layout.geo.Projection(type = 'albers usa'),
        showlakes=True, # lakes
        lakecolor='rgb(255, 255, 255)'),
)

fig.show()

### 2.2b For the Year 2021

In [12]:
fig = go.Figure(data=go.Choropleth(
    locations=df_2021_yy['state'],
    z=df_2021_yy['average_listing_price_yy'].astype(float),
    locationmode='USA-states',
    colorscale='Reds',
    autocolorscale=False,
    text=df_2021_yy['state'], # hover text
    marker_line_color='white', # line markers between states
    colorbar_title="percentage"
))

fig.update_layout(
    title_text='2021 US Average Listing Price Change per State (%)',
    font=dict(
        family="Courier New, monospace",
        size=25,
        color="black"
    ),
    geo = dict(
        scope='usa',
        projection=go.layout.geo.Projection(type = 'albers usa'),
        showlakes=True, # lakes
        lakecolor='rgb(255, 255, 255)'),
)

fig.show()

In 2019, the average listing price yearly change can be seen in Alaska. Some changes are in the Pacific region (Nevada) and Midwest region (Nebraska and South Dakota). However, in 2021, the average listing price yearly change can be seen mostly in Northwest (Utah and Arizona) and Southwest (Idaho and Montana) region.