## Zillow Top Tier Housing Data


In this project we will look at the top tier home values across all 50 states. 

We will be using Zillow's ZHVI Home Value Index. It is a measure of typical home value and market changes across a given region and housing type. It reflects the value for homes in the 35th to 65th percentile range.

Zillow publishes top-tier ZHVI ($, typical value for homes within the 65th to 95th percentile range for a given region) and bottom-tier ZHVI.

A user guide for this data can be found at: [Zillow](https://www.zillow.com/research/zhvi-user-guide/). 

In [1]:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import sqlite3
import csv 
import numpy as np 
import warnings 
from pandasql import sqldf
# Ignore all warnings 
warnings.filterwarnings('ignore')


# Load your data into a DataFrame (assuming it's in a CSV file)

csv_file = r"C:\Users\Wolfrank\Desktop\Zillow.csv"
data = pd.read_csv(csv_file)
df = data


### Cleaning the Data 


In [2]:

# Next we clean up the data in the dataframe we just created, then we save the new file.

# Data Cleaning:  Drop Columns 
columns_to_remove = ['RegionID', 'RegionType', 'StateName',]
data.drop(columns=columns_to_remove, inplace=True)

# Rename Column from RegionName to State
df.rename(columns={'RegionName': 'State'}, inplace=True)

# Add 1 to Index 
df.index = df.index + 1

# Add 1 to SizeRank 
df['SizeRank'] = df['SizeRank'] + 1

# Show number of columns
num_columns = len(data.columns)
print("Number of columns:", num_columns)

display(df)

Number of columns: 51


Unnamed: 0,SizeRank,State,7/31/2019,8/31/2019,9/30/2019,10/31/2019,11/30/2019,12/31/2019,1/31/2020,2/29/2020,...,10/31/2022,11/30/2022,12/31/2022,1/31/2023,2/28/2023,3/31/2023,4/30/2023,5/31/2023,6/30/2023,7/31/2023
1,1,California,1010157.03,1012997.2,1017687.95,1022485.24,1026952.62,1031142.67,1034553.47,1038088.6,...,1388930.11,1376351.17,1365640.69,1349536.48,1331812.63,1318644.67,1313776.9,1316454.54,1325324.18,1341108.82
2,2,Texas,357269.45,357782.32,358412.61,359412.86,360688.93,362151.96,363891.37,365822.27,...,532105.51,528545.81,524829.49,521187.45,518476.9,517509.82,517123.74,517491.48,518693.11,520137.03
3,3,Florida,418248.13,418802.55,419552.05,420841.5,422743.27,425119.6,427888.45,430657.11,...,671682.72,670050.89,667964.77,664787.53,661775.13,661015.61,661671.22,663656.64,666775.03,670282.22
4,4,New York,690210.31,693015.33,693910.26,693100.09,693893.69,695226.93,698205.02,699981.7,...,844798.5,840894.0,835562.38,830563.4,826754.7,826619.79,828228.14,831973.72,837724.8,844885.84
5,5,Pennsylvania,335935.47,336900.44,337641.14,338545.74,339749.56,341004.42,342010.48,343080.18,...,442150.02,442246.2,442397.14,441695.92,441335.83,442410.63,444593.91,447405.56,450479.79,454024.12
6,6,Illinois,357412.36,356471.43,355391.39,354901.9,354797.42,355092.67,355916.39,357078.08,...,431662.33,430503.82,429329.67,428739.96,428391.45,429022.68,429375.42,430677.89,433281.28,436567.53
7,7,Ohio,272835.08,273670.82,274235.67,275015.43,276117.2,277291.76,278632.96,279996.3,...,371393.22,371248.72,371095.11,371301.9,371832.66,373447.41,375182.41,377375.3,379453.14,381272.81
8,8,Georgia,357878.82,358461.71,359081.64,360062.2,361342.18,363000.74,364967.97,367159.45,...,524631.21,522735.07,520415.83,518330.42,516799.91,517033.03,517620.21,519478.18,522161.75,525356.7
9,9,North Carolina,355739.44,356297.63,356922.47,358036.28,359549.58,361386.43,363186.53,365085.51,...,540551.45,538276.84,535798.16,533270.51,531354.74,531537.4,532334.4,534778.51,538365.26,542623.91
10,10,Michigan,299772.62,299910.94,299882.68,300244.12,301048.21,301914.79,303047.67,304480.98,...,397356.16,395913.24,394584.94,394051.71,394175.0,395806.96,397219.51,399235.77,401152.93,402754.27


In [3]:
# Get the column names and join them with a comma
column_names = ','.join(df.columns)

print(column_names)


SizeRank,State,7/31/2019,8/31/2019,9/30/2019,10/31/2019,11/30/2019,12/31/2019,1/31/2020,2/29/2020,3/31/2020,4/30/2020,5/31/2020,6/30/2020,7/31/2020,8/31/2020,9/30/2020,10/31/2020,11/30/2020,12/31/2020,1/31/2021,2/28/2021,3/31/2021,4/30/2021,5/31/2021,6/30/2021,7/31/2021,8/31/2021,9/30/2021,10/31/2021,11/30/2021,12/31/2021,1/31/2022,2/28/2022,3/31/2022,4/30/2022,5/31/2022,6/30/2022,7/31/2022,8/31/2022,9/30/2022,10/31/2022,11/30/2022,12/31/2022,1/31/2023,2/28/2023,3/31/2023,4/30/2023,5/31/2023,6/30/2023,7/31/2023
