# Relationship between Zillow Rent Index of CA Cities and Nearby Venues

## 1. Introduction/Business Problem

### Tenants are looking for an affordable city or area to live and work.The nearby venues are an important factor to consider.
### Investors search a good place to buy houses and use them to collect rentals. The nearby venues can predict the capital return and potential rentals.
### Therefore, this project will explore the relationship between Zillow Rent Index of CA cities and nearby venues from Foursquare

### This project helps a group of stakeholders (tenants and landlord) solve a problem.


## 2. Data Source

### 2.1 Zillow Rent Index data from Zillow
### Zillow Rent Index (ZRI): A smoothed measure of the typical estimated market rate rent across a given region and housing type. ZRI, which is a dollar-denominated alternative to repeat-rent indices, is the mean of rent estimates that fall into the 40th to 60th percentile range for all homes and apartments in a given region, including those not currently listed for rent.

### Multifamily, SFR, Condo, Co-op (rental house type)
### City level (California)
### Columns: City Area Size (SizeRank), Rentals (Zri), Each City Rental Sample Size (ZriRecordCnt)

In [1]:
import pandas as pd # library for data analsysis

In [2]:
df=pd.read_csv('http://files.zillowstatic.com/research/public/City/City_Zri_AllHomesPlusMultifamily_Summary.csv')
df.head()

Unnamed: 0,Date,RegionName,State,Metro,County,SizeRank,Zri,MoM,QoQ,YoY,ZriRecordCnt
0,2019-12-31,New York,NY,New York-Newark-Jersey City,Queens County,0,2400,0.0028,0.0196,0.0371,2099299
1,2019-12-31,Los Angeles,CA,Los Angeles-Long Beach-Anaheim,Los Angeles County,1,2840,0.0026,0.0142,0.032,824116
2,2019-12-31,Houston,TX,Houston-The Woodlands-Sugar Land,Harris County,2,1410,0.0123,0.0204,0.0056,898628
3,2019-12-31,Chicago,IL,Chicago-Naperville-Elgin,Cook County,3,1710,0.0003,-0.0121,0.0103,807202
4,2019-12-31,San Antonio,TX,San Antonio-New Braunfels,Bexar County,4,1200,0.0068,0.0016,-0.0007,518784


In [3]:
df_CA=df[df['State']=='CA'][['RegionName','SizeRank','Zri','ZriRecordCnt']].reset_index(drop=True)
df_CA.columns=['City','SizeRank','Zri','ZriRecordCnt']
df_CA.head()

Unnamed: 0,City,SizeRank,Zri,ZriRecordCnt
0,Los Angeles,1,2840,824116
1,San Diego,8,2610,406428
2,San Jose,11,3130,257852
3,San Francisco,14,4220,182102
4,Sacramento,29,1700,185705


### 2.2 Latitude and Longitude of California Cities
### Maps of World (California Latitude and Longitude Map, city level)

In [4]:
import requests # library to handle requests

In [5]:
url='https://www.mapsofworld.com/usa/states/california/lat-long.html'
web_text=requests.get(url).text

In [6]:
from bs4 import BeautifulSoup
soup=BeautifulSoup(web_text,'lxml')
#print(soup.prettify())

In [7]:
result_table=soup.find_all('table',{'class':'tableizer-table'})
#result_table

In [8]:
ths = result_table[1].find_all('th')
headings = [th.text.strip() for th in ths]
headings

['Location', 'Latitude', 'Longitude']

In [9]:
data_list=[]
for i in [1,2]:
    for tr in result_table[i].find_all('tr'):
        tds = tr.find_all('td')
        if tds != []:
            data_list.append([td.text.strip() for td in tds])

In [10]:
df_coords=pd.DataFrame(data_list,columns=headings)
df_coords.columns=['City','Latitude','Longitude']
df_coords.head()

Unnamed: 0,City,Latitude,Longitude
0,Acalanes Ridge,37.9,-122.08
1,Acampo,38.17,-121.28
2,Acton,34.5,-118.19
3,Adelanto city,34.59,-117.44
4,Adin,41.2,-120.95


### 2.3 Foursquare Location Data to Get Nearby Venues
### Merge above tables to get nearby venues with city name, latitude and longitude

### The Nearby Venues table in Final_project_part1 (after data wrangling)
### Then, new table will display the top 10 venues for each city.
### Next, KMeans cluster cities by top 10 venues.
### Finally, conclusion will compare cluster labels with SizeRank, Zri, and ZriRecordCnt.