# Zillow ML Application
Project Function: This application will be used for analyzing hot real estate deals based on geo location. 
# Visuals 
- Mapbox
- Histogram 
## What will we be analyzing ?
- Fair Market Price by neighborhood cluster
- Top deals 


In [None]:
import requests
import pandas as pd 

In [26]:
#Search String
pd.set_option('display.max_columns', None)
city = 'Boston'
state = 'MA'
search_str = city + ', ' +state
print('Search string:', search_str)


Search string: Boston, MA


In [27]:
#API for For Sale
url = ""
querystring = {"location":search_str}

headers = {
    'x-rapidapi-host': "",
    'x-rapidapi-key': ""
    }

In [29]:
response = requests.request("GET", url, headers=headers, params=querystring)
json = response.json()
df = pd.json_normalize(data=json['props'])

## Home Types Availble Through API

RecentlySold are available:

- Homes_for_You
- Price_High_Low
- Price_Low_High
- Newest
- Bedrooms
- Bathrooms
- Square_Feet
- Lot_Size

default Homes_for_You

For status_type = ForRent are available:

- Verified_Source
- Payment_High_Low
- Payment_Low_High
- Newest
- Bedrooms
- Bathrooms
- Square_Feet
- Lot_Size

default Verified_Source

In [31]:
df.head()

Unnamed: 0,dateSold,propertyType,lotAreaValue,address,priceChange,zestimate,imgSrc,price,bedrooms,contingentListingType,longitude,latitude,listingStatus,zpid,rentZestimate,daysOnZillow,bathrooms,livingArea,country,currency,lotAreaUnit,hasImage,variableData.text,variableData.type,listingSubType.is_FSBA,listingSubType.is_openHouse,unit,variableData
0,,SINGLE_FAMILY,9000.0,"95 Loring St, Hyde Park, MA 02136",,525512.0,https://photos.zillowstatic.com/fp/49eac7ae6c3...,499900,3,,-71.12415,42.245888,FOR_SALE,59128352,3200.0,-1,2,1433,USA,USD,sqft,True,Open: Sat. 12-2pm,OPEN_HOUSE,True,True,,
1,,CONDO,,"1 Franklin St UNIT 4204, Boston, MA 02110",,3681318.0,https://photos.zillowstatic.com/fp/f6eab9d54be...,3900000,3,,-71.059555,42.356316,FOR_SALE,246887237,14357.0,-1,3,2096,USA,USD,,True,Open: Sat. 11:30am-12:30pm,OPEN_HOUSE,True,True,Unit 4204,
2,,SINGLE_FAMILY,1980.0,"8 Claremont Park, Boston, MA 02118",,,https://photos.zillowstatic.com/fp/3cfd53392ba...,5495000,5,,-71.08123,42.34185,FOR_SALE,113389934,5441.0,-1,7,4440,USA,USD,sqft,True,Open: Sun. 12-1:30pm,OPEN_HOUSE,True,True,,
3,,SINGLE_FAMILY,1785.0,"504 E Broadway, Boston, MA 02127",,,https://photos.zillowstatic.com/fp/c9eaea44d06...,1599900,4,,,,FOR_SALE,2066646258,,-1,3,2868,USA,USD,sqft,True,Open: Thu. 4:30-6pm,OPEN_HOUSE,True,True,,
4,,SINGLE_FAMILY,4800.0,"5 Lewiston St, Hyde Park, MA 02136",,553797.0,https://photos.zillowstatic.com/fp/696de1d5607...,525000,3,,-71.112656,42.266773,FOR_SALE,59125651,3499.0,-1,2,1270,USA,USD,sqft,True,Open: Sat. 11am-12:30pm,OPEN_HOUSE,True,True,,


In [37]:
df.columns

Index(['dateSold', 'propertyType', 'lotAreaValue', 'address', 'priceChange',
       'zestimate', 'imgSrc', 'price', 'bedrooms', 'contingentListingType',
       'longitude', 'latitude', 'listingStatus', 'zpid', 'rentZestimate',
       'daysOnZillow', 'bathrooms', 'livingArea', 'country', 'currency',
       'lotAreaUnit', 'hasImage', 'variableData.text', 'variableData.type',
       'listingSubType.is_FSBA', 'listingSubType.is_openHouse', 'unit',
       'variableData'],
      dtype='object')

# Variable Uses

### KNN:
- Zpid ? possible unique geospatial identifier 
- Lat/Long
- Property Type

### K means
Iterations have to be done over different property types 
- Lat/Long (latitude / longitude)

### Regression 
Has to be filtered on home type 
- Price (price)
- SQFT



# Junk Code

import plotly.express as px
import numpy as np
df.fillna(df.mean(), inplace=True)


X = df[['propertyType', 'latitude', 'longitude']]
X = pd.get_dummies(X, columns=['propertyType'])
kmeans = KMeans(n_clusters=3)
kmeans.fit(X)
labels = kmeans.predict(X)

fig = px.scatter_mapbox(df, lat='latitude', lon='longitude', color='propertyType', zoom=10,
                        mapbox_style='open-street-map', width=800, height=600)

# Generate dot sizes based on home price
dot_sizes = np.interp(df['price'], (df['price'].min(), df['price'].max()), (5, 15))

# Add scatter trace with dot sizes and home price labels
fig.add_trace(px.scatter_mapbox(df, lat='latitude', lon='longitude', size=dot_sizes, color='propertyType',
                                 hover_data=['price']).data[0])

# Add home price labels
for i, row in df.iterrows():
    fig.add_annotation(
        dict(
            text=row['price'],
            x=row['longitude'],
            y=row['latitude'],
            font=dict(color='black', size=10),
            showarrow=False,
        )
    )

# Update the layout to adjust the font size of the title and legend
fig.update_layout(title='Property Clusters', title_font_size=20, legend_font_size=16)

