# Analysis of rising housing costs for UCSD students
### By
- Andrew T. Li, A12818225
- Andrew Yoo, A11346949
- Chelin Huang, A53053719
- Alex LaBranche, A14266131
- Matthias Baker, A13788705

### Introduction and Background:
As students we have to worry about whether or not we can graduate with good grades and without too much debt. Dealing with these things with so many unknowns can make a student's life more difficult than it has to be. So we wanted to explore one of those unknowns -- off campus housing. Many of us live off campus due to UCSD no longer offering 4 year guarunteed on-campus housing, and many of us rely on financial aid. So we wanted to explore the relationships between off-campus housing, rising tuition costs, and if financial aid is keeping up with both of them.

Our group would like to determine if housing prices and tuiton costs are rising, but fincancial aid is not adjusting accordingly then financial aid packages should be adjusted.


### DataSet: Zillow API
- Link to the dataset: https://www.zillow.com/howto/api/APIOverview.htm

From the Zillow API we are using the GetDeepSearchResults. This will allow us to get information such as rent price, when the home entered the market, location, region name, and much more. The specific fields our group is interested in are location, time the property entered the market, and rent per month. This will allow us to analyze the financial impact of housing relevant to students in the UCSD area. 

The Zillow API allows up to 1000 requests per day, and for the purposes of our project we will be using roughly 10,000 observations. These observations range from 2010 to 2017. 

### DataSet: UCSD Financial Aid
- Link to data:


### Data Cleaning

In order to analyze housing prices in UCSD, data that was pulled from Zillow needed to be cleaned. Observations needed to be filtered so that the homes and apartments were in University City, and posted during the time period 2010 - 2017. 

In [1]:
import urllib
import json
import requests
import pprint

# For XML -> json
import xml.etree.ElementTree as ET
from xmljson import badgerfish as bf

In [2]:
# For url -> json
!curl -i -X PUT -H "Content-Type: application/json" -d

curl: option -d: requires parameter
curl: try 'curl --help' or 'curl --manual' for more information


In [3]:
# Put in your zws_id here as a string
zws_id=""

In [None]:
# Use redirected request of zpid only url to get address on zillow
import requests
response = requests.get("https://www.zillow.com/homedetails/16842323_zpid/")
if response.history:
    print ("Request was redirected")
    for resp in response.history:
        print (resp.status_code, resp.url)
    print ("Final destination:")
    print (response.status_code, response.url)
else:
    print ("Request was not redirected")

In [None]:
def API_URL(zws_id, zpid):
    response = requests.get("https://www.zillow.com/homedetails/" + str(zpid) + "_zpid/")
    address = response.url.split('/')[-3]
    url = "http://www.zillow.com/webservice/GetDeepSearchResults.htm?zws-id=" + zws_id + "&address="
    street = address.split('-')[:-4]
    for i in range(len(street)-1):
        url += street[i]
        url += "+"
    url += street[-1] + "&citystatezip=" + address.split('-')[-4] + "+" + address.split('-')[-3]
    url += "%2C+CA&rentzestimate=true"
    return url


def rentData_URL(zpid):
    return "https://www.zillow.com/ajax/homedetail/HomeValueChartData.htm?mt=9&zpid=" + str(zpid) + "&format=json"

result_str = '{http://www.zillow.com/static/xsd/SearchResults.xsd}searchresults'

In [None]:
# 'a' for appending newly scraping data
outfile = open('data.json', 'a')

merged_json = dict()
for zpid in range(16837859,(16837859+0)):
    try:
        json_property = json.loads(json.dumps(bf.data(ET.fromstring(urllib.request.urlopen(API_URL(zws_id,zpid)).read()))))
    except IndexError:
        continue
    if json_property[result_str]['message']['code']['$'] != 0:
        if json_property[result_str]['message']['code']['$'] == 7:
            print("this account has reached is maximum number of calls for today")
            print("The last index")
            print(zpid)
            break
        else:
            continue
    json_propertyResponse = json_property[result_str]['response']
    json_rentHistory = json.loads(urllib.request.urlopen(rentData_URL(zpid)).read())[0]
    json_propertyResponse["HomeValueChartData"] = json.dumps(json_rentHistory)
    merged_json['zpid'] = zpid
    merged_json['data'] = json_propertyResponse
    json.dump(merged_json, outfile)
    outfile.write('\n')
outfile.close()
print("Done")

In [None]:
# Load the result json file to dest_json
dest_json = dict()
for line in open('data.json','r'):
    temp = json.loads(line)
    dest_json[temp['zpid']] = temp['data']

In [None]:
pprint.pprint(dest_json[16842300])