## Introduction: Rental Housing in San Francisco

<div class="alert alert-block alert-info" style="margin-top: 20px">
    
1. [Problem and Discussion](#0)<br>
2. [Description of Data and How it will be Used](#1)<br>
</div>
<hr>

### 1. Problem and Discussion<a id="0"></a>

Often relocation is a time consuming process. Finding affordable housing with safe neighborhoods and prefered venues is a big challenge.  Data science can save time for finding and meeting such criteria by providing interactive visual tools through Jupyter notebook.  A Goal of this project is to provide such a sample in San Francisco utilizing  **Folium** library to make visual segmentation and clustering data in a map.  This notebook allows users to tweek few parameters and shows crime rate in neighborhoods, rental price ranges and venues on interactive maps.<br><br>
This project can help rentees considering moving to San Francisco or renters deciding reasonable rents since interactive visual aids can quickly allow users to see intuitive and interactive visual infromation. The use of FourSquare data and mapping techniques combined with data analysis will help providing clustered venues along with rents and crime rate in a single map. Lastly, this project is a good practical case toward the development of Data Science skills.<br>

### 2. Dicription of Data and How it will be Used<a id="1"></a>
<br>

In this jupyter notebook, main focal area is set to San Francisco.  This notebook will use geojson data from DataSF (https://data.sfgov.org/api/geospatial/pty2-tcw4?method=export&format=GeoJSON) for geographical information and police department incident report from DataSF (https://data.sfgov.org/api/views/wg3w-h783/rows.csv?accessType=DOWNLOAD) for crime statistics and finally use python-craigslist to retrieve set of most recent posts on interactive map.  The raw data from craigslist is programatically scraped.  The data will generate statistics and interactive visual aids for users.<br><br>
Use Foursquare and geopy data to map top 10 venues for all San Francisco neighborhoods and clustered in groups ( as per Course LAB). Use foursquare and geopy data to map the location of available rental housings and crime rates, separately and on top of the above clustered map in order to identify the venues.  The markers of rental housing display the rents and URL to the posts in the popups. Alternatively Boxplot and Choropleth Maps shows rents statistics and average rents respectively to give a general price trend in the neighborhoods. 

In [1]:
!pip install wget 

Collecting wget
  Downloading https://files.pythonhosted.org/packages/47/6a/62e288da7bcda82b935ff0c6cfe542970f04e29c756b0e147251b2fb251f/wget-3.2.zip
Building wheels for collected packages: wget
  Building wheel for wget (setup.py) ... [?25ldone
[?25h  Stored in directory: /home/dsxuser/.cache/pip/wheels/40/15/30/7d8f7cea2902b4db79e3fea550d7d7b85ecb27ef992b618f3f
Successfully built wget
Installing collected packages: wget
Successfully installed wget-3.2


In [5]:
!wget -q -O 'SF Find Neighborhoods.geojson' https://data.sfgov.org/api/geospatial/pty2-tcw4?method=export&format=GeoJSON
!wget -q -O 'Police_Department_Incident_Reports__2018_to_Present.csv' https://data.sfgov.org/api/views/wg3w-h783/rows.csv?accessType=DOWNLOAD    

NameError: name 'pd' is not defined

In [6]:
import numpy as np # library to handle data in a vectorized manner
import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
df_incidents = pd.read_csv('Police_Department_Incident_Reports__2018_to_Present.csv')
print('Dataset downloaded and read into a pandas dataframe!')
df_incidents.head()

Dataset downloaded and read into a pandas dataframe!


Unnamed: 0,Incident Datetime,Incident Date,Incident Time,Incident Year,Incident Day of Week,Report Datetime,Row ID,Incident ID,Incident Number,CAD Number,Report Type Code,Report Type Description,Filed Online,Incident Code,Incident Category,Incident Subcategory,Incident Description,Resolution,Intersection,CNN,Police District,Analysis Neighborhood,Supervisor District,Latitude,Longitude,point,SF Find Neighborhoods,Current Police Districts,Current Supervisor Districts,Analysis Neighborhoods,HSOC Zones as of 2018-06-05,OWED Public Spaces,Central Market/Tenderloin Boundary Polygon - Updated,Parks Alliance CPSI (27+TL sites),ESNCAG - Boundary File,"Areas of Vulnerability, 2016"
0,2020/02/03 02:45:00 PM,2020/02/03,14:45,2020,Monday,2020/02/03 05:50:00 PM,89881675000,898816,200085557,200342870.0,II,Initial,,75000,Missing Person,Missing Person,Found Person,Open or Active,20TH AVE \ WINSTON DR,33719000.0,Taraval,Lakeshore,7.0,37.72695,-122.476039,POINT (-122.47603947349434 37.72694991292525),41.0,10.0,8.0,16.0,,,,,,2.0
1,2020/02/03 03:45:00 AM,2020/02/03,03:45,2020,Monday,2020/02/03 03:45:00 AM,89860711012,898607,200083749,200340316.0,II,Initial,,11012,Stolen Property,Stolen Property,"Stolen Property, Possession with Knowledge, Re...",Cite or Arrest Adult,24TH ST \ SHOTWELL ST,24064000.0,Mission,Mission,9.0,37.75244,-122.415172,POINT (-122.41517229045435 37.752439644389675),53.0,3.0,2.0,20.0,3.0,,,,,2.0
2,2020/02/03 10:00:00 AM,2020/02/03,10:00,2020,Monday,2020/02/03 10:06:00 AM,89867264015,898672,200084060,200340808.0,II,Initial,,64015,Non-Criminal,Other,"Aided Case, Injured or Sick Person",Open or Active,MARKET ST \ POWELL ST,34016000.0,Tenderloin,Financial District/South Beach,3.0,37.78456,-122.407337,POINT (-122.40733704162238 37.784560141211806),19.0,5.0,3.0,8.0,,35.0,,,,2.0
3,2020/01/19 05:12:00 PM,2020/01/19,17:12,2020,Sunday,2020/02/01 01:01:00 PM,89863571000,898635,206024187,,II,Coplogic Initial,True,71000,Lost Property,Lost Property,Lost Property,Open or Active,,,Taraval,,,,,,,,,,,,,,,
4,2020/01/05 12:00:00 AM,2020/01/05,00:00,2020,Sunday,2020/02/03 04:09:00 PM,89877368020,898773,200085193,200342341.0,II,Initial,,68020,Miscellaneous Investigation,Miscellaneous Investigation,Miscellaneous Investigation,Open or Active,PINE ST \ DIVISADERO ST,26643000.0,Richmond,Pacific Heights,2.0,37.787112,-122.44025,POINT (-122.44024995765258 37.78711245591735),103.0,4.0,6.0,30.0,,,,,,1.0


<hr>

Copyright &copy; 2020. This notebook and its source code are released under the terms of the [MIT License](https://bigdatauniversity.com/mit-license/).