# Introduction

The data used in this project was obtained from two sources: 

- [NYC Open Data's 311 Service Requests from 2010 to Present](https://data.cityofnewyork.us/Social-Services/311-Service-Requests-from-2010-to-Present/erm2-nwe9)
  - This dataset contains information about the time, location, complaint type, and status of more than 24 million 311 service requests made in New York City within the past decade. This project uses a subset of the data from 2020 that was accessed with the [Socrata Open Data (SODA) API](https://dev.socrata.com/consumers/getting-started.html). 
- [NYC Department of City Planning’s Community District Profiles](https://communityprofiles.planning.nyc.gov/)
  - After navigating to any profile on the Community District Profiles website,  the Indicators Data can be obtained under "Download the Data." This dataset contains development and population information for each Community District in New York City. Community board names, which correspond to community districts, can also be found in the 311 dataset. 



# Loading Dependencies

In [None]:
pip install sodapy

Collecting sodapy
  Downloading https://files.pythonhosted.org/packages/9e/74/95fb7d45bbe7f1de43caac45d7dd4807ef1e15881564a00eef489a3bb5c6/sodapy-2.1.0-py2.py3-none-any.whl
Installing collected packages: sodapy
Successfully installed sodapy-2.1.0


In [None]:
from sodapy import Socrata
import pandas as pd
from google.colab import drive


The module is deprecated in version 0.21 and will be removed in version 0.23 since we've dropped support for Python 2.7. Please rely on the official version of six (https://pypi.org/project/six/).


The sklearn.neighbors.base module is  deprecated in version 0.22 and will be removed in version 0.24. The corresponding classes / functions should instead be imported from sklearn.neighbors. Anything that cannot be imported from sklearn.neighbors is now part of the private API.



# Obtaining and Exporting the 311 Data

The [API Documentation](https://dev.socrata.com/foundry/data.cityofnewyork.us/erm2-nwe9) for this dataset contains further information about how to obtain filtered versions of the data. Below, we simply request the 1450000 most recent 311 calls. The line ``client.timeout = 1000`` prevents the Socrata connection from timing out after its default setting of 10 seconds. 

In [None]:
client = Socrata("data.cityofnewyork.us", None)

client.timeout = 1000
results = client.get("erm2-nwe9", limit=1450000)

The results are stored into a pandas dataframe and exported as a CSV file to Google Drive:

In [None]:
df = pd.DataFrame.from_records(results)

In [None]:
drive.mount("/content/gdrive")

Mounted at /content/gdrive


In [None]:
df.to_csv('/content/gdrive/My Drive/311.csv', header=True)

# Google Drive Links

The datasets can also be accessed via public Google Drive documents at the links below.

- [311 Service Requests](https://drive.google.com/file/d/1-2N5LdtbgESvjH4fONjY7ee24FLOb0Np/view?usp=sharing)

- [Community District Indicators](https://drive.google.com/file/d/1CBg0A-Y1IocnqQTtBQ_35kMZNwa5wxR4/view?usp=sharing)