# Making Request to the NYC 311 API

A few of my students are doing a project using data from https://www.buzzfeednews.com/article/lamvo/gentrification-complaints-311-new-york, over the course of this project, they needed to make requests to the NYC 311 database https://data.cityofnewyork.us/Social-Services/311-Service-Requests-from-2010-to-Present/erm2-nwe9.

This is a quick tutorial for them. Let us start with something relatively simple.

Queuries are made from URLs of the form `endpoint` + `?` + `query`, so for example, the url for finding all indicdents created on April 1st, 2019 would be

In [1]:
# April Fools Incidents Request
endpoint <- 'https://data.cityofnewyork.us/resource/fhrw-4uyv.csv'
query <- 'created_date=2019-04-01T00:00:00.000'
url <- paste(endpoint, "?", query, sep="")
print(url)

[1] "https://data.cityofnewyork.us/resource/fhrw-4uyv.csv?created_date=2019-04-01T00:00:00.000"


We can use this to make a request for incidents from that date simply by passing this url to `read.csv`

In [2]:
#April Fools Incidents dataframe
april.fools.2019 <- read.csv(url)
head(april.fools.2019, 2)

address_type,agency,agency_name,bbl,borough,bridge_highway_direction,bridge_highway_name,bridge_highway_segment,city,closed_date,⋯,resolution_description,road_ramp,status,street_name,taxi_company_borough,taxi_pick_up_location,unique_key,vehicle_type,x_coordinate_state_plane,y_coordinate_state_plane
ADDRESS,DOHMH,Department of Health and Mental Hygiene,3042340024,BROOKLYN,,,,BROOKLYN,,⋯,The Department of Health and Mental Hygiene will review your complaint to determine appropriate action. Complaints of this type usually result in an inspection. Please call 311 in 30 days from the date of your complaint for status,,Assigned,CRESCENT STREET,,,42112599,,1020319,185300
ADDRESS,DOHMH,Department of Health and Mental Hygiene,4098230009,QUEENS,,,,Jamaica,2019-04-05T13:46:23.000,⋯,The Department of Health and Mental Hygiene will review your complaint to determine appropriate action. Complaints of this type usually result in an inspection. Please call 311 in 30 days from the date of your complaint for status,,Closed,88 AVENUE,,,42111461,,1041795,198036


Now let's get a little fancier, say we instead wanted to get all the incidents *after* April 1st. We would have to modify our syntax a little bit. This syntax is proprietary to Socrata, you can see a tutorial here https://dev.socrata.com/docs/queries/,  but it is designed to replicate SQL, if you are familiar with that it helps a lot. If you find it confusing, pls ask me, I do NOT want to accidentally DDOS the server.



In [3]:
# Post April Fools Request
endpoint <- 'https://data.cityofnewyork.us/resource/fhrw-4uyv.csv'
query <- "$where=created_date>='2019-04-01T00:00:00.000'"
url2 <- paste(endpoint, "?", query, sep="")
print(url2)
april.fools.plus <- read.csv(url2)

[1] "https://data.cityofnewyork.us/resource/fhrw-4uyv.csv?$where=created_date>='2019-04-01T00:00:00.000'"


In [4]:
"Number of Rows:" 
nrow(april.fools.plus)


You will notice a few things... 

1. The floating datetime had to be in Brittish Quotes ''. 

2. It only output 1000 rows.... this is because there is a limit to what gryd will import, so there have been more than 1000 incidents this month, and we are only seeing a portion.

## Getting the incidents from the neighborhood in the article

The following is a list of addresses which I took from the Buzzfeed articles github page. These are the street addresses of the neighborhood which was studied in the article.

In [5]:
addresses<-c("600 WEST 136 STREET",
        "601 WEST 136 STREET",
        "602 WEST 136 STREET",
        "603 WEST 136 STREET",
        "604 WEST 136 STREET",
        "605 WEST 136 STREET",
        "606 WEST 136 STREET",
        "607 WEST 136 STREET",
        "608 WEST 136 STREET",
        "609 WEST 136 STREET",
        "610 WEST 136 STREET",
        "611 WEST 136 STREET",
        "612 WEST 136 STREET",
        "613 WEST 136 STREET",
        "614 WEST 136 STREET",
        "615 WEST 136 STREET",
        "616 WEST 136 STREET",
        "617 WEST 136 STREET",
        "618 WEST 136 STREET",
        "619 WEST 136 STREET",
        "620 WEST 136 STREET",
        "621 WEST 136 STREET",
        "622 WEST 136 STREET",
        "623 WEST 136 STREET",
        "624 WEST 136 STREET",
        "625 WEST 136 STREET",
        "626 WEST 136 STREET",
        "627 WEST 136 STREET",
        "628 WEST 136 STREET",
        "629 WEST 136 STREET",
        "630 WEST 136 STREET",
        "631 WEST 136 STREET",
        "632 WEST 136 STREET",
        "633 WEST 136 STREET",
        "634 WEST 136 STREET",
        "635 WEST 136 STREET",
        "636 WEST 136 STREET",
        '615 WEST 136 STREET',
        '618 WEST 136 STREET',
        '622 WEST 136 STREET',
        '619 WEST 136 STREET',
        '623 WEST 136 STREET',
        '611 WEST 136 STREET',
        '636 WEST 136 STREET',
        '621 WEST 136 STREET',
        '612 WEST 136 STREET',
        '607 WEST 136 STREET',
        '616 WEST 136 STREET',
        '624 WEST 136 STREET',
        '625 WEST 136 STREET',
        '626 WEST 136 STREET',
        '614 WEST 136 STREET',
        '600 WEST 136 STREET',
        '613 WEST 136 STREET',
        '620 WEST 136 STREET',
        '630 WEST 136 STREET',
        '623 WEST  136 STREET',
        '636 WEST  136 STREET',
        '622 WEST  136 STREET',
        '627 WEST 136 STREET',
        '634 WEST 136 STREET') 
addresses <- unique(addresses)

If we want to create a request for a single address then it would look like

In [6]:
endpoint <- "https://data.cityofnewyork.us/resource/fhrw-4uyv.csv"
query <- 'incident_address=634%20WEST%20136%20STREET'
request <- paste(endpoint, "?", query, sep="")
data<-read.csv(request)

In [7]:
data

address_type,agency,agency_name,bbl,borough,bridge_highway_direction,bridge_highway_name,bridge_highway_segment,city,closed_date,⋯,resolution_description,road_ramp,status,street_name,taxi_company_borough,taxi_pick_up_location,unique_key,vehicle_type,x_coordinate_state_plane,y_coordinate_state_plane
ADDRESS,NYPD,New York City Police Department,1020020097,MANHATTAN,,,,NEW YORK,2016-02-08T05:12:48.000,⋯,The Police Department responded to the complaint and with the information available observed no evidence of the violation at that time.,,Closed,WEST 136 STREET,,,32630281,,996567,238446


You'll notice that for our `query` we needed to replace all the spaces with `%20`'s. This is because of how HTML encodes white spaces. (on a related note, if you like podcasts, this is the cause of a bug in some car stereos when they go to play "99%Invisible" https://www.gimletmedia.com/shows/reply-all/brh8jm/140-the-roman-mars-mazda-virus)

So when we put together our for-loop, we are going to replace all the spacesm, make a request for all the addresses, then glue the dataframe to our old dataframe. 

In [17]:
data  = data.frame()
endpoint <- "https://data.cityofnewyork.us/resource/fhrw-4uyv.csv"
query <- 'incident_address='
overflow.addresses <- c()

for( address in addresses){
    # fix roman mars virus....
    mars_friendly_address <- gsub(" ", "%20", address)
    
    # make the request
    request <- paste(endpoint, "?", query, mars_friendly_address, sep="")
    temp_df <- read.csv(request)
    
    #test to make sure we don't have 1000 rows
    if(nrow(temp_df)==1000){
        print(paste("To many incidents at ", address))
        overflow.addresses <- c(overflow.addresses, mars_friendly_address)
    }
    # otherwise we add it to the previous dataframe
    else{data <- rbind(data, temp_df)}
}
    

[1] "To many incidents at  615 WEST 136 STREET"
[1] "To many incidents at  622 WEST 136 STREET"


Oh no! There are a lot 311 complaints at those two addresses, so we have to do something different with those two addresses! Hopefully we can combine our times part of the call with the address part of the call.

In [47]:
years <- factor(2010:2019)

for(year in years){
    url <- paste(endpoint, '?',"$where=date_extract_y(created_date)=", year, '&', 'incident_address=', overflow.addresses[1], sep="" )
    test.data <- read.csv(url)
    data <- rbind(data, test.data)
}

for(year in years){
    url <- paste(endpoint, '?',"$where=date_extract_y(created_date)=", year, '&', 'incident_address=', overflow.addresses[2], sep="" )
    test.data <- read.csv(url)
    data <- rbind(data, test.data)
}

In [50]:
nrow(data)
ncol(data)

And there we have it, the data from the neighborhood that the article was written about.