Skip to content

Data Sources

crvenkapa edited this page Mar 25, 2017 · 49 revisions

Here is a list of the data sets we'll be using for the hack day. In the bullet points below each source of data you can find the associated SQL table names you can use for querying:

  • SPARCS 2013 and 2014 [Source, Documentation]

    • `smart_cities_data.sparcs_2013`
    • `smart_cities_data.sparcs_2014`
  • Zillow median rent reports by neighborhood and zip code [Source, Documentation]

    • `smart_cities_data.median_rents_neighborhood_1br`
    • `smart_cities_data.median_rents_neighborhood_2br`
    • `smart_cities_data.median_rents_neighborhood_studio`
    • `smart_cities_data.median_rents_zipcode_1br`
    • `smart_cities_data.median_rents_zipcode_2br`
    • `smart_cities_data.median_rents_zipcode_studio`
  • MIT street safety scode data [Source, Documentation]

    • `smart_cities_data.mit_streetscore`
  • Active Liquor and Tobacco license data [Sources: Liquor, Tobacco, Documentation]

    • `smart_cities_data.liquor_licenses`
    • `smart_cities_data.tobacco_vendors`
  • 311 Complaint Data [Source, Documentation]

    • `bigquery-public-data.new_york.311_service_requests`
  • Citibike Data [Source, Documentation]

    • `bigquery-public-data.new_york.citibike_stations`
    • `bigquery-public-data.new_york.citibike_trips`
  • TLC Trip Data [Source, Documentation]

    • `bigquery-public-data.new_york.tlc_fhv_trips_2015`
    • `bigquery-public-data.new_york.tlc_fhv_trips_2016`
    • `bigquery-public-data.new_york.tlc_green_trips_2013`
    • `bigquery-public-data.new_york.tlc_green_trips_2014`
    • `bigquery-public-data.new_york.tlc_green_trips_2015`
    • `bigquery-public-data.new_york.tlc_green_trips_2016`
    • `bigquery-public-data.new_york.tlc_yellow_trips_2009`
    • `bigquery-public-data.new_york.tlc_yellow_trips_2010`
    • `bigquery-public-data.new_york.tlc_yellow_trips_2011`
    • `bigquery-public-data.new_york.tlc_yellow_trips_2012`
    • `bigquery-public-data.new_york.tlc_yellow_trips_2013`
    • `bigquery-public-data.new_york.tlc_yellow_trips_2014`
    • `bigquery-public-data.new_york.tlc_yellow_trips_2015`
    • `bigquery-public-data.new_york.tlc_yellow_trips_2016`
  • Tree Census [Sources: 1995, 2005, 2015 and Tree Species, Documentation]

    • `bigquery-public-data.new_york.tree_census_1995`
    • `bigquery-public-data.new_york.tree_census_2005`
    • `bigquery-public-data.new_york.tree_census_2015`
    • `bigquery-public-data.new_york.tree_species`
  • Highway length by zip code [Source, Documentation]

    • `smart_cities_data.RoadLenghtByZIP`
  • Census data (2000 and 2010) [Source, Documentation]

    • `smart_cities_data.census00NYC`
    • `smart_cities_data.census10NYC`
    • `smart_cities_data.IncomeCensus00NYC`
    • `smart_cities_data.IncomeCensus10NYC`
    • `smart_cities_data.census00NYC_metadata`
    • `smart_cities_data.census10NYC_metadata`
    • `smart_cities_data.IncomeCensus00NYC_metadata`
    • `smart_cities_data.IncomeCensus10NYC_metadata`
  • NYPD Crime Statistics [Source, Documentation]

    • `smart_cities_data.NYPD_crime_stats_2000_2015`
  • Park Inspections [Source, Documentation]

    • `smart_cities_data.park_inspections`
  • MTA stations [Source, Documentation]

    • `smart_cities_data.NYC_MTA_stations`
  • MTA Turnsiles will be used for the cleaning track and is not available via SQL. It can be downloaded directly from the source.

  • Uber data available through this fiveThirtyEight repo.

  • Shape files - some shapefiles and geojson files are collected in this repository. Here is a quick tutorial on how to use them: