# Week 5 Quiz

This notebook contains the SQL Quiz for Week 5. Section 1 uses the New York City data we know and love. Section 2 uses building footprint data from the Google Open Buildings dataset.

INSTRUCTIONS:

Run this notebook in Google Colab. The answer to each question will be a number or a string. Input these into the corresponding question on Moodle. You have 90 minutes to attempt the quiz, so if you get stuck on a question, move on.

Make sure you run all of the cells of code in order, especially the ones that already have code in them! If you run into serious problems, try clicking on the "runtime" tab above and selecting "restart session and run all".

In [None]:
%pip install duckdb duckdb-engine jupysql leafmap

In [None]:
import duckdb
import leafmap
%load_ext sql
%config SqlMagic.autopandas = True
%config SqlMagic.feedback = False
%config SqlMagic.displaycon = False

In [None]:
%%sql
duckdb:///:memory:
INSTALL spatial;
LOAD spatial;


# Section 1



The code below downloads the data on New York City that we've been working with so far.

In [None]:
url='https://s3.amazonaws.com/s3.cleverelephant.ca/postgis-workshop-2020.zip'
leafmap.download_file(url, unzip=True)

The following line of code creates a table called `nyc_neighborhoods` from a shapefile called `nyc_neighborhoods.shp`, located in the `postgis-workshop/data/` folder.

In [None]:
%%sql

CREATE TABLE nyc_neighborhoods AS SELECT * FROM "postgis-workshop/data/nyc_neighborhoods.shp";

The `%%sql` at the top of the cell allows you to run SQL code in the rest of the cell:

In [None]:
%%sql

SELECT * FROM nyc_neighborhoods;

## Question 1

Create the following tables using the corresponding shapefiles.
- nyc_census_blocks
- nyc_homicides
- nyc_streets
- nyc_subway_stations

In [None]:
%%sql


## Question 2:
How many rows are there in the nyc_homicides table?



In [None]:
%%sql


## Question 3:

How many homicides were there in Brooklyn in 2008?

In [None]:
%%sql


## Question 4:

Which neighborhood of New York had the most murders in 2010?

In [None]:
%%sql



## Question 5

Calculate the per-capita murder rates for each borough of New York.

In [None]:
%%sql


# Section 2

This section uses building footprint data generated by Google. Each row is a polygon that specifies the outline of a building detected using AI and satellite imagery.

![](https://sites.research.google/open-buildings/static/img/buildings-header-light-cropped.png)

We're going to be working with all of the building footprints identified in Kigali, the capital city of Rwanda.

The code below creates a table called `kigali_buildings` by reading a parquet file hosted a the URL in the final line.

In [None]:
%%sql

CREATE TABLE kigali_buildings AS
SELECT full_plus_code as building_id, ST_GeomFromWKB(geometry) as geom
FROM read_parquet('https://data.source.coop/cholmes/google-open-buildings/v2/geoparquet-admin1/country=RWA/City_of_Kigali.parquet');

Our table has two columns; one called `building_id` which is a unique identifier for each building footprint, and `geom` which contains the geometry information.

In [None]:
%%sql
select * from kigali_buildings

## Question 6
Complete the code below to create a column called `area` which calculates the area of each building using the `geom` column and multiply the result by 12356260000 to get the area in meters.

In [None]:
%%sql

ALTER TABLE kigali_buildings ADD COLUMN area DOUBLE;
UPDATE kigali_buildings SET area = ...

## Question 7

How many buildings in Kigali are larger than 100 square meters?

In [None]:
%%sql


## Question 8

How many square meters is the largest building in Kigali?

In [None]:
%%sql


## Question 9

How many buildings are within a distance of 0.001 from the largest building in Kigali? (be careful)

In [None]:
%%sql


## Question 10

A common characteristic of informal settlements (sometimes referred to as "slums") is the presence of a large number of buildings in close proximity to each other. In this question, we are going to create a measure of building density.

First, let's create a new table called `sample` which contains the first 1000 rows of data in our datset:

In [None]:
%%sql

CREATE TABLE sample AS SELECT * FROM kigali_buildings limit 1000;

Spatially join the `sample` table to itself to identify buildings within a distance of 0.001 of each other. (Hint: `count` and `group by` will be useful functions).

What is the `building_id` of the building with the greatest number of buildings within a distance of 0.001?

In [None]:
%%sql


