# Introduction to PostGIS

**Setting up the conda env:**

```
conda create -n sql python
conda activate sql
conda install ipython-sql sqlalchemy psycopg2 notebook pandas -c conda-forge
```

**Sample dataset:**
- [nyc_data.zip](https://github.com/giswqs/postgis/raw/master/data/nyc_data.zip) (Watch this [video](https://youtu.be/fROzLrjNDrs) to load data into PostGIS)

**References**:
- [Introduction to PostGIS](https://postgis.net/workshops/postgis-intro)
- [Using SQL with Geodatabases](https://desktop.arcgis.com/en/arcmap/latest/manage-data/using-sql-with-gdbs/sql-and-enterprise-geodatabases.htm)

## Connecting to the database

In [1]:
%load_ext sql

In [2]:
import os

In [3]:
host = "localhost"
database = "nyc"
user = os.getenv('SQL_USER')
password = os.getenv('SQL_PASSWORD')

In [4]:
connection_string = f"postgresql://{user}:{password}@{host}/{database}"

In [5]:
%sql $connection_string

'Connected: postgres@nyc'

In [6]:
%%sql 

SELECT * FROM nyc_neighborhoods WHERE FALSE

 * postgresql://postgres:***@localhost/nyc
0 rows affected.


id,geom,boroname,name


In [7]:
%%sql 

SELECT id, boroname, name from nyc_neighborhoods LIMIT 10

 * postgresql://postgres:***@localhost/nyc
10 rows affected.


id,boroname,name
1,Brooklyn,Bensonhurst
2,Manhattan,East Village
3,Manhattan,West Village
4,The Bronx,Throggs Neck
5,The Bronx,Wakefield-Williamsbridge
6,Queens,Auburndale
7,Manhattan,Battery Park
8,Manhattan,Carnegie Hill
9,Staten Island,Mariners Harbor
10,Staten Island,Rossville


## Simple SQL

In [8]:
%%sql

SELECT postgis_full_version()

 * postgresql://postgres:***@localhost/nyc
1 rows affected.


postgis_full_version
"POSTGIS=""3.4.1 3.4.1"" [EXTENSION] PGSQL=""160"" GEOS=""3.12.1-CAPI-1.18.1"" PROJ=""8.2.1 NETWORK_ENABLED=OFF URL_ENDPOINT=https://cdn.proj.org USER_WRITABLE_DIRECTORY=C:\WINDOWS\ServiceProfiles\NetworkService\AppData\Local/proj DATABASE_PATH=C:\Program Files\PostgreSQL\16\share\contrib\postgis-3.4\proj\proj.db"" LIBXML=""2.9.14"" LIBJSON=""0.12"" LIBPROTOBUF=""1.2.1"" WAGYU=""0.5.0 (Internal)"""


### NYC Neighborhoods

![](https://i.imgur.com/eycL547.png)

What are the names of all the neighborhoods in New York City?

In [9]:
%%sql

SELECT name FROM nyc_neighborhoods

 * postgresql://postgres:***@localhost/nyc
129 rows affected.


name
Bensonhurst
East Village
West Village
Throggs Neck
Wakefield-Williamsbridge
Auburndale
Battery Park
Carnegie Hill
Mariners Harbor
Rossville


What are the names of all the neighborhoods in Brooklyn?

In [10]:
%%sql

SELECT name
FROM nyc_neighborhoods
WHERE boroname = 'Brooklyn'

 * postgresql://postgres:***@localhost/nyc
23 rows affected.


name
Bensonhurst
Bay Ridge
Boerum Hill
Cobble Hill
Downtown
Sunset Park
Borough Park
East Brooklyn
Flatbush
Park Slope


What is the number of letters in the names of all the neighborhoods in Brooklyn?

In [11]:
%%sql

SELECT char_length(name)
FROM nyc_neighborhoods
WHERE boroname = 'Brooklyn'

 * postgresql://postgres:***@localhost/nyc
23 rows affected.


char_length
11
9
11
11
8
11
12
13
8
10


What is the average number of letters and standard deviation of number of letters in the names of all the neighborhoods in Brooklyn?

In [12]:
%%sql

SELECT avg(char_length(name)), stddev(char_length(name))
FROM nyc_neighborhoods
WHERE boroname = 'Brooklyn'

 * postgresql://postgres:***@localhost/nyc
1 rows affected.


avg,stddev
11.739130434782608,3.91056135594074


What is the average number of letters in the names of all the neighborhoods in New York City, reported by borough?

In [13]:
%%sql

SELECT boroname, avg(char_length(name)), stddev(char_length(name))
FROM nyc_neighborhoods
GROUP BY boroname

 * postgresql://postgres:***@localhost/nyc
5 rows affected.


boroname,avg,stddev
Queens,11.666666666666666,5.005743827281598
Brooklyn,11.739130434782608,3.91056135594074
Staten Island,12.291666666666666,5.204339048095947
The Bronx,12.041666666666666,3.6651017740975154
Manhattan,11.821428571428571,4.312372994832526


### NYC Census Blocks

![](https://i.imgur.com/tHyMJMm.png)

In [14]:
%%sql

SELECT * FROM nyc_census_blocks WHERE FALSE

 * postgresql://postgres:***@localhost/nyc
0 rows affected.


id,geom,blkid,popn_total,popn_white,popn_black,popn_nativ,popn_asian,popn_other,boroname


What is the population of the City of New York?

In [15]:
%%sql 

SELECT Sum(popn_total) AS population
FROM nyc_census_blocks

 * postgresql://postgres:***@localhost/nyc
1 rows affected.


population
8175032


What is the population of the Bronx?

In [16]:
%%sql 

SELECT SUM(popn_total) AS population
FROM nyc_census_blocks
WHERE boroname = 'The Bronx'

 * postgresql://postgres:***@localhost/nyc
1 rows affected.


population
1385108


For each borough, what percentage of the population is white?

In [17]:
%%sql

SELECT
boroname,
100 * SUM(popn_white)/SUM(popn_total) AS white_pct
FROM nyc_census_blocks
GROUP BY boroname

 * postgresql://postgres:***@localhost/nyc
5 rows affected.


boroname,white_pct
Queens,39.72207739459101
Brooklyn,42.80117379326865
The Bronx,27.903744689944755
Manhattan,57.44930394804628
Staten Island,72.89420348601539
