## Setting up the PostGIS database

Login to postgres (either command line or PgAdmin)

```
psql -d postgres -U username
```

Then create a new database for the project with the PostGIS extension

```sql
CREATE DATABASE urban_form_toronto;
\c urban_form_toronto;
CREATE EXTENSION postgis;
```

More info on PostGIS: https://postgis.net/

## Load in the spatial data

Using `shp2pgsql` to load in a hex grid (created in QGIS) and bouandaries for census Dissemination Areas (DA), census blocks (DB), and Traffic Analysis Zones (TAZ) which we will use for joining data

```sh
# an entire hex grid - 1.3 million or so records
shp2pgsql -I -s 32617 -W "latin1" input_data/thex_grid/hex_grid_200m.shp hex_grid_200m | psql -U ja -d urban_form_toronto

# a hex grid for testing - a subset of 260 hex near downtown Toronto
shp2pgsql -I -s 32617 -W "latin1" input_data/thex_grid/hex_grid_200m_test_subset.shp hex_grid_200m_subset | psql -U ja -d urban_form_toronto

# TAZ
shp2pgsql -I -s 32617 -W "latin1" input_data/tspatial_boundaries/TAZ/TAZ_utm17n.shp zones_TAZ | psql -U ja -d urban_form_toronto

# DA 2016
shp2pgsql -I -s 32617 -W "latin1" input_data/tspatial_boundaries/CensusDisseminationAreas/2016/DA_GGH_utm17n.shp zones_DA16 | psql -U ja -d urban_form_toronto

# DA 2011
shp2pgsql -I -s 32617 -W "latin1" input_data/tspatial_boundaries/CensusDisseminationAreas/2011/DA_GGH_utm17n_2011.shp zones_DA11 | psql -U ja -d urban_form_toronto

# DB
shp2pgsql -I -s 32617 -W "latin1" input_data/tspatial_boundaries/CensusBlocks/2016/DB_GGH_utm17n.shp zones_DB16 | psql -U ja -d urban_form_toronto
```

## Load in the tabular data

Block-level (DB) population data for 2016:

```sql
-- create the table
DROP TABLE IF EXISTS table_DB_pop_2016;
CREATE TABLE table_DB_pop_2016
(dbuid character varying, dauid character varying, pop2016 character varying);

-- add in the csv
COPY table_DB_pop_2016 FROM 'input_data/tabular_data/DB_block_2016_population.csv' WITH (FORMAT csv);

-- updating the population column to an integer, all blanks to 0
ALTER TABLE table_DB_pop_2016 ADD COLUMN pop2016int integer;
UPDATE table_DB_pop_2016 SET pop2016int = CAST(coalesce(pop2016, '0') AS integer);
ALTER TABLE table_DB_pop_2016 DROP COLUMN pop2016;
```

Business count data for 2016 (linked to 2011 geographies)

https://dataverse.scholarsportal.info/dataset.xhtml?persistentId=doi:10.5683/SP/FLLHOV&version=2.0



```sql
-- create the table
DROP TABLE IF EXISTS table_DA_business_2016_in_11da;
CREATE TABLE table_DA_business_2016_in_11da
(dauid character varying, business2016 integer);

-- add in the csv
COPY table_DA_business_2016_in_11da FROM 'input_data/tabular_data/DA_2016_business_store_subset.csv' DELIMITER ',' CSV HEADER;
```


Employment data for 2016 (there are several DAs with suppressed data, any with an employment < 40)

http://odesi2.scholarsportal.info/documentation/CENSUS/2016/cen16labour.html

```sql
-- create the table
DROP TABLE IF EXISTS table_DA_emp_2016;
CREATE TABLE table_DA_emp_2016
(dauid character varying, emp2016 integer);

-- add in the csv
COPY table_DA_emp_2016 FROM 'input_data/tabular_data/DA_2016_business_store_subset.csv' DELIMITER ',' CSV HEADER;

```




## Backing up the database

Backup up the database and pipe it to a compressed file. Run this every so often :)

```sh
pg_dump urban_form_toronto | gzip > db_urban_form_toronto_backup.gz
```

And restore with the following if need be

```sh
gunzip -c db_urban_form_toronto_backup.gz | psql urban_form_toronto
```