# Step 1 - Explore the road and district data your coworker sent you

Your coworker downloaded two publicly available datasets from the city's website as shapefiles:
- [Road network](https://data.stadt-zuerich.ch/dataset/geo_fuss__und_velowegnetz) with additional information for bikes (in Switzerland called _velo_) and passengers saved as _20220405_veloFusswegnetzZurich_ in the data folder.
- [Statistical zones dataset](https://data.stadt-zuerich.ch/dataset/geo_statistische_zonen) which divides the city into 216 districts for statistical purposes, saved as _20220405_statistischeQuartiereZurich_ in the data folder.

In this section we will use the GDAL command line utility [ogrinfo](https://gdal.org/programs/ogrinfo.html) to explore the datasets your coworker sent you and make sure they match our requirements. The following commands are run on the command line. This jupyter lab setup provides you with a linux bash shell with the necessary commands configured.

**As a preparation for all next steps, open a terminal and navigate to the folder of this story (same folder as this jupyter notebook) if necessary.** You can use the commands `pwd` (shows where you currently are), `ls` (lists folder and filenames) and `cd` (change directory). Good to know: When using `cd` you can start typing and hit Tabulator for autocompletion.

![open terminal](./story_images/open_terminal.gif)

***
## Explore the road network dataset

Let's now use `ogrinfo` to explore the road network shapefile using read only mode (`-ro`) and print only summary information (`-so`). Run the following command in the terminal:

```shell
ogrinfo -ro -so "./data/20220405_veloFusswegnetzZurich/taz_mm.tbl_routennetz.shp"
```

The output lists all layers in the data source. Not surprisingly there is only a single layer in the shapefile:
```
INFO: Open of `./data/20220405_veloFusswegnetzZurich/taz_mm.tbl_routennetz.shp'
     using driver `ESRI Shapefile' successful.
1: taz_mm.tbl_routennetz (Line String)
```

***

When a layer is specified, ogrinfo provides useful information about this specific layer. You can now provide the layer name to ogrinfo to get information about this data layer: 
```shell
ogrinfo -ro -so "./data/20220405_veloFusswegnetzZurich/taz_mm.tbl_routennetz.shp" "taz_mm.tbl_routennetz"
```

The terminal should show the following output:
```
INFO: Open of `./data/20220405_veloFusswegnetzZurich/taz_mm.tbl_routennetz.shp'
      using driver `ESRI Shapefile' successful.

Layer name: taz_mm.tbl_routennetz
Metadata:
  DBF_DATE_LAST_UPDATE=2022-04-05
Geometry: Line String
Feature Count: 40065
Extent: (2676247.120400, 1241239.066500) - (2689662.340100, 1254306.994900)
Layer SRS WKT:
PROJCRS["CH1903+ / LV95",
    BASEGEOGCRS["CH1903+",
        DATUM["CH1903+",
            ELLIPSOID["Bessel 1841",6377397.155,299.1528128,
                LENGTHUNIT["metre",1]]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["degree",0.0174532925199433]],
        ID["EPSG",4150]],
    CONVERSION["Swiss Oblique Mercator 1995",
        METHOD["Hotine Oblique Mercator (variant B)",
            ID["EPSG",9815]],
        PARAMETER["Latitude of projection centre",46.9524055555556,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8811]],
        PARAMETER["Longitude of projection centre",7.43958333333333,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8812]],
        PARAMETER["Azimuth of initial line",90,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8813]],
        PARAMETER["Angle from Rectified to Skew Grid",90,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8814]],
        PARAMETER["Scale factor on initial line",1,
            SCALEUNIT["unity",1],
            ID["EPSG",8815]],
        PARAMETER["Easting at projection centre",2600000,
            LENGTHUNIT["metre",1],
            ID["EPSG",8816]],
        PARAMETER["Northing at projection centre",1200000,
            LENGTHUNIT["metre",1],
            ID["EPSG",8817]]],
    CS[Cartesian,2],
        AXIS["(E)",east,
            ORDER[1],
            LENGTHUNIT["metre",1]],
        AXIS["(N)",north,
            ORDER[2],
            LENGTHUNIT["metre",1]],
    USAGE[
        SCOPE["Cadastre, engineering survey, topographic mapping (large and medium scale)."],
        AREA["Liechtenstein; Switzerland."],
        BBOX[45.82,5.96,47.81,10.49]],
    ID["EPSG",2056]]
Data axis to CRS axis mapping: 1,2
id1: Real (20.0)
velo: Integer (6.0)
velostreif: String (5.0)
veloweg: Integer (6.0)
einbahn: String (5.0)
fuss: Integer (6.0)
name: String (150.0)
map_velo: Integer (6.0)
map_fuss: Integer (6.0)
se_anno_ca: String (254.0)
objectid: Real (38.0)
```

**How cool is that?** With this simple command we get a summary about the number of features (around 40k), the coodinate reference system (CH1903+ / LV95) and the attribute data (columns of the attribute table).

**Your turn:**
- The dataset also has a meta data document (metadaten.pdf) which contains valuable additional information about how to interpret the attributes. Using this document, what attribute do you think is suitable for our bike indicator to distinguish what roads can be used by bikes (velos)?


***
# Explore the district data
Let's now explore also the district data with the same two step approach. Use `ogrinfo` to find the name of the data layer and then use the name of the layer to get information about it.

**Your turn:**
- What is the geometry type of the features?
- How many features are there?
- What is the coordinate reference system?
- What columns does the attribute table have?

***
# Conclusion
Upon exploration you saw that both datasets are in the new swiss coordinate reference system (CH1903+ / LV95) which is suitable for our usecase at the city level. You also found that the attribute `velo` of the road network data seems to be a good indicator whether a road is suitable (1) for a bike or not (0). The data looks all good and you feel ready to load it into PostGIS.