# Geographic Analysis with OpenStreetMap Data

**Test 02** <br/>
Authors: Niko Kolaxidis & Tobias Romes <br/>
Topic/Key to look at: **"Ellesmere Port"**

## Completeness

### *Query the ohsome API to derive the number of buildings and overall building area in OSM for each 1km x 1km grid cell. Consider in your queries buildings with the tags yes, house, residential and garage. Save your results into your geojson file.*

In [4]:
import sys
import fiona
import warnings
import geopandas as gpd
from ohsome import OhsomeClient

warnings.filterwarnings("ignore")


# write function to process two distinct requests
def join_and_write(response, topic):
    """Join a response from ohsome API to a geojson file"""
    df = f"response_df_{topic}"
    df = response.as_dataframe()

    # assign primary key to join correctly
    df.reset_index(inplace=True)
    df["idx"] = df["boundary"].astype(int)
    df.set_index("idx", inplace=True)

    # read infile again and join response to geojson file (add suffix to )
    join_df = gpd.read_file(infile).join(df, lsuffix="_2")

    # rename key "value" to prevent duplicates due to repeated request
    join_df[f"{topic}_osm"] = join_df.pop("value")
    
    # calculate differences and ratio 
    join_df[f"{topic}_difference"] = join_df[f"{topic}_osm"] - join_df[f"{topic}"]
    join_df[f"{topic}_ratio"] = join_df[f"{topic}_osm"] / join_df[f"{topic}"]

    # export your results as a geojson file
    with fiona.Env(OSR_WKT_FORMAT="WKT2_2018"):
        join_df.to_file(infile, driver="GeoJSON")

    return join_df


client = OhsomeClient()
infile = r".\data\t2_buildings_sum_count.geojson"
bpolys = gpd.read_file(infile)

# Define which OSM features should be considered.
filter_buildings = "building in (yes, house, residential, garage)"

try:
    response_count = client.elements.count.groupByBoundary.post(
        bpolys=bpolys, 
        filter=filter_buildings
    )
    response_area_sum = client.elements.area.groupByBoundary.post(
        bpolys=bpolys, 
        filter=filter_buildings
    )
    
except Exception as err:
    print(f"Could not send request to ohsome API: {err}")
    sys.exit()

try:
    join_and_write(response_count, "count")
    # read geojson file again and join to new dataset
    join_df = join_and_write(response_area_sum, "area_sum")
    
except Exception as err:
    print(f"Could not complete operations: {err}")
    sys.exit()


2022-12-21 18:47:23,500  collection  ERROR:  .\data\t2_buildings_sum_count.geojson: No such file or directory
2022-12-21 18:47:23,534  collection  ERROR:  .\data\t2_buildings_sum_count.geojson: No such file or directory


### *Report on the overall (sum) building count and building area for your entire area of interest. How complete are OSM buildings for your area of interest in regard to count based completeness and area based completeness measures?*

In [2]:
print("Total building area: {:.2f} km² in OSDH | {:.2f} km² in OSM".format(join_df["area_sum"].sum() / 1000000, join_df["area_sum_osm"].sum() / 1000000))
print("Total buildings: {:.0f} in OSDH | {:.0f} in OSM".format(join_df["count"].sum(), join_df["count_osm"].sum()))

Total building area: 3.04 km² in OSDH | 2.17 km² in OSM
Total buildings: 14213 in OSDH | 28704 in OSM


As seen in above print statements, the data shows differences in mapping completeness. Despite the higher building area mapped in the Ordnance Survey Data Hub (OSDH), the OpenStreetMap (OSM) data seems more detailed due to a much higher total count of mapped buildings. This has yet to be proven with additional data and an evaluation of the pre-/postprocessing of mapped data.

### *How are the measures different for your area of interest? What could be a reason for this difference?*

As stated by Ordnance Survey Limited in their Specification Paper, the OSDH dataset is generalized by "reducing the scale and complexity of map detail while maintaining the
important elements and characteristics of the geometry" (Ordance Survey Limited 2017, p. 12). In regard to buildings, they state that one entity of a building can in reality consist of multiple structures which probably is the reason why the OSM dataset has a higher building count - it simply is more detailed and many buildings are probably divided in multiple entities (Ordnance Survey Limited 2017, p. 13).

*Source: Ordnance Survey Limited (2017): OS OpenMap - Local Technical Specification contents [v.1.1.1], https://www.ordnancesurvey.co.uk/documents/os-open-map-local-product-guide.pdf [21.02.2022].*

### *Visualize your results for the 1km x 1km grid on two maps. The symbology should be based on (A) the ration between OSM / reference building area and (B) the ration between OSM / reference building count.*

![](./images/map_count.png)
![](./images/map_area.png)

## Correctness

### *Derive the currentness of OSM data for all shops in your area of interest. Use a snapshot based approach for this analysis. Download all OSM elements which use the key shop and are represented as a point. Make sure that this download includes the timestamp the OSM element has been edited the last time.*

Query in Overpass Turbo:

```
[out:json][timeout:25];
// fetch area “Ellesmere Port” to search in
{{geocodeArea:Ellesmere Port}}->.searchArea;
// gather results
(
  node["shop"](area.searchArea);
);
// print results
out meta; // use meta instead of body to include timestamps
>;
out skel qt;
```

Result: [Overpass Turbo Query](https://overpass-turbo.eu/s/1pka) <br/>
Also as file: [JSON file from Overpass Turbo](./data/shops_EP.geojson)

### *Categorize OSM elements into three groups based on their last edit timestamp. Use 2018-01-01 and 2020-01-01 as the thresholds. How many elements have been edited for each category?*

| Categories | Count |
|:----------:|:-----:|
| Post <br/> (after 2020-01-01) | 12 |
| Pre <br/> (before 2018-01-01) | 10 |
| Mid <br/> (between 2018-01-01 and 2020-01-01) | 3 |

### *Visualize the results on a map.*

![](./images/map_shops.png)