diff --git a/assets/images/prow-completeness/missing-prows-heatmap.png b/assets/images/prow-completeness/missing-prows-heatmap.png new file mode 100644 index 0000000..dcddf86 Binary files /dev/null and b/assets/images/prow-completeness/missing-prows-heatmap.png differ diff --git a/assets/images/prow-completeness/missing-prows.png b/assets/images/prow-completeness/missing-prows.png new file mode 100644 index 0000000..f448c1f Binary files /dev/null and b/assets/images/prow-completeness/missing-prows.png differ diff --git a/content/blog/prow-completeness.md b/content/blog/prow-completeness.md index 2faec00..dd56296 100644 --- a/content/blog/prow-completeness.md +++ b/content/blog/prow-completeness.md @@ -22,26 +22,26 @@ In order to help me determine which PRoWs were missing I used QGIS to analyse th * Use the [points along geometry](https://docs.qgis.org/3.16/en/docs/user_manual/processing_algs/qgis/vectorgeometry.html#points-along-geometry) tool to place points at a regular interval (I used 25m) on both the OSM and PRoW datasets. This is to ensure that straight sections of line still contain vertices, which is important for the next step of analysis. * Employ the [hub to hub distance](https://docs.qgis.org/3.16/en/docs/user_manual/processing_algs/qgis/vectoranalysis.html#distance-to-nearest-hub-points) tool to find the nearest node in OSM from each node in the PRoW, which is done by setting the PRoWs as the source, and OSM as the destination. We can't use the [join attributes by nearest](https://docs.qgis.org/3.16/en/docs/user_manual/processing_algs/qgis/vectorgeneral.html#qgisjoinbynearest) tool for this as it works based on line centeroids, which may be completely different in the datasets. For example, if a public footpath begins halfway up the length of a residential road and then they end at the same point, their centeroids will be quite far apart, despite their significant overlap and hence will not conflate. * Add a virtual layer, I called mine `missing_prow_list`. This will allow you to write arbitrary SQL to create a list of the missing PRoWs route codes. I ended up using something like the below -- this will select any rights of way which are on average more than 30m away from a `highway=*` in OSM: - -```SQL -SELECT route_code, AVG(hub_to_hub_dist) as avg_hub_dist, -FROM prow_dataset_distances -GROUP BY route_code -HAVING avg_hub_dist > 40; -``` - + ```SQL + SELECT route_code, AVG(hub_to_hub_dist) as avg_hub_dist, + FROM prow_dataset_distances + GROUP BY route_code + HAVING avg_hub_dist > 40; + ``` * Use another virtual layer to get the actual geometries out of the original PRoW lines, using a join: - -```SQL -SELECT route_code, designation, geometry -FROM prow_dataset -INNER JOIN missing_prow_list -ON missing_prow_list.route_code = prow_dataset.route_code -``` - + ```SQL + SELECT route_code, designation, geometry + FROM prow_dataset + INNER JOIN missing_prow_list + ON missing_prow_list.route_code = prow_dataset.route_code + ``` * Select your second virtual layer in QGIS and export it as a shapefile. This can then be imported into JOSM, or any editor of your choice, then manually merged into OpenStreetMap after being tagged up appropriately; hint, this is by far the hardest step. * Job done :) +{{< container-image path="images/prow-completeness/missing-prows.png" method="Resize" options="1200x png Lanczos" margin="10px" alt="Missing rights of ways in Dorset" >}} + +{{< container-image path="images/prow-completeness/missing-prows-heatmap.png" method="Resize" options="1200x png Lanczos" margin="10px" alt="Heatmap of missing PRoWs" >}} + This method is not perfect as it will miss smaller footpaths (such as those that join two roads only a short distance apart) and hence it is more useful for rural areas. On the other hand it seems to rarely produce false positives. Ideally when OSM has further matured this will become simpler, as we could just compare the list of PRoW route codes to the `prow_ref=*` tags. Unfortunately it seems a significant amount of PRoWs haven't been given this tag yet, meaning it is not yet viable due to the number of false positives.