## Testing out parenx skeletonization and voronoi approaches

Resources:
* https://github.com/nptscot/networkmerge
* https://github.com/nptscot/networkmerge
* https://github.com/anisotropi4/parenx/tree/main

In [8]:
import glob
import os
import re

import geopandas as gpd

In [2]:
# parquet is not a recognized format - convert to gpkg first
os.makedirs("../temp-parenx/", exist_ok=True)
folders = os.listdir("../data/")
folders.remove("sample.parquet")
for folder in folders:
    os.makedirs(f"../temp-parenx/{folder}/", exist_ok=True)
    if not folder.endswith("parquet"):
        roads = gpd.read_parquet(f"../data/{folder}/roads_osm.parquet").reset_index(
            drop=True
        )
        roads.to_file(
            f"../temp-parenx/{folder}/roads_osm.gpkg", layer="roads", engine="pyogrio"
        )

**Now, run the bash script `parenx-run.sh` from command line**

`bash code/parenx-run.sh`

this will add to each subfolder in `temp-parenx` 2 files: voronoi.gpkg and skeletonize.gpkg. gitignoring them for now because the outputs are too large.

**reduce output file size by removing duplicated data**,  and copy to corresponding `data/{fua_id]}/parenx/` folders (in parquet format)

In [53]:
for subfolder in glob.glob("../temp-parenx/*"):

    fua = int(re.findall(r'\d+', subfolder)[0])

    os.makedirs(f"../data/{fua}/parenx/", exist_ok=True)

    ske = gpd.read_file(
        filename = subfolder + "/skeletonize.gpkg",
        driver = "fiona",
        layer = "line"
    )


    ske.to_parquet(f"../data/{fua}/parenx/skeletonize.parquet")

    vor = gpd.read_file(
        filename = subfolder + "/voronoi.gpkg",
        driver = "fiona",
        layer = "line"
    )

    vor.to_parquet(f"../data/{fua}/parenx/voronoi.parquet")

### Initial observations & thoughts:
* computation time: skeletonization around 10min for all 5 usecases; voronoi between 1h and 14h (salt lake city, maybe because it has the largest area, or maybe because my laptop went to sleep...)
*  it works well for some places (esp intersections, even the more complicated ones)
* major issue 1: sometimes network topology is not kept (linestrings that don't connect are merged)
* major issue 2: it creates wobbly lines