# 02. Generate loop censuses, Denmark-wide
## Project: Bicycle node network loop analysis

This notebook generates a loop census for each node from the preprocessed and snapped Denmark-wide network from 01, and calculates/plots basic descriptive statistics for the whole country of Denmark.  
Please select `denmark` as the `study_area` in the `config.yml`.

Contact: Michael Szell (michael.szell@gmail.com)

Created: 2024-01-24  
Last modified: 2025-08-07

## To do

- [ ] Double-check loop/link lengths. For example 3-loop east of Faxe
- [ ] Double-check edge_ids during simplifications
- [ ] Semilogy scale for loop lengths

## Parameters

In [None]:
%run -i setup_parameters.py
debug = True  # Set to True for extra plots and verbosity

## Functions

In [None]:
%run -i functions.py

## Load data

In [None]:
Gnx = nx.empty_graph()
for subarea in STUDY_AREA_COMBINED[STUDY_AREA]:
    with lzma.open(PATH[subarea]["data_out"] + "network_preprocessed.xz", "rb") as f:
        G_new = pickle.load(f)
        Gnx = nx.disjoint_union(Gnx, G_new.to_networkx())
if debug:
    print("N: " + str(Gnx.number_of_nodes()), ", L: " + str(Gnx.number_of_edges()))
    for k, v in list(Gnx.nodes(data=True))[:10]:
        print(k, v)
    for u, v in Gnx.edges(list(range(3))):
        print(u, v)

In [None]:
# just dummy files that must be saved due to copy paste from old code. eventually fix and remove.
nodes_id = list()
nodes_coords = list()

## Loop generation

### Get face loops

The minimum cycle basis is generally not the cycle basis of face loops, see: https://en.wikipedia.org/wiki/Cycle_basis#In_planar_graphs  
Therefore, we can't use https://python.igraph.org/en/latest/api/igraph.GraphBase.html#minimum_cycle_basis here. Instead, we solve the problem geometrically via shapely.

#### Polygonize

In [None]:
edgegeoms = list(nx.get_edge_attributes(Gnx, "geometry").values())
facepolygons, _, _, _ = shapely.polygonize_full(edgegeoms)
if debug:
    print(edgegeoms[:10])
    p = gpd.GeoSeries(facepolygons)
    p.plot()
    plt.axis("off")

#### Intersect polygons with graph to get face loops

In [None]:
# Code from: https://github.com/anastassiavybornova/bike-node-planner/blob/main/scripts/script06.py
ns, es = momepy.nx_to_gdf(net=Gnx, points=True, lines=True)

linestrings = (
    es.geometry.copy()
)  # our geopandas.GeoSeries of linestrings representing street network
collection = shapely.GeometryCollection(linestrings.array)  # combine to a single object
noded = shapely.node(collection)  # add missing nodes
polygonized = shapely.polygonize(
    noded.geoms
)  # polygonize based on an array of nodded parts
polygons = gpd.GeoSeries(polygonized.geoms)  # create a GeoSeries from parts

# create geodataframe of loops, where we will save evaluation column
faceloops = gpd.GeoDataFrame(geometry=polygons, crs=es.crs)
if debug:
    print(faceloops.head(5))

Code to fix below

In [None]:
faceloopnodes = []
for fl in faceloops.itertuples():  # loop through each face loop fl
    nsinters = ns.intersection(
        fl.geometry
    )  # intersect all nodes with the fl's geometry
    nsinters = nsinters[~nsinters.is_empty]  # remove empties
    faceloopnodes.append(len(nsinters))  # save number of nodes

In [None]:
plt.hist(faceloopnodes, [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
# Something is wrong. There are 320 faces without nodes, 949 with just 1 node, 1123 with just 2 nodes?? But each face needs to have at least 3 nodes!

Code to fix above

~~Getting all simple loops has not yet been implemented in igraph, see:  
* https://github.com/igraph/igraph/issues/379  
* https://github.com/igraph/igraph/issues/1398  
Some potential progress here, but only for C, not Python:
* https://github.com/igraph/igraph/pull/2181

But they can be XORed through the loop base.~~

Update 2025-06-19: It *has* been implemented now! 🎉  
https://github.com/igraph/python-igraph/releases/tag/0.11.9  
https://python.igraph.org/en/0.11.9/api/igraph.GraphBase.html#simple_cycles  

It has been implemented in networkX though: https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.cycles.simple_cycles.html#networkx.algorithms.cycles.simple_cycles

Therefore, we do not use igraph's loop basis, but go ahead with networkX.

### Get all loops via igraph

to do

### Get all loops via nx

In [None]:
# Get all loops, meaning a loop ABCA is counted also as BCAB and CABC
allloops = {}
for nid in list(Gnx.nodes(data=False)):  # Initialize allloops
    allloops[nid] = {
        "loops": [],
        "lengths": [],
        "numnodes": [],
        "max_slopes": [],
        "water_profile": [],
        "poi_diversity": [],
    }

numloops = 0
allloops_generator = nx.simple_cycles(
    Gnx, length_bound=LOOP_NUMNODE_BOUND
)  # length refers to number of nodes
for c in tqdm(allloops_generator, desc="Generate all loops"):
    sourcenode = c[0]
    c_length = get_loop_length(c)
    # LOOP_LENGTH_BOUND is 0 for no limit, or a number (meters)
    if not LOOP_LENGTH_BOUND or c_length * MPERUNIT <= LOOP_LENGTH_BOUND:
        c_max_slope = get_loop_max_slope(c)
        c_water = get_loop_water_profile(c)
        c_poi_diversity = get_loop_poi_diversity(c)
        for sourcenode in c:
            numloops += 1
            allloops[sourcenode]["loops"].append(c)
            allloops[sourcenode]["lengths"].append(c_length)
            allloops[sourcenode]["numnodes"].append(len(c))
            allloops[sourcenode]["max_slopes"].append(c_max_slope)
            allloops[sourcenode]["water_profile"].append(c_water)
            allloops[sourcenode]["poi_diversity"].append(c_poi_diversity)
if LOOP_LENGTH_BOUND:
    llb_string = " and length bound " + str(LOOP_LENGTH_BOUND) + "m"
else:
    llb_string = ""
print(
    "Found "
    + str(numloops)
    + " loops for number of nodes bound "
    + str(LOOP_NUMNODE_BOUND)
    + llb_string
)

In [None]:
alllooplengths = np.zeros(numloops, dtype=np.float32)
allloopnumnodes = np.zeros(numloops, dtype=np.uint8)
allloopmaxslopes = np.zeros(numloops, dtype=np.uint16)
i = 0
for j in tqdm(allloops, desc="Extract global loop properties"):
    l = len(allloops[j]["lengths"])
    alllooplengths[i : i + l] = allloops[j]["lengths"]
    allloopnumnodes[i : i + l] = allloops[j]["numnodes"]
    allloopmaxslopes[i : i + l] = (np.array(allloops[j]["max_slopes"]) * 100).astype(
        np.uint16
    )  # max_slopes are multiplied by 100 for storage as uint16
    i += l

In [None]:
# Turn lists into numpy arrays for less data storage
for sourcenode in tqdm(allloops, desc="Turn loop data into numpy arrays"):
    for k, v in allloops[sourcenode].items():
        if k == "lengths":
            allloops[sourcenode][k] = np.array(
                allloops[sourcenode][k], dtype=np.float32
            )
        elif k == "numnodes":
            allloops[sourcenode][k] = np.array(allloops[sourcenode][k], dtype=np.uint8)
        elif k == "max_slopes":
            intslopes = [
                i * 100 for i in allloops[sourcenode][k]
            ]  # max_slopes are multiplied by 100 for storage as uint16
            allloops[sourcenode][k] = np.array(intslopes, dtype=np.uint16)
        elif k == "poi_diversity":
            allloops[sourcenode][k] = np.array(allloops[sourcenode][k], dtype=np.uint8)

## Save loop census

In [None]:
if LOOP_LENGTH_BOUND:
    llb_string = "_maxlength" + str(LOOP_LENGTH_BOUND)
else:
    llb_string = ""

with open(
    PATH["data_out"] + "loopcensus_" + str(LOOP_NUMNODE_BOUND) + llb_string + ".pkl",
    "wb",
) as f:
    pickle.dump(allloops, f)
    pickle.dump(alllooplengths, f)
    pickle.dump(allloopnumnodes, f)
    pickle.dump(allloopmaxslopes, f)
    pickle.dump(Gnx, f)
    pickle.dump(LOOP_NUMNODE_BOUND, f)
    pickle.dump(nodes_id, f)
    pickle.dump(nodes_coords, f)
    pickle.dump(numloops, f)
    pickle.dump(faceloops, f)