### Loading and Preparing Graph Data from GraphML Files

The `load_graphml_files` function loads a series of bicycle traffic network graphs stored in GraphML format and prepares them for training with PyTorch Geometric (PyG). The objective is to convert each monthly graph into a format compatible with Graph Neural Networks (GNNs), ensuring that both node and edge features are retained.

Each NetworkX graph is converted into a PyG `Data` object using a custom helper function `transform_networkx_into_pyg`. This function ensures that essential node and edge attributes such as:

- **Node attributes**:
  - `lon` (longitude),
  - `lat` (latitude),
  
- **Edge attributes**:
  - `speed_rel` (relative speed),
  - `month` (for cyclical encoding of months),
  - `year` (the year of the traffic data),
  - `id` (an identifier for a certain trackfrom A to B),
  - `tracks` (the number of bicycles traveling from the starting to the ending point),

are preserved during the conversion process.

PyG expects data in a specific structure, particularly when both node and edge attributes are used in models like GATv2.

`data_list` contains multiple `torch_geometric.data.Data` objects, each representing a graph.


In [None]:
import os
import sys
import networkx as nx
import import_ipynb 

src_path = os.path.abspath(os.path.join(os.getcwd(), "..", ".."))
if src_path not in sys.path:
    sys.path.append(src_path)

from utils.wrapper.transform_networkx_into_pyg import transform_networkx_into_pyg

def load_graphml_files(years=[2021, 2022, 2023]):
    """
    Loads multiple directed graph files in GraphML format and converts them 
    into PyTorch Geometric (PyG) Data objects.

    Parameters:
    -----------
    years : list of int (default=[2021, 2022, 2023])
        List of years for which graph files should be loaded. 
        Assumes 12 monthly files per year.

    Returns:
    --------
    data_list : list of torch_geometric.data.Data
        List of PyG data objects created from the loaded NetworkX graphs.
    """

    data_list = []

    for year in years:
        for i in range(12):
            path = f"../../../data/graphml/{year}/bike_network_{year}_{i}.graphml"
            if not os.path.exists(path):
                print(f"[WARN] File not found: {path}")
                continue

            G_nx = nx.read_graphml(path)
            G_nx = nx.DiGraph(G_nx)

            data = transform_networkx_into_pyg(G_nx)
            data_list.append(data)

    print(f"Number of loaded graphs: {len(data_list)}")
    return data_list