🚌 Projet MDM - Mobilité Durable en Montagne ⛰️

*Author : Nicolas Grosjean*

*Date : 13/09/2025*

**Description :**

This Jupyter Notebook analyses the OSM data

In [1]:
import geopandas as gpd
import pandas as pd

In [2]:
import os

os.chdir("../..")
print("Working directory set to the root of the project")

Working directory set to the root of the project


In [3]:
from src.processors.osm import OSMBusLinesProcessor, OSMBusStopsProcessor

In [4]:
def get_markdown_dtype(df: pd.DataFrame):
    markdown_table = "| Column | Dtype |\n|--------|-------|\n"
    for col in df.columns:
        non_null = df[col].count()
        dtype = df[col].dtype
        markdown_table += f"| {col} | {dtype} |\n"
    return markdown_table

In [5]:
stops_gdf = OSMBusStopsProcessor.fetch(reload_pipeline=False)
stops_gdf.head()

Unnamed: 0,gtfs_id,navitia_id,osm_id,name,description,line_gtfs_ids,line_osm_ids,network,network_gtfs_id,geometry,other
0,,,135296,Université - IUT-STAPS,,[],[3922428],M réso,,POINT (5.77622 45.19738),"{'alt_name': None, 'amenity': None, 'bench': N..."
1,,,135930,Hôpital Couple Enfant,,[],"[3333927, 3922430]",M réso,,POINT (5.74231 45.20065),"{'alt_name': None, 'amenity': None, 'bench': N..."
2,,,136570,Cap des H',"Arrêt de régulation, non commercial.",[],[],M réso,,POINT (5.68159 45.21695),"{'alt_name': None, 'amenity': None, 'bench': N..."
3,,,136597,Place de la Libération,,[],"[3031947, 3031948, 3044780, 3044781, 3923550, ...",M réso,,POINT (5.66272 45.20707),"{'alt_name': None, 'amenity': None, 'bench': N..."
4,,,137073,Centr'Alp 2,,[],"[8671286, 8671287]",M réso,,POINT (5.6043 45.3198),"{'alt_name': None, 'amenity': None, 'bench': N..."


In [6]:
stops_gdf.info()

<class 'geopandas.geodataframe.GeoDataFrame'>
RangeIndex: 6793 entries, 0 to 6792
Data columns (total 11 columns):
 #   Column           Non-Null Count  Dtype   
---  ------           --------------  -----   
 0   gtfs_id          595 non-null    object  
 1   navitia_id       0 non-null      object  
 2   osm_id           6793 non-null   int64   
 3   name             6793 non-null   object  
 4   description      282 non-null    object  
 5   line_gtfs_ids    6793 non-null   object  
 6   line_osm_ids     6793 non-null   object  
 7   network          4830 non-null   object  
 8   network_gtfs_id  0 non-null      object  
 9   geometry         6793 non-null   geometry
 10  other            6793 non-null   object  
dtypes: geometry(1), int64(1), object(9)
memory usage: 583.9+ KB


In [7]:
expanded = stops_gdf["other"].apply(pd.Series)
expanded_stops_gdf = pd.concat([stops_gdf.drop(columns=["other"]), expanded], axis=1)
expanded_stops_gdf[expanded_stops_gdf.columns[:10]].head()

Unnamed: 0,gtfs_id,navitia_id,osm_id,name,description,line_gtfs_ids,line_osm_ids,network,network_gtfs_id,geometry
0,,,135296,Université - IUT-STAPS,,[],[3922428],M réso,,POINT (5.77622 45.19738)
1,,,135930,Hôpital Couple Enfant,,[],"[3333927, 3922430]",M réso,,POINT (5.74231 45.20065)
2,,,136570,Cap des H',"Arrêt de régulation, non commercial.",[],[],M réso,,POINT (5.68159 45.21695)
3,,,136597,Place de la Libération,,[],"[3031947, 3031948, 3044780, 3044781, 3923550, ...",M réso,,POINT (5.66272 45.20707)
4,,,137073,Centr'Alp 2,,[],"[8671286, 8671287]",M réso,,POINT (5.6043 45.3198)


In [8]:
expanded_stops_gdf.info()

<class 'geopandas.geodataframe.GeoDataFrame'>
RangeIndex: 6793 entries, 0 to 6792
Data columns (total 78 columns):
 #   Column                         Non-Null Count  Dtype   
---  ------                         --------------  -----   
 0   gtfs_id                        595 non-null    object  
 1   navitia_id                     0 non-null      object  
 2   osm_id                         6793 non-null   int64   
 3   name                           6793 non-null   object  
 4   description                    282 non-null    object  
 5   line_gtfs_ids                  6793 non-null   object  
 6   line_osm_ids                   6793 non-null   object  
 7   network                        4830 non-null   object  
 8   network_gtfs_id                0 non-null      object  
 9   geometry                       6793 non-null   geometry
 10  alt_name                       32 non-null     object  
 11  amenity                        3 non-null      object  
 12  bench                     

In [9]:
print(
    get_markdown_dtype(expanded_stops_gdf[expanded_stops_gdf.columns[:10]]).replace(
        "object", "string"
    )
)

| Column | Dtype |
|--------|-------|
| gtfs_id | string |
| navitia_id | string |
| osm_id | int64 |
| name | string |
| description | string |
| line_gtfs_ids | string |
| line_osm_ids | string |
| network | string |
| network_gtfs_id | string |
| geometry | geometry |



In [10]:
lines_df = OSMBusLinesProcessor.fetch(reload_pipeline=False)
lines_df.head()

Unnamed: 0,gtfs_id,osm_id,name,from_location,to,network,network_gtfs_id,network_wikidata,operator,colour,text_colour,stop_gtfs_ids,stops_osm_ids,school,geometry,other
0,,2067887,Ligne A : Gare de Saint-Clair-Les-Roches ⇒ Ron...,Gare de Saint-Clair-Les-Roches,Rond-point Chanas,TPR,,,Courriers Rhodaniens / Fayard,e53b1a,,[],"[1659415935, 8874916309, 11146173165, 11146173...",False,,"{'charge': None, 'check_date': None, 'comment'..."
1,,2569190,Ouibus 70 : Grenoble Gare Routière -> Aéroport...,Grenoble - Gare Routière,Aéroport Lyon Saint-Exupéry - Terminal 1,BlaBlaBus,,Q1653380,Faure Vercors,#ee0064,,[],"[2617010911, 474827289, 6074566590]",False,,"{'charge': None, 'check_date': None, 'comment'..."
2,,2569239,Ouibus 70 : Aéroport Lyon Saint-Exupéry -> Pla...,Aéroport Lyon Saint-Exupéry - Terminal 1,Grenoble - Gare routière,BlaBlaBus,,Q1653380,Faure Vercors,#ee0064,,[],"[6074566590, 457759141, 2617010911]",False,,"{'charge': None, 'check_date': None, 'comment'..."
3,,2920548,15 : Bois Français => Grenoble (via Chenevières),Saint Ismier - Bois Français,Grenoble - Verdun-Préfecture,M réso,,Q131689044,VFD,#1f72b9,,[],"[2299463674, 513946287, 513946283, 513946279, ...",False,,"{'charge': None, 'check_date': None, 'comment'..."
4,,2920549,15 : Grenoble => Bois Français (via Chenevières),Grenoble - Verdun-Préfecture,Saint Ismier - Bois Français,M réso,,Q131689044,VFD,#1f72b9,,[],"[372746162, 451116247, 1829688368, 1829874475,...",False,,"{'charge': None, 'check_date': None, 'comment'..."


In [11]:
lines_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1866 entries, 0 to 1865
Data columns (total 16 columns):
 #   Column            Non-Null Count  Dtype 
---  ------            --------------  ----- 
 0   gtfs_id           335 non-null    object
 1   osm_id            1866 non-null   int64 
 2   name              1866 non-null   object
 3   from_location     1862 non-null   object
 4   to                1862 non-null   object
 5   network           1860 non-null   object
 6   network_gtfs_id   0 non-null      object
 7   network_wikidata  1389 non-null   object
 8   operator          1594 non-null   object
 9   colour            1440 non-null   object
 10  text_colour       34 non-null     object
 11  stop_gtfs_ids     1866 non-null   object
 12  stops_osm_ids     1866 non-null   object
 13  school            1866 non-null   bool  
 14  geometry          0 non-null      object
 15  other             1866 non-null   object
dtypes: bool(1), int64(1), object(14)
memory usage: 220.6+ KB


In [12]:
expanded = lines_df["other"].apply(pd.Series)
expanded_lines_df = pd.concat([lines_df.drop(columns=["other"]), expanded], axis=1)
expanded_lines_df.head()

Unnamed: 0,gtfs_id,osm_id,name,from_location,to,network,network_gtfs_id,network_wikidata,operator,colour,...,roundtrip,route,seasonal,segment,source,type,url,via,website,wheelchair
0,,2067887,Ligne A : Gare de Saint-Clair-Les-Roches ⇒ Ron...,Gare de Saint-Clair-Les-Roches,Rond-point Chanas,TPR,,,Courriers Rhodaniens / Fayard,e53b1a,...,,bus,,,,route,,Le Péage de Roussillon - Gare SNCF,,
1,,2569190,Ouibus 70 : Grenoble Gare Routière -> Aéroport...,Grenoble - Gare Routière,Aéroport Lyon Saint-Exupéry - Terminal 1,BlaBlaBus,,Q1653380,Faure Vercors,#ee0064,...,,bus,,,,route,,Place de la résistance,,
2,,2569239,Ouibus 70 : Aéroport Lyon Saint-Exupéry -> Pla...,Aéroport Lyon Saint-Exupéry - Terminal 1,Grenoble - Gare routière,BlaBlaBus,,Q1653380,Faure Vercors,#ee0064,...,,bus,,,,route,,Place de la Résistance,,
3,,2920548,15 : Bois Français => Grenoble (via Chenevières),Saint Ismier - Bois Français,Grenoble - Verdun-Préfecture,M réso,,Q131689044,VFD,#1f72b9,...,,bus,,,,route,,Domène - Chenevières,,yes
4,,2920549,15 : Grenoble => Bois Français (via Chenevières),Grenoble - Verdun-Préfecture,Saint Ismier - Bois Français,M réso,,Q131689044,VFD,#1f72b9,...,,bus,,,,route,,Domène - Chenevières,,yes


In [13]:
expanded_lines_df.loc[4, expanded_lines_df.columns[:25]]

gtfs_id                                                          None
osm_id                                                        2920549
name                 15 : Grenoble => Bois Français (via Chenevières)
from_location                            Grenoble - Verdun-Préfecture
to                                       Saint Ismier - Bois Français
network                                                        M réso
network_gtfs_id                                                  None
network_wikidata                                           Q131689044
operator                                                          VFD
colour                                                        #1f72b9
text_colour                                                      None
stop_gtfs_ids                                                      []
stops_osm_ids       [372746162, 451116247, 1829688368, 1829874475,...
school                                                          False
geometry            

In [14]:
expanded_lines_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1866 entries, 0 to 1865
Data columns (total 72 columns):
 #   Column                    Non-Null Count  Dtype 
---  ------                    --------------  ----- 
 0   gtfs_id                   335 non-null    object
 1   osm_id                    1866 non-null   int64 
 2   name                      1866 non-null   object
 3   from_location             1862 non-null   object
 4   to                        1862 non-null   object
 5   network                   1860 non-null   object
 6   network_gtfs_id           0 non-null      object
 7   network_wikidata          1389 non-null   object
 8   operator                  1594 non-null   object
 9   colour                    1440 non-null   object
 10  text_colour               34 non-null     object
 11  stop_gtfs_ids             1866 non-null   object
 12  stops_osm_ids             1866 non-null   object
 13  school                    1866 non-null   bool  
 14  geometry                

In [15]:
print(
    get_markdown_dtype(expanded_lines_df[expanded_lines_df.columns[:25]]).replace(
        "object", "string"
    )
)

| Column | Dtype |
|--------|-------|
| gtfs_id | string |
| osm_id | int64 |
| name | string |
| from_location | string |
| to | string |
| network | string |
| network_gtfs_id | string |
| network_wikidata | string |
| operator | string |
| colour | string |
| text_colour | string |
| stop_gtfs_ids | string |
| stops_osm_ids | string |
| school | bool |
| geometry | string |
| charge | string |
| check_date | string |
| comment | string |
| description | string |
| description:ca | string |
| description:de | string |
| description:es | string |
| description:fr | string |
| duration | string |
| fee | string |

