🚌 Projet MDM - Mobilité Durable en Montagne ⛰️

*Author : Nicolas Grosjean*

*Date : 13/09/2025*

**Description :**

This Jupyter Notebook analyses the OSM data

In [1]:
import geopandas as gpd
import pandas as pd

In [2]:
%cd ../..
print("Working directory set to the root of the project")

D:\Documents\GitHub\mobilite_durable
Working directory set to the root of the project


In [3]:
from src.processors.osm import OSMBusLinesProcessor, OSMBusStopsProcessor

In [4]:
def get_markdown_dtype(df: pd.DataFrame):
    markdown_table = "| Column | Dtype |\n|--------|-------|\n"
    for col in df.columns:
        non_null = df[col].count()
        dtype = df[col].dtype
        markdown_table += f"| {col} | {dtype} |\n"
    return markdown_table

In [5]:
stops_gdf = OSMBusStopsProcessor.fetch(reload_pipeline=False)
stops_gdf.head()

Unnamed: 0,gtfs_id,navitia_id,osm_id,name,description,line_gtfs_ids,line_osm_ids,geometry,other
0,,,135296,Université - IUT-STAPS,,[],[],POINT (5.77622 45.19738),"{'alt_name': None, 'amenity': None, 'bench': N..."
1,,,135930,Hôpital Couple Enfant,,[],[],POINT (5.74231 45.20065),"{'alt_name': None, 'amenity': None, 'bench': N..."
2,,,136570,Cap des H',"Arrêt de régulation, non commercial.",[],[],POINT (5.68159 45.21695),"{'alt_name': None, 'amenity': None, 'bench': N..."
3,,,136597,Place de la Libération,,[],[],POINT (5.66272 45.20707),"{'alt_name': None, 'amenity': None, 'bench': N..."
4,,,137073,Centr'Alp 2,,[],[],POINT (5.6043 45.3198),"{'alt_name': None, 'amenity': None, 'bench': N..."


In [6]:
stops_gdf.info()

<class 'geopandas.geodataframe.GeoDataFrame'>
RangeIndex: 6792 entries, 0 to 6791
Data columns (total 9 columns):
 #   Column         Non-Null Count  Dtype   
---  ------         --------------  -----   
 0   gtfs_id        595 non-null    object  
 1   navitia_id     0 non-null      object  
 2   osm_id         6792 non-null   int64   
 3   name           6792 non-null   object  
 4   description    284 non-null    object  
 5   line_gtfs_ids  6792 non-null   object  
 6   line_osm_ids   6792 non-null   object  
 7   geometry       6792 non-null   geometry
 8   other          6792 non-null   object  
dtypes: geometry(1), int64(1), object(7)
memory usage: 477.7+ KB


In [8]:
expanded = stops_gdf["other"].apply(pd.Series)
expanded_stops_gdf = pd.concat([stops_gdf.drop(columns=["other"]), expanded], axis=1)
expanded_stops_gdf[expanded_stops_gdf.columns[:10]].head()

Unnamed: 0,gtfs_id,navitia_id,osm_id,name,description,line_gtfs_ids,line_osm_ids,geometry,alt_name,amenity
0,,,135296,Université - IUT-STAPS,,[],[],POINT (5.77622 45.19738),,
1,,,135930,Hôpital Couple Enfant,,[],[],POINT (5.74231 45.20065),,
2,,,136570,Cap des H',"Arrêt de régulation, non commercial.",[],[],POINT (5.68159 45.21695),,
3,,,136597,Place de la Libération,,[],[],POINT (5.66272 45.20707),,
4,,,137073,Centr'Alp 2,,[],[],POINT (5.6043 45.3198),,


In [7]:
expanded_stops_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6791 entries, 0 to 6790
Data columns (total 75 columns):
 #   Column                         Non-Null Count  Dtype 
---  ------                         --------------  ----- 
 0   type                           6791 non-null   object
 1   geometry                       6791 non-null   object
 2   id                             6791 non-null   int64 
 3   bus                            5856 non-null   object
 4   highway                        6791 non-null   object
 5   name                           6378 non-null   object
 6   network                        4829 non-null   object
 7   public_transport               6758 non-null   object
 8   wheelchair                     1256 non-null   object
 9   description                    284 non-null    object
 10  network:wikidata               413 non-null    object
 11  fixme                          68 non-null     object
 12  source                         165 non-null    object
 13  she

In [10]:
print(
    get_markdown_dtype(expanded_stops_gdf[expanded_stops_gdf.columns[:10]]).replace(
        "object", "string"
    )
)

| Column | Dtype |
|--------|-------|
| gtfs_id | string |
| navitia_id | string |
| osm_id | int64 |
| name | string |
| description | string |
| line_gtfs_ids | string |
| line_osm_ids | string |
| geometry | geometry |
| alt_name | string |
| amenity | string |



In [9]:
lines_df = OSMBusLinesProcessor.fetch(reload_pipeline=False)
lines_df.head()

2025-09-21 11:26:07,549 - INFO - src.processors.osm|query_overpass:60 - Getting overpass query results in 3s
2025-09-21 11:26:12,155 - DEBUG - src.processors.osm|pre_process:153 - Skipping disused or abandoned bus line with id 2073673


Unnamed: 0,gtfs_id,osm_id,name,from_location,to,network,network_gtfs_id,network_osm_id,network_wikidata,operator,colour,text_colour,stop_gtfs_ids,stops_osm_ids,school,geometry,other
0,,2067887,Ligne A : Gare de Saint-Clair-Les-Roches ⇒ Ron...,Gare de Saint-Clair-Les-Roches,Rond-point Chanas,TPR,,,,Courriers Rhodaniens / Fayard,e53b1a,,[],"[1659415935, 8874916309, 11146173165, 11146173...",False,,"{'public_transport:version': '2', 'ref': 'A', ..."
1,,2569190,Ouibus 70 : Grenoble Gare Routière -> Aéroport...,Grenoble - Gare Routière,Aéroport Lyon Saint-Exupéry - Terminal 1,BlaBlaBus,,,Q1653380,Faure Vercors,#ee0064,,[],"[2617010911, 474827289, 6074566590]",False,,"{'network:wikipedia': 'fr:BlaBlaCar Bus', 'old..."
2,,2569239,Ouibus 70 : Aéroport Lyon Saint-Exupéry -> Pla...,Aéroport Lyon Saint-Exupéry - Terminal 1,Grenoble - Gare routière,BlaBlaBus,,,Q1653380,Faure Vercors,#ee0064,,[],"[6074566590, 457759141, 2617010911]",False,,"{'old_name': 'Satobus', 'opening_hours': 'Mo-S..."
3,,2920548,15 : Bois Français => Grenoble (via Chenevières),Saint Ismier - Bois Français,Grenoble - Verdun-Préfecture,M réso,,,Q131689044,VFD,#1f72b9,,[],"[2299463674, 513946287, 513946283, 513946279, ...",False,,"{'description': 'Circule l'été', 'opening_hour..."
4,,2920549,15 : Grenoble => Bois Français (via Chenevières),Grenoble - Verdun-Préfecture,Saint Ismier - Bois Français,M réso,,,Q131689044,VFD,#1f72b9,,[],"[372746162, 451116247, 1829688368, 1829874475,...",False,,"{'description': 'Circule l'été', 'opening_hour..."


In [10]:
lines_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1866 entries, 0 to 1865
Data columns (total 17 columns):
 #   Column            Non-Null Count  Dtype 
---  ------            --------------  ----- 
 0   gtfs_id           335 non-null    object
 1   osm_id            1866 non-null   int64 
 2   name              1866 non-null   object
 3   from_location     1862 non-null   object
 4   to                1862 non-null   object
 5   network           1860 non-null   object
 6   network_gtfs_id   0 non-null      object
 7   network_osm_id    0 non-null      object
 8   network_wikidata  1389 non-null   object
 9   operator          1594 non-null   object
 10  colour            1440 non-null   object
 11  text_colour       34 non-null     object
 12  stop_gtfs_ids     1866 non-null   object
 13  stops_osm_ids     1866 non-null   object
 14  school            1866 non-null   bool  
 15  geometry          0 non-null      object
 16  other             1866 non-null   object
dtypes: bool(1), in

In [11]:
expanded = lines_df["other"].apply(pd.Series)
expanded_lines_df = pd.concat([lines_df.drop(columns=["other"]), expanded], axis=1)
expanded_lines_df.head()

Unnamed: 0,gtfs_id,osm_id,name,from_location,to,network,network_gtfs_id,network_osm_id,network_wikidata,operator,...,reservation,charge,url,comment,fixme,name:pt,check_date,name:eu,duration,public_transport
0,,2067887,Ligne A : Gare de Saint-Clair-Les-Roches ⇒ Ron...,Gare de Saint-Clair-Les-Roches,Rond-point Chanas,TPR,,,,Courriers Rhodaniens / Fayard,...,,,,,,,,,,
1,,2569190,Ouibus 70 : Grenoble Gare Routière -> Aéroport...,Grenoble - Gare Routière,Aéroport Lyon Saint-Exupéry - Terminal 1,BlaBlaBus,,,Q1653380,Faure Vercors,...,,,,,,,,,,
2,,2569239,Ouibus 70 : Aéroport Lyon Saint-Exupéry -> Pla...,Aéroport Lyon Saint-Exupéry - Terminal 1,Grenoble - Gare routière,BlaBlaBus,,,Q1653380,Faure Vercors,...,,,,,,,,,,
3,,2920548,15 : Bois Français => Grenoble (via Chenevières),Saint Ismier - Bois Français,Grenoble - Verdun-Préfecture,M réso,,,Q131689044,VFD,...,,,,,,,,,,
4,,2920549,15 : Grenoble => Bois Français (via Chenevières),Grenoble - Verdun-Préfecture,Saint Ismier - Bois Français,M réso,,,Q131689044,VFD,...,,,,,,,,,,


In [12]:
expanded_lines_df.loc[4, expanded_lines_df.columns[:25]]

gtfs_id                                                                  None
osm_id                                                                2920549
name                         15 : Grenoble => Bois Français (via Chenevières)
from_location                                    Grenoble - Verdun-Préfecture
to                                               Saint Ismier - Bois Français
network                                                                M réso
network_gtfs_id                                                          None
network_osm_id                                                           None
network_wikidata                                                   Q131689044
operator                                                                  VFD
colour                                                                #1f72b9
text_colour                                                              None
stop_gtfs_ids                                                   

In [13]:
expanded_lines_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1866 entries, 0 to 1865
Data columns (total 73 columns):
 #   Column                    Non-Null Count  Dtype 
---  ------                    --------------  ----- 
 0   gtfs_id                   335 non-null    object
 1   osm_id                    1866 non-null   int64 
 2   name                      1866 non-null   object
 3   from_location             1862 non-null   object
 4   to                        1862 non-null   object
 5   network                   1860 non-null   object
 6   network_gtfs_id           0 non-null      object
 7   network_osm_id            0 non-null      object
 8   network_wikidata          1389 non-null   object
 9   operator                  1594 non-null   object
 10  colour                    1440 non-null   object
 11  text_colour               34 non-null     object
 12  stop_gtfs_ids             1866 non-null   object
 13  stops_osm_ids             1866 non-null   object
 14  school                  

In [14]:
print(
    get_markdown_dtype(expanded_lines_df[expanded_lines_df.columns[:25]]).replace(
        "object", "string"
    )
)

| Column | Dtype |
|--------|-------|
| gtfs_id | string |
| osm_id | int64 |
| name | string |
| from_location | string |
| to | string |
| network | string |
| network_gtfs_id | string |
| network_osm_id | string |
| network_wikidata | string |
| operator | string |
| colour | string |
| text_colour | string |
| stop_gtfs_ids | string |
| stops_osm_ids | string |
| school | bool |
| geometry | string |
| public_transport:version | string |
| ref | string |
| route | string |
| type | string |
| via | string |
| network:wikipedia | string |
| old_name | string |
| opening_hours | string |
| description | string |

