**Computes large population movements**

**Input**: csv using the following columns:
* `transportation_mode`: used mode of transport for the trip,
* `starting_longitude` and `starting_latitude`: Starting point of trip
* `ending_longitude` and `ending_latitude`: Ending point of trip
* `user_id`: Id of the traveling user, used to make sure results include more than 3 users per geographic division
* `journey_id`: id of journeys grathering multiple trips
* `start_time`: Begining date of trip, can be used to compute trip duration
* `end_time`: End date of trip, can be used to compute trip duration

**Input**: geoJSON file with perimeter of Ile-de-France region

**Output**: geoJSON file "../static/data/exode.geojson" containing h3 cell shapes with the following metadata:
* `geometry`: h3 shape of the destinations
* `MostCommonTransport`: prefered `transportation_mode_tr` for most trips
* `color`: A color representation of `MostCommonTransport`
* `Count`: Total number of trips arriving in cell from Ile-de-France

**Output**: geoJSON file "../static/data/exode_lines.geojson" containing line shapes with the following metadata:
* `geometry`: LineString starting from Ile-de-France and ending in the centroid of h3 celles
* `MostCommonTransport`: prefered `transportation_mode_tr` for most trips
* `color`: A color representation of `MostCommonTransport`
* `Count`: Total number of trips arriving in cell from Ile-de-France

**Output**: geoJSON file "../static/data/inxode.geojson" containing h3 cell shapes with the following metadata:
* `geometry`: h3 shape of the origins
* `MostCommonTransport`: prefered `transportation_mode_tr` for most trips
* `color`: A color representation of `MostCommonTransport`
* `Count`: Total number of trips leaving from cell to Ile-de-France

**Output**: geoJSON file "../static/data/inxode_lines.geojson" containing line shapes with the following metadata:
* `geometry`: LineString starting from Ile-de-France and ending in the centroid of h3 celles
* `MostCommonTransport`: prefered `transportation_mode_tr` for most trips
* `color`: A color representation of `MostCommonTransport`
* `Count`: Total number of trips leaving from cell to Ile-de-France


The way the MostCommonTransport is computed is as follow:
* We aggregate trips in journeys, 
    * journey start point corresponds to the start point of the first trip
    * journey end point corresponds to the end point of the last trip
    * journey transportation mode corresponds to the most used transportation mode, by distance travelled
* We then aggregate journeys by h3 hexagon cells
    * most common transport is the transportation mode appearing for the most trips in a cell



In [None]:
# Configuration
INPUT_CSV_FILE = "sources/data_france_ceremonie_jo_sans_gps.csv" # "sources/data.csv"
ILE_DE_FRANCE_GEOJSON_PERIMETER_FILE = "sources/region-ile-de-france.geojson"
EXODE_OUTPUT_FILE = "../static/data/26_exode.geojson"
EXODE_LINES_OUTPUT_FILE = "../static/data/26_exode_lines.geojson"
INXODE_OUTPUT_FILE = "../static/data/26_inxode.geojson"
INXODE_LINES_OUTPUT_FILE = "../static/data/26_inxode_lines.geojson"

In [2]:
import pandas as pd
from mappymatch.constructs.geofence import Geofence
from shapely.geometry import Point, LineString
from shapely.vectorized import contains
import h3pandas
import geopandas as gpd
import folium
import json
import numpy as np


### Load sources

In [3]:
geofence_idf = Geofence.from_geojson(ILE_DE_FRANCE_GEOJSON_PERIMETER_FILE)

In [4]:
df = pd.read_csv(INPUT_CSV_FILE)

### Add a human readable transporation mode

In [5]:
tr = {
-10 : "NOT_DEFINED",
0 : "UNKNOWN",
1 : "PASSENGER_CAR",
2 : "MOTORCYCLE",
3 : "HEAVY_DUTY_VEHICLE",
4 : "BUS",
5 : "COACH",
6 : "RAIL_TRIP",
7 : "BOAT_TRIP",
8 : "BIKE_TRIP",
9 : "PLANE",
10 : "SKI",
11 : "FOOT",
12 : "IDLE",
13 : "OTHER",
101 : "SCOOTER",
102 : "HIGH_SPEED_TRAIN"
}
df['transportation_mode_tr'] = df['transportation_mode'].apply(lambda x: tr[x])

### Consider journeys instead of trips (aggregation)

In [6]:
# Aggregate the main features
agg_main = df.groupby('journey_id').agg(
    starting_longitude=('starting_longitude', 'first'),
    starting_latitude=('starting_latitude', 'first'),
    start_time=('start_time', 'first'),
    ending_longitude=('ending_longitude', 'last'),
    ending_latitude=('ending_latitude', 'last'),
    end_time=('end_time', 'last'),
    user_id=('user_id', 'first')
).reset_index()

# Calculate the sum of distances for each transportation mode within each journey
agg_distance = df.groupby(['journey_id', 'transportation_mode_tr']).agg(
    total_distance=('distance_km', 'sum')
).reset_index()

# Sort the distance aggregation and find the top two transportation modes for each journey
agg_distance_sorted = agg_distance.sort_values(by=['journey_id', 'total_distance'], ascending=[True, False])

# Get the top two transportation modes for each journey
agg_distance_top2 = agg_distance_sorted.groupby('journey_id').head(2).reset_index(drop=True)

# Split the top two transportation modes into separate columns
agg_distance_top2['rank'] = agg_distance_top2.groupby('journey_id').cumcount() + 1
agg_distance_pivot = agg_distance_top2.pivot(index='journey_id', columns='rank', values=['transportation_mode_tr', 'total_distance']).reset_index()

# Rename columns for clarity
agg_distance_pivot.columns = ['journey_id', 
                              'top_transportation_mode_tr', 'second_top_transportation_mode_tr', 
                              'top_transportation_mode_distance', 'second_top_transportation_mode_distance']

# Merge the results
result = pd.merge(agg_main, agg_distance_pivot, on='journey_id', how='left')

In [7]:
result["merge_transportation_mode_tr"] = result["top_transportation_mode_tr"] + result["second_top_transportation_mode_tr"].fillna('')
result

Unnamed: 0,journey_id,starting_longitude,starting_latitude,start_time,ending_longitude,ending_latitude,end_time,user_id,top_transportation_mode_tr,second_top_transportation_mode_tr,top_transportation_mode_distance,second_top_transportation_mode_distance,merge_transportation_mode_tr
0,742817667022,6.464718,48.186218,2024-07-26 11:00:31.264000+00:00,6.174910,48.689155,2024-07-26 14:51:44.347000+00:00,352988,PASSENGER_CAR,,59.890017,,PASSENGER_CAR
1,742823639999,55.298969,-21.217768,2024-07-26 21:01:07.801000+00:00,55.237720,-21.095352,2024-07-27 02:56:51.148000+00:00,2551880,FOOT,,15.197029,,FOOT
2,742988401230,-1.177554,48.068819,2024-07-26 16:05:27.190000+00:00,2.281656,48.599228,2024-07-26 18:44:09.396000+00:00,2770212,PASSENGER_CAR,,269.232315,,PASSENGER_CAR
3,743012284368,-2.572195,47.524249,2024-07-26 12:04:53.627000+00:00,-1.659237,48.098194,2024-07-26 13:58:06.206000+00:00,626198,PASSENGER_CAR,,95.319885,,PASSENGER_CAR
4,743046442648,0.714236,47.404153,2024-07-26 14:32:23.974000+00:00,2.690119,50.625516,2024-07-26 22:09:24.949000+00:00,1485009,PASSENGER_CAR,,97.887792,,PASSENGER_CAR
...,...,...,...,...,...,...,...,...,...,...,...,...,...
363343,1722045575153,-61.523715,16.251679,2024-07-27 01:59:35.163000+00:00,-61.522721,16.252330,2024-07-27 02:28:40.741000+00:00,1644093,FOOT,,1.646118,,FOOT
363344,1722045587659,5.008377,46.844728,2024-07-27 01:59:47.669000+00:00,4.990153,46.833452,2024-07-27 02:01:47.837000+00:00,2930047,RAIL_TRIP,,1.972177,,RAIL_TRIP
363345,1722045592887,5.744103,49.291760,2024-07-27 01:59:52.889000+00:00,5.742220,49.291600,2024-07-27 03:00:03.368000+00:00,1174446,BIKE_TRIP,,40.272155,,BIKE_TRIP
363346,1722045594779,-1.748568,48.026827,2024-07-27 01:59:54.783000+00:00,-1.741409,48.022066,2024-07-27 02:07:12.661000+00:00,74972,RAIL_TRIP,,1.482595,,RAIL_TRIP


In [8]:
# ignore southern part of the world, data is messed up anyway
# df = result[result["ending_latitude"] > 35][result["starting_latitude"] > 35]

# Or consider whole data
df = result

### Generate Exode data

Those are trips leaving idf

In [9]:
# Convert the DataFrame to a GeoDataFrame
gdf = gpd.GeoDataFrame(df, geometry=gpd.points_from_xy(df.starting_longitude, df.starting_latitude), crs="EPSG:4326")


In [10]:
geofence_idf_geometry = geofence_idf.geometry

# Keep only trips starting in idf
gdf_from_idf = gdf[gdf.geometry.within(geofence_idf_geometry)]

# Create end_geometry for the end points
gdf_from_idf['end_geometry'] = gpd.points_from_xy(gdf_from_idf.ending_longitude, gdf_from_idf.ending_latitude)

# Remove trips ending in idf
gdf_exit_idf = gdf_from_idf[~gdf_from_idf['end_geometry'].within(geofence_idf_geometry)]

# Drop the temporary 'end_geometry' column if not needed
gdf_exit_idf = gdf_exit_idf.drop(columns=['geometry', 'end_geometry'])
gdf_exit_idf

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  super().__setitem__(key, value)


Unnamed: 0,journey_id,starting_longitude,starting_latitude,start_time,ending_longitude,ending_latitude,end_time,user_id,top_transportation_mode_tr,second_top_transportation_mode_tr,top_transportation_mode_distance,second_top_transportation_mode_distance,merge_transportation_mode_tr
176,743661342011,2.249933,49.103483,2024-07-26 10:03:52.032000+00:00,2.113450,49.459648,2024-07-26 10:37:09+00:00,38302,PASSENGER_CAR,,48.198729,,PASSENGER_CAR
181,743663401000,2.139036,48.788045,2024-07-26 11:05:57.825000+00:00,1.358679,43.605231,2024-07-27 00:22:32.429000+00:00,2944220,BIKE_TRIP,PASSENGER_CAR,555.195777,28.493114,BIKE_TRIPPASSENGER_CAR
184,743663621000,2.304301,48.888637,2024-07-26 22:18:18.655000+00:00,1.332669,49.182481,2024-07-26 23:20:29.999000+00:00,2910719,PASSENGER_CAR,,93.001282,,PASSENGER_CAR
195,743666371999,2.540277,48.670898,2024-07-26 15:16:23.995000+00:00,2.742658,47.975821,2024-07-26 16:49:59.172000+00:00,623542,RAIL_TRIP,,78.748795,,RAIL_TRIP
375,743675514923,2.598341,48.989128,2024-07-26 14:16:40.608000+00:00,57.678856,-20.433268,2024-07-27 02:36:24.557000+00:00,100210,PLANE,,9449.246107,,PLANE
...,...,...,...,...,...,...,...,...,...,...,...,...,...
362462,1722041570491,2.533101,48.793737,2024-07-27 00:53:50.585000+00:00,1.855665,48.274499,2024-07-27 02:42:00.604000+00:00,50227,PASSENGER_CAR,,97.832331,,PASSENGER_CAR
362493,1722041719112,2.309339,48.721391,2024-07-27 00:55:19.116000+00:00,0.500223,48.086437,2024-07-27 02:48:29.314000+00:00,147800,PASSENGER_CAR,RAIL_TRIP,152.354055,3.831696,PASSENGER_CARRAIL_TRIP
363026,1722044112403,2.372216,49.023937,2024-07-27 01:47:13.280000+00:00,0.366331,46.700672,2024-07-27 06:09:32.139000+00:00,138860,PASSENGER_CAR,,322.554216,,PASSENGER_CAR
363137,1722044589650,2.785610,49.072341,2024-07-27 01:43:09.714000+00:00,0.707739,47.435195,2024-07-27 04:33:47.292000+00:00,1856951,PASSENGER_CAR,,278.943928,,PASSENGER_CAR


In [11]:
# Optional, map colored by count, not an output

# dfh3 = gdf_exit_idf.h3.geo_to_h3(4, lat_col="ending_latitude", lng_col="ending_longitude", set_index=False)
# df_unique_user = dfh3.drop_duplicates(subset=['h3_04', 'user_id'])
# drawgeoframe = df_unique_user[['h3_04']].groupby(['h3_04']).agg(Count=('h3_04', np.size))
# drawgeoframe=drawgeoframe.reset_index().set_index('h3_04')
# drawgeoframe = drawgeoframe[drawgeoframe['Count'] > 3]
# drawgeoframe = drawgeoframe.h3.h3_to_geo()
# drawgeoframe["center_geom"] = drawgeoframe["geometry"]
# drawgeoframe = drawgeoframe.h3.h3_to_geo_boundary()

# fixed_point = Point(2.333333, 48.866667)

# # Function to create a line from the fixed point to each point
# def create_line(point):
#     return LineString([fixed_point, point])

# # Apply the function to each geometry in the GeoDataFrame
# drawgeoframe['geom'] = drawgeoframe['center_geom'].apply(create_line)


# import branca.colormap as cm
# colormap = cm.LinearColormap(["green", "yellow", "red"], vmin=0, vmax=50)
# drawgeoframe["color"] = drawgeoframe["Count"].apply(lambda x: colormap(x)[:-2])

# start_lat = 48.8915079
# start_long = 2.3495425
# m = folium.Map(location=[start_lat, start_long], zoom_start=13)
# folium.GeoJson(drawgeoframe[["geometry", "color"]], style_function=lambda f: {"color": f['properties']['color']}).add_to(m)
# folium.GeoJson(drawgeoframe[["geom", "color"]].rename(columns={"geom": "geometry"}), style_function=lambda f: {"color": f['properties']['color']}).add_to(m)
# m

In [12]:
# Group by h3 cell (level 4), count journeys ending in cell, and find common transportation mode for each cell
def most_common_value(series):
    return series.mode().iloc[0]
dfh3 = gdf_exit_idf.h3.geo_to_h3(4, lat_col="ending_latitude", lng_col="ending_longitude", set_index=False)

# Avoid counting the same user twice per cell
df_unique_user = dfh3.drop_duplicates(subset=['h3_04', 'user_id'])

# Do the grouping
drawgeoframe = df_unique_user[['h3_04', 'top_transportation_mode_tr']].groupby(['h3_04']).agg(Count=('h3_04', np.size), MostCommonTransport=('top_transportation_mode_tr', most_common_value))
drawgeoframe = drawgeoframe.reset_index().set_index('h3_04')

# Filter cells where we have less than 4 trips, to help with anonymity, and data quality
drawgeoframe = drawgeoframe[drawgeoframe['Count'] > 3]

# Find center of h3 cell and store the point in center_geom
drawgeoframe = drawgeoframe.h3.h3_to_geo()
drawgeoframe["center_geom"] = drawgeoframe["geometry"]

# Store the hexagon shape in "geometry" (default)
drawgeoframe = drawgeoframe.h3.h3_to_geo_boundary()

drawgeoframe

Unnamed: 0_level_0,Count,MostCommonTransport,geometry,center_geom
h3_04,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
8418443ffffffff,5,PASSENGER_CAR,"POLYGON ((-2.80275 47.77414, -3.11584 47.70003...",POINT (-2.85043 47.54344)
8418459ffffffff,10,PASSENGER_CAR,"POLYGON ((-1.65479 47.60246, -1.96538 47.53150...",POINT (-1.70602 47.37305)
841845bffffffff,12,PASSENGER_CAR,"POLYGON ((-1.44852 47.21421, -1.75690 47.14318...",POINT (-1.50009 46.98444)
841845dffffffff,10,PASSENGER_CAR,"POLYGON ((-2.22661 47.68955, -2.53848 47.61703...",POINT (-2.27609 47.45949)
8418495ffffffff,7,PASSENGER_CAR,"POLYGON ((-1.64883 43.64373, -1.94131 43.56599...",POINT (-1.69710 43.40859)
...,...,...,...,...
84396c3ffffffff,7,PASSENGER_CAR,"POLYGON ((4.77957 44.11133, 4.71094 43.88102, ...",POINT (5.01640 43.96692)
84396c5ffffffff,16,PASSENGER_CAR,"POLYGON ((3.86803 43.84979, 3.80265 43.61928, ...",POINT (4.10406 43.70719)
84a2509ffffffff,18,PASSENGER_CAR,"POLYGON ((55.17462 -21.02656, 55.10968 -21.267...",POINT (55.35782 -21.19699)
84a2543ffffffff,7,PASSENGER_CAR,"POLYGON ((55.05712 -20.61584, 54.99223 -20.856...",POINT (55.23936 -20.78579)


In [13]:
fixed_point = Point(2.333333, 48.866667)

# Function to create a line from the fixed point to each point
def create_line(point):
    return LineString([fixed_point, point])

# Store in geom a line between the center of IDF and the center of the cell, this will be used for the lines viz
drawgeoframe['geom'] = drawgeoframe['center_geom'].apply(create_line)

In [14]:
# Add color param, depending on the mode
colormap = {
    "PLANE": "red",
    "PASSENGER_CAR": "orange",
    "PASSENGER_CARFOOT": "darkorange",
    "RAIL_TRIP": "green",
    "HIGH_SPEED_TRAIN": "green",
    "HIGH_SPEED_TRAINRAIL_TRIP": "darkgreen",
    "PASSENGER_CARRAIL_TRIP": "yellow",
    "FOOT": "black" # wtf
}
drawgeoframe["color"] = drawgeoframe["MostCommonTransport"].apply(lambda x: colormap.get(x, "gray"))

# Save exode and exode_liness geojsons
drawgeoframe[["geometry", "color", "Count", "MostCommonTransport"]].to_file(EXODE_OUTPUT_FILE, driver="GeoJSON")
drawgeoframe[["geom", "color", "Count", "MostCommonTransport"]].rename(columns={"geom": "geometry"}).to_file(EXODE_LINES_OUTPUT_FILE, driver="GeoJSON")

# Locally display results, optionnal
start_lat = 48.8915079
start_long = 2.3495425
m = folium.Map(location=[start_lat, start_long], zoom_start=13)
folium.GeoJson(drawgeoframe[["geometry", "color"]], style_function=lambda f: {"color": f['properties']['color']}).add_to(m)
folium.GeoJson(drawgeoframe[["geom", "color"]].rename(columns={"geom": "geometry"}), style_function=lambda f: {"weight": "0.5", "color": f['properties']['color']}).add_to(m)
m

### Now we repeat the process for journeys coming to IDF (Called "Inxode")

In [15]:
geofence_idf_geometry = geofence_idf.geometry

# Convert the DataFrame to a GeoDataFrame
gdf = gpd.GeoDataFrame(df, geometry=gpd.points_from_xy(df.starting_longitude, df.starting_latitude), crs="EPSG:4326")

# Remove trips starting in idf
gdf_from_idf = gdf[~gdf.geometry.within(geofence_idf_geometry)]

# Create end_geometry for the end points
gdf_from_idf['end_geometry'] = gpd.points_from_xy(gdf_from_idf.ending_longitude, gdf_from_idf.ending_latitude)

# Keep only trips ending in idf
gdf_enter_idf = gdf_from_idf[gdf_from_idf['end_geometry'].within(geofence_idf_geometry)]

# Drop the temporary 'end_geometry' column if not needed
gdf_enter_idf = gdf_enter_idf.drop(columns='end_geometry')
gdf_enter_idf


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  super().__setitem__(key, value)


Unnamed: 0,journey_id,starting_longitude,starting_latitude,start_time,ending_longitude,ending_latitude,end_time,user_id,top_transportation_mode_tr,second_top_transportation_mode_tr,top_transportation_mode_distance,second_top_transportation_mode_distance,merge_transportation_mode_tr,geometry
2,742988401230,-1.177554,48.068819,2024-07-26 16:05:27.190000+00:00,2.281656,48.599228,2024-07-26 18:44:09.396000+00:00,2770212,PASSENGER_CAR,,269.232315,,PASSENGER_CAR,POINT (-1.17755 48.06882)
180,743663220051,1.853730,48.267776,2024-07-26 13:24:12.967000+00:00,2.388605,48.693623,2024-07-26 14:14:36.873000+00:00,2865881,PASSENGER_CAR,,78.703723,,PASSENGER_CAR,POINT (1.85373 48.26778)
323,743674442058,1.112734,46.293521,2024-07-26 10:08:15.218000+00:00,2.360907,48.725417,2024-07-26 10:43:00.999000+00:00,2804834,PLANE,UNKNOWN,343.673373,33.788693,PLANEUNKNOWN,POINT (1.11273 46.29352)
328,743674516000,4.726871,49.741582,2024-07-26 10:25:33.574000+00:00,2.423133,48.894987,2024-07-26 13:34:00+00:00,2244158,PASSENGER_CAR,MOTORCYCLE,229.977847,4.957302,PASSENGER_CARMOTORCYCLE,POINT (4.72687 49.74158)
334,743674650014,4.284088,47.321571,2024-07-26 10:08:41.539000+00:00,2.326334,48.876579,2024-07-26 11:47:15.434000+00:00,854366,HIGH_SPEED_TRAIN,PASSENGER_CAR,110.064559,78.395898,HIGH_SPEED_TRAINPASSENGER_CAR,POINT (4.28409 47.32157)
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
362755,1722042886401,0.962303,47.534745,2024-07-27 01:14:46.402000+00:00,2.214424,48.591083,2024-07-27 03:12:44.303000+00:00,2894815,PASSENGER_CAR,,181.233126,,PASSENGER_CAR,POINT (0.96230 47.53474)
362792,1722043052999,0.962284,47.534771,2024-07-27 01:17:33+00:00,1.990391,48.574507,2024-07-27 02:40:42.487000+00:00,2760830,PASSENGER_CAR,,140.496049,,PASSENGER_CAR,POINT (0.96228 47.53477)
363157,1722044682990,3.299817,49.846217,2024-07-27 01:44:42.995000+00:00,2.641531,48.975506,2024-07-27 03:38:52.931000+00:00,873950,PASSENGER_CAR,FOOT,137.192706,0.222663,PASSENGER_CARFOOT,POINT (3.29982 49.84622)
363174,1722044803964,3.923475,49.244399,2024-07-27 01:46:43.968000+00:00,2.146309,48.636293,2024-07-27 03:45:57.769000+00:00,464006,PASSENGER_CAR,,160.964676,,PASSENGER_CAR,POINT (3.92348 49.24440)


In [16]:
# Group by h3 cell (level 4), count journeys ending in cell, and find common transportation mode for each cell
def most_common_value(series):
    return series.mode().iloc[0]
dfh3 = gdf_enter_idf.h3.geo_to_h3(4, lat_col="starting_latitude", lng_col="starting_longitude", set_index=False)

# Avoid counting the same user twice per cell
df_unique_user = dfh3.drop_duplicates(subset=['h3_04', 'user_id'])

# Do the grouping
drawgeoframe = df_unique_user[['h3_04', 'top_transportation_mode_tr']].groupby(['h3_04']).agg(Count=('h3_04', np.size), MostCommonTransport=('top_transportation_mode_tr', most_common_value))
drawgeoframe = drawgeoframe.reset_index().set_index('h3_04')

# Filter cells where we have less than 4 trips, to help with anonymity, and data quality
drawgeoframe = drawgeoframe[drawgeoframe['Count'] > 3]

# Find center of h3 cell and store the point in center_geom
drawgeoframe = drawgeoframe.h3.h3_to_geo()
drawgeoframe["center_geom"] = drawgeoframe["geometry"]

# Store the hexagon shape in "geometry" (default)
drawgeoframe = drawgeoframe.h3.h3_to_geo_boundary()

drawgeoframe

Unnamed: 0_level_0,Count,MostCommonTransport,geometry,center_geom
h3_04,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
8418443ffffffff,4,PASSENGER_CAR,"POLYGON ((-2.80275 47.77414, -3.11584 47.70003...",POINT (-2.85043 47.54344)
8418453ffffffff,6,PASSENGER_CAR,"POLYGON ((-1.80744 46.91285, -2.11492 46.84021...",POINT (-1.85764 46.68206)
8418459ffffffff,15,PASSENGER_CAR,"POLYGON ((-1.65479 47.60246, -1.96538 47.53150...",POINT (-1.70602 47.37305)
841845bffffffff,5,PASSENGER_CAR,"POLYGON ((-1.44852 47.21421, -1.75690 47.14318...",POINT (-1.50009 46.98444)
841845dffffffff,5,PASSENGER_CAR,"POLYGON ((-2.22661 47.68955, -2.53848 47.61703...",POINT (-2.27609 47.45949)
...,...,...,...,...
84396e9ffffffff,14,PASSENGER_CAR,"POLYGON ((3.50261 43.53072, 3.43871 43.29979, ...",POINT (3.73768 43.38839)
845e463ffffffff,5,PASSENGER_CAR,"POLYGON ((-61.10330 14.41641, -60.92662 14.570...",POINT (-61.13544 14.62772)
84a2509ffffffff,27,PASSENGER_CAR,"POLYGON ((55.17462 -21.02656, 55.10968 -21.267...",POINT (55.35782 -21.19699)
84a2543ffffffff,11,PASSENGER_CAR,"POLYGON ((55.05712 -20.61584, 54.99223 -20.856...",POINT (55.23936 -20.78579)


In [17]:
fixed_point = Point(2.333333, 48.866667)

# Function to create a line from the fixed point to each point
def create_line(point):
    return LineString([fixed_point, point])

# Store in geom a line between the center of IDF and the center of the cell, this will be used for the lines viz
drawgeoframe['geom'] = drawgeoframe['center_geom'].apply(create_line)

In [18]:
# Add color param, depending on the mode
colormap = {
    "PLANE": "red",
    "PASSENGER_CAR": "orange",
    "RAIL_TRIP": "green",
    "HIGH_SPEED_TRAIN": "green",
    "FOOT": "black" # wtf
}
drawgeoframe["color"] = drawgeoframe["MostCommonTransport"].apply(lambda x: colormap.get(x))

# Save inxode and inxode_lines geojsons
drawgeoframe[["geometry", "color", "Count", "MostCommonTransport"]].to_file(INXODE_OUTPUT_FILE, driver="GeoJSON")
drawgeoframe[["geom", "color", "Count", "MostCommonTransport"]].rename(columns={"geom": "geometry"}).to_file(INXODE_LINES_OUTPUT_FILE, driver="GeoJSON")

# Locally display results, optionnal
start_lat = 48.8915079
start_long = 2.3495425
m = folium.Map(location=[start_lat, start_long], zoom_start=13)
folium.GeoJson(drawgeoframe[["geometry", "color"]], style_function=lambda f: {"color": f['properties']['color']}).add_to(m)
folium.GeoJson(drawgeoframe[["geom", "color", "Count"]].rename(columns={"geom": "geometry"}), style_function=lambda f: {"weight": int(f['properties']['Count'])/100, "color": f['properties']['color']}).add_to(m)
m