# Extracting mile marker values and Route IDs from a GeoDataFrame with point geometries

Suppose we have a GeoDataFrame of point geometries, but with no associated route ID or mile marker information. 
Suppose further that we also have the linework of the reference routes. 

We can use `linref` to extract the route ID and mile marker information for each for each of the rows of the input GeoDataFrame. 

The example below shows the step-by-step process.

In [1]:
# Import dependencies
from shapely.geometry import Point, LineString
import geopandas as gpd
import linref as lr

First, let's create the sample DataFrames and GeoDataFrames we'll be using throughout this example.

In [2]:
# `point_df` is the GeoDataFrame containing the point geometries without route IDs or mile marker info.
point_gdf = gpd.GeoDataFrame({
    'point_id':[1,2,3,4],
    'geometry':[
        Point((1,0)),
        Point((2,0)),
        Point((3,2)),
        Point((4,2))]
})


# `ref_gdf` is the GeoDataFrame that contains the reference linework of the roadway network
ref_gdf = gpd.GeoDataFrame({
    'route_id': ['A','B'],
    'beg': [0,0],
    'end': [5,5],
    'geometry': [LineString(((0,0),(5,0))), LineString(((0,2),(5,2)))]
})

The first processing step is to create an `EventCollection` object for the reference linework:

In [3]:
# For the reference object, notice how we specify the `beg` and `end` parameters because we are dealing with a link-based object.
# The idea here is that the `beg` and `end` columns indicate which columns contain the start and end mile marker information for 
# each link. 
ref_ec = lr.EventsCollection(
    ref_gdf,
    keys=['route_id'],
    beg='beg',
    end='end',
    geom='geometry'
)

Then, we need to use the `project` method to find exactly how our input `point_gdf` GeoDataFrame matches up with the `ref_gdf` linework.

In [4]:
# First, we ne need to project the point data onto the reference EventsCollection object.
# As a result, this operation will generate another EventCollection object containing 
# all the relevant information needed.
proj_ec = ref_ec.project(point_gdf)

# We can then dig into the `.df` parameter of the resulting EventsCollection and pull
# out the required columns. Note that the mile marker gets stored by default in a column
# called `"LOC"`. 
point_gdf[['route_id','mile_marker']] = proj_ec.df[['route_id','LOC']]

print(point_gdf)

   point_id                 geometry route_id  mile_marker
0         1  POINT (1.00000 0.00000)        A          1.0
1         2  POINT (2.00000 0.00000)        A          2.0
2         3  POINT (3.00000 2.00000)        B          3.0
3         4  POINT (4.00000 2.00000)        B          4.0


You can see that each row from the input `point_gdf` now has the relevant route ID and mile marker information.