# Extracting mile marker values and Route IDs from a GeoDataFrame with line geometries

Suppose we have a GeoDataFrame of line geometries, but with no associated route ID or mile marker information. 
Suppose further that we also have the linework of the reference routes. 

We can use `linref` to extract the route ID and mile marker information for each for each of the rows of the input GeoDataFrame. 

The example below shows the step-by-step process.

In [1]:
# Import dependencies
from shapely.geometry import LineString
import geopandas as gpd
import linref as lr

First, let's create the sample DataFrames and GeoDataFrames we'll be using throughout this example.

In [2]:
# `line_df` is the GeoDataFrame containing the line geometries without route IDs or mile marker info.
line_gdf = gpd.GeoDataFrame({
    'line_id':[1,2,3,4],
    'geometry':[
        LineString(((1,0),(2,0))),
        LineString(((2,0),(3,0))),
        LineString(((2,2),(3,2))),
        LineString(((3,2),(4,2)))]
})


# `ref_gdf` is the GeoDataFrame that contains the reference linework of the roadway network
ref_gdf = gpd.GeoDataFrame({
    'route_id': ['A','B'],
    'beg': [0,0],
    'end': [5,5],
    'geometry': [LineString(((0,0),(5,0))), LineString(((0,2),(5,2)))]
})

The first processing step is to create an `EventCollection` object for the reference linework:

In [3]:
# For the reference object, notice how we specify the `beg` and `end` parameters because we are dealing with a link-based object.
# The idea here is that the `beg` and `end` columns indicate which columns contain the start and end mile marker information for 
# each link. 
ref_ec = lr.EventsCollection(
    ref_gdf,
    keys=['route_id'],
    beg='beg',
    end='end',
    geom='geometry'
)

Then, we need to use the `project_parallel` method to find exactly how our input `line_gdf` GeoDataFrame matches up with the `ref_gdf` linework.

In [4]:
# First, we ne need to project the line data onto the reference EventsCollection object.
# As a result, this operation will generate another EventCollection object containing 
# all the relevant information needed.
proj_ec = ref_ec.project_parallel(line_gdf)

# We can then dig into the `.df` parameter of the resulting EventsCollection and pull
# out the required columns:
line_gdf[['route_id','beg','end']] = proj_ec.df[['route_id','beg','end']]

print(line_gdf)

   line_id                                       geometry route_id  beg  end
0        1  LINESTRING (1.00000 0.00000, 2.00000 0.00000)        A  1.0  2.0
1        2  LINESTRING (2.00000 0.00000, 3.00000 0.00000)        A  2.0  3.0
2        3  LINESTRING (2.00000 2.00000, 3.00000 2.00000)        B  2.0  3.0
3        4  LINESTRING (3.00000 2.00000, 4.00000 2.00000)        B  3.0  4.0


You can see that each row from the input `line_gdf` now has the relevant route ID and the beginning and ending mile marker information.