# Demo: backing out HQ transit corridors as lines

Conservation Biology Institute working with Governor's Office of Planning and Research to develop a CEQA Site Check.

**Main hurdle**: Screening parcels using two different buffer ranges (1/2-mile and 1/4-mile) specific to HQ corridors, rather than the single buffer range in your CA HQ Transit Areas dataset (polygons). We do this because they filter parcels into different qualification zones depending on the particular CEQA exemption for streamlining of housing development.
<br>**What they have**: an existing modeling workflow to buffer and build out rest of the separate layers.
<br>**What they want**: linestrings they can buffer themselves.
<br>**Solution**: demo how to get linestrings for HQ areas (polygons) using only open data portal products.


In [1]:
import geopandas as gpd
import pandas as pd


import os
os.environ['USE_PYGEOS'] = '0'
import geopandas

In the next release, GeoPandas will switch to using Shapely by default, even if PyGEOS is installed. If you only have PyGEOS installed to get speed-ups, this switch should be smooth. However, if you are using PyGEOS directly (calling PyGEOS functions on geometries from GeoPandas), this will then stop working and you are encouraged to migrate from PyGEOS to Shapely 2.0 (https://shapely.readthedocs.io/en/latest/migration_pygeos.html).
  import geopandas as gpd


Use open data portal products: `ca_hq_transit_areas` and `ca_transit_routes`.

In [2]:
HQTA_URL = ("https://gis.data.ca.gov/datasets/"
            "863e61eacbf3463ab239beb3cee4a2c3_0.geojson")
ROUTES_URL = ("https://gis.data.ca.gov/datasets/"
              "dd7cb74665a14859a59b8c31d3bc5a3e_0.geojson")

hq_areas = gpd.read_file(HQTA_URL)

routes = gpd.read_file(ROUTES_URL)

While the file is called `CA Transit Routes`, it's important to note that transit routes have different variations, the most basic variation being that a route usually travels in 2 directions. But, depending on the service the operator provides, there can be more variations (the same `route_id` has different `shape_id` values).

We can clip the `routes` by HQTA areas (polygons) and get a much smaller set of routes.

On this much smaller `routes` file, we should definitely dissolve and get combine all the variations (`shape_id`) for a given route.

Note: clipping and dissolving can be swapped. But the clip throws away the portion that is outside the HQTA areas, making the dissolve much quicker.

In [3]:
routes2 = routes.clip(hq_areas)

# Now that it's clipped, get rid of shape variations, 
# and dissolve to routes
routes2 = routes2.drop(
    # we don't need these columns later in the analysis
    columns = ["shape_id", "n_trips", "uri"]
).dissolve(
    # dissolve by a set of identifiers that uniquely identifes routes
    by=["org_id", "agency", "route_id", 
        "route_type", "route_name", 
        "base64_url"]
).reset_index()

One more check. Is it possible that a lot more routes are present in `CA Transit Routes` than are included in HQTAs? Yes.

Let's get rid of those.

In [4]:
routes_in_transit_routes = routes2[["org_id", "route_id"]].drop_duplicates()
routes_in_hqta = hq_areas[["org_id_primary", "route_id"]
        ].drop_duplicates()

print(f"# routes in transit routes: {len(routes_in_transit_routes)}")
print(f"# routes in hqtas: {len(routes_in_hqta)}")

# routes in transit routes: 2209
# routes in hqtas: 2139


Merge in the clipped/dissolved routes (with linestrings) with hq areas.

Putting it on the left means we make linestrings the primary geometry, 
not the polygon (in fact, we drop the polygon geometry from hqta areas).

In [5]:
routes_in_hq_areas = pd.merge(
    routes2[["org_id", "route_id", "geometry"]],
    hq_areas.rename(columns = {"org_id_primary": "org_id"}
                   ).drop(columns = ["Shape_Length", "Shape_Area", "geometry"]),
    on = ["org_id", "route_id"],
    how = "inner",
)

In [6]:
routes_in_hq_areas.head()

Unnamed: 0,org_id,route_id,geometry,OBJECTID,agency_primary,agency_secondary,hqta_type,hqta_details,base64_url_primary,base64_url_secondary,org_id_secondary
0,rec0FfOvKIMZu1Qjs,RouteA-Red,"MULTILINESTRING ((-118.19941 33.92756, -118.19...",492,City of Lynwood,,hq_corridor_bus,stop_along_hq_bus_corridor_single_operator,aHR0cHM6Ly9naXRodWIuY29tL0xBQ01UQS9sb3MtYW5nZW...,,
1,rec0FfOvKIMZu1Qjs,RouteA-Red,"MULTILINESTRING ((-118.19941 33.92756, -118.19...",7965,City of Lynwood,City of Lynwood,major_stop_bus,intersection_2_bus_routes_same_operator,aHR0cHM6Ly9naXRodWIuY29tL0xBQ01UQS9sb3MtYW5nZW...,aHR0cHM6Ly9naXRodWIuY29tL0xBQ01UQS9sb3MtYW5nZW...,rec0FfOvKIMZu1Qjs
2,rec0FfOvKIMZu1Qjs,RouteA-Red,"MULTILINESTRING ((-118.19941 33.92756, -118.19...",7966,City of Lynwood,City of Lynwood,major_stop_bus,intersection_2_bus_routes_same_operator,aHR0cHM6Ly9naXRodWIuY29tL0xBQ01UQS9sb3MtYW5nZW...,aHR0cHM6Ly9naXRodWIuY29tL0xBQ01UQS9sb3MtYW5nZW...,rec0FfOvKIMZu1Qjs
3,rec0FfOvKIMZu1Qjs,RouteA-Red,"MULTILINESTRING ((-118.19941 33.92756, -118.19...",7967,City of Lynwood,City of Lynwood,major_stop_bus,intersection_2_bus_routes_same_operator,aHR0cHM6Ly9naXRodWIuY29tL0xBQ01UQS9sb3MtYW5nZW...,aHR0cHM6Ly9naXRodWIuY29tL0xBQ01UQS9sb3MtYW5nZW...,rec0FfOvKIMZu1Qjs
4,rec0FfOvKIMZu1Qjs,RouteA-Red,"MULTILINESTRING ((-118.19941 33.92756, -118.19...",7968,City of Lynwood,City of Lynwood,major_stop_bus,intersection_2_bus_routes_same_operator,aHR0cHM6Ly9naXRodWIuY29tL0xBQ01UQS9sb3MtYW5nZW...,aHR0cHM6Ly9naXRodWIuY29tL0xBQ01UQS9sb3MtYW5nZW...,rec0FfOvKIMZu1Qjs
