# Script to fetch concept data

* Current data source: `../data/raw/folkersma_concept`
* Future data source: **GeoFA** (rewrite code into .py script for QGIS)
* The study area defined by the user (currently in the `01_define_area` notebook) is used to extract data of the study area from the sum of all concept networks

In [6]:
import os
os.environ['USE_PYGEOS'] = '0' # pygeos/shapely2.0/osmnx conflict solving
import geopandas as gpd
import matplotlib.pyplot as plt
import contextily as cx

**Read in and preprocess raw Folkersma data of concept networks**

In [7]:
# read in files
nodes = gpd.read_file("../data/raw/folkersma_concept/node.shp") 
edges = gpd.read_file("../data/raw/folkersma_concept/stretch.shp")

# add crs to edges
edges.crs = nodes.crs

# convert both to projected crs
nodes = nodes.to_crs("EPSG:25832")
edges = edges.to_crs("EPSG:25832")

# remove empty geometries
edges = edges[~edges.geometry.isna()].reset_index(drop=True)

# assert there is one (and only one) LineString per geometry row
edges = edges.explode(index_parts = False).reset_index(drop=True)
assert all(edges.geometry.type=="LineString")
assert all(edges.geometry.is_valid)

# rectify attributes (ratings)
edges["myattribute"]= edges["rating"].fillna(0)
edges["myattribute"] = edges.apply(lambda x: int(x.myattribute), axis = 1)

# classify manually
edges.loc[edges["myattribute"]==0, "myattribute"] = 1

**Cut to study area extent**

In [8]:
# Read in study area
study_area = gpd.read_file("../data/raw/user_input/study_area.gpkg")
study_area_polygon = study_area.loc[0,"geometry"]

In [17]:
# Find all edges that intersect the study area polygon
edges_in_study_area = edges[
    edges.intersects(study_area_polygon)].copy().reset_index(drop=True)

**Save as separate file as input for qgis processing**

In [18]:
edges_in_study_area.to_file(
    "../data/processed/user_output/qgis_input_concept.gpkg", index=False)