## Vector data in Python

Setup: https://carpentries-incubator.github.io/geospatial-python/index.html

Instruction: https://carpentries-incubator.github.io/geospatial-python/07-vector-data-in-python.html

Be sure to download the following in place into a 'data' folder
* [brpgewaspercelen_definitief_2020_small.gpkg](https://figshare.com/ndownloader/files/37729413)
* [brogmwvolledigeset.zip](https://figshare.com/ndownloader/files/37729416)
* [status_vaarweg.zip](https://figshare.com/ndownloader/files/37729419)

Objectives:
* Load spatial objects.
* Select the spatial objects within a bounding box.
* Perform a CRS conversion of spatial objects.
* Select features of spatial objects.
* Match objects in two datasets based on their spatial relationships.

* Before executing the code cells, be sure to replace the "_____" as appropriate

In [None]:
# first import necessary libraries
import geopandas as gpd

In [None]:
#  use the geopandas package to load the crop field vector data we downloaded
fields = gpd.read_file("data/brpgewaspercelen_definitief_2020_small.gpkg")
print(fields.crs)
fields

In [None]:
# Define bounding box using data CRS
# note the '_' in the numbers are solely for clarity.
xmin, xmax = (110_000, 140_000)
ymin, ymax = (470_000, 510_000)
bbox = (xmin, ymin, xmax, ymax)
print(bbox)
# other options 
#Use the use the “Draw Rectangular Polygon” https://geojson.io/ but you'd need to reproj to EPSG:4326 (WGS 84)

In [None]:
# Partially load data within the bounding box
fields = gpd.read_file("data/brpgewaspercelen_definitief_2020_small.gpkg", bbox= "_____")

In [None]:
# Plot the overview
fields."_____"()

In [None]:
# show the geometry types
fields."_____"

In [None]:
# show the coordinate reference system (crs)
fields."_____"

In [None]:
# show the bounds (total_bounds)
fields."_____"

In [None]:
# Use a smaller bounding box to crop the data without reloading it
xmin, xmax = (120_000, 135_000)
ymin, ymax = (485_000, 500_000)

# coordinate indexer (cx) makes this possible docs https://geopandas.org/en/stable/docs/reference/api/geopandas.GeoDataFrame.cx.html
fields_cx = fields.cx["_____":"_____", "_____":"_____"]


In [None]:
# Export data to file (extension .shp)
fields_cx.to_file("_____")

## Selecting spatial features

In [None]:
#load our exported data
fields = gpd.read_file("_____")

# load underground water monitoring wells.
wells = gpd.read_file("data/brogmwvolledigeset.zip")

In [None]:
# plot the wells with marker size 0.1
wells.plot(markersize="_____")

In [None]:
# change projection to match fields
wells = wells.to_crs(epsg="_____")

In [None]:
# compare the wells with the cropped fields
wells_clip = wells.clip(fields)
wells_clip
# note this will take some time

In [None]:
# use a 50 meter buffer to match the points in the neighborhood of the fields
buffer = fields.buffer("_____")

# to keep the other columns, assign it to the GeoDataFrame as a geometry column
fields_buffer = fields.copy()
fields_buffer['geometry'] = buffer 

#plot the result
fields_buffer."_____"()

In [None]:
# use the dissolve function to dissolve the buffers into one shape
fields_buffer_dissolve = fields_buffer.dissolve()
fields_buffer_dissolve

In [None]:
# try the clip again, it'll be much faster
wells_clip_buffer = wells.clip(fields_buffer_dissolve)
#print(wells_clip_buffer)
# plot the result
wells_clip_buffer."_____"()

In [None]:
# Exercise: Clip field wells within 500m from the wells
# * visualize the results.
fields =  gpd.read_file("fields_cropped.shp")
wells = gpd.read_file("data/brogmwvolledigeset.zip")

# Crop wells with bounding box of fields plus buffer
# note: wells data might be too big to buffer, so best to crop it first 
xmin, ymin, xmax, ymax = fields.total_bounds
wells = wells.to_crs(28992)
wells_cx = wells.cx[xmin-500:xmax+500, ymin-500:ymax+500]

# Create wells buffer
wells_cx_500mbuffer = wells_cx.copy()
wells_cx_500mbuffer['geometry'] = wells_cx.buffer(500)

# Clip fields by the wells
fields_clip_buffer = fields.clip(wells_cx_500mbuffer)

#plot the result
fields_clip_buffer.plot()

## Spatially join the features

In [None]:
# Join fields and wells_cx_500mbuffer
# and display the shape
fields_wells_buffer = fields.sjoin("_____")
print(fields_wells_buffer.shape)

In [None]:
# get the unique indexes, and use the iloc indexer to select
# Since a polygon can fall into multiple buffers creating duplicated field indexes in the results
idx = fields_wells_buffer.index.unique()
fields_in_buffer = fields.iloc["_____"]

#plot the result
fields_in_buffer."_____"()

## Modify the geometry of a GeoDataFrame

In [None]:
# load and visualize the Dutch waterway lines file status_vaarweg.zip
waterways_nl = gpd.read_file("_____")
waterways_nl."_____"()

In [None]:
# Take a look on what makes up the geometry column of waterways_nl
waterways_nl['geometry']

In [None]:
# Print the 'geometry' and it's 'type' of the third row 
print(waterways_nl['geometry']["_____"])
print(type(waterways_nl['geometry']["_____"]))

In [None]:
#use shapely to flip the geometry
import shapely

# Define a function flipping the x and y coordinate values
def flip(geometry):
    return shapely.ops.transform(lambda x, y: (y, x), geometry)
# shapely transform function docs https://shapely.readthedocs.io/en/stable/manual.html#shapely.ops.transform
# more on lambda https://realpython.com/python-lambda/

# Apply this function to all coordinates and lines
geom_corrected = waterways_nl['geometry'].apply(flip)

In [None]:
# Update geometry
waterways_nl['geometry'] = geom_corrected

# Visualization
waterways_nl.plot()

In [None]:
# Export updated geometry
waterways_nl.to_file('waterways_nl_corrected.shp')