# Notebook - Fractopo – KB7 Trace Data Validation

In [None]:
import warnings

warnings.filterwarnings("ignore", message="The Shapely GEOS")
warnings.filterwarnings("ignore", message="In a future version, ")
warnings.filterwarnings("ignore", message="No data for colormapping provided via")
warnings.filterwarnings(
    "ignore", message="Shapely 2.0 is installed, but because PyGEOS is also installed"
)

In [None]:
from pathlib import Path

import geopandas as gpd
from fractopo import Validation
import matplotlib.pyplot as plt
from shapely.geometry import box
import numpy as np

## Data (KB7)

In [None]:
traces_path = Path("../../tests/sample_data/KB7/KB7_traces.geojson")
area_path = Path("../../tests/sample_data/KB7/KB7_area.geojson")

traces = gpd.read_file(traces_path)
area = gpd.read_file(area_path)

# Name the dataset
name = "KB7"

## Validation (KB7)

In [None]:
# Create validation object with fixing (i.e. modification of data) allowed.
# AREA_EDGE_SNAP_MULTIPLIER is overridden to keep catching this error even with future default
# value changes
kb7_validation = Validation(
    traces, area, name=name, allow_fix=True, AREA_EDGE_SNAP_MULTIPLIER=2.5
)

In [None]:
# Run actual validation and capture the outputted validated trace GeoDataFrame
kb7_validated = kb7_validation.run_validation()

## Validation results (KB7)

In [None]:
# Normal DataFrame methods are available for data inspection
kb7_validated.columns

In [None]:
# Convert column data to string to allow hashing and return all unique
# validation errors.
kb7_validated["VALIDATION_ERRORS"].astype(str).unique()

In [None]:
# Better description function is found in fractopo.cli
from fractopo.cli import describe_results

describe_results(kb7_validated, kb7_validation.ERROR_COLUMN)

The KB7 dataset contains the above errors of which `MULTI JUNCTION`, `V NODE`, `STACKED TRACES` and `TRACE UNDERLAPS TARGET AREA` are disruptive in further analysis.

See documentation: https://nialov.github.io/fractopo/validation/errors.html

## Visualization of `MULTI JUNCTION` and `TRACE UNDERLAPS TARGET AREA` errors in notebook

Though visualization here is possible, GIS-software (e.g. QGIS, ArcGIS) are much more interactive and are recommended for actual fixing and further error inspection.

### MULTI JUNCTION

In [None]:
are_in_box = kb7_validated.intersects(box(466020, 6692090, 466030, 6692105))
are_multi_junction = [
    "MULTI JUNCTION" in err for err in kb7_validated[kb7_validation.ERROR_COLUMN]
]
kb7_multijunctions = kb7_validated.loc[np.logical_and(are_in_box, are_multi_junction)]
kb7_multijunctions

In [None]:
kb7_multijunctions.plot()

In [None]:
kb7_multijunctions.plot(colors=["red", "black", "blue", "orange", "green"])

The plot shows that the green and blue traces abut at their endpoints
which is not a valid topology for traces.
The fix is done by merging the green and blue traces.

Additionally the orange trace has a dangling end instead of being accurately snapped to the black trace. 

### TRACE UNDERLAPS TARGET AREA

In [None]:
# Find TRACE UNDERLAPS TARGET AREA erroneous traces in GeoDataFrame
kb7_underlaps = kb7_validated.loc[
    [
        "TRACE UNDERLAPS TARGET AREA" in err
        for err in kb7_validated[kb7_validation.ERROR_COLUMN]
    ]
]
kb7_underlaps

In [None]:
# Create figure, ax base
fig, ax = plt.subplots()

# Plot the underlapping trace along with the trace area boundary
kb7_underlaps.plot(ax=ax, color="red")
area.boundary.plot(ax=ax, color="black")

# Get trace bounds
minx, miny, maxx, maxy = kb7_underlaps.total_bounds

ax.set_xlim(minx - 0.5, maxx + 0.5)
ax.set_ylim(miny - 0.5, maxy + 0.5)

The plot shows that the trace underlaps the target area at least on the northern end and maybe on the southern end. The fix is implemented by extending the trace to meet the target area boundary.