# How to use the process area

Follow along this step-by-step guide to learn about the [ProcessArea](https://geospaitial-lab.github.io/aviary/api_reference/process_area) class.

To avoid any issues, run the cells in order and don't skip any cells.<br />
If something seems off, just restart the runtime and run the cells again.

# Install aviary

Install aviary in the current runtime using pip.

In [None]:
! pip install -q geospaitial-lab-aviary

# Import aviary and verify the installation

In [None]:
import aviary

print(aviary.__version__)

# Create a process area

A process area specifies the area of interest by a set of coordinates of the bottom left corner of each tile.

By default, a new instance of the `ProcessArea` class has no coordinates.<br />
You can access the coordinates of the process area with the `coordinates` attribute,
which is a numpy array of shape (n, 2) and data type int32.

In [None]:
process_area = aviary.ProcessArea()

print(process_area.coordinates)

If you already have the coordinates, you can pass them to the initializer of the `ProcessArea` class.

In [None]:
import numpy as np

coordinates = np.array(
    [
        [363084, 5715326],
        [363212, 5715326],
        [363084, 5715454],
        [363212, 5715454],
    ],
    dtype=np.int32,
)
process_area = aviary.ProcessArea(coordinates=coordinates)

print(process_area.coordinates)

We can visualize the process area with [folium](https://python-visualization.github.io/folium/latest) for a better understanding.

Install folium in the current runtime using pip.

In [None]:
! pip install -q folium

We define a function `visualize_process_area`, so that we can reuse it in the next steps.

In [None]:
import folium
import geopandas as gpd


def visualize_process_area(
    process_area: aviary.ProcessArea,
) -> folium.Map:
    # Convert the process area to a geodataframe
    gdf = process_area.to_gdf(
        epsg_code=25832,
        tile_size=128,
    )

    # Compute the centroid of the process area
    centroid = gpd.GeoDataFrame(
        geometry=[gdf.union_all().centroid],
        crs=gdf.crs,
    )

    # Convert the centroid to EPSG:4326 (folium requires EPSG:4326)
    centroid_epsg_4326 = centroid.to_crs(epsg=4326)

    # Compute the location of the folium map
    location_epsg_4326 = [
        centroid_epsg_4326.geometry.y.mean(),
        centroid_epsg_4326.geometry.x.mean(),
    ]

    # Convert the process area to EPSG:4326 (folium requires EPSG:4326)
    gdf_epsg_4326 = gdf.to_crs(epsg=4326)

    # Create a folium map
    folium_map = folium.Map(
        location=location_epsg_4326,
        zoom_start=16,
        tiles='OpenStreetMap',
    )

    # Define the style of the process area
    style_function = lambda x: {
        'fillOpacity': .2,
        'color': 'black',
        'weight': 2,
    }

    # Add the process area to the folium map
    folium.GeoJson(gdf_epsg_4326, style_function=style_function).add_to(folium_map)

    return folium_map

Now we can visualize the process area.

In [None]:
folium_map = visualize_process_area(process_area)

folium_map

You can set the coordinates of an already created process area with the `coordinates` attribute.

In [None]:
coordinates = np.array(
    [
        [363084, 5715326],
        [363212, 5715326],
    ],
    dtype=np.int32,
)
process_area.coordinates = coordinates

print(process_area.coordinates)

Visualize the process area.

In [None]:
folium_map = visualize_process_area(process_area)

folium_map

## Create a process area from a bounding box

You can create a process area from a bounding box with the `from_bounding_box` class method.

In [None]:
bounding_box = aviary.BoundingBox(
    x_min=363084,
    y_min=5715326,
    x_max=363340,
    y_max=5715582,
)
process_area = process_area.from_bounding_box(
    bounding_box=bounding_box,
    tile_size=128,
    quantize=False,
)

print(process_area.coordinates)

Visualize the process area.

In [None]:
folium_map = visualize_process_area(process_area)

folium_map

## Create a process area from a geodataframe

You can create a process area from a geodataframe with the `from_gdf` class method.

In [None]:
from shapely.geometry import box

gdf = gpd.GeoDataFrame(
    geometry=[box(363084, 5715326, 363340, 5715582)],
    crs='EPSG:25832',
)
process_area = process_area.from_gdf(
    gdf=gdf,
    tile_size=128,
    quantize=False,
)

print(process_area.coordinates)

Visualize the process area.

In [None]:
folium_map = visualize_process_area(process_area)

folium_map

The geodataframe may contain multiple polygons, e.g. the administrative areas of Gelsenkirchen and Recklinghausen.

In [None]:
url = 'TODO'
gdf = gpd.read_file(url)
process_area = process_area.from_gdf(
    gdf=gdf,
    tile_size=128,
    quantize=True,
)

print(process_area.coordinates)

Visualize the process area.

In [None]:
folium_map = visualize_process_area(process_area)

# Convert the administrative areas to EPSG:4326 (folium requires EPSG:4326)
gdf_epsg_4326 = gdf.to_crs(epsg=4326)

# Define the style of the administrative areas (red)
style_function = lambda x: {
    'fillOpacity': 0,
    'color': '#FF595E',
    'weight': 2,
}

# Add the administrative areas to the folium map
folium.GeoJson(gdf_epsg_4326, style_function=style_function).add_to(folium_map)

folium_map

## Create a process area from a json string

You can create a process area from a json string with the `from_json` class method.

In [None]:
json_string = (
    '[[363084, 5715326], '
    '[363212, 5715326], '
    '[363084, 5715454], '
    '[363212, 5715454]]'
)
process_area = process_area.from_json(json_string=json_string)

print(process_area.coordinates)

Visualize the process area.

In [None]:
folium_map = visualize_process_area(process_area)

folium_map

# Add or subtract process areas

You can add two process areas with the `+` operator.<br />
If the process areas overlap, the resulting process area will contain the union of the two process areas.

In [None]:
coordinates_1 = np.array(
    [
        [363084, 5715326],
        [363212, 5715326],
        [363084, 5715454],
        [363212, 5715454],
    ],
    dtype=np.int32,
)
process_area_1 = aviary.ProcessArea(coordinates=coordinates_1)

coordinates_2 = np.array(
    [
        [363212, 5715454],
        [363340, 5715454],
        [363212, 5715582],
        [363340, 5715582],
    ],
    dtype=np.int32,
)
process_area_2 = aviary.ProcessArea(coordinates=coordinates_2)

print(process_area_1.coordinates)
print(process_area_2.coordinates)

In [None]:
process_area = process_area_1 + process_area_2

print(process_area.coordinates)

Visualize the process area.

In [None]:
folium_map = visualize_process_area(process_area)

# Convert the first process area to a geodataframe
gdf_1 = process_area_1.to_gdf(
    epsg_code=25832,
    tile_size=128,
)

# Convert the first process area to EPSG:4326 (folium requires EPSG:4326)
gdf_1_epsg_4326 = gdf_1.to_crs(epsg=4326)

# Define the style of the first process area (red)
style_function_1 = lambda x: {
    'fillOpacity': 0,
    'color': '#FF595E',
    'weight': 2,
}

# Add the first process area to the folium map
folium.GeoJson(gdf_1_epsg_4326, style_function=style_function_1).add_to(folium_map)

# Convert the second process area to a geodataframe
gdf_2 = process_area_2.to_gdf(
    epsg_code=25832,
    tile_size=128,
)

# Convert the second process area to EPSG:4326 (folium requires EPSG:4326)
gdf_2_epsg_4326 = gdf_2.to_crs(epsg=4326)

# Define the style of the second process area (blue)
style_function_2 = lambda x: {
    'fillOpacity': 0,
    'color': '#1982C4',
    'weight': 2,
}

# Add the second process area to the folium map
folium.GeoJson(gdf_2_epsg_4326, style_function=style_function_2).add_to(folium_map)

folium_map

You can subtract two process areas with the `-` operator.

In [None]:
coordinates_1 = np.array(
    [
        [363084, 5715326],
        [363212, 5715326],
        [363084, 5715454],
        [363212, 5715454],
    ],
    dtype=np.int32,
)
process_area_1 = aviary.ProcessArea(coordinates=coordinates_1)

coordinates_2 = np.array(
    [
        [363212, 5715454],
        [363340, 5715454],
        [363212, 5715582],
        [363340, 5715582],
    ],
    dtype=np.int32,
)
process_area_2 = aviary.ProcessArea(coordinates=coordinates_2)

print(process_area_1.coordinates)
print(process_area_2.coordinates)

In [None]:
process_area = process_area_1 - process_area_2

print(process_area.coordinates)

Visualize the process area.

In [None]:
folium_map = visualize_process_area(process_area)

# Convert the first process area to a geodataframe
gdf_1 = process_area_1.to_gdf(
    epsg_code=25832,
    tile_size=128,
)

# Convert the first process area to EPSG:4326 (folium requires EPSG:4326)
gdf_1_epsg_4326 = gdf_1.to_crs(epsg=4326)

# Define the style of the first process area (red)
style_function_1 = lambda x: {
    'fillOpacity': 0,
    'color': '#FF595E',
    'weight': 2,
}

# Add the first process area to the folium map
folium.GeoJson(gdf_1_epsg_4326, style_function=style_function_1).add_to(folium_map)

# Convert the second process area to a geodataframe
gdf_2 = process_area_2.to_gdf(
    epsg_code=25832,
    tile_size=128,
)

# Convert the second process area to EPSG:4326 (folium requires EPSG:4326)
gdf_2_epsg_4326 = gdf_2.to_crs(epsg=4326)

# Define the style of the second process area (blue)
style_function_2 = lambda x: {
    'fillOpacity': 0,
    'color': '#1982C4',
    'weight': 2,
}

# Add the second process area to the folium map
folium.GeoJson(gdf_2_epsg_4326, style_function=style_function_2).add_to(folium_map)

folium_map

# Append coordinates to the process area

You can append coordinates to the process area with the `append` method.

In [None]:
coordinates = np.array(
    [
        [363084, 5715326],
        [363212, 5715326],
        [363084, 5715454],
        [363212, 5715454],
    ],
    dtype=np.int32,
)
process_area = aviary.ProcessArea(coordinates=coordinates)

print(process_area.coordinates)

In [None]:
process_area = process_area.append((363340, 5715582))

print(process_area.coordinates)

Visualize the process area.

In [None]:
folium_map = visualize_process_area(process_area)

folium_map

If you want to append coordinates that already exist, the process area will not change.

In [None]:
process_area = process_area.append((363340, 5715582))

print(process_area.coordinates)

# Chunk the process area

You can chunk the process area into multiple process areas with the `chunk` method.<br />
This might be useful when you want to run multiple pipelines in distributed environments.

In [None]:
coordinates = np.array(
    [
        [363084, 5715326],
        [363212, 5715326],
        [363084, 5715454],
        [363212, 5715454],
    ],
    dtype=np.int32,
)
process_area = aviary.ProcessArea(coordinates=coordinates)

print(process_area.coordinates)

In [None]:
process_areas = process_area.chunk(num_chunks=2)

for process_area in process_areas:
    print(process_area.coordinates)

# Filter the process area

TODO

# Convert the process area to a geodataframe

You can convert the process area to a geodataframe with the `to_gdf` method.

In [None]:
coordinates = np.array(
    [
        [363084, 5715326],
        [363212, 5715326],
        [363084, 5715454],
        [363212, 5715454],
    ],
    dtype=np.int32,
)
process_area = aviary.ProcessArea(coordinates=coordinates)

print(process_area.coordinates)

In [None]:
gdf = process_area.to_gdf(
    epsg_code=25832,
    tile_size=128,
)

print(gdf)

# Convert the process area to a json string

You can convert the process area to a json string with the `to_json` method.

In [None]:
coordinates = np.array(
    [
        [363084, 5715326],
        [363212, 5715326],
        [363084, 5715454],
        [363212, 5715454],
    ],
    dtype=np.int32,
)
process_area = aviary.ProcessArea(coordinates=coordinates)

print(process_area.coordinates)

In [None]:
json_string = process_area.to_json()

print(json_string)