# Part 5: Parallelization examples

This part will briefly cover how to run GRASS computations in parallel. First, create a new mapset:

In [None]:
%%bash
grass -c -e ~/grassdata/nc_spm_08_grass7/parallelization

In [None]:
import subprocess
import sys

# Ask GRASS GIS where its Python packages are.
sys.path.append(
    subprocess.check_output(["grass", "--config", "python_path"], text=True).strip()
)

# Import the GRASS GIS packages we need.
import grass.script as gs
import grass.jupyter as gj

# Start GRASS Session
gj.init("~/grassdata", "nc_spm_08_grass7", "parallelization")

## Tool-level parallelization
There are several [internally parallelized tools](https://grass.osgeo.org/grass-stable/manuals/keywords.html#parallel), either using OpenMP or Python multiprocessing library. We can use `nprocs` option to set the number of cores to be used for processing.


Set computational region to match _elevation_ raster.

In [None]:
gs.run_command("g.region", raster="elevation")

Compute moving window analysis and measure time first with one core, than with 2:

In [None]:
%%timeit
gs.run_command("r.neighbors", input="elevation", output="elevation_smoothed", method="average", size=25, nprocs=1)

In [None]:
%%timeit
gs.run_command("r.neighbors", input="elevation", output="elevation_smoothed", method="average", size=25, nprocs=2)

Visualize original and smoothed raster (turn layers on and off):

In [None]:
neighbors_map = gj.InteractiveMap()
neighbors_map.add_raster("elevation")
neighbors_map.add_raster("elevation_smoothed")
neighbors_map.add_layer_control(position="bottomright")
neighbors_map.show()

## GridModule for tiling
Some compute-intensive tasks can benefit from spatially splitting the task into tiles, and then running the task in parallel. [GridModule](https://grass.osgeo.org/grass-stable/manuals/libpython/pygrass.modules.grid.html) can automate this splitting-computing-merging procedure and execute the computation in parallel.

In this example, we will interpolate an elevation surface from vector points using IDW interpolation. First, set computational region to match the extent of vector points and set the resolution to 1 meter:

In [None]:
gs.run_command("g.region", vector="elev_lid792_bepts", res=1, flags="a")

Measure the time without using GridModule:

In [None]:
%%timeit
gs.run_command("v.surf.idw", input="elev_lid792_bepts", output="elev_lid792_interp")

And now with GridModule:

In [None]:
%%writefile interpolation.py
from grass.pygrass.modules.grid import GridModule


grid = GridModule(
    "v.surf.idw",
    input="elev_lid792_bepts",
    output="elev_lid792_interp",
    processes=3,
    overlap=12,
    quiet=True,
)
grid.run()

In [None]:
%%timeit
!python interpolation.py

In [None]:
gs.run_command("r.colors", map="elev_lid792_interp", color="elevation")
neighbors_map = gj.Map()
neighbors_map.d_rast(map="elev_lid792_interp")
neighbors_map.d_vect(map="elev_lid792_bepts", size=1, color="black")
neighbors_map.show()

## Running multiple independent computations
In this example, our goal is to compute multiple viewsheds and export them as picture. Since these are independent computations, we can run them in parallel.
The first part implements this task in Python using _multiprocessing_ library (also explained in Part 2)
and the second part will run each computation using `grass --exec` interface in separate mapsets that allows us to potentially distribute the computation across multiple nodes on an HPC.

First compute a shaded relief raster for visualization:

In [None]:
gs.run_command("g.region", raster="elevation")
gs.run_command("r.relief", input="elevation", output="relief")

We will compute viewsheds from vector points _firestations_:

In [None]:
viewpoints = gs.read_command('v.out.ascii', input='firestations', separator='comma', flags="r").strip().splitlines()
viewpoints = [p.split(",") for p in viewpoints]
viewpoints

We will extend script from Part 2 of this workshop to include rendering to file. Set `nprocs` to more than 1 when possible.

In [None]:
import os
from grass.exceptions import CalledModuleError
from multiprocessing import Pool, cpu_count


def viewshed(point):
    x, y, cat = point
    x, y = float(x), float(y)
    max_distance = 2000
    # copy current environment
    env = os.environ.copy()
    # set GRASS_REGION variable using region_env function
    env["GRASS_REGION"] = gs.region_env(align="elevation",
                                        e=x + max_distance,
                                        w=x - max_distance,
                                        n=y + max_distance,
                                        s=y - max_distance)
    name = f"viewshed_{cat}"
    try:
        gs.run_command("r.viewshed", input="elevation", output=name, flags="b",
                      coordinates=(x, y), max_distance=max_distance, env=env)
        # create visualization
        viewshed_map = gj.Map(use_region=True, env=env)
        viewshed_map.d_rast(map="relief")
        viewshed_map.d_rast(map=f"viewshed_{cat}", values=1)
        viewshed_map.d_vect(map="firestations", cat=cat, size=15, icon="basic/pin")
        viewshed_map.save(f"viewshed_{cat}.png")
        return f"viewshed_{cat}"
    except CalledModuleError:
        return None

# run with the number of CPUs available
# proc = cpu_count()
nprocs = 1
with Pool(processes=nprocs) as pool:
    maps = pool.map_async(viewshed, viewpoints).get()

Let's look at one of the computed and rendered viewsheds:

In [None]:
from IPython.display import Image

Image("viewshed_22.png")

Note that this way, we can't distribute the computation across multiple nodes (hundreds of cores).
We will do the same thing differently, using `grass --exec` [interface](https://grass.osgeo.org/grass-stable/manuals/grass.html), running each task in a separate mapset. This way, the tasks could be distributed across multiple nodes.

`--exec` interface allows GRASS tools and user scripts to be executed in a GRASS GIS non-interactive session. For example, here is a simple call to list all available vectors in PERMANENT mapset:

In [None]:
%%bash
grass ~/grassdata/nc_spm_08_grass7/PERMANENT --exec g.list type=vector mapset=PERMANENT -t

Now we will create a Python script `myscript.py` computing and rendering viewsheds similarly as in the previous example. The script requires 3 parameters (x and y coordinate, and category). Note that we can set computational region in a standard way, because each script will run in separate mapset, so the different regions won't interfere with each other.

In [None]:
%%writefile myscript.py
import sys
import grass.script as gs
import grass.jupyter as gj


def main(x, y, cat):
    max_distance = 2000
    x, y = float(x), float(y)
    name = f"viewshed_{cat}"
    gs.run_command("g.region", align="elevation", e=x + max_distance,
                   w=x - max_distance, n=y + max_distance, s=y - max_distance)
    gs.run_command("r.viewshed", input="elevation", output=name, coordinates=(x, y),
                   observer_elevation=3, max_distance=max_distance, flags="b")
    # create visualization
    viewshed_map = gj.Map(use_region=True)
    viewshed_map.d_rast(map="relief@parallelization")
    viewshed_map.d_rast(map=f"viewshed_{cat}", values=1)
    viewshed_map.d_vect(map="firestations", cat=cat, size=15, icon="basic/pin")
    viewshed_map.save(f"viewshed_{cat}.png")

if __name__ == "__main__":
    args = sys.argv[1:]
    main(*args)

We will generate a file `jobs.sh` with one command per line. We will run each task in a temporary mapset so all computed data will be deleted afterwards. That is fine for our example where we need only the final PNG files.

In [None]:
with open("jobs.sh", "w") as f:
    for viewpoint in viewpoints:
        f.write(f"grass --tmp-mapset ~/grassdata/nc_spm_08_grass7 --exec python myscript.py {viewpoint[0]} {viewpoint[1]} {viewpoint[2]}\n")

This is the content of the file:

In [None]:
!cat jobs.sh

To execute these commands in parallel, we can use e.g. [GNU Parallel](https://www.gnu.org/software/parallel/):

In [None]:
%%bash

parallel -j 2 < jobs.sh

Check one of the resulting PNG files:

In [None]:
Image("viewshed_22.png")