In [None]:
%%bash
# install dependencies
apt install -y autoconf2.13 autotools-dev bison flex g++ gettext git imagemagick libblas-dev libbz2-dev libcairo2-dev libfftw3-dev libfreetype6-dev libgdal-dev libgeos-dev libglu1-mesa-dev libjpeg-dev liblapack-dev liblas-c-dev libncurses5-dev libnetcdf-dev libpng-dev libpq-dev libproj-dev libreadline-dev libsqlite3-dev libtiff-dev libxmu-dev libzstd-dev make netcdf-bin p7zip proj-bin sqlite3 unixodbc-dev xvfb zlib1g-dev libgomp1 subversion parallel

pip install Pillow
pip install ply
pip install PyVirtualDisplay

git clone --depth 1 https://github.com/OSGeo/grass.git grass-src

# enter the directory with source code
cd grass-src

# compile
./configure \
    --enable-largefile=yes \
    --with-nls \
    --with-cxx \
    --with-readline \
    --with-bzlib \
    --with-pthread \
    --with-proj-share=/usr/share/proj \
    --with-geos=/usr/bin/geos-config \
    --with-cairo \
    --with-opengl-libs=/usr/include/GL \
    --with-freetype=yes --with-freetype-includes="/usr/include/freetype2/" \
    --with-sqlite=yes \
    --with-openmp
make -j2
make -j2 install

# leave the directory with source code
cd ~

# download sample data
mkdir -p grassdata
curl -SL https://grass.osgeo.org/sampledata/north_carolina/nc_basic_spm_grass7.zip > nc_basic_spm_grass7.zip
unzip -qq nc_basic_spm_grass7.zip
mv nc_basic_spm_grass7 grassdata
rm nc_basic_spm_grass7.zip

In [None]:
import os
os.chdir(os.path.expanduser("~"))

# Developing Custom GRASS Tools - FOSS4G 2022 Workshop

Learn how to develop custom tools (aka addons or modules) for GRASS GIS in Python and, if you like, in C.

Python scripting is powerful, but what is even more powerful is turning a script into a GRASS tool with just a few tricks and tweaks we will cover in this workshop. When you develop a GRASS tool (aka module), you get a graphical user interface (GUI), command line interface, and convenience you and your users will appreciate. Such tools can be published in a community-maintained addon repository which helps not only to distribute the tool, but also to maintain the code in the long term.

We will focus on Python, but we will cover tools written in C, too, because even compiled tools in C and C++ can be in this community-maintained repository and distributed to users.

## Authors

### Vaclav Petras

Vaclav (Vashek) Petras is a research software engineer, open source developer, and open science advocate. He received his masters in Geoinformatics from the Czech Technical University and PhD in Geospatial Analytics from the North Carolina State University. Vaclav is a member of the GRASS GIS Development Team and Project Steering Committee.

### Anna Petrasova

Anna is a geospatial research software engineer with PhD in Geospatial Analytics. She develops spatio-temporal models of urbanization and pest spread across landscape. As a member of the OSGeo Foundation and the GRASS GIS Project Steering Committee, Anna advocates the use of open source software in research and education.

Thanks for providing feedback goes to: Bernardo Santos

## Related talks

* _Take-Home Messages from Adding Code Quality Measures to GRASS GIS_
* _Tips for parallelization in GRASS GIS in the context of land change modeling_
* _Using GRASS GIS in Jupyter Notebooks: An Introduction to grass.jupyter_
* _State of GRASS GIS_

## Outline

- This notebook
  * Python script structure
  * Running Python scripts
  * Running GRASS GIS
- Tools for GRASS GIS in Python
- Best practices for writing GRASS tools
- Best practices for writing GRASS tools
- Tools for GRASS GIS in C

## Workshop Software Setup

The workshop material assumes it runs in the prepared Binder environment which is running Ubuntu and GRASS GIS is already installed there.

The Binder is set up with development version of GRASS GIS 8.3, but the notebooks will work with 8.2 as well.

## Getting GRASS GIS Ready

If you are running the notebook in the prepared Binder, there is nothing to do. If you are compiling GRASS GIS yourself, start JupyterLab in a way that your compiled GRASS GIS is the first _grass_ command on the path, e.g.,:

```bash
PATH=~/grass/code/bin.x86_64-pc-linux-gnu/:$PATH jupyter lab
```

For other cases, please refer to [GRASS GIS Jupyter notebooks wiki page](https://grasswiki.osgeo.org/wiki/GRASS_GIS_Jupyter_notebooks#Running_a_Jupyter_notebook_locally).

Check that GRASS GIS is running and that you get the expected version:

In [None]:
!grass --version

## Getting Data Ready

The Binder setup already has the [full North Carolina sample dataset](https://grass.osgeo.org/sampledata/north_carolina/nc_spm_08_grass7.zip) included.

To do our test runs in an isolated environment, we will create a new mapset (aka subproject) called _foss4g_:

In [None]:
!grass -e -c ~/grassdata/nc_basic_spm_grass7/foss4g

To start over later on, you can use a different mapset or delete this one using:

```bash
rm -r ~/grassdata/nc_basic_spm_grass7/foss4g
```

## Python Script

Let's start with a basic Python script which uses GRASS GIS. GRASS Python packages are usually not on Python path, so we will use GRASS command line interface to get the path to these packages before we import them.

The script starts a GRASS session and uses the mapset we created above. Then, it prints the current mapset name. 

In [None]:
%%python
# Import standard Python packages we need.
import subprocess
import sys

# Ask GRASS GIS where its Python packages are.
sys.path.append(
    subprocess.check_output(["grass", "--config", "python_path"], text=True).strip()
)

# Import the GRASS GIS packages we need.
import grass.script as gs
import grass.script.setup  # Needed only in 8.2 and older.

# Use GRASS session as a context manager.
with grass.script.setup.init("~/grassdata/nc_basic_spm_grass7/foss4g") as session:
    print(gs.read_command("g.mapset", flags="p"))

## Running Python Scripts from Command Line

For testing a script and for integrating it in GRASS GIS, it is advantageous to see how a script is executed from command line or generally as a subprocess.

Before, we used IPython kernel cell magic `%%python` to run a cell as a separate Python script. Now, we will use `%%writefile` cell magic to create a Python file which we will execute in the following cell.

In [None]:
%%writefile mapset_print_script.py
import subprocess
import sys

sys.path.append(
    subprocess.check_output(["grass", "--config", "python_path"], text=True).strip()
)

import grass.script as gs
import grass.script.setup

with grass.script.setup.init("~/grassdata/nc_basic_spm_grass7/foss4g") as session:
    print(gs.read_command("g.mapset", flags="p"))

Use _python_ to execute the script. It's name (or path) are provided as parameter:

In [None]:
!python mapset_print_script.py

## Full Python Script Structure

The best practice for Python scripts is to use a _main_ function which is called from the so-called "if name equals main" block. The name of the of the _main_ function is not import, although it usually is _main_, while the syntax of the if-name-equals-main block is. Generally, all code should be in the _main_ function or called from it. This is the structure we will use from now on:

```python
def main():
    pass

if __name__ == "__main__":
    main()
```

Our script, combined with the new structure, now looks like this:

In [None]:
%%writefile mapset_print_main.py
import subprocess
import sys

sys.path.append(
    subprocess.check_output(["grass", "--config", "python_path"], text=True).strip()
)

import grass.script as gs
import grass.script.setup


def main():
    with grass.script.setup.init("~/grassdata/nc_basic_spm_grass7/foss4g") as session:
        print(gs.read_command("g.mapset", flags="p"))


if __name__ == "__main__":
    main()

In [None]:
!python mapset_print_main.py

## Executable Scripts and Shebang

On unix-like systems (Linux, macOS, ...), specifying the Python interpreter can be avoided when the script has execute permissions and the first line of the script called shebang specifies which interpreter to use for the given script. A minimal script then looks like this:

```python
#!/usr/bin/env python

def main():
    pass

if __name__ == '__main__':
    main()
```

The first line now caries very special meaning, but for Python it is just a comment, although some helper tools may recognize it.

Let's add shebang to our script:

In [None]:
%%writefile mapset_print_executable.py
#!/usr/bin/env python
import subprocess
import sys

sys.path.append(
    subprocess.check_output(["grass", "--config", "python_path"], text=True).strip()
)

import grass.script as gs
import grass.script.setup


def main():
    with grass.script.setup.init("~/grassdata/nc_basic_spm_grass7/foss4g") as session:
        print(gs.read_command("g.mapset", flags="p"))


if __name__ == "__main__":
    main()

Permissions are managed using _chmod_. `chmod u+x` makes a file executable for the user:

In [None]:
!chmod u+x mapset_print_executable.py

Script can then run without specifying the Python interpreter:

In [None]:
!./mapset_print_executable.py

Note `./` which says the script is in the current directory. The path is always mandatory in this case even if it is the current directory. Installed GRASS tools are _on path_, i.e., are where the operating system looks for executables, so for installed tools, no path needs to be specified.

The executable mechanism on Windows is different and GRASS GIS does number of steps to ensure that the scripts can be executed and right Python is used.

## Running in GRASS GIS

GRASS tools are different from Python scripts which are using GRASS GIS in the way that they are not setting up their own GRASS session. The tools are already running in a session which was previously set up by the user in some interactive or automated way, e.g., using GUI in a desktop environment or from a Python script.

The following script assumes that it runs in a GRASS session. Because we separated the concern about the GRASS session, the script is simpler: There is no need to set up path to GRASS packages and initialize GRASS session with a mapset.

In [None]:
%%writefile mapset_print_tool.py
#!/usr/bin/env python

import subprocess
import sys

import grass.script as gs


def main():
    print(gs.read_command("g.mapset", flags="p"))


if __name__ == "__main__":
    main()

The script can then run in an interactive GRASS session (from GUI or shell) or it can be executed using the `--exec` interface:

In [None]:
!grass ~/grassdata/nc_basic_spm_grass7/foss4g --exec python ./mapset_print_tool.py

Let's make the script executable:

In [None]:
!chmod u+x mapset_print_tool.py

For the executable script, we can leave out `python`:

In [None]:
!grass ~/grassdata/nc_basic_spm_grass7/foss4g --exec ./mapset_print_tool.py

## Command Line Parameters

Scripts like all other programs, can take command line parameters. This is a crucial feature for developing general scripts and GRASS tools.

Here is a simple script which prints parameters received on the command line:

In [None]:
%%writefile command_line_print.py
#!/usr/bin/env python

import sys


def main():
    print(f"Parameters are: {sys.argv}")


if __name__ == "__main__":
    main()

Make the script executable:

In [None]:
!chmod u+x command_line_print.py

Try different combinations of parameters:

In [None]:
!./command_line_print.py abc xyz 1 2 3 "dd ee ff" '44 55 66'

The script works just the same with `grass ... --exec` where parameters go after the script:

In [None]:
!grass ~/grassdata/nc_basic_spm_grass7/foss4g --exec ./command_line_print.py abc "dd ee ff"

# GRASS Tool with Parameters

## Interface Definition for Scripts

Since GRASS tools are executable scripts (or generally programs), the interface of a GRASS tool is the command line interface of a Python script. A dedicated function _grass.script.parser_ takes care of processing the command line arguments based on the interface description specified in a Python comment with a key-value syntax defined by GRASS GIS.

The following is an example of a script which takes two parameters: name of a vector map and name of a raster map:

In [None]:
%%writefile vector_to_raster.py
#!/usr/bin/env python

# %module
# % description: Converts vector data to raster data
# %end
# %option G_OPT_V_INPUT
# %end
# %option G_OPT_R_OUTPUT
# %end

import subprocess
import sys

import grass.script as gs


def main():
    gs.parser()


if __name__ == "__main__":
    main()

As before, we will make the script executable:

In [None]:
!chmod u+x vector_to_raster.py

Running the script with `--help` gives its interface described for command line use:

In [None]:
!grass ~/grassdata/nc_basic_spm_grass7/foss4g --exec ./vector_to_raster.py --help

Running the script with `--interface-description` gives its interface described using XML which is useful for building other interfaces, e.g., GUI:

In [None]:
!grass ~/grassdata/nc_basic_spm_grass7/foss4g --exec ./vector_to_raster.py --interface-description

Running the script with `--html-description` gives the command line interface described in HTML which later becomes a part of the tool's HTML documentation:

In [None]:
!grass ~/grassdata/nc_basic_spm_grass7/foss4g --exec ./vector_to_raster.py --html-description > test.html

In [None]:
from IPython.display import IFrame

IFrame("test.html", width=700, height=600)

Open the generated HTML file called _test.html_ from the File Browser (on the left in JupyterLab).

On desktop, a graphical user interface for the tool would be available, too, accessible, e.g., through `--ui`:

```bash
grass ~/grassdata/nc_basic_spm_grass7/foss4g --exec ./vector_to_raster.py --ui
```

The GUI window may look like this:

![GUI with vector and raster parameters](img/vector_raster_gui.png)

## Using the Parameters

The values parsed from the command line are stored in a dictionary returned by the _parse_ function. They can be accessed using `dictionary["name"]` syntax where _name_ is the name of the parameter. We are using the predefined standard options for vector input and raster output which are named _input_ and _output_. Here, we add an additional parameter named _layer_ also referred to as _field_ which can be used to specify a layer or a subset in the input vector dataset (more on that later).

We will store the three values in Python variables and pass them to _v.to.rast_ which will do the actual processing for us.

In [None]:
%%writefile vector_to_raster.py
#!/usr/bin/env python

# %module
# % description: Converts vector data to raster data
# %end
# %option G_OPT_V_INPUT
# %end
# %option G_OPT_V_FIELD
# %end
# %option G_OPT_R_OUTPUT
# %end

import subprocess
import sys

import grass.script as gs


def main():
    options, flags = gs.parser()
    vector_input = options["input"]
    vector_layer = options["layer"]
    raster_output = options["output"]

    gs.run_command(
        "v.to.rast",
        input=vector_input,
        layer=vector_layer,
        output=raster_output,
        use="val",
    )


if __name__ == "__main__":
    main()

We can now execute the tool. There is no need to make it executable again because we are using the same name as before and the execute permissions are preserved even when the file contents change.

The command line parameters in GRASS GIS are key-value pairs which are using syntax `key=value`. In the CLI world, this is sometimes called _named arguments_ and it is similar to Python keyword arguments.

The dataset we are using here has vector points called _firestations_ and we will create new raster called _stations_. Before running the processing, we will set the computation region to the extent of _firestations_ and will use resolution 30 meters (this is set and preserved for the mapset between the individual runs of `grass ... --exec`).

In [None]:
!grass ~/grassdata/nc_basic_spm_grass7/foss4g --exec g.region vector=firestations res=100
!grass ~/grassdata/nc_basic_spm_grass7/foss4g --exec ./vector_to_raster.py input=firestations output=stations

Try running the above again. The raster named _stations_ now exists, so GRASS GIS will automatically detect that and ask you to use `--overwrite` if you want to replace the existing data.

In [None]:
!grass ~/grassdata/nc_basic_spm_grass7/foss4g --exec ./vector_to_raster.py input=firestations output=stations

With added `--overwrite`:

In [None]:
!grass ~/grassdata/nc_basic_spm_grass7/foss4g --exec ./vector_to_raster.py input=firestations output=stations --overwrite

Let's view data range of the newly created raster:

In [None]:
!grass ~/grassdata/nc_basic_spm_grass7/foss4g --exec r.info map=stations -r

To view the data in the notebook we will render the raster using _grass.jupyter.Map_. Usually, we would just create a GRASS session or already have one. However, to keep our development environment as is, we will avoid creating a session in the notebook process, but using a subprocess to do the rendering into a PNG image. The following uses the `%%python` magic to execute Python code in a subprocess:

In [None]:
%%python
import subprocess
import sys

sys.path.append(
    subprocess.check_output(["grass", "--config", "python_path"], text=True).strip()
)

import grass.script as gs
import grass.jupyter as gj
import grass.script.setup  # Needed only in 8.2 and older.

with grass.script.setup.init("~/grassdata/nc_basic_spm_grass7/foss4g") as session:
    ortho_map = gj.Map()
    ortho_map.d_rast(map="stations")
    # Save the image (in a standard notebook, we would just display the image now).
    ortho_map.save("stations.png")

Now, use _Image_ from _IPython.display_ to display the PNG:

In [None]:
from IPython.display import Image

Image("stations.png")

## General Parameter Definition

To add general parameters such as text and numbers, we can use the following key-value syntax enclosed in `%option` and `%end`:

```python
# %option
# % key1: value1
# % key2: value2
# % key3: value3
# %end
```

Let's say we want to allow users of our tool to specify the raster value which is used where vector features are present. We will name it _value_ (`key: value`) and make it required (`required: yes`). The data type we will use is _double_ (`type: double`) which we can use as _float_ in Python. The following puts all these together:

```python
# %option
# % key: value
# % type: double
# % required: yes
# % description: Raster cell value where features are
# %end
```

Here is the full script. The values come as strings, so for raster value, we convert the string to float (`float(options["value"])`), although in this case we don't have to because we just pass it to the _v.to.rast_ subprocess and a string would work in that context too.

In [None]:
%%writefile vector_to_raster.py
#!/usr/bin/env python

# %module
# % description: Converts vector data to raster data
# %end
# %option G_OPT_V_INPUT
# %end
# %option G_OPT_V_FIELD
# %end
# %option G_OPT_R_OUTPUT
# %end
# %option
# % key: value
# % type: double
# % required: yes
# % description: Raster cell value where features are
# %end

import subprocess
import sys

import grass.script as gs


def main():
    options, flags = gs.parser()
    vector_input = options["input"]
    vector_layer = options["layer"]
    raster_output = options["output"]
    value = float(options["value"])

    gs.run_command(
        "v.to.rast",
        input=vector_input,
        layer=vector_layer,
        output=raster_output,
        use="val",
        value=value,
    )


if __name__ == "__main__":
    main()

Now, let's add the new parameter `value=5`:

In [None]:
!grass ~/grassdata/nc_basic_spm_grass7/foss4g --exec ./vector_to_raster.py input=firestations output=stations_value value=5

The data range the raster is now 5-5, i.e., all values are 5:

In [None]:
!grass ~/grassdata/nc_basic_spm_grass7/foss4g --exec r.info map=stations_value -r

## Using the New Tool from Python

The tool can be used from Python just like the other GRASS tools.

Here is a Python script which creates a GRASS session and calls our new tool:

In [None]:
%%python
import subprocess
import sys

sys.path.append(
    subprocess.check_output(["grass", "--config", "python_path"], text=True).strip()
)

import grass.script as gs
import grass.script.setup


def main():
    with grass.script.setup.init("~/grassdata/nc_basic_spm_grass7/foss4g") as session:
        gs.run_command(
            "vector_to_raster.py",
            input="firestations",
            output="stations_python",
            value=5,
        )


if __name__ == "__main__":
    main()

## Using Existing Interfaces for Generating Wrappers and Boilerplates

Often, a new tool is somehow wrapping or extending an existing tool or is similar to one. To quickly generate a boilerplate code in such cases, we can run any GRASS tool with `--script`. Unfortunately, `--script` does not currently output standard options, so the generated definitions are unnecessarily complicated.

Given that the same structure is needed every time, it is a good idea to use `--script` or copy-paste code from existing tools or examples. In the GRASS GIS source code, the Python scripts are under _[scripts](https://github.com/OSGeo/grass/tree/releasebranch_8_2/scripts)_ and _[temporal](https://github.com/OSGeo/grass/tree/releasebranch_8_2/temporal)_. Tools in the grass-addons repository are not organized by language, but many of the tools are in Python.

Here is how to get a Python script boilerplate from _v.to.rast_ (which itself is in written C):

In [None]:
!grass --tmp-location XY --exec v.to.rast --script

# Best Practices

These practices are for writing GRASS tools or scripts which can be used in the same way. For general, standalone scripts, most of these simply don't apply. However, the concepts and techniques might still be useful.

The assumption is that the tools is written in Python. However, the advice applies to other scripting languages as well, although, when the Python API is not available, the underlying mechanism such as environment variables may need to be used directly. The same best practices apply to tools written in C in terms of behavior, but the way how to achieve the desired behavior might be different.

## Use Standard Options in Interface

As mentioned before, a GRASS tools must use the GRASS parser to handle its command line parameters, so that it works with the other components such as GUI.

To make writing parameters simpler and the interfaces more unified, use standard options. See [Parser standard options](https://grass.osgeo.org/grass82/manuals/parser_standard_options.html). For example, use this:

```python
# %option G_OPT_V_INPUT
# %end
# %option G_OPT_R_OUTPUT
# %end
```

If needed, override values which need to be different:

```python
# %option G_OPT_V_INPUT
# % key: point_input
# % label: Name of input vector map with points
# % description: Points which used for sampling the raster input
# %end
# %option G_OPT_R_OUTPUT
# % key: raster_input
# % label: Name of sampled raster map
# % description: Raster map which will be sampled by the points
# %end
```

Don't repeat the values when a standard option defines them. For example, don't use this if possible:

```python
# Example of where standard option would work better:
# %option
# % key: input
# % type: string
# % required: yes
# % multiple: no
# % key_desc: name
# % label: Name of input vector map
# % description: Or data source for direct OGR access
# % gisprompt: old,vector,vector
# %end
```

## Consider both Flags and Options to Modify Behavior

If the tool's behavior can be modified by users in some way, e.g., when its handling of nulls (no data) can be modified, a flag can be used to ask for that alternative behavior. Flags are like options which are booleans with default false with names which are only one character. They are defined using:

```python
# %flag
# % key: n
# % description: Consider zeros to be null values
# %end
```

If the tool is used in Python, the flag would be used in the _flags_ parameter of _run_command_:

```python
gs.run_command(..., flags="n", ...)
```

In the command line, the flag would be used with dash as `-n`.

However, options are often better because they improve readability, clarify the default behavior, and allow for extension of the interface.

Consider a tool which produces text output which by default produces human-readable plain text output. Then you add JSON output which is enabled by a flag `j`. Later, you decide to add YAML output. This now needs to be flag `y` which needs to be exclusive with flag `j`. Soon, you have several related flags each exclusive with all the others. Using an option instead of flag from the beginning allows the interface to accommodate more formats. In this example, an option named `format` can have default value `plain` and `json` from JSON output. When you later add YAML, you simply add `yaml` to possible values without a need for additional options or flags. The interface definition for the example would look like:

```python
# %option
# % key: format
# % type: string
# % required: yes
# % options: plain,json,yaml
# % label: Output format
# % descriptions: plain;Plain text output;json;JSON output;yaml;YAML output
# % answer: plain
# %end
```

Other typical cases where this applies include handling of computational region (e.g., `i` versus `extent=input`) or the aforementioned null handling (`n` for NULLs being zeros versus `nulls=zeros`, `nulls=nulls`, and `nulls=9999`).

## Input and Output Geospatial Data Format

A tool should read and write geospatial data as GRASS raster or vector maps. The tools should generally use input geospatial data which are in the current GRASS location. Importing data from other formats should generally be left to dedicated import tools, e.g., _v.import_. The same applies to outputs and export of data. The obvious exceptions are import and export of data, e.g., _r.in.xyz_.

The processing and analytical tools can then use simple names referring to the data in GRASS location instead of file paths. Here is an example of using existing _boundaries_ vector data and outputting new _boundaries_ raster data:

```
v.to.rast input=boundaries output=boundaries use=val
```

This follows _separation of concerns_: Format conversion and CRS transformations are separate from analysis.

## Geospatial Inputs Versus Outputs

Generally, inputs should not be modified and the results of computation should go into newly created outputs. This rule usually holds for raster data processing and more often than not for vector data processing, too.

If the tools adds to existing data or modifies a specific, especially auxiliary part of existing data, it may modify the existing input data instead of creating a copy with modifications. Good examples are adding an attribute column or modifying a color table.

## Overwriting Existing Data

A tool should not overwrite existing data unless specified by the user using the `--overwrite` flag. For most cases, this is managed automatically by the  GRASS command line parser. For raster and vector maps and files in general, the parser automatically checks their existence and ends the tool execution with a proper error message in case the output already exists. If the flag is set by the user (`--overwrite` in command line, `overwrite=True` in Python), the parser enables the overwriting for the whole tool.

The `--overwrite` flag can be globally enabled in GUI (_Settings > Preferences > Tools > Allow output files to overwrite existing files_), in Python scripts by `os.environ["GRASS_OVERWRITE"] = "1"`, or in general by setting environment variable `GRASS_OVERWRITE` to `1`. Notably, the GRASS session from _grass.jupyter_ sets `GRASS_OVERWRITE` to `1` to enable re-running of the cells and notebooks.

## Mapsets

Output data should be always written to the current mapset. This is ensured by build-in GRASS mechanisms, so there is nothing which needs to be done in the tool.

If a tool modifies inputs, rules for outputs apply, i.e., the input must be in the current mapset.

If the tool is not modifying the inputs, the tool should accept inputs from any mapset in the current location. Tools should rely on existing GRASS mechanisms to determine in which mapset the data is in. If the user-provided names are simply passed to other GRASS tools, there is nothing to do in the tool itself. 

The user-provided name may or may not include mapset name. If the name is used to create, e.g., column names, the mapset needs to be resolved explicitly and possibly separated from the rest of the name. In Python, this can be done using the following code:

```python
file_info = gs.find_file(user_provided_name, element="cell")  # Cell means raster here.
full_name = file_info["fullname"]
name = file_info["name"]
mapset = file_info["mapset"]
```

Good reasons to use mapsets explicitly in a tool include parallel processing - individual processes running separately in temporary mapsets - bulk processing, and, obviously, managing mapsets.

## Computational Region

Tools should not change the computational region. This is done by specific tools, especially, by _g.region_.

Raster processing tools should respect the current computational region. Vector processing tools may use the current computational region, e.g., to selected subset of the input data.

Users should be able to re-run a command or workflow with different computational regions to, e.g., test processing in a small are and then move to a larger one.

One exception to respecting the computational region rule are imports where respecting of the region is optional. The usual expectation is that the data is respected. Respecting of the region may be implemented as an optional feature. This is to avoid, e.g., importing data under finer resolution than the native resolution of the data. The tools should behave appropriately to the input data, for example, importing only the extent based on the current region may be appropriate for import of a global dataset.

Another exception is raster processing where alignment of the cells plays a crucial role and there is a clear answer to how the alignment should be done. In that case, the tool may change the resolution.

Some tools, such as _r.mapcalc_, opt for providing additional computation region handling policies. However, these require custom implementation and the general practice is to simply leave computation region handling out of the tool.

Finally, some operations are meant use all the data, e.g., creating metadata, these operations should not use the current computational region.

If a tool needs to change the computational region for part of the computation, temporary region in Python API is a simplest way to do that:

```python
gs.use_temp_region()  # From now on, use a separate region in the script.
# Set the computational region with g.region as needed.
```

This makes any changes done in the tool local for the tool without influencing other tools running in the same session.

The ultimate tool to change the computational region in a Python tool is the `GRASS_REGION` environment variable which is passed to subprocesses. (This generally works for any script, but not for tools which are using C libraries.) Python API has functions which help with the setup:

```python
os.environ["GRASS_REGION"] = gs.region_env(raster=input_raster)
```

If different subprocesses need different regions, use different environments:

```python
env = os.environ.copy()
env["GRASS_REGION"] = gs.region_env(raster=input_raster)
gs.run_command("r.slope.aspect", elevation=input_raster, slope=slope, env=env)
```

This approach makes the computational region completely safe for parallel processes as no region-related files are modified.

## Mask

GRASS GIS has a global mask managed by the _r.mask_ tool and represented by a raster called `MASK`. Tools should not set or remove the global mask. Raster tools called as subprocess will automatically respect the globally set mask when reading the data. For outputs, respecting of the mask is optional.

If the tool can't avoid setting the mask internally, it should check for presence of the mask and fail if the mask is present. The tools should not remove and later restore the original mask because that creates confusing behavior for interactive use and breaks parallel processing.

In addition to the global mask, tools may implement additional mask inputs, e.g., to limit interpolation of points. Interaction of the additional masking with the global mask should be documented. 

Generally, any mask behavior should be documented unless it is the standard case where masked cells don't participate in the computation and are represented as NULL cells (no data) in the output.

The future versions of GRASS GIS will include improved global mask handling for use in tools and parallel processing and a tool to determine mask status (see PRs [2390](https://github.com/OSGeo/grass/pull/2390) and [2392](https://github.com/OSGeo/grass/pull/2392)).

## Additional Parameters for Vector Data

A tool with GRASS interface which takes vector map as an input should have at least input and layer parameters:

```python
# %option G_OPT_V_INPUT
# %end
# %option G_OPT_V_FIELD
# %end
```

A layer number selects a subset of a vector map in GRASS GIS which is specified by its name. Most cases are covered by the default which is `1`, but multiple layers can be present in one vector map which allows for creation of complex data structures. The layer can also have an associated database link which links an attribute table. Where only geometry is used and attributes are not present or ignored, `-1` is used to denote all layers.

Additionally, the presence of layer, covers cases where OGR-readable data in matching CRS are used directly using the OGR pseudo-mapset:

```bash
grass8 ~/grassdata/nc_basic_spm_grass7/foss4g --exec \
    v.to.rast input="~/data/project_shapefiles@OGR" layer=all_sites output=sites use=val
```

When there is more than one input, all the additional parameters should be included for all if applicable and all names should be changed to avoid duplication.

Sometimes, layer is needed for output, but usually it is not.

Additionally, if it is possible, e.g., the underlying tools support it, the input vector should ideally also have:
- type specifying geometry types (`G_OPT_V_TYPE`)
- cats specifying category numbers (identifiers) of features to select (`G_OPT_V_CATS`)
- where specifying SQL WHERE clause expression (`G_OPT_DB_WHERE`)

## Temporary Maps

Using temporary map is preferred over using temporary mapsets. This follows the rule that writing should be done only to the current mapset. Some users may have write permissions only for their mapsets, but not for creating other mapsets.

The following script creates a temporary name using _append_node_pid_ which is using node (computer) name and process identifier to create unique, but identifiable name. The temporary maps are removed when scripts ends that's to adding the removal function to exit procedures using _atexit.register_.

In [None]:
%%writefile buffered_vector_to_raster.py
#!/usr/bin/env python

# %module
# % description: Converts vector data to raster data
# %end
# %option G_OPT_V_INPUT
# %end
# %option G_OPT_V_FIELD
# %end
# %option G_OPT_R_OUTPUT
# %end
# %option
# % key: buffer
# % type: double
# % required: yes
# % description: Buffer around vector features
# %end

import atexit
import subprocess
import sys

import grass.script as gs
import grass.script.setup


def remove(name):
    gs.run_command("g.remove", type="vector", name=name, flags="f", quiet=True, errors="ignore")


def main():
    options, flags = gs.parser()
    vector_input = options["input"]
    vector_layer = options["layer"]
    raster_output = options["output"]
    buffer = options["buffer"]

    temporary = gs.append_node_pid("tmp_buffer")
    atexit.register(remove, temporary)

    gs.run_command("v.buffer", input=vector_input, layer=vector_layer, output=temporary, use="val")
    gs.run_command("v.to.rast", input=temporary, layer=vector_layer, output=raster_output, use="val")

if __name__ == "__main__":
    main()

## Data Processing History

Tools should record processing history to the output data.

For vectors:

```python
gs.vector_history(output)
```

For rasters:

```python
gs.raster_history(output, overwrite=True)
```

In [None]:
%%writefile vector_to_raster_with_history.py
#!/usr/bin/env python

# %module
# % description: Converts vector data to raster data
# %end
# %option G_OPT_V_INPUT
# %end
# %option G_OPT_V_FIELD
# %end
# %option G_OPT_R_OUTPUT
# %end

import subprocess
import sys

import grass.script as gs
import grass.script.setup

def main():
    options, flags = gs.parser()
    vector_input = options["input"]
    vector_layer = options["layer"]
    raster_output = options["output"]
    
    gs.run_command("v.to.rast", input=vector_input, layer=vector_layer, output=raster_output, use="val")
    gs.raster_history(raster_output, overwrite=True)

if __name__ == "__main__":
    main()

We set executable permissions:

Set execute permissions:

In [None]:
!chmod u+x ./vector_to_raster_with_history.py

Run the tool:

In [None]:
!grass ~/grassdata/nc_basic_spm_grass7/foss4g --exec ./vector_to_raster_with_history.py input=firestations output=stations_with_history --o

Show the history metadata information:

In [None]:
!grass ~/grassdata/nc_basic_spm_grass7/foss4g --exec r.info -h stations_with_history

Compare with the history of a raster we created earlier without managing the history:

In [None]:
!grass ~/grassdata/nc_basic_spm_grass7/foss4g --exec r.info -h stations

## Memory and cores

If the underlying tools support `memory` or `nprocs` parameters, the tool should expose those in the interface and pass the values to the underlying tools.

If possible, whole raster should not be loaded into memory to avoid limiting the data size by machine memory. If the size of the processed chunks can vary, the memory consumption should be driven by a `memory` parameter. For Python tools, the proper behavior is usually taken care of by the underlying tools.

The standard option `G_OPT_M_NPROCS` should be used to specify maximum number of cores (processes, threads) the tool will use. By default, only one one core should be used, or more precisely, the standard option `G_OPT_M_NPROCS` should be used with its default value. Clearly limiting the cores avoids taking more resources than the user expects.

# Document, Compile, Install, and Distribute

There is few more steps to make the tool better for users and easy to distribute. Some of these steps are required for inclusion into the grass-addons repository.

## Addons Repository

- Community-maintained tools (addons aka extensions aka plugins)
- Separate from the main repository, but only one repository
- A repository with the source code, not just a registry
- Best of both worlds:
  * Broader community of contributors (including one-time contributors)
  * Single repository maintained by the core community similarly to the main repository

## Name

Tool name should follow the existing conventions for naming GRASS tools. Tools are organized into categories (families) based on their function. The categories are distinguished by prefixes:

| Prefix | Function | Example |
| --- | --- | --- |
| r | raster processing | r.mapcalc: map algebra |
| v | vector processing | v.clean: topological cleaning |
| i | imagery processing | i.segment: object recognition |
| db | database management | db.select: select values from table |
| r3 | 3D raster processing | r3.stats: 3D raster statistics |
| t | temporal data processing | t.rast.aggregate: temporal aggregation |
| g | general data management | g.rename: renames map |
| d | display | d.rast: display raster map |
| ps | PostScript rendering | ps.map: create map compositions |
| m | miscellaneous | m.proj: convert coordinates |

The name of the module helps to understand its function, for example v.in.lidar starts with v so it deals with vector maps, the name follows with in which indicates that the module is for importing the data into GRASS GIS Spatial Database and finally lidar indicates that it deals with lidar point clouds. 

Generally, the idea is to include only one or two dots. All core tools comply with this rule, but some addons break it. Sometimes, this suggests further grouping. For example, tools staring with `v.net` deal with vector network analysis and tools starting with `g.gui` are opening GUI applications.

Tools with non-compliant names will generally work, but may not make full use of some tools such as the GUI which uses the naming scheme to recognize GRASS tools in some contexts.

## Directory Structure

The tool is in a directory named the same as the tool. The Python file has the name and `.py` extension.

So far we have only a Python file, but all other files are in the directory as well. There can be subdirectories for special purposes; for example, tests are in a subdirectory.

Now, we create the directory and rename our script from the previous notebook to comply with the rules:

In [None]:
!mkdir -p v.buffered.raster
!cp vector_to_raster.py v.buffered.raster/v.buffered.raster.py

## Keywords

Keywords are part of the basic description of the tools and its interface. The first two keywords are special. First keyword is the tool family of the tool, so, e.g., for vector tools, which has names starting with `v.`, the keyword is `vector`. Second keyword is a topic which is more highlighted in the documentation than other keywords. If possible, tools should use one of [existing topics](https://grass.osgeo.org/grass82/manuals/topics.html). Tool should have at least one other keyword. These can include other data types the tool works with, the name of the specific process it implements, or synonyms for the terms used in its name and description. Keywords can contain more than one word and can be understood as general labels or tags as long as they are adding to identification of the tool in searches. Reuse of [existing keywords](https://grass.osgeo.org/grass82/manuals/keywords.html) is encouraged. Keywords in Python are specified using as follows:

```python
# %module
# % description: Converts vector data to raster data
# % keyword: vector
# % keyword: conversion
# % keyword: raster
# % keyword: rasterization
# %end
```

## Documentation

A file with documentation which uses simple HTML syntax must be provided. This documentation is then distributed with the addon and it is also automatically available online ([GRASS GIS Addons Manual pages](https://grass.osgeo.org/grass82/manuals/addons/)). A template with the main sections follows (the syntax highlighting does not work in notebook in JupyterLab, only in a separate editor tab).

In [None]:
%%writefile v.buffered.raster/v.buffered.raster.html
<h2>DESCRIPTION</h2>

A long description with details about the method, implementation, usage or whatever is appropriate.

<h2>NOTES</h2>

Random notes, tricks, and quirks which don't fit above.

<h2>EXAMPLES</h2>

Examples of how the tool can be used alone or in combination with other tools.
Possibly using the GRASS North Carolina State Plane Metric sample Location.
At least one screenshot (PNG format) of the result should provided to show the user what to expect.

<h2>REFERENCES</h2>

Reference or references to paper related to the tool or references which algorithm the tool is based on.

<h2>SEE ALSO</h2>

List of related or similar GRASS tools or tools used together with this tools as well as any related websites, or
related pages at the GRASS GIS User wiki.

<h2>AUTHORS</h2>

List of author(s), their organizations and funding sources.

The name of the HTML file should be the name of the tool with `.html` extension.

It is a good idea to include one ore more images to enhance the documentation. These should be PNGs, GIFs, or JPEGs, but if there are original files used to generate the images, these should be included as well (SVGs, scripts, notebooks). PNGs are preferred. GIFs are for animations. JPEGs for photographs.

The image files should be named uniquely. The best practice is to use the name of the tool, but use underscores instead of dots, and an optional suffix. A PNG named like this without any additional suffix, may be used as an image representing the tool (this is currently done for tools in the main repository). All extensions should be lowercase (e.g., `.png`). The extension recognized for JPEG is `.jpg`.

Optionally, a _README.md_ file can be included if some files or other aspects of the tool need more explanation which does not fit into any of the other files, e.g., when extra instructions are needed for re-creating the images or for maintenance of the code.

## Formal requirements

Tools included in the grass-addons repository, must be under the GNU GPL license, version 2 or later (SPDX: GPL-2.0-or-later). There is a specified way how the first comment in tool's main file should look like. Here is a template for the first lines of a file:

```python
#!/usr/bin/env python

##############################################################################
# MODULE:    vector_to_raster
#
# AUTHORS:   Alice Doe <email AT some domain>
#            Bob Doe <email AT some domain>
#
# PURPOSE:   Describe your script here from maintainer perspective
#
# COPYRIGHT: (C) 2022 Alice Doe and the GRASS Development Team
#
#            This program is free software under the GNU General Public
#            License (>=v2). Read the file COPYING that comes with GRASS
#            for details.
##############################################################################

"""Describe your script here from Python user perspective"""
```

## Compilation

Although Python is not a compiled language like C, we need to compile also the Python tools in order to include them into our GRASS installation, make them executable without the file extension, and to create HTML documentation. For this a `Makefile` needs to be written which follows a standard template as well. The included `Script.make` takes care of processing everything, given that the Python script, the HTML documentation and an optional screenshot(s) in PNG format are present in the same directory. Installed tools will show up in the GUI.

In [None]:
%%writefile v.buffered.raster/Makefile
MODULE_TOPDIR = ../..

PGM = v.buffered.raster

include $(MODULE_TOPDIR)/include/Make/Script.make

default: script

To compile, either a low level `make` command can be used, but it is easier to make use of installation mechanism of _g.extension_ which compiles and installs on Linux and macOS (and all unix-like systems).

On Linux and macOS:

In [None]:
!grass --tmp-location XY --exec g.extension v.buffered.raster url=v.buffered.raster

In [None]:
!grass --tmp-location XY --exec which v.buffered.raster

The best way to get the tools to Windows users is to include them in the grass-addons repository. (Experimentally, it is also possible to setup a private institution-specific repository like the grass-addons repository.)

Code can be hosted on GitHub or other platform. _g.extension_ supports installation from many sources, but it needs the compilation tools which are not available on Windows, so this works only for Linux and macOS.

## Submitting to the GRASS GIS Addons Repository

Create a pull request to the [grass-addons repository](https://github.com/OSGeo/grass-addons/) (instructions are there). Wait for someone to review it or convince someone to do that. When issues from the review are addressed, the reviewer will merge it.

Finally, check the [Submitting Guidelines](https://trac.osgeo.org/grass/wiki/Submitting) (both for updating your files and updating the guidelines themselves).

PR reviews are time consuming, so make it easier for the reviewer by checking the best practices yourself. And yes, you can become a reviewer and get access to the grass-addons repository too. That's actually much simpler than getting write access to the main repository.

## Testing your code

One way to speed up the review process is to include tests of your tool. This not only demonstrates that the tool works, but it makes it also easier to maintain the code in the future.

### grass.gunittest tests

- Based on highly customized extension of the standard Python _unittest_ package.
- Code runs in an existing GRASS session.
- Can assume the sample NC SPM location is the current location.
- Many specialized functions for GRASS GIS, especially specialized asserts.

The readily available test real-world data and assert functions specialized for GRASS GIS, make _grass.gunittest_ a great tool for tests of data processing tools.

### pytest tests

- Use _pytest_ as is.
- There are no specialized functions for GRASS GIS yet.
- Fixtures and comparisons need to be written using basic functions.
- No GRASS session or data.

The lack of any setup may work well for tests of tools which are not doing standard processing. Increasing number of project and people migrate to _pytest_, so you may simply prefer that.

## Grouping Related Tools

### Naming

Just use names which start with the same prefix or contain the same words, but keep thinks separate.

### Common Directory

If it makes sense to use each tool individually, the tools can be in separate directories.

One directory with multiple tools in subdirectories works well even for mix of Python and C tools. There should be an additional HTML documentation in the top directory. The name of the directory should be the common prefix in the name of the tools. If there is no common prefix, the tools should likely be separate.

However, a step further, including common libraries to this structure, is more complicated and the functionality is not as stable as it should be. Creating a common library is not worth the trouble for a couple of short functions shared over couple of the tools, but it may be better for maintenance of a large library of broadly used functions. Note that the tools can also call each other possibly avoiding needs for Python imports. 

### Experimental Toolboxes

#### Addon Toolbox

Multiple tools can be listed together in an XML file and _g.extension_ can show and install this toolbox.

#### GUI Toolbox

Multiple tools can be listed together in an XML file which is stored in user configuration directory. The GUI adds these toolboxes to the tree in the _Tools_ tab.

# GRASS Tool in C

For some tools, C (or C++) is a good choice of language because an algorithm or model requires fine work with the data. For example, a loop in Python over all vector points or raster cells may be prohibitively slow. When this cannot be compensated by available Python libraries or use of GRASS C libraries through the ctypes API or grass.pygrass wrappers, writing a tool directly in C is the most common approach.

GRASS tools written in C (and C++) can be distributed in exactly the same way as tools in Python. For Windows, they are compiled on project servers. For other systems, they are compiled on the user machine. Python, C, and C++ tools can be and are being published in the grass-addons repository with users noticing the difference only in case of bugs.

## Structure of a C Tool

There is no difference for small tools comparing to Python tools except that the file with the source code is called main. There are still three files: source code, documentation, and a Makefile.

Explore the directory with example tool called _r.example.twice_ which multiplies values in a raster map by two:

In [None]:
!ls r.example.twice/

Open _r.example.twice/main.c_ (using File Browser in JupyterLab).

The file with the _main_ function should be called `main.c`. There can be multiple `.c` and `.h` files.

## Compile and Install

Here, we will compile and install the tool in the same way as a Python tool using _g.extension_: 

In [None]:
!grass --tmp-location XY --exec g.extension r.example.twice url=r.example.twice

In case of C, compilation is necessary to execute the program, so typically, the low-level command _make_ is used to do the compilation. The specific use may differ based on the development environment, but often you may want to compile whole GRASS GIS yourself and simply add the compiled tool to that build instead of installing it as an addon (which is what happens with _g.extension_). In that case, _make_ is used with `MODULE_TOPDIR` variable which is set to where the GRASS source code is, e.g.:

```bash
make MODULE_TOPDIR=~/Projects/grass/
```

If you are calling `make install` to install GRASS GIS after compilation, you need to do the same with your code:

```bash
make install MODULE_TOPDIR=~/Projects/grass/
```

This assumes you are developing a tool for grass-addons. If you are developing a tool for the main repository, `MODULE_TOPDIR=...` is not needed.

Although this is most useful for developing tools in C, the same applies for Python tools as well.

Using _g.extension_ with the `-d` flag, gives suggestion of a setup for complex cases in case _g.extension_ or the usage of _make_ above are not enough.

## Test of the Interface

The resulting interface is the same as for Python tools, for example `--help` works:

In [None]:
!grass --tmp-location XY --exec r.example.twice --help

## Test of the Computation

Let's test the tool with raster _elevation_. First, check its metadata and then set the computational region to it:

In [None]:
!grass ~/grassdata/nc_basic_spm_grass7/foss4g --exec r.info map=elevation
!grass ~/grassdata/nc_basic_spm_grass7/foss4g --exec g.region raster=elevation

Now run the _r.example.twice_ tool (we use `--o` in the example as shorthand for `--overwrite`, so we can re-run the example multiple times):

In [None]:
!grass ~/grassdata/nc_basic_spm_grass7/foss4g --exec r.example.twice input=elevation output=test_twice --o

Minimum and maximum in metadata of the new raster should be double the original value:

In [None]:
!grass ~/grassdata/nc_basic_spm_grass7/foss4g --exec r.info map=test_twice

We will use a subprocess to render the result as a PNG image (without modifying our current environment):

In [None]:
%%python
import subprocess
import sys

sys.path.append(
    subprocess.check_output(["grass", "--config", "python_path"], text=True).strip()
)

import grass.script as gs
import grass.jupyter as gj
import grass.script.setup  # Needed only in 8.2 and older.

with grass.script.setup.init("~/grassdata/nc_basic_spm_grass7/foss4g") as session:
    ortho_map = gj.Map()
    ortho_map.d_rast(map="test_twice")
    # Save the image (in a standard notebook, we would just display the image now).
    ortho_map.save("test_twice.png")

Display the image in the notebook:

In [None]:
from IPython.display import Image

Image("test_twice.png")