# PyQGIS: Expanding QGIS's functionality with Python. 
# Day 2 – Processing and plugins

Yesterday, you learned the basics of PyQGIS: how to run code from the embedded Python console, how to use the central classes (such as QgsVectorLayer) and run operations of layer geometries and attributes. However, the strength of QGIS is its vast library of geospatial algorithms (both native and those added by plugins, such as GRASS) coupled with general data crunching tools. 

In these practicals, we will learn how to run processing algorithms via PyQGIS, even chaining multiple operations together. We will create automated processes using the graphic Model Builder and link it with our Python code. The point of these exercises is to learn how to use Python in QGIS to create reproducible and easily shareable tools for spatial analysis needs.

In the final section we will look at plugin development in QGIS: this section will teach you the basics of creating plugins with graphic user interfaces.

### Preparations
Open QGIS and load in all the layers from _practical_data.gpkg_. Do this via the GUI or, if you want to do the same programmatically, use the script below. Adapted [from the PyQGIS Cookbook](https://docs.qgis.org/latest/en/docs/pyqgis_developer_cookbook/cheat_sheet.html#layers).

In [None]:
# EXAMPLE PATH: define the actual path on your system
gpkg_path = "C:/Users/tatu/pyqgis_practical/data/practical_data.gpkg" # windows
gpkg_layer = QgsVectorLayer(gpkg_path, "whole_gpkg", "ogr")

# returns a list of strings describing the sublayers
sub_strings = gpkg_layer.dataProvider().subLayers()
# EXAMPLE: 1!!::!!Paavo!!::!!3027!!::!!MultiPolygon!!::!!geom!!::!!
# !!::!! separates the values

for sub_string in sub_strings:
    layer_name = sub_string.split(gpkg_layer.dataProvider().sublayerSeparator())[1]
    uri = "{0}|layername={1}".format(gpkg_path, layer_name)
    # Create layer
    sub_vlayer = QgsVectorLayer(uri, layer_name, 'ogr')
    # Add layer to map
    if sub_vlayer.isValid():
        QgsProject.instance().addMapLayer(sub_vlayer)
    else:
        print("Can't add layer", layer_name)

Next, make sure you have the processing toolbox open in the GUI. If not, open it from the drop down menu _Processing > Toolbox_. If the toolbox is unavailable for some reason, [follow these instructions](https://docs.qgis.org/3.22/en/docs/training_manual/processing/set_up.html) to remedy.

### Model builder and processing algorithms
1. Processing algorithms with PyQGIS
2. Writing a custom processing script
3. Model builder
4. Linking the script with model builder

### A look at plugin development
1. Processing plugins
2. GUI plugins

## Basics of processing algorithms
The processing framework is a collection of helpful functions for topics ranging from network analysis to raster terrain analysis. Both native and custom algorithms can be called programmatically, but doing so requires us to know a few things, namely the algorigthm name and its parameters.

The simplest way to access these is by letting QGIS create the initial code for us.

### TASK
- Using the GUI, run _Simplify_ algorithm, under _Vector geometry_. The easiest way to find it is to use the search bar. 
    - Change _Tolerance_ to 5000, otherwise keep the default settings.
    - Run the algorithm. It should add a new memory layer with simplified geometry to the project.
    
![Simplify algorithm in gui](images/simplify_algorithm_settings.JPG)

After running the algorithm, click on the _Processing_ drop-down menu and select _History_. The opened window contains a list of previous processing runs (under ALGORITHM). 

Select the latest one. An algorithm call is displayed on the text box below the list and it looks something like this (formatted for clarity):

In [None]:
processing.run("native:simplifygeometries", 
               {'INPUT':'C:/Users/tatu/pyqgis_practical/data/practical_data.gpkg|layername=NUTS2_FIN_pop',
                'METHOD':0,
                'TOLERANCE':5000,
                'OUTPUT':'TEMPORARY_OUTPUT'})

What we have here is a call to the processing framework. Specifically, we name the algorithm and pass a dictionary of algorithm parameters (if you need a refresher of Python dicts, [check this out](https://www.w3schools.com/python/python_dictionaries.asp)). 

Notice that the parameters are identical to the ones we defined graphically. Though you might wonder how one could know that _Distance: Douglas-Peucker_ maps to method 0 without first running the algorithm graphically. The processing framework has a function for describing algorithms:

In [None]:
>>> processing.algorithmHelp("native:simplifygeometries")

In [None]:
Simplify (native:simplifygeometries)

This algorithm simplifies the geometries in a line or polygon layer. 
---
METHOD: Simplification method

	Parameter type:	QgsProcessingParameterEnum

	Available values:
		- 0: Distance (Douglas-Peucker)
		- 1: Snap to grid
		- 2: Area (Visvalingam)
---

There we have it. Use this function if you want to learn more about the algorithms and the parameters.

However, to use the helper function, you need the algorithm's id string, which is different from its screen name (like _native:simplifygeometries_ vs. _Simplify_). Print all available algorithms [like this](https://docs.qgis.org/latest/en/docs/user_manual/processing/console.html#calling-algorithms-from-the-python-console). Because the hundreds of algorithms can be overwhelming, a simple filter is applied below:

In [None]:
>>> for alg in QgsApplication.processingRegistry().algorithms():
...        search_str = "simplify"
...        if search_str in alg.displayName().lower():
...            print(alg.displayName(), "->", alg.id())

Simplify Network -> NetworkGT:Simplify Network
Simplify -> native:simplifygeometries

The part before the colon (_NetworkGT_ and _native_) refers to the algorithm provider – in this case a plugin and the native processing library.

## Running processing algorithms with PyQGIS
If you ran the processing.run call before, you might've noticed that it simply returned a dictionary with a layer as the value – this is because an algorithm can have multiple outputs. The output layers are accessed simply by entering the key of the output.

In [None]:
>>> input_layer_path = 'C:/Users/tatu/pyqgis_practical/data/practical_data.gpkg|layername=NUTS2_FIN_pop'
>>> results = processing.run("native:simplifygeometries", 
...              {'INPUT':input_layer_path,
...                'METHOD':0,
...                'TOLERANCE':5000,
...                'OUTPUT':'TEMPORARY_OUTPUT'})
>>> print(results)
{'OUTPUT': <QgsVectorLayer: 'Simplified' (memory)>}
>>> simplified_layer = results['OUTPUT']

Or to skip the middle steps and load the layer directly to the current project, use _runAndLoadResults_.

In [None]:
>>> processing.runAndLoadResults("native:simplifygeometries", 
                {'INPUT': input_layer_path,
                'METHOD':0,
                'TOLERANCE':5000,
                'OUTPUT':'TEMPORARY_OUTPUT'})

### Batch processing
Running processing algorithms on multiple layers is straightforward to do with Python loops.

The script below runs simplification on all vector layers in a project:

In [None]:
proj_layers = QgsProject.instance().mapLayers()

for layer in proj_layers.values():
    # excluding other layer types
    if isinstance(layer, QgsVectorLayer):
        processing.runAndLoadResults("native:simplifygeometries", 
                {'INPUT': layer,
                'METHOD':0,
                'TOLERANCE':5000,
                'OUTPUT':'TEMPORARY_OUTPUT'})

### Chaining algorithms
The real power of these algorithms gets unleashed when they're chained together to form an analysis pipeline. For example, you may remember how long and cumbersome the script used to get to simplify the geometries and fields of our input layer was previously. With the processing framework, we can offset the heavy lifting to two processing algorithms (_Simplify_ and _Drop fields_). Since we're _dropping_ fields instead of _keeping_ them, we need to do a bit of Python magic first.

P.S. If we'd be using ≥QGIS 3.18., we could use [_Retain fields_](https://qgis.org/en/site/forusers/visualchangelog318/#feature-add-retain-fields-algorithm). It's not available in the current LTR.

In [None]:
# defining input parameters

# path to the NUTS2 layer
input_layer_path = 'C:/Users/tatu/pyqgis_practical/data/practical_data.gpkg|layername=NUTS2_FIN_pop'

input_layer = QgsVectorLayer(input_layer_path, "input_layer", "ogr")

tolerance = 5000
# list of field names to keep
fields_to_keep = ['name', 'pop']

# get all fields
all_fields = input_layer.fields().names()

# BASICALLY: create a list containing all fields except those that are in the "keep" list
drop_fields = [field for field in all_fields if field not in fields_to_keep]

simplified_layer = processing.run("native:simplifygeometries", 
              {'INPUT':input_layer,
               'METHOD':0,
               'TOLERANCE':tolerance,
               'OUTPUT':'TEMPORARY_OUTPUT'})['OUTPUT'] # NOTICE THAT THE LAYER IS IMMEDIATELY FETCHED FROM THE DICT

processing.runAndLoadResults("qgis:deletecolumn", 
               {'INPUT':simplified_layer,
                'COLUMN':drop_fields,
                'OUTPUT':'TEMPORARY_OUTPUT'})

Notice that with _runAndLoadResults_, the layer will be named automatically according to the algorithm definitions (like _Remaining fields_).

### TASK
- Modify the script above by adding one more algorithm to the pipeline, namely _Add geometry attributes_
    - Find out the algorithm id of _Add geometry attributes_.
    - Create the parameter dictionary as needed for the algorithm (HINT: if you're lost, run the algorithm through the GUI first)
    - Add the algorithm call to the end of the script and modify the previous algorithm call accordingly.

## Custom processing scripts
Processing scripts like above are already quite neat, but they do have some weaknesses. Using them requires some programming understanding, which hinders their usability if you'd want to share your tools with others. It's much more user friendly to select the input parameters graphically like in the processing toolbox algorithms.

We will create such a processing tool. One approach would be to write it in code from scratch, like the existing tools are. However, in the interested of time, we will create the base of the script using the graphic _Model Designer_ in QGIS. 

### Graphical processing models
Model designer enables defining inputs and chaining processing algorithms graphically. The picture below shows an example pipeline for creating a population heatmap clipped to sea boundaries. Subpicture (1) shows the pipeline running through centroid creation, KDE, and clipping with a mask layer. The yellow rectangles are inputs given by the user. These are shown as options when running the script, as seen in subpicture (2). The process outputs a raster (subpicture (3)).

If you want to play around with this model, find it in _scripts > heatmap_from_pop_grid.model3_  in the practical materials.

![Graphic modeler example](images/graphic_model.png)

Great thing about the models is that they can be exported to Python code. We will use this function later on. 

## Recreating the script as a processing script
Now, let's once again recreate the layer simplification and trimming script, this time as an installable processing tool.

**Open the Model Designer window** from the the left-most button under _Processing Toolbox > Create new Model_.

![Opening graphical model](images/opening_graphical_modeler.JPG)

A mostly empty window is opened. The empty area in the middle is where we'll start building our script. On the left, there's a selection of inputs (1) and algorithms (2), which can be dragged to the builder. Other important functions are naming the model (3), exporting to a Python script (4) and running the model (5).

![Model builder introduction](images/model_builder_intro.png)

Start by dragging and dropping inputs. Remember which ones we defined previously, namely vector layer and a list of fields:

In [None]:
input_layer_path = 'C:/Users/tatu/pyqgis_practical/data/practical_data.gpkg|layername=NUTS2_FIN_pop'
fields_to_keep = ['name', 'pop']

Equivalently, first drag _Vector layer_, then _Vector field_. Below are the parameters definitions for both. Make sure to toggle _Accept multiple fields_ for the field input:
![Vector layer and field model inputs](images/model_builder_vector_input.png)

Next, click the algorithms tab active. The whole algorithm toolbox and a search function is available. Search for _Drop fields_ and drag it to the model.

In the properties window that opens, change the input type to _Model input_ for both _Input layer_ and _Fields to drop_. Also write _Kept fields_ in the output box:
![Setting drop field algorithm properties](images/drop_field_parameters.png)

Now you should have something like this:

![Opening script template](images/drop_fields_model.JPG)

This is a fully functional model that gives the user an option to select a layer and fields in it, and the drops the selected fields. Hmm, but we want to keep the fields. To implemented that, we need to apply the custom Python logic created before and create a new tool.

Press the button with Python logo (_Export as script algorithm_). It throws a Script editor window where the model is expressed as Python code. Let's first briefly explore the code to understand processing scripts.

### Understanding processing scripts

The first thing you might notice are the imports. The Python console imports qgis.core automatically, but the same is not true of processing scripts. Following good coding practices, only the necessary methods are imported.

In [None]:
from qgis.core import QgsProcessing
from qgis.core import QgsProcessingAlgorithm
from qgis.core import QgsProcessingMultiStepFeedback
from qgis.core import QgsProcessingParameterVectorLayer
from qgis.core import QgsProcessingParameterField
from qgis.core import QgsProcessingParameterFeatureSink
import processing

Next, the script defines a **class** that inherits from the general [QgsProcessingAlgorithm](https://qgis.org/pyqgis/3.22/core/QgsProcessingParameterField.html#qgis.core.QgsProcessingParameterField).

In [None]:
class Model(QgsProcessingAlgorithm):

Under this class, there are methods that both define metadata information, such as the algorithm identifier and display name, and run the actual processing. For the interested, [here's a thorough description of all the mandatory methods](https://docs.qgis.org/latest/en/docs/user_manual/processing/scripts.html#extending-qgsprocessingalgorithm).

The script is currently generically named "model". **Change the name to _Keep fields_** under _name_, _displayName_, _createInstance_ and class definition as shown below
![Script name changes](images/keep_fields_script_names.png)

The workhorse methods are _initAlgorithm_ and _processAlgorithm_. In the initiation step, the input **and output** [parameters](https://qgis.org/pyqgis/latest/core/QgsProcessingParameters.html) are defined.  Notice that the settings are the same we did previously graphically.

BTW, the order the parameters are written here defines the order they're shown in the program. Therefore, if you want the user to first select the layer and then the fields, insert QgsProcessingParameterVectorLayer first.

In [None]:
def initAlgorithm(self, config=None):
    self.addParameter(QgsProcessingParameterVectorLayer('Inputlayer', 'Input layer', defaultValue=None))
    
    self.addParameter(QgsProcessingParameterField('Fieldstokeep', 'Fields to keep', 
                                                  type=QgsProcessingParameterField.Any, 
                                                  parentLayerParameterName='Inputlayer', 
                                                  allowMultiple=True, defaultValue=None))
    
    self.addParameter(QgsProcessingParameterFeatureSink('KeptFields', 'Kept fields', 
                                                        type=QgsProcessing.TypeVectorAnyGeometry, 
                                                        createByDefault=True, supportsAppend=True, 
                                                        defaultValue=None))

Currently, the processing simply defines the parameters for Drop fields algorithm, runs it and returns a result dictionary.

In [None]:
def processAlgorithm(self, parameters, context, model_feedback):
    # Use a multi-step feedback, so that individual child algorithm progress reports are adjusted for the
    # overall progress through the model
    feedback = QgsProcessingMultiStepFeedback(1, model_feedback)
    results = {}
    outputs = {}

    # Drop field(s)
    alg_params = {
        'COLUMN': parameters['Fieldstokeep'],
        'INPUT': parameters['Inputlayer'],
        'OUTPUT': parameters['KeptFields']
    }
    
    outputs['DropFields'] = processing.run('qgis:deletecolumn', alg_params, 
                                           context=context, feedback=feedback, 
                                           is_child_algorithm=True)
    results['KeptFields'] = outputs['DropFields']['OUTPUT']
    return results

### Creating a new processing script
Let's start inserting our own code. First, we need the vector layer and a list of fields to keep. The pre-made code calls the parameter dictionary, like:

In [None]:
'INPUT': parameters['Inputlayer']

But this returns a _string_ by default, not the objects we need. QgsProcessingAlgorithm has [methods to return the actual objects](https://qgis.org/pyqgis/latest/core/QgsProcessingAlgorithm.html#qgis.core.QgsProcessingAlgorithm.parameterAsVectorLayer)

In [None]:
input_layer = self.parameterAsVectorLayer(parameters, "Inputlayer", context)
fields_to_keep = self.parameterAsFields(parameters, 'Fieldstokeep', context)

Next, insert the field selection code used previously:

In [None]:
all_fields = input_layer.fields().names()
        
drop_fields = [field for field in all_fields if field not in fields_to_keep]

Now that drop_fields contains the fields we want to delete, change the COLUMN paramater to drop_fields:

In [None]:
alg_params = {
    'COLUMN': drop_fields,
    'INPUT': parameters['Inputlayer'],
    'OUTPUT': parameters['KeptFields']
}

All in all, the script looks like this:

In [None]:
from qgis.core import QgsProcessing
from qgis.core import QgsProcessingAlgorithm
from qgis.core import QgsProcessingMultiStepFeedback
from qgis.core import QgsProcessingParameterVectorLayer
from qgis.core import QgsProcessingParameterField
from qgis.core import QgsProcessingParameterFeatureSink
import processing


class KeepFields(QgsProcessingAlgorithm):

    def initAlgorithm(self, config=None):
        self.addParameter(QgsProcessingParameterVectorLayer('Inputlayer', 'Input layer', defaultValue=None))
        self.addParameter(QgsProcessingParameterField('Fieldstokeep', 'Fields to keep', type=QgsProcessingParameterField.Any, parentLayerParameterName='Inputlayer', allowMultiple=True, defaultValue=None))
        self.addParameter(QgsProcessingParameterFeatureSink('KeptFields', 'Kept fields', type=QgsProcessing.TypeVectorAnyGeometry, createByDefault=True, supportsAppend=True, defaultValue=None))

    def processAlgorithm(self, parameters, context, model_feedback):
        # Use a multi-step feedback, so that individual child algorithm progress reports are adjusted for the
        # overall progress through the model
        feedback = QgsProcessingMultiStepFeedback(1, model_feedback)
        results = {}
        outputs = {}

        input_layer = self.parameterAsVectorLayer(parameters, "Inputlayer", context)
        fields_to_keep = self.parameterAsFields(parameters, 'Fieldstokeep', context)
        
        all_fields = input_layer.fields().names()
        
        drop_fields = [field for field in all_fields if field not in fields_to_keep]

        # Drop field(s)
        alg_params = {
            'COLUMN': drop_fields,
            'INPUT': parameters['Inputlayer'],
            'OUTPUT': parameters['KeptFields']
        }
        outputs['DropFields'] = processing.run('qgis:deletecolumn', alg_params, context=context, feedback=feedback, is_child_algorithm=True)
        results['KeptFields'] = outputs['DropFields']['OUTPUT']
        return results

    def name(self):
        return 'keepFields'

    def displayName(self):
        return 'Keep fields'

    def group(self):
        return ''

    def groupId(self):
        return ''

    def createInstance(self):
        return KeepFields()

**Save your script** locally for example as keep_fields.py.

After saving, close the script window. From prosessing toolboxes' Python drop-down, **select _Add script to toolbox_ and add the script file**. This adds _Keep fields_ in the toolbox under _Scripts_. You may run the tool to test that it indeed works.

![Adding script to toolbox](images/add_script_to_toolbox.JPG)

### TASK: Finalizing the model
- Re-create the simplify-and-trim algorithm with the graphical modeler using _Simplify_ and _Keep fields_. It should look something like the pic below.
    - NOTE! Use _Algorithm output_ as the input layer for the second algorithm.
    - Simplification tolerance is simply a _Number_ input.

![Final model](images/simplify_and_trim_final.png)

## Wrapping up processing
The take-home messages of this session were:
- Processing algorithms can be run, chained and expanded upon with PyQGIS to create efficient automated GIS processes.
- New processing algorithms can be created by using the model builder – these models can be expanded with Python.

While this tutorial used graphic modeler to create a basis for the scripts, new processing scripts can be created purely by code as well. 

To do this, see the upper row tools of _Processing toolbox_, click on the Python drop-down menu and select _Create new script from template_. [See the user manual for a tutorial](https://docs.qgis.org/latest/en/docs/user_manual/processing/console.html#creating-scripts-and-running-them-from-the-toolbox)

![Opening script template](images/opening_script_template.JPG)

# A look at plugin development