Skip to content

Stands & Metric Layers

Mark edited this page Nov 29, 2023 · 5 revisions

Stands & Metric Layers

Planscape and Forsys utilizes the RRK datasets to run it's scenarios. These layers are available in raster form and need to be loaded in the database, for us to calculate the metrics used by Planscape.

Rationale

To keep our Forsys runs short, we decided to cache everything we could cache before-hand. All of the data is pre-calculated by a series of scripts, allowing our Forsys runs to focus just on fetching the pre-calculated data.

There are a few important tables that needs to be populates:

  1. stands_stand (stands/models.py::Stand)
  2. stands_standmetric (stands/models.py::StandMetric)
  3. conditions_basecondition (conditions/models.py::BaseCondition)
  4. conditions_condition (conditions/models.py::Condition)
  5. conditions_conditionraster (conditions/models.py::ConditionRaster)

The stands_stand stores information about our planning subdivisions, the stands. Each stand is a hexagon, created by ST_HexagronGrid function in PostGIS. The stands_standmetric table has a foreign key to stands_stand and represents a list of metrics for a given condition in that stand. Each row stores the min, avg, max, count and sum for that stand.

How to load the data

We have a list of Django management commands in order to load this data. They should be run in order:

  1. Create Stands (this is a simple SQL function - we need to create the management command);
  2. Load Conditions;
  3. Load Rasters;
  4. Load Metrics v2;

All of these commands and functions takes up several parameters, allowing you to tweak how you want to run it.

How long does it take?

To load stands the runtime is not that bad. Last measurement is less than one hour for all stand sizes. Loading raster's also takes quite a while (separate 1 day for this - as this process cannot be run in parallel).

To load all the metrics, we have the following run-times, per condition:

  1. LARGE: approximately 2 minutes, per condition
  2. MEDIUM: approximately 10 minutes, per condition
  3. SMALL: approximately 60-80 minutes, per condition

Create Stands

To create the stands, we use a SQL function that should be installed in your database by our migrations. The system is configured to run the creation of LARGE stands automatically, but not for the rest of the stand sizes (MEDIUM, SMALL), because they take a long time to be created.

This supports the following parameters:

  1. Extent (this extent represents the whole state of California)
  2. Length of the Hexagon. These magic numbers will give us extents with specific areas
  3. Size of the stand. Enumeration here
  4. Boolean flag to reset the stands of this size, before recreating them.

Large Stands

SELECT * FROM
create_stands(
    ST_MakeEnvelope(
        -2356881.4306262177415192,
         1242364.3072737671900541,
        -1646662.6679147812537849,
         2452712.7869634097442031,
         5070),
    877.38267558,
    'LARGE',
    TRUE
);

Medium Stands

SELECT * FROM
create_stands(
    ST_MakeEnvelope(
        -2356881.4306262177415192,
         1242364.3072737671900541,
        -1646662.6679147812537849,
         2452712.7869634097442031,
         5070),
    392.377463,
    'MEDIUM',
    TRUE
);

Small Stands

SELECT * FROM
create_stands(
    ST_MakeEnvelope(
        -2356881.4306262177415192,
         1242364.3072737671900541,
        -1646662.6679147812537849,
         2452712.7869634097442031,
         5070),
    124.0806483,
    'SMALL',
    TRUE
);

Load Conditions

Our system depends on many conditions. This is another term for metric. Conditions represent some sort of phenomenon, that can be represented by a raster layer.

Each condition has a few attributes and are accurately described in this file. The load conditions management command parses this JSON (also allows you to specify a different JSON input) and creates the corresponding data in conditions_basecondition and conditions_condition tables.

With these conditions in place, we can load the rasters.

Parameters:

  • --conditions-file, not required, defaults to [this file (https://github.com/OurPlanscape/Planscape/blob/main/src/planscape/config/conditions.json)
  • --dry-run, not required, defaults to false

This command will upsert all the conditions found in the JSON file. This is run every time the system is deployed.

Example:

python manage.py load_conditions

Load Rasters

Conditions contains metadata about our rasters. After we load our conditions, we can load the actual data for each of these. This command is not triggered automatically, because it takes a long time to load all of the data.

Parameters:

  • --conditions-ids, not required, defaults to all conditions. You can specify multiple condition ids with --condition-ids 1 2 3, where 1 2 3 are the primary key for the conditions_condition table.
  • --clear, not required, defaults to true. If you set this to true, the data for a specific condition will be removed before inserting the new calculate data.

Example:

python manage.py load_rasters --condition-ids 1 2 3

Load Metrics v2

Given that we have our conditions in place and all the rasters inside the database, we can start calculating stands_standmetric values. For each stand, we try to intersect and aggregate values from a condition in the database. In this process we extract basic information, such as min, avg, max, sum and count (of pixels).

This command processes each condition against all the intersecting stands, and inserts the corresponding records in the stands_standmetric table.

warning: there are two versions of this command. the first version is very slow and will be deprecated.

Parameters:

  • --condition-ids - list of integers, each id identifying a specific condition. Defaults to all conditions.
  • --size - one of SMALL, MEDIUM or LARGE, defaults to medium.
  • --clear, boolean determining if we should delete the data for a certain condition being processed and a stand size.
  • --max-workers, number of maximum workers used by the program to insert the data. defaults to 4. Increase it if you have more threads available in your system.

Example:

python manage.py load_conditions --size MEDIUM --condition-ids 1 2 3 --clear

the example above will clear the data before loading, only for the specified conditions and stand size.

Reviewing conditions that have been processed

If this process gets interrupted, you can review the conditions that have been processed with an SQL statement like the following statement. For instance, this will show the condition_ids for SMALL stands that relate to the Southern California region:

SELECT DISTINCT stands_standmetric.condition_id as condition_id
FROM stands_standmetric 
JOIN stands_stand ON stands_stand.id = stands_standmetric.stand_id
JOIN conditions_condition ON stands_standmetric.condition_id = conditions_condition.id 
JOIN conditions_basecondition ON conditions_condition.condition_dataset_id = conditions_basecondition.id 
WHERE conditions_basecondition.region_name = 'southern-california'
AND stands_stand.size = 'SMALL';