Skip to content
Pedro R. Andrade edited this page Sep 27, 2017 · 33 revisions

Creating and Filling Cellular Spaces in TerraME

Pedro R. Andrade, Rodrigo Avancini

This tutorial works with TerraME version 2.0.0-RC4 or greater.

Summary

Introduction

When working with geospatial data with modeling purposes, it is very common to handle data using different representations and formats. To face this challenge, the most commonly used approach is to standardize information from raster and vector data stored in different sources by aggregating them into a single cellular representation. A cellular representation uses a vector format to store raster-like objects called cells. Each cell is represented as a square polygon with several attributes. Converting data from different representations into a cellular representation simplifies simulations as well as statistical analysis and makes them faster, once geospatial operations such as overlays are necessary only while creating the attributes.

This tutorial describes how to create and fill cellular spaces from geospatial data in TerraME. As TerraME is a programming environment, it is necessary to write a script to describe how data will be created. There are two main advantages of using scripts instead of a common GIS procedure using a graphical interface:

  1. The whole process of creating and filling is recorded into a file. It is always possible to come back to the script to see which data and operation was used to create each attribute. This promotes transparency to the process and enforces reproductibility by other researchers, once both data and script are available.
  2. With a script, it is possible to change the resolution of the cellular space as needed. It is only necessary to update the script with the new resolution, run the script again, and for the computer to rebuild the cellular space.

Package gis has a set of functions to describe how to create and fill cellular spaces. Two types from this package are necessary to create cellular spaces in TerraME: Project and Layer.

Project

Project is a type to describe all the data to be used by a given model. Geospatial data can be stored in different sources, with different formats and different information to describe how to access data. For example, only a file name is needed to open a shapefile, while PostGIS tables require a host, port, user, password, and the name of the table. A project encapsulates how to access such data and organises them, storing all the information related to each data source internally.

TerraME allows the modeler to create a Project from scratch or to load one already created in another software of TerraLib family. The simplest way to create a Project is shown below. The code creates a variable named proj to access a project stored in file "myproject.tview", stored in the current directory. Note that it is necessary to import gis package in order to use Project.

import("gis")

proj = Project{
    file = "myproject.tview"
}

As default, TerraME never overwrites a Project file. It means that, if one runs the code above again, it will stop with an error as "myproject.tview" was already created. To allow the script to run again even if the file was already created by a previous execution of the script, add clean = true as argument to Project, such as in the code below.

import("gis")

proj = Project{
    file = "myproject.tview",
    clean = true
}

Layer

A Layer represents a geospatial dataset stored in a given data source, such as a database table, a shapefile, or a web service. Each Layer belongs to a Project. For example, the code below creates a Layer named "places" that points to a shapefile named "cities.shp", stored in the the same directory where this script is saved.

It is very important to remark that the shapefile is not copied to the Project. The Project only stores where the shapefile is stored in the computer. Every time it needs to access layer "cities", it will open file "cities.shp".

-- [the code below cannot run]

places = Layer{
	project = proj, -- the project created in the previous code
	file = "cities.shp",
	name = "cities"
}

When the data used by the Project is stored in files, such as the one above, it is possible to declare Layers directly while creating a Project. In the code below, a Project is created with three layers: localities, roads, and census. Each layer represents a shapefile that belongs gis package. To access such data it is necessary to use function filePath().

import("gis")

itaituba = Project{
    file = "itaituba.tview",
    clean = true,
    localities = filePath("itaituba-localities.shp", "gis"),
    roads = filePath("itaituba-roads.shp", "gis"),
    census = filePath("itaituba-census.shp", "gis")
}

If data is not stored into a file, it is necessary to create a Layer after the Project, once other arguments are required. For example, if data is stored in a PostGIS table we will need to set host (if data is stored in another machine), user, password, and table. If data is available in a Web Feature Service (WFS), it is necessary to select service and feature. The code below creates two Layers, named "roads" and "protected", from a PostGIS table and from a WFS, respectively.

-- [the code below cannot run]

Layer{
    project = itaituba,
    name = "soil",
    user = "root",
    password = "abc123",
    table = "soil"
}

Layer{
    project = itaituba,
    name = "biomes",
    service = "http://terrabrasilis.info/wfs/wfs_biomes",
    feature = "biomes",
}

Another optional argument to create a Layer is related to its geospatial projection. When the data to be addeed to the Project does not have enough information about its projection, it is necessary to set a epsg, which is an EPSG number. For example, in the two Layers below, it is necessary to indicate that their epsg is 29191 to allow combining them with other geospatial data. A list with the supported epsg numbers in the latest TerraME version is available here.

Layer{
    project = itaituba,
    name = "deforestation",
    file = filePath("itaituba-deforestation.tif", "gis"),
    epsg = 29191
}

Layer{
    project = itaituba,
    name = "elevation",
    file = filePath("itaituba-elevation.tif", "gis"),
    epsg = 29191
}

A common procedure when one wants to create a Project is to store all files in the same directory. The best way to create a Project with all the shapefiles of a given directory is by using argument directory. In this case, it searches for all shp and tif files within the given directory and adds them to the project. The name of the layer will be the file name without extension.

import("gis")

proj = Project{
	file = "myproject.tview",
	directory = "." -- all files of the directory where this file is saved
	clean = true
}

Cellular space

A cellular space is composed by a set of squared cells. It is internally represented as polygons and can be opened in any GIS. Every cells within a cellular space has the same resolution, such as of 1m x 1m, 500m x 500m, or 100km x 100km. The best resolution for a given cellular space depends on the resolution of the available data, on the computing resources, and on the scale of the process under study. Finding the best resolution can even be part of the modeling process.

A cellular space is created as a Layer. It requires another Layer as reference for its coverage area. If the selected reference layer has a polygonal representation, cells will be created in such a way to fill all the space of the polygons, considering only cells that have some overlay with some polygon of the reference Layer. This means that some cells might be partially outside the study area.

A cellular space will have the same geographic projection of its reference layer. If the reference layer uses cartesian coordinates, each created cell will have the same area, which makes a model that uses such data simpler as it is not necessary to take into account differences between areas. Because of that, it is usually recommended to use a projection that uses cartesian coordinates instead of geographic coordinates (latitude and longitude).

It is possible to create a cellular space using Layer constructor as shown in the code below. The cellular space has 5000m of resolution, as the unit of measurement of the input Layer is meters. Each created cell will have three attributes: "col", "row", and "id". Note the option clean = true to delete the shapefile if it already exists.

itaitubaCells = Layer{
    project = itaituba,
    name = "cells",
    clean = true,
    file = "itaituba.shp",
    input = "census",
    resolution = 5000
}

The code above creates the cells below. Layer "census" is also shown.

There is a second option to create cells. It is possible to set box = true to create cells based on the bounding box of the layer. In this case, cells will cover the whole extent of the reference layer, with a constant number of cells in each row and column. Additionally, if a raster data or other non-polygonal geometry is used as input, cells will automatically be created from its bounding box.

Filling cells

After creating a cellular layer, now it is possible to add attributes using the data available. Depending on the geometric representation and the semantics of the input data attributes, different operations can be applied. Some operations use geometry, some use only attributes of objects with some overlap, and some combine geometry and attributes. It is also possible to fill any geospatial data that uses a vector format, even if it is not composed by rectangular objects. This section describes some examples of filling some attributes.

Using raster data as reference layer, it is possible to compute operations based on pixels whose centroid belongs to the cell. For example, operation average compute the average value of such pixels for each cell. Argument attribute defines the name of the attribute to be created. See the code below. After executing this code, the cellular space will have an additional attribute "altim" with a number value.

itaitubaCells:fill{
    operation = "average",
    layer = "elevation",
    attribute = "elevation"
}

The main operation based on categorical raster data is coverage. It computes the percentage of the cell's area covered by each of the available categories, creating one attribute for each of them. It uses the attribute name plus _ followed by the attribute value as attribute names. For example, layer "deforestation" has a tif data with two values, 0 and 254. The code below then creates two attributes, defor_0 and defor_254, with values ranging from zero to one for each attribute. Note that the layer might have a dummy value that will be ignored by this operation.

itaitubaCells:fill{
    operation = "coverage",
    layer = "deforestation",
    attribute = "defor"
}

Using vector data without attributes, it is possible to compute the distance to the nearest object using operation distance. The code below shows an example that creates attribute "distr".

itaitubaCells:fill{
    operation = "distance",
    layer = "roads",
    attribute = "distroad"
}

The most important operation using polygonal data with attributes is sum using area = true. This operation distributes the values of a given select attribute from the polygons to the cells given their intersection areas. It is particularly useful for data such as population because it conserves the total sum of the input in the output. See the code below.

itaitubaCells:fill{
    operation = "sum",
    layer = "census",
    attribute = "population",
    select = "population",
    area = true
}

Visualising output

To visualize the attributes created by Layer:fill(), it is necessary to create a CellularSpace. As we now have a Project, we can use it directly with the name of the Layer to be read instead of describing where the data is stored. The code below reads layer "cells" from the project itaituba.tview. The attribute "population" will be drawn using log scale, therefore we need to create a Cell to be used as instance of the CellularSpace. It will have a function logpop that returns the log of the "population".

cell = Cell{
    logpop = function(self)
        return math.log(self.population)
    end
}       
    
cs = CellularSpace{
    project = itaituba,
    layer = "cells",
    instance = cell
}

Now it is only necessary to create some Map objects to visualize the attributes. For example, the code below creates three Maps. The first one draws attribute "altim" using ten slices of blue. The second one draws "distl" using reds. The last one draws with greens the attribute "defor_0", created from operation "coverage" using pixels with value "10".

Map{
    target = cs,
    select = "elevation",
    slices = 6,
    color = "Blues"
}   

Map{
    target = cs,
    select = "defor_87",
    slices = 6,
    invert = true,
    color = "Greens"
}

Map{
    target = cs,
    select = "logpop",
    slices = 10,
    color = "Purples"
}

Map{
    target = cs,
    select = "distroad",
    slices = 10,
    color = "Reds"
}

The outputs are shown below. It is also possible to reproduce these maps running itaituba project in package gis. The four figures clockwise from the top-left show the results of the Maps above, from top to bottom.

maps

Documentation

TerraME can use fill scripts to automatically document cellular layers. To accomplish that, it is necessary to create a TerraME package to store data as well as scripts to create projects and cellular spaces. A package encapsulate all the data in a single directory and allows documentation of data and their attributes. It also simplifies scripts to create cellular spaces as it is possible to use filePath() instead of describing where the files are stored in the computer manually. To create a data package it is necessary to execute the following steps:

  1. Create a directory with the package's name.
  2. Create description.lua and fill it with the description of the package.
  3. Create data.lua with the description of the package's data. Projects as well as cellular spaces created from scripts do not need to be manually documented. They are automatically documented by TerraME.
  4. Create data directory and put data as well as fill scripts.

For example, it is possible to copy all the files related to Itaituba from gis to create a package itaituba. The files are stored in data directory, together with itaituba.lua script. Package itaituba does not need to be installed in order to build documentation. If it is in the current directory then it is enough for TerraME. Using the command prompt in the directory where you unzipped itaituba zip file, run:

terrame -package itaituba -projects

It executes itaituba.lua to create the cellular space and fill its attributes. Then run:

terrame -package itaituba -doc

When you execute this command, TerraME automatically documents the cellular layer using the data about the selected operations. To see the created documentation, just run:

terrame -package itaituba -showdoc

In the webpage, when you click in Data, it is possible to see the documentation of the cellular space. The itaituba file created above is documented in the following way. The project:

tview

The cellular Layer is described below. Note that it has additional attributes: FID and id (unique identifiers) as well as col and row, related to the spatial location of each cell.

shp

Once a script to create and fill a cellular space belongs to a package, it is possible to create new versions of the cellular space with other resolutions without changing the script by using the button Create in the graphical interface of TerraME when the package selected. Additionally, it is possible to execute it from the command line. For example, the following command executes the script using resolution 2500:

terrame -package itaituba -project itaituba 2500

The output is shown below:

low-resolution

Final remarks

In the documentation of gis package, it is possible to find the complete documentation of Project, Layer, and fill. This package also has some examples that create and fill cellular spaces. It is possible to run them directly from TerraME's main graphical interface through the Project button. There are also links from the documentation of the package to the scripts to create cellular spaces.