Skip to content
Thomas Nipen edited this page Jun 18, 2020 · 44 revisions

The gridpp library contains functions for the core algorithms in gridpp. As the command-line version of gridpp was built first, we are gradually collecting all the funtionality into a library. The library is written in C++, but automatic wrappers for python are created using SWIG. Bindings for other languages (like R) will be added in the future. All examples using the library in this wiki use the python interface.

The documentation for the library is found here. The header file with the available functions can be found in include/gridpp.h.

The library consists mostly of functions that take scalars (float, int) and vectors (1D, 2D, 3D std::vector) as inputs.

Parallelization

Many gridpp functions are configured to run on multiple processors. To enable this, either set the OMP_NUM_THREADS= environment variable to the number of parallel threads to use, or call the following function, before calling any other functions:

gridpp::set_omp_threads(4);

If OMP_NUM_THREADS= is not set and gridpp::set_omp_threads is never called, a default of 1 thread will be used. This applies also to any of the language bindings.

Classes for describing locations

The gridpp library supports two classes for describing geographical positions of datasets. The Grid class encapsulates a 2 dimensional gridded field of locations. The Points class encapsulates a 1 dimensional vector of locations.

The library operates exclusively with latitude,longitude coordinates, instead of x,y coordinates on projected grids. This means grids do not need to be on a specific projection. The latitude/longitude coordinates are converted to 3D spherical coordinates and distances between points are calculated in 3D space. Thus distances do not follow the curvature of the earth byt are straight-line distance through the earth. This has little impact for most applications.

Both classes have efficient methods for retrieving the nearest point, the nearest N points, and all points within a specified radius. Grid and Points use the helper class KDTree, which uses the rtree implementation in boost.

Functions (such as bilinear) require both a 2D vector of values, and a Grid object to describe the coordinates of the 2D vector.

Exceptions

Functions will throw std::invalid_argument exceptions when input arguments are invalid (e.g. dimension mismatch between arguments) and std::runtime_error for other errors.

Python

After installing the package, gridpp can be loaded with import gridpp. There are no submodules in gridpp, so all functions are accessed with gridpp.<function>. Read the header file documentation to see available functions. All functions available in the C++ library are also made available in python, with the same function signature. For example, the C++ function:

vec2 neighbourhood (const vec2 &input, int radius, Statistic statistic)

can be used in python as follows:

>>> import gridpp
>>> import numpy as np
>>> gridpp.neighbourhood(np.array([[1,2,3],[4,5,6],[7,8,9]]), 1, gridpp.Mean)
>>> array([3. , 3.5, 4. ],
...       [4.5, 5. , 5.5],
...       [6. , 6.5, 7. ]], dtype=float32)

Input data types

The SWIG interface does type checking and casting when required, and the following types can be used:

C++ type Python types
float Any scalar
int Any scalar
vec 1D np.array, list, tuple
vec2 2D np.array, list of lists, tuple of tuples
vec3 3D np.array, list of list of lists, tuple of tuple of tuples
ivec 1D np.array, list, tuple
Statistic One of gridpp.Min, gridpp.Mean, gridpp.Median, gridpp.Max, gridpp.Std, gridpp.Variance, gridpp.Sum

For best performance, use numpy arrays since these already store their data sequentially in C. Using tuples or lists incurr decent a conversion penality. Also, numpy arrays with dtypes of float32 and int32 are ideal, since then no conversion from (for example) double is made.

All python vectors are automatically converted to C++ floats when passed to C++ functions that expect floats. Python vectors are automatically converted to C++ ints when passed to C++ functions that expect ints. This means you can pass numpy arrays of any numeric dtype, such as float32, float64, int32.

Output data types

All C++ functions will numpy arrays either of dtype float32 or int32 depending on the output type on the C++ side.

Required setup for examples

To get ready for the examples in the next sections, run the code below to set up necessary variables (you will also need the test datasets). This retrieves air temperature and precipitation from the observation, analysis, and forecast files as well as metadata about the grids.

import gridpp
import netCDF4
import numpy as np

# Input data
with netCDF4.Dataset('analysis.nc', 'r') as file:
    ilats = file.variables['latitude'][:]
    ilons = file.variables['longitude'][:]
    igrid = gridpp.Grid(ilats, ilons)
    temp_analysis = np.moveaxis(np.squeeze(file.variables['air_temperature_2m'][:]), 0, 2)
    precip_analysis = np.moveaxis(np.squeeze(file.variables['air_temperature_2m'][:]), 0, 2)

with netCDF4.Dataset('forecast.nc', 'r') as file:
    temp_forecast = np.squeeze(file.variables['air_temperature_2m'][:])
    precip_forecast = np.squeeze(file.variables['air_temperature_2m'][:])

# Output grid
with netCDF4.Dataset('output.nc', 'r') as file:
    olats = file.variables['latitude'][:]
    olons = file.variables['longitude'][:]
    ogrid = gridpp.Grid(olats, olons)

# Observations
with netCDF4.Dataset('obs.nc', 'r') as file:
    plats = file.variables['latitude'][:]
    plons = file.variables['longitude'][:]
    pgrid = gridpp.Points(plats, plons)
    temp_obs = file.variables['air_temperature_2m'][:]
    precip_obs = file.variables['precipitation_amount'][:]

Clone this wiki locally