Skip to content

Lifting a Function in MobilityDB

Esteban Zimanyi edited this page Jan 2, 2024 · 21 revisions

Lifting is a powerful mechanism allowing to "temporally extend" a traditional "static" function. This document explains the steps to lift a function in MobilityDB.

Consider the arithmetic operators +, -, *, and /. These operators require two arguments, each one can be either an integer or a float, and return as result either an integer or a float depending on the type of the arguments. Lifting these operators means that we allow one or both arguments to be temporal numbers and the result would also be a temporal number. As another example, consider the traditional distance function between two points, which in PostGIS is implemented by the ST_Distance function. This function requires two geometries or two geographies as arguments and returns the distance between them. Lifting this function would allow one or the two arguments to be a temporal point and the result would be a temporal float.

MobilityDB provides a generic infrastructure for lifting functions and operators on temporal types. These functions are used for lifting arithmetic operators (+, -, *, /), Boolean operators (and, or, not), comparisons (<, <=, >, >=), distance (<->), spatial relationships (tcontains), etc.

The lifting of functions and operators must take into account the following characteristic of the function to be lifted.

  1. The number of arguments of the function:

    • unary functions, such as degrees for temporal floats or setPrecision for temporal points.
    • binary functions and operators, such as arithmetic operators and comparisons (e.g., + or <) or spatial relationships functions (e.g.,tintersects).
  2. The type of the arguments for binary functions:

    • a temporal type and a base type, such as tfloat + float. In this case, the non-lifted function is applied to each instant of the temporal type.
    • two temporal types, such as tfloat + tfloat. In this case the operands must be synchronized and the function is applied to each pair of synchronized instants.
  3. Whether the type of the arguments may vary. For example, temporal numbers can be of different base type (that is, integer or float). Therefore, the Oids (i.e., PostgreSQL object identifiers) of the arguments must be taken into account when computing binary operators (e.g., + or <) for temporal numbers.

  4. The number of optional parameters of the function:

    • no arguments, such as most spatial relationships functions (e.g., tintersects).
    • one argument, such as spatial relationships functions that need an additional parameter (e.g., tdwithin).
    • two arguments, e.g., when assembling a temporal point from two temporal floats, the SRID and and a boolean flag stating whether the resulting temporal point is geometric or geographic are needed.
  5. The result type of the function at every instant. For example, it would be either integer or float for arithmetic operators but would be boolean for comparisons.

  6. Whether the result is linear or stepwise. Obviously, this is required only for base types that are continuous, such as floats or points. For example, the result of arithmetic operators over temporal floats will be linear if at least one argument is linear.

  7. Whether the base function is symmetric. For example, considering arithmetic operators, addition and multiplication are symmetric, while subtraction and division are not.

  8. Whether the function has instantaneous discontinuities at the crossings. Examples of such functions are temporal comparisons for temporal floats or temporal spatial relationships since the value of the result may change immediately before, at, or immediately after a crossing.

  9. Whether the ever or always semantics is used. Examples of such functions are eDisjoint or aDisjoint.

  10. Whether intermediate points must be added to take into account the crossings or the turning points (i.e., local minimum/maximum) of the function. We need to consider two cases depending on the type of the arguments as follows

    • a temporal type and a base type: in this case the non-lifted function is applied to each instant of the temporal type.
    • two temporal types: in this case the operands must be synchronized and the function is applied to each pair of synchronized instants.

    Turning points may be added in each of the above cases. For the first case above, none of the arithmetic operators (+, -, *, /) require to add a turning point when computing tfloat <op> base but for when computing the temporal distance tpoint <-> point we need to add a turning point at the shortest distance between the temporal point and the point. For the second case above, tfloat + tfloat only needs to synchronize the arguments while tfloat * tfloat requires in addition to add the turning point, which is the timestamp in which the linear functions defined by the two segments are equal.

We will sketch the steps to lift a function using as example the temporal distance operator <-> which is implemented by the distance function. In this case, PostGIS already provides the function to be lifted with signature ST_Distance(geo, geo): float, where the arguments are of type geometry or geography, although in our case these arguments are restricted to a point. In other circumstances, for example if we want to add a bearing function computing the bearing between two points, we would also need to write the two functions with signature bearing(point, point): float for both geometric and geographic points.

After having the function to be lifted, we need to provide the functions required to add the turning points in the two cases as follows

  1. one of the arguments is temporal and another is a base type, i.e., functions distance(tpoint, point) and distance(point, tpoint), and
  2. both arguments are temporal, i.e., a function distance(tpoint, tpoint).

Consider the first case above, when one of the arguments is temporal and another is a base type. The following figure explains the idea of adding a turning point.

Adding a turning point for distance(tpoint, point)

When computing the temporal distance, we need to add the turning point at the timestamp t at which the distance between point p and the segment defined by the points p1 and p2 is minimal. Thus, the result of the temporal distance would be composed of three instants: [ST_Distance(p1, p)@t1, ST_Distance(p, q)@t, ST_Distance(p2, p)@t2]. The MobilityDB function that computes this turning point is called tpoint_geo_min_dist_at_timestamp.

Now, consider the second case above when both arguments are temporal points. The first thing to do is to synchronize the two temporal points as illustrated by the following figure

Synchronizing two temporal point

Synchronization is done internally by MobilityDB by (1) restricting the two arguments to their common time span, i.e., keeping only the parts between the leftmost and rightmost continuous vertical lines, and (2) adding to both arguments the intermediate points represented by the dashed vertical lines. Consider now the following figure that explains how to add the turning points to a pair of synchronized segments.

Adding a turning point for distance(tpoint, tpoint)

Here we have two objects o1 and o2. The first object moves from p1@t1 to p2@t2 while the second object moves from p3@t1 to p1@t2. The turning point is the defined by the location of the two objects at the time t in which the distance between the two objects is minimal. Therefore the result of the temporal function for this segment would be composed of three instants: [ST_Distance(p1, p3)@t1, ST_Distance(p, q)@t, ST_Distance(p1, p2)@t2]. The MobilityDB function that computes this turning point is called tpoint_min_dist_at_timestamp.

A struct named LiftedFunctionInfo is used to describe the above characteristics of the function to be lifted.

#define MAX_PARAMS 3

typedef struct
{
  Datum (*func)(Datum, ...); /**< Variadic function that is lifted */
  int numparam;              /**< Number of parameters of the function */
  Datum param[MAX_PARAMS];   /**< Datum array for the parameters of the function */
  bool argoids;              /**< True if the lifted function requires the Oid of the arguments */
  Oid argtypid[2];           /**< Type of the arguments */
  Oid restypid;              /**< Base type of the result of the function */
  bool reslinear;            /**< True if the result has linear interpolation */
  bool invert;               /**< True if the arguments of the function must be inverted */
  bool discont;              /**< True if the function has instantaneous discontinuities */
  bool ever;                 /**< True/false when computing the ever/always semantics */
  bool (*tpfunc_base)(const TInstant *, const TInstant *, Datum, Oid, 
    Datum *, TimestampTz *); /**< Turning point function for temporal and base types*/
  bool (*tpfunc)(const TInstant *, const TInstant *, const TInstant *,
     const TInstant *, Datum *,
     TimestampTz *);         /**< Turning point function for two temporal types */
} LiftedFunctionInfo;

In the functions that implement the temporal distance, we need to fill this structure with the right values and pass them to the built-in lifting infrastructure in MobilityDB.

Let us consider first the case of computing the temporal distance between a temporal point and a point (or vice versa). We will explain the overall process starting from the definition of the SQL functions as follows.

CREATE FUNCTION distance(geometry, tgeompoint)
  RETURNS tfloat
  AS 'MODULE_PATHNAME', 'Distance_geo_tpoint'
  LANGUAGE C IMMUTABLE STRICT PARALLEL SAFE;
CREATE FUNCTION distance(tgeompoint, geometry)
  RETURNS tfloat
  AS 'MODULE_PATHNAME', 'Distance_tpoint_geo'
  LANGUAGE C IMMUTABLE STRICT PARALLEL SAFE;
CREATE FUNCTION distance(tgeompoint, tgeompoint)
  RETURNS tfloat
  AS 'MODULE_PATHNAME', 'Distance_tpoint_tpoint'
  LANGUAGE C IMMUTABLE STRICT PARALLEL SAFE;

These SQL functions call the external C functions defined as follows.

PG_FUNCTION_INFO_V1(Distance_geo_tpoint);
/**
 * Returns the temporal distance between the geometry/geography point
 * and the temporal point
 */
PGDLLEXPORT Datum
Distance_geo_tpoint(PG_FUNCTION_ARGS)
{
  GSERIALIZED *gs = PG_GETARG_GSERIALIZED_P(0);
  Temporal *temp = PG_GETARG_TEMPORAL(1);
  ensure_point_type(gs);
  ensure_same_srid(tpoint_srid(temp), gserialized_get_srid(gs));
  ensure_same_dimensionality_tpoint_gs(temp, gs);
  if (gserialized_is_empty(gs))
    PG_RETURN_NULL();
  /* Store fcinfo into a global variable */
  store_fcinfo(fcinfo);
  Temporal *result = distance_tpoint_geo(temp, PointerGetDatum(gs));
  PG_FREE_IF_COPY(gs, 0);
  PG_FREE_IF_COPY(temp, 1);
  PG_RETURN_POINTER(result);
}

PG_FUNCTION_INFO_V1(Distance_tpoint_geo);
/**
 * Returns the temporal distance between the temporal point and the
 * geometry/geography point
 */
PGDLLEXPORT Datum
Distance_tpoint_geo(PG_FUNCTION_ARGS)
{
  Temporal *temp = PG_GETARG_TEMPORAL(0);
  GSERIALIZED *gs = PG_GETARG_GSERIALIZED_P(1);
  ensure_point_type(gs);
  ensure_same_srid(tpoint_srid(temp), gserialized_get_srid(gs));
  ensure_same_dimensionality_tpoint_gs(temp, gs);
  if (gserialized_is_empty(gs))
    PG_RETURN_NULL();
  /* Store fcinfo into a global variable */
  store_fcinfo(fcinfo);
  Temporal *result = distance_tpoint_geo(temp, PointerGetDatum(gs));
  PG_FREE_IF_COPY(temp, 0);
  PG_FREE_IF_COPY(gs, 1);
  PG_RETURN_POINTER(result);
}

PG_FUNCTION_INFO_V1(Distance_tpoint_tpoint);
/**
 * Returns the temporal distance between the two temporal points
 */
PGDLLEXPORT Datum
Distance_tpoint_tpoint(PG_FUNCTION_ARGS)
{
  Temporal *temp1 = PG_GETARG_TEMPORAL(0);
  Temporal *temp2 = PG_GETARG_TEMPORAL(1);
  ensure_same_srid(tpoint_srid(temp1), tpoint_srid(temp2));
  ensure_same_dimensionality(temp1->flags, temp2->flags);
  /* Store fcinfo into a global variable */
  store_fcinfo(fcinfo);
  Temporal *result = distance_tpoint_tpoint(temp1, temp2);
  PG_FREE_IF_COPY(temp1, 0);
  PG_FREE_IF_COPY(temp2, 1);
  if (result == NULL)
    PG_RETURN_NULL();
  PG_RETURN_POINTER(result);
}

As can be seen, these functions just verify the validity of the given arguments and then call the corresponding internal function. Let us start with the case where the arguments are a temporal point and a point.

Temporal *
distance_tpoint_geo(const Temporal *temp, Datum geo)
{
  LiftedFunctionInfo lfinfo;
  memset(&lfinfo, 0, sizeof(LiftedFunctionInfo));
  lfinfo.func = (varfunc) get_pt_distance_fn(temp->flags);
  lfinfo.numparam = 0;
  lfinfo.argtypid[0] = lfinfo.argtypid[1] = temp->basetypid;
  lfinfo.restypid = FLOAT8OID;
  lfinfo.reslinear = MOBDB_FLAGS_GET_LINEAR(temp->flags);
  lfinfo.invert = INVERT_NO;
  lfinfo.discont = CONTINUOUS;
  lfinfo.tpfunc_base = lfinfo.reslinear ?
    &tpoint_geo_min_dist_at_timestamp : NULL;
  lfinfo.tpfunc = NULL;
  Temporal *result = tfunc_temporal_base(temp, geo, &lfinfo);
  return result;
}

After declaring the lifting structure lfinfo, we set lfinfo.func with the distance function to be used. This is done by a call to the function get_distance_fn shown below, which selects the function depending on whether the temporal points have planar or geodetic coordinates, and whether they have 2D or 3D coordinates.

/**
 * Select the appropriate distance function
 */
datum_func2
get_distance_fn(int16 flags)
{
  datum_func2 result;
  if (MOBDB_FLAGS_GET_GEODETIC(flags))
    result = &geog_distance;
  else
    result = MOBDB_FLAGS_GET_Z(flags) ?
      &geom_distance3d : &geom_distance2d;
  return result;
}

We continue filling the lifting structure as follows.

  • lfinfo.numparam = 0; There are no parameters to be passed to the distance function
  • lfinfo.argtypid[0] = lfinfo.argtypid[1] = temp->basetypid; The type of both arguments at each instant are the same as the one in the temporal point
  • lfinfo.restypid = FLOAT8OID; The result type at each instant is a long float whose Oid in PostgreSQL is denoted as stated.
  • lfinfo.reslinear = MOBDB_FLAGS_GET_LINEAR(temp->flags); The result is linear if the temporal point passed as argument is linear
  • lfinfo.invert = INVERT_NO; The distance function is symmetric and thus no inversion of parameters is needed. In functions that are not symmetric, for example subtraction for temporal numbers, an additional parameter would be passed to the internal C function stating whether the SQL function was called with parameters (tnumber, number) or (number, tnumber) and thus the built-in lifting infrastructure will call func(value, number) or func(number, value) for each instant value@t of the temporal number.
  • lfinfo.discont = CONTINUOUS; The distance function does not have instantaneous discontinuities.
  • lfinfo.tpfunc_base = lfinfo.reslinear ? &tpoint_geo_min_dist_at_timestamp : NULL; The function for adding the turning points if the result is linear
  • lfinfo.tpfunc = NULL; No function given in this case, it should be filled for the second case below.
  • Temporal *result = tfunc_temporal_base(temp, geo, lfinfo); Call the built-in lifting infrastructure. Notice that one parameter would be required, for example, for the tdwithin function: it would be the distance parameter given by the user in the query.

Similarly, when computing the temporal distance between two temporal points, the internal C function is as follows.

Temporal *
distance_tpoint_tpoint(const Temporal *temp1, const Temporal *temp2)
{
  LiftedFunctionInfo lfinfo;
  memset(&lfinfo, 0, sizeof(LiftedFunctionInfo));
  lfinfo.func = (varfunc) get_pt_distance_fn(temp1->flags);
  lfinfo.numparam = 0;
  lfinfo.restypid = FLOAT8OID;
  lfinfo.reslinear = MOBDB_FLAGS_GET_LINEAR(temp1->flags) ||
    MOBDB_FLAGS_GET_LINEAR(temp2->flags);
  lfinfo.invert = INVERT_NO;
  lfinfo.discont = CONTINUOUS;
  lfinfo.tpfunc_base = NULL;
  lfinfo.tpfunc = lfinfo.reslinear ? &tpoint_min_dist_at_timestamp : NULL;
  Temporal *result = tfunc_temporal_temporal(temp1, temp2, &lfinfo);
  return result;
}

The function fills the lifting structure similarly as above, excepted that now

  • the result is linear as soon as at least one of the temporal points given as arguments is linear
  • we need to provide the function for adding the turning points, tpointseq_min_dist_at_timestamp in this case.

Finally, the function calls the built-in function tfunc_temporal_temporal which computes in a single pass (this is crucial for achieving good perfomance) the synchronization and the computation of the lifted function, adding the turning points when necessary.

Clone this wiki locally