You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Preface:
This issue follows the conversation in PR #109 and is purposefully a test case working on the migration described in issue #106. #109 is still open so we can kick the tires on both approaches.
For discussion of the text of this proposal, use copy and paste content into a comment below and use strikethrough (~~strikethrough~~) and bold (**bold**) text to indicate removed text. E.g. change thisto this. An alternative could be: strikthrough(new text in parens in newexpectedstyle)
Note that this is probably a very long submission in comparison to typical change requests that would be vetted using github issues. Comments here will likely get long, but it seems that should be OK as long as we remember that it is probably an outlier and is about as long as these would ever get.
Summary
This proposal has been vetted on the CF email list extensively and has gone through a number of iterations. The structure and semantics of the proposed addition below should be close to complete, but this is the first review of proposed text to be added to the CF 1.8 specification. This is entirely new text (section 7.5) to be added just after section 7.4. There is also text to be added as Example E1 in Appendix E. The text should more or less speak for its self, but much more information about the proposal can be seen in the readme here, on the wiki about the specification here, and in the poster here.
The proposed text follows first with a suggested section 7.5 then a suggested example to be added to appendix E.
Section 7.5 Spatial Geometries
For many geospatial applications, data values are associated with a spatial geometry (e.g., the average monthly rainfall in the UK). Although cells with an arbitrary number of multiple vertices can be described using Section 7.1, "Cell Boundaries", spatial geometries contain an arbitrary number of nodes for each geometry and include line and multipart geometries (e.g., the different islands of the UK). The approach described here specifies how to encode such geometries following the pattern in 9.3.3 Contiguous ragged array representation and attach them to variables in a way that is consistent with the cell bounds approach.
A geometry is usually thought to be a spatial representation of a real-world feature. It can be disjoint, having multiple parts. Geometry types are limited to point, multipoint, line, multiline, polygon and multipolygon types. Other types exist and may be introduced in a later version of the specification. Similar to other geospatial data formats, geometries are encoded as ordered sets of geospatial nodes. The connection between nodes is assumed to be linear in the coordinate reference system the nodes are defined in. Parametric geometries or otherwise curved features may be supported in the future.
All geometries are made up of one or more nodes. The geometry type specifies the set of topological assumptions to be applied to relate the nodes. For example, multipoint and line geometries are nearly the same except nodes are interpreted as being connected for lines. Lines and polygons are also nearly the same except the first and last nodes must be identical for polygons. Polygons that have holes, such as waterbodies in a land unit, are encoded as a collection of polygon ring parts, each identified as exterior or interior polygons. Multipart geometries, such as multiple lines representing the same river or multiple islands representing the same jurisdiction, are encoded as collections of un-connected points, lines, or polygons that are logically grouped into a single geometry.
While this geometry encoding is applicable to any variable that shares a dimension with a set of geometriesy, the application it was originally designed for requires that the geometry be joined to the instance dimension of a Discrete Sampling Geometry timeSeries featureType. In this case, any data variable can be given a geometry attribute that is to be interpreted as the representative geometry for the quantity held in the variable. An example of this is areal average precipitation over a watershed. An example of line geometry with time series data is given in Appendix E: Cell Methods.
Geometry Variables and Attributes
A set of geometries can be added to a file by inserting all required data variables and a geometry container variable that acts as a container for attributes that describe a set of geometries. A geometry attribute containing the name of a geometry container variable can be added to any variable that shares a dimension with the geometries. The geometry container must hold geometry_type and node_coordinates attributes. Depending on the geometry_type, the geometry container may also need to contain a node_count, part_node_count, and interior_ring attribute. These attributes are described in detail below.
The geometry_type attribute must be carried by a geometry container variable and indicates the type of geometry present. Its allowable values are: point, multipoint, line, multiline, polygon, multipolygon. The node_coordinates attribute must be carried by a geometry container variable and contains the space delimited names of the x and y (and z) variables that contain geometry node coordinates.
For all geometry types except point, the geometry container variable must have a node_count attribute that contains the name of a variable indicating the count of nodes per geometry. Note that the node count may span multiple geometry parts. For multiline, multipolygon, and polygons with holes, the geometry container variable must have a part_node_count attribute that contains the name of a variable indicating the count of nodes per geometry part. Note that because multipoint geometries always have a single node per part, the part_node_count is not required.
For polygon and multipolygon geometries with holes, the geometry container variable must have an interior_ring attribute that contains the name of a variable that indicates if the polygon parts are interior rings (i.e., holes) or not. The variable indicated by the interior_ring attribute should contain the value 0 to indicate an exterior ring polygon and 1 to indicate an interior ring polygon. Note that single part polygons can have interior rings; multipart polygons are distinct in that they have more than one exterior ring.
The variables that contain geometry node coordinate data, indicated by the node_coordinates attribute on the geometry container variable, are also identifiable through the use of a required cf_role attribute. Allowable values are geometry_x_node, geometry_y_node, and geometry_z_node.
Encoding Geometries
Geometry encoding follows a similar pattern to the contiguous ragged array approach in 9.3.3 Contiguous ragged array representation with some modification to suit the spatial geometry use case rather than observational time series. All spatial data are encoded in the variables indicated by the node_coordinates and appropriate cf_role attribute. These node variables should be one dimensional and total number of nodes long. There are three one dimensional variables that are used to break up and interpret the node variabes: node_count, part_node_count, and interior_ring.
For geometry types requiring a node_count attribute, the node count variable should be the number of geometries long and indicate the number of nodes per geometry. For geometry types requireing a part_node_count attribute, the part node count variable should be the number of geometry parts long and indicate the number of nodes per geometry part. For geometry types requireing an interior_ring attribute, the interior ring variable should be the number of geometry parts long and contain 0s and 1s to indicate exterior or interior.
The ecosystem of polygon specifications and software implementations of those specifications varies in how polygons are encoded. Nodes within a polygon exterior or interior ring are typically encoded in opposite clockwise or anticlockwise direction around the polygon. This is important for operations such as caluclating area. CF requires that outer rings be encoded in anticlockwise order and interior rings be encoded in clockwise order. CF also requires that the first and last node in a polygon be identical to ensure polygon rings are complete.
A coordinate reference system (CRS) (referred to as a grid mapping elsewhere in the CF convention) is strictly required for geometries. The normal CF practice, of attaching a grid_mapping attribute--containing the name of a CRS container variable--to a data variable, can be used and the grid_mapping CRS should be assumed to apply to the geometry. However, the normal grid_mapping, which typically applies to auxiliary coordinate variables and remains optional for use with geometries, can be overridden by attaching a crs attribute that contains the name of a CRS container variable to the geometry container variable. If a grid_mapping is not present on a data variable linked to geometry, a crs attribute is required.
Example 7.14. A multipolygon with holes
This example demonstrates the use of all potential attributes and variables for encoding geometries.
The time series variable, someData, is associated with line geometries via the geometry attribute. The first line geometry is comprised of three nodes, while the second has two nodes. Client applications unaware of CF geometries can fall back to the lat and lon variables to locate feature instances in space. In this example, lat and lon coordinates are identical to the first node in each line geometry, though any representative point could be used.
The text was updated successfully, but these errors were encountered:
Preface:
This issue follows the conversation in PR #109 and is purposefully a test case working on the migration described in issue #106. #109 is still open so we can kick the tires on both approaches.
For discussion of the text of this proposal, use copy and paste content into a comment below and use
strikethrough(~~strikethrough~~
) and bold (**bold**
) text to indicate removed text. E.g.change thisto this. An alternative could be:strikthrough(new text in parens in new expectedstyle
)Note that this is probably a very long submission in comparison to typical change requests that would be vetted using github issues. Comments here will likely get long, but it seems that should be OK as long as we remember that it is probably an outlier and is about as long as these would ever get.
Summary
This proposal has been vetted on the CF email list extensively and has gone through a number of iterations. The structure and semantics of the proposed addition below should be close to complete, but this is the first review of proposed text to be added to the CF 1.8 specification. This is entirely new text (section 7.5) to be added just after section 7.4. There is also text to be added as Example E1 in Appendix E. The text should more or less speak for its self, but much more information about the proposal can be seen in the readme here, on the wiki about the specification here, and in the poster here.
The proposed text follows first with a suggested section 7.5 then a suggested example to be added to appendix E.
Section 7.5 Spatial Geometries
For many geospatial applications, data values are associated with a spatial geometry (e.g., the average monthly rainfall in the UK). Although cells with an arbitrary number of multiple vertices can be described using Section 7.1, "Cell Boundaries", spatial geometries contain an arbitrary number of nodes for each geometry and include line and multipart geometries (e.g., the different islands of the UK). The approach described here specifies how to encode such geometries following the pattern in 9.3.3 Contiguous ragged array representation and attach them to variables in a way that is consistent with the cell bounds approach.
A geometry is usually thought to be a spatial representation of a real-world feature. It can be disjoint, having multiple parts. Geometry types are limited to point, multipoint, line, multiline, polygon and multipolygon types. Other types exist and may be introduced in a later version of the specification. Similar to other geospatial data formats, geometries are encoded as ordered sets of geospatial nodes. The connection between nodes is assumed to be linear in the coordinate reference system the nodes are defined in. Parametric geometries or otherwise curved features may be supported in the future.
All geometries are made up of one or more nodes. The geometry type specifies the set of topological assumptions to be applied to relate the nodes. For example, multipoint and line geometries are nearly the same except nodes are interpreted as being connected for lines. Lines and polygons are also nearly the same except the first and last nodes must be identical for polygons. Polygons that have holes, such as waterbodies in a land unit, are encoded as a collection of polygon ring parts, each identified as exterior or interior polygons. Multipart geometries, such as multiple lines representing the same river or multiple islands representing the same jurisdiction, are encoded as collections of un-connected points, lines, or polygons that are logically grouped into a single geometry.
While this geometry encoding is applicable to any variable that shares a dimension with a set of geometries
y, the application it was originally designed for requires that the geometry be joined to the instance dimension of a Discrete Sampling GeometrytimeSeries
featureType. In this case, any data variable can be given ageometry
attribute that is to be interpreted as the representative geometry for the quantity held in the variable. An example of this is areal average precipitation over a watershed. An example of line geometry with time series data is given in Appendix E: Cell Methods.Geometry Variables and Attributes
A set of geometries can be added to a file by inserting all required data variables and a geometry container variable that acts as a container for attributes that describe a set of geometries. A
geometry
attribute containing the name of a geometry container variable can be added to any variable that shares a dimension with the geometries. The geometry container must holdgeometry_type
andnode_coordinates
attributes. Depending on thegeometry_type
, the geometry container may also need to contain anode_count
,part_node_count
, andinterior_ring
attribute. These attributes are described in detail below.The
geometry_type
attribute must be carried by a geometry container variable and indicates the type of geometry present. Its allowable values are: point, multipoint, line, multiline, polygon, multipolygon. Thenode_coordinates
attribute must be carried by a geometry container variable and contains the space delimited names of the x and y (and z) variables that contain geometry node coordinates.For all geometry types except point, the geometry container variable must have a
node_count
attribute that contains the name of a variable indicating the count of nodes per geometry. Note that the node count may span multiple geometry parts. For multiline, multipolygon, and polygons with holes, the geometry container variable must have apart_node_count
attribute that contains the name of a variable indicating the count of nodes per geometry part. Note that because multipoint geometries always have a single node per part, thepart_node_count
is not required.For polygon and multipolygon geometries with holes, the geometry container variable must have an
interior_ring
attribute that contains the name of a variable that indicates if the polygon parts are interior rings (i.e., holes) or not. The variable indicated by theinterior_ring
attribute should contain the value 0 to indicate an exterior ring polygon and 1 to indicate an interior ring polygon. Note that single part polygons can have interior rings; multipart polygons are distinct in that they have more than one exterior ring.The variables that contain geometry node coordinate data, indicated by the
node_coordinates
attribute on the geometry container variable, are also identifiable through the use of a requiredcf_role
attribute. Allowable values are geometry_x_node, geometry_y_node, and geometry_z_node.Encoding Geometries
Geometry encoding follows a similar pattern to the contiguous ragged array approach in 9.3.3 Contiguous ragged array representation with some modification to suit the spatial geometry use case rather than observational time series. All spatial data are encoded in the variables indicated by the
node_coordinates
and appropriatecf_role
attribute. These node variables should be one dimensional and total number of nodes long. There are three one dimensional variables that are used to break up and interpret the node variabes:node_count
,part_node_count
, andinterior_ring
.For geometry types requiring a
node_count
attribute, the node count variable should be the number of geometries long and indicate the number of nodes per geometry. For geometry types requireing apart_node_count
attribute, the part node count variable should be the number of geometry parts long and indicate the number of nodes per geometry part. For geometry types requireing aninterior_ring
attribute, the interior ring variable should be the number of geometry parts long and contain 0s and 1s to indicate exterior or interior.The ecosystem of polygon specifications and software implementations of those specifications varies in how polygons are encoded. Nodes within a polygon exterior or interior ring are typically encoded in opposite clockwise or anticlockwise direction around the polygon. This is important for operations such as caluclating area. CF requires that outer rings be encoded in anticlockwise order and interior rings be encoded in clockwise order. CF also requires that the first and last node in a polygon be identical to ensure polygon rings are complete.
A coordinate reference system (CRS) (referred to as a grid mapping elsewhere in the CF convention) is strictly required for geometries. The normal CF practice, of attaching a
grid_mapping
attribute--containing the name of a CRS container variable--to a data variable, can be used and thegrid_mapping
CRS should be assumed to apply to the geometry. However, the normalgrid_mapping
, which typically applies to auxiliary coordinate variables and remains optional for use with geometries, can be overridden by attaching acrs
attribute that contains the name of a CRS container variable to the geometry container variable. If a grid_mapping is not present on a data variable linked to geometry, a crs attribute is required.Example 7.14. A multipolygon with holes
This example demonstrates the use of all potential attributes and variables for encoding geometries.
Example E.1. Timeseries with geometry.
The time series variable, someData, is associated with line geometries via the geometry attribute. The first line geometry is comprised of three nodes, while the second has two nodes. Client applications unaware of CF geometries can fall back to the lat and lon variables to locate feature instances in space. In this example, lat and lon coordinates are identical to the first node in each line geometry, though any representative point could be used.
The text was updated successfully, but these errors were encountered: