Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for deformation models #1001

Closed
ccrook opened this issue May 18, 2018 · 38 comments · Fixed by #2206
Closed

Add support for deformation models #1001

ccrook opened this issue May 18, 2018 · 38 comments · Fixed by #2206
Labels
enhancement feature request pinned Prevent stale-bot from closing an issue

Comments

@ccrook
Copy link
Contributor

ccrook commented May 18, 2018

This is an enhancement proposal to provide a generic implementation of deformation models
into proj.

This briefly describes the reason for using deformation models and proposes
a detailed format definition for encoding such a model. The format definition
is intended as a straw man for debate before implementation. The intention
in detailing so precisely is to hopefully provide an unambiguous representation
of the proposal. Also if it does provoke debate then it can be updated to converge on an implementation proposal.

Background

Deformation models are used to relate coordinates or coordinate systems based
on physical objects to global references frames such ITRF/WGS84 used by GNSS
(global navigation satellite systems) such as GPS.

For stable continental tectonic plates which are moving as a rigid body
across the earth this movement can be represented as a simple rotation,
and implemented using a helmert transformation with differential rotation
parameters. https://proj4.org/operations/transformations/helmert.html

In some areas deformation can be described by a velocity model For example
GIA (glacial isostatic adjustment) in Scandinavia is well modelled by
velocity grids.

However in many areas the secular velocity model is overlaid with complex
deformation due tectonic events such as earthquakes, which may include both
coseismic deformation (a step function) as well an ongoing post-seismic
deformation signal.

The New Zealand geodetic datum NZGD2000 is an example of a datum that
incorporates a deformation model.

While there are formats which describe aspects of deformation
(NTv2, ctable, have been used) there is currently nothing which provides a
general complete contained model of deformation that can be implemented
as a time dependent transformation step into a transformation chain.

Functional model

A generic functional model which can represent most types of deformation consists of
an arbitrary number of components, each of which consists of a
spatial model multiplied by a piecewise linear time function. The spatial
model is defined by an ordered sequence of one or more grids. The displacement of the spatial model
can be defined in one of two ways, either by adding the values calculated for
each grid independently, or by calculating the value of the first grid in the sequence
which contains the calculation point. This allows defining a set of nested grids (such
as used in the NTv2 model. Each grid may define horizontal deformation,
vertical deformation, or 3d deformation.

Multiple grids are provided to allow efficient representation of deformation such as
that due to earthquakes with very variable complexity. The spatial model
can use a coarse grid representing the far field deformation, and
progressively finer resolution grids approaching the epicentre.

Currently this proposal does not include defining uncertainties of the displacement
at each node (as for example NTv2 does).

A piecewise linear time model is proposed as a simple minimal functional
that can represent the approximate time history of a deformation event. While
more complex functional forms are often used in geophysics (for example exponential or logarithmic functions for post-earthquake deformation) these can be
represented with arbitrary precision by a pieceswise linear function and
with realistic precision using relatively few points.

A model following this approach (with some variations in the time model)
has been adopted by New Zealand for publishing
their deformation model. The various deformation grids are defined in separate
csv format files, with the time functions and other metadata linking them
together also in csv files. The entire set of files is collected into a single
zip file. While this format is complete and very accessible it is intended specifically for
publication. It is much too inefficient for use in large scale transformation
computations.

In addition the New Zealand experience suggests it is useful to support
multiple versions of the deformation model. With each earthquake a new version of the datum
is published with a new version of the deformation model. The new version of
the deformation model is identical to the previous except that it has some
extra components (spatial/time models) representing the latest earthquake. It
therefore contains all the data for all previous versions of the deformation
model. So it makes sense that it can be used to calculate all versions of the
model, not just the latest.

This is supported in this proposal by specifying for each component the model
version in which it is added, and the model version in which it is removed.

Apart from the format and api for calculating the deformation it defines
at any time and place, the format needs to be supported by a number of tools
for constructing from ascii or other formats, for checking the validity
and consistency of the model (for example continuity where nested grids
overlap), and for testing the calculated deformation.

Binary format proposal for generic deformation model.

Some criteria in defining a binary format are:

  • it should be as simple and minimal as practical
  • it should contain sufficient metadata to completely define the model
  • it should be identifiable and have well defined and testable encoding of numeric values
  • it should be structured to support future enhancement and extension

The following format is a straw man proposal for a format that can be used to
implement the deformation model, to serve as a basis for discussion. This will
be version 1 of the format.

The format combines a model definition and a series of grid definitions in a single file. Effectively each of these is an embedded file. As currently written this proposal is suggesting a simple concatenation of components in which the model definition file defines the byte offset of each of the grids from the end of the file. However the discussion below suggests that there may be advantages in using the tar format to combine the grid files. In this case the header file would identify grid files by name, and the grid file locations would determined when they are first read.

Units used are:

  • x,y coordinates are in degrees for geographical systems. If non-geographical systems are defined the units will be metres.
  • dx,dy displacements are in degrees for geographical systems. If non-geographical systems are defined the units will be metres.
  • dz displacements are in metres
  • time values in the model are expressed as decimal years.
  • model version numbers are defined as unsigned*4 (The New Zealand convention respresents versions as yyyymmdd)
  • text values are represented by an unsigned*2 count of bytes in the string, followed by the data. Text data will be assumed to be in UTF-8, though it is not directly used by the model.

East and north values for coordinates and displacements are positive. Upwards displacements are positive.

It is proposed that this format is called defmod.
Note: it turns out defmod is the name of a finite element crustal deformation modelling format which sufficiently close to this application that it could be confusing. Format name to be determined.

File header

The header identifies the filetype, binary endcoding, and versioning information. It is
proposed that for proj the encoding will always be little-endian, though the header could be used
to check endian-ness

name type description
signature char*10 Identifies this as a defmod file, always "PROJDEFMOD"
bytesig2 unsigned*4 Test for correct reading of endian-ness, always 12345678
bytesig3 float*8 Test for correct floating point numeric format, always 1234.5678
version unsigned*4 Version of defmod format used by this file, initially 1
flags unsigned*4 flags defining component settings
ncomp unsigned*4 number of components in the model
ocomp unsigned*4 offset of the first component definition from the start of the file

The flags are followed by deformation model metadata. This is not used for the calculation
but may be used by the model creator to specify information about the model.
The metadata is defined by key/value pairs. Each key and value is a string value.
It may be useful to define some keys for common use, such as description, authority, and so on.

name type description
nmeta unsigned*4 Number of metadata items
key1 string First metadata key
value1 string First metadata value
key2 string Second metadata key
value2 string Second metadata value
....

Flags to implement initially are:

id description
is_geographic Set if the coordinate system is geographic, so can wrap longitudes

Model components

Each component (spatial model/time model) is defined by a header as follows:

name type description
cbytelen unsigned*4 offset from start of this component to start of next component.
ntimes unsigned*4 number of time/scale factor pairs defining the time model
ngrids unsigned*4 number of grids defining the spatial model
cflags unsigned*4 flags defining component settings, see below
trange float*8 tmin, tmax, the range of times for which the model is considered valid. Calculation of times outside this range will fail
crange float*8 four values, xmin, xmax, ymin, ymax, defining the spatial extent for which the component is defined
vadd integer*4 the first version of the deformation model (not the format) for which this component is to be used
vremove integer*4 the version of the deformation model in which the component is revoked. Use 0 if it has not been revoked.

This is followed by ntimes time points. Each contains a time value in decimal years, and a scale
factor that applies at that time. Between the time points the scale value is interpolated
linearly. The scale factor before the first time and after the last time are defined by the
tlinear0 and tlinear1 values.

As a special case, if there is just one time/scale point then the spatial model defines a velocity
model rather than a displacement model. The displacement is zero at the time point. The
spatial model is multiplied by the time point scale factor and treated as a rate of change per year.

Time values should be defined from the earliest to the lastest time. Consecutive values can have the same time to represent a step function.

Each time value is represented by

name type description
timen float*8 The time at which the scale factor applies
scalen float*8 The scale factor applied to the spatial model at that time

This is followed by the sequence of grids implementing the spatial model.
Each grid can be used by more than one
component, though usually will not be. The format of the grid is defined below. It is represented
in the component definition by its offset from the start of the file. Also the range of each grid is defined in the component definition so that there is no need to seek grid definitions which do not include the point of interest. The range is duplicated in the grid definition below so that it forms a complete standalone embedded definition of the grid. The checking software will include testing that the two copies of the ranges are the same.

name type description
gridn unsigned*8 The offset of the start of the grid from the start of the file
xmin float*8 Minimum x value (longitude) covered by the grid
xmax float*8 Maximum x value (longitude) covered by the grid
ymin float*8 Minimum y value (latitude) covered by the grid
ymax float*8 Maximum y value (latitude) covered by the grid

The list of grid models may be followed by component metadata stored as a sequence of key, value
pairs. Each key and value is a string value.

name type description
nmeta unsigned*4 Number of metadata items
key1 string First metadata key
value1 string First metadata value
key2 string Second metadata key
value string Second metadata value
....

Flags to implement initially are:

id description
fail_outside If set then the deformation is undefined outside the range of the component, so calculation of it should fail. If not set then it is assumed to be zero outside its extents.
grids_additive If set the each grid is calculated independently and summed to define the spatial model deformation. Otherwise just the first grid containing the calculation point is used to calculate the spatial model deformation
tlinear0 If set then the first two time/scale points are extrapolated for times before the first. Otherwise the first scale value is used for all times before the first
tlinear1 If set then the last two time/scale points are extrapolated for times after the first. Otherwise the last scale value is used for all times after the last

Grid format

The grid format proposed can be used as a standalone grid file. The intention is that it should be useful as a generic grid format. The implementation in defmod will place constraints on the generic format in that it will require that the grid defines one, two, or three values which are dx, dy, dz. The format can easily be extended to include other information, for example horizontal and vertical uncertainty.

While a useful general grid format could also include a definition of nesting within the grid file this implementation is using the component definition to define the nesting of grids.

The existing formats such as ctable2 and GTX only partially meet the requirements for deformation models. In particular they don't support three data values at each node.

The defmod file may contain an arbitrary number of grids.

The data values (displacements) values are stored as integer*4 values. For each data value the grid header defines an offset and scale factor to convert the integer to the displacement value. The value INT_MIN is used to identify missing data.

Grid nodes are assumed to be regularly spaced in rows and columns aligned with the x,y axes.
For geographic grids this means that the grid nodes are not equally spaced physically.

The data values for each node are stored sequentially. The nodes in the grid are stored sequentially for each latitude value ordered from east to west. The latitude rows are stored sequentially ordered from south to north.

Data values within grid cells are calculated using bilinear interpolation between the values at the nodes forming the corner of the cell. Data values outside the range of the grid are application dependent. A data value is undefined if its calculation requires using missing data.

For use in defmod the grid data values will be either (dx, dy, dz), (dx, dy), or (dz). Values outside the range of the grid are assumed to be zero.

Note that this format is very simplistic. Some efficiency of retrieval could be gained by a tiled storage of blocks of the grid at a small cost in locating the values.

The grid is represented in the file as:

name type description
ftype unsigned*4 Version of the grid format used. Always 1 for this format
gflags unsigned*4 Grid flags
dataoffset unsigned*4 Offset from start of grid definition to grid data
ncol unsigned*4 Number of x values (columns) in the grid.
nrow unsigned*4 Number of y values (rows) in the grid.
nval unsigned *4 Number of values defined at each grid node
xmin float*8 Minimum x value
xmax float*8 Maximum x value
ymin float*8 Minimum y value
ymax float*8 Maximum y value

All the values in the grid are stored as integer*4 values, so for each of the data values at
each node the header defines a floating point scale and offset to apply to the integer value
to get the actual value at the node. There are nval scale and offset values.

name type description
scalei float*8 Dx offset
offseti float*8 Dx scale value

The grid header may be followed by metadata stored as a sequence of key, value
pairs. Each key and value is a string value. This may include the names and descriptions
of each data value as value#_id, value#_description, where # is 1 for the first data value, etc.

name type description
nmeta unsigned*4 Number of metadata items
key1 string First metadata key
key1 string First metadata value
value1 string Second metadata key
key2 string Second metadata value
....

This is followed by the actual griddata stored as integer values

name type description
value integer*4 nvncolnrow values (nv is the number of values at each grid node)

Currently no grid flags are defined.

Supporting tools

A number of tools are proposed to support using the defmod format. Tools anticipated listed below - some or all of them may be unnecessary if their function can be implemented into existing proj/gdal tools.

defmod_convert

Converts between a simple ascii format and the defmod binary format. The ascii format will include
a single model definition file defining all the components, metadata, and time models, and a separate file for each of the grids. The grid files may alternatively be in the format used within the defmod file, particularly if that is supported by conversion tools iteself.

The format of the model definition file is not yet defined, but will likely be a simple UTF8 file as simple code/value pair as below. This could perhaps be fully implemented as a YAML file.

defmod_convert could be extended or supplemented with additional programs to convert other formats.

defmod_calc

A program to calculate the deformation for a set of dates and locations. It is proposed that this
is a command line program that can take either lon/lat/date values on the command line, or a date
filename for a set of input coordinates, or a file containing a set of longitude, latitude, date
values.

This will be built with the same code as that implementing defmod in proj so that it can be used
for development testing.

Also it can be used by users creating deformation models to check the model.

defmod_check

A tool to check the correctness of a defmod file. It checks that all data in the file is
valid (eg offsets, grid sizes, etc) and where possible checks and warns about possible
discontinuites in the deformation model. For nested models this may be between the edge of a child and its parent. It may also warn if a grid overlaps more than one subsequent grid. For non-nested
grids it will check that the values at the edge are zero, unless it is also on the the edge
of a component with the "fail_outside" flag.

@kbevers
Copy link
Member

kbevers commented May 18, 2018

These are indeed some interesting thoughts and a topic that I am currently very interested in. I assume you are familiar with the deformation, hgridshift, vgridshift and pipeline operators. While not fully achieving the generic implementation you envision, they get you almost all the way (I am by the way working on extending the gridshift operators so they can be used as step functions with regards to coseismic deformation).

I have so far tried to work around the creation of a new file format for deformation models but I agree that having it all in just one file does have some nice benefits. Especially in terms of distribution. Let's continue down this interesting path.

As I understand this proposal it comes with several individual "work packages":

  1. Determine the scope:
    Basically, what sort of deformation do we want to be able to model?

screen shot 2018-05-18 at 10 12 52

The generic implementation would be able to handle everything in the above figure. I think they can be narrowed down to three archetypes: Coseismic deformation (displacement step function), post-seismic deformation (e.g. logarithmic function) and secular deformation (velocity field - includes any predictable deformation, like GIA, plate motions, intra-plate deformation, etc.).
  1. File format definition:
    Based on the above a file format can be defined. You've already made a nice initial proposal.

  2. File format implementation:
    Once the format is defined this is "easy". If possible this should be implemented as a simple header file that can easily be included in PROJ and GDAL. Inclusion in GDAL is crucial so that all the regular tools can be used afterwards when creating models (possibly alleviating the need for the proposed defmod_calc tool). It also comes with the added bonus of not having to write defmod_convert and defmod_check as suggested since GDAL already provides the needed infrastructure.

  3. Implementation in PROJ:
    Perhaps the current deformation operator could be superseded by this? Apart from reading grids we already have most of the necessary code, it just needs to be refactored into this new operation.

  4. Create some deformation models!

So let's start by discussing point 1. I gather that your primary reason for this proposal is to be able to handle transformation to, from and within NZGD2000? That is, a datum of the semi-dynamic variety. For me it is important that fully dynamic datums are supported as well. I don't think those two are conflicting but there may be some things to consider, the several central epochs of the semi-dynamic (corresponding to the reverse patches) versus a dynamic datum that in principle has no central epoch.

Perhaps it would be good to describe a few use cases with details of how the transformations work. E.g. NZGD2000, GIA in Scandinavia, dynamic datum in Iceland, the new American datums. I think that will make it easier to settle on a format that works in the generic case. Thoughts?

@busstoptaktik
Copy link
Member

@ccrook: I have two comments at wildly different levels of abstraction. Let's take the most concrete first:

  1. Please replace all instances of unsigned*2 by unsigned*4. The lessons learned from the LAS LiDAR point cloud data format is that "you will need larger blocks than you expect". So rather than retrofitting support for blocks larger than 64 KB, please waste a few tens of bytes from the start to support a block size of up to 4 GB.

  2. If I read the description correctly, the model parts accumulate forward in time, so the (probably) most common case will require a trip through all partial models. For efficiency, would it be possible to revert this, so the format is based on an accumulated model, supplemented by "time reverted" partials, so the (probably) common case of applying the accumulated model will be a one step thing? I realize this will probably require separate handling of deformation and displacement.

The displacement parts will probably require use of the NTv2 style nested grids anyway, but stepping through nested grids while accumulating deformations forward in time seems like an unnecessarily inefficient way of doing things.

Note, however, that using a time reverted implementation will not reduce the complexity of the code, since we will still have to be able to peel off the surplus elements backward in time if transforming older data. If this is actually the common use case, this suggestion will not make any difference wrt. efficiency.

Also, your comments about the semantics of the grids_additive flag, seem to indicate that the format is actually intended as time reversed. Could you elaborate a bit on that? And, for exposition, do you know of any accessible descriptions of how your current model is being used.

Also, @kbevers, your comment about "reverse patches" seems to indicate that your world view is the same as I suggest - I just cannot read this intention from Chris' original description.

Since humans live in a world obeying the second law of thermodynamics, forward time descriptions are more comprehensible, while time reverted implementations are probably more efficient in computational cases where we do not need to respect the "arrow of time".

@kbevers
Copy link
Member

kbevers commented May 18, 2018

The displacement parts will probably require use of the NTv2 style nested grids anyway, but stepping through nested grids while accumulating deformations forward in time seems like an unnecessarily inefficient way of doing things.

Note, however, that using a time reverted implementation will not reduce the complexity of the code, since we will still have to be able to peel off the surplus elements backward in time if transforming older data. If this is actually the common use case, this suggestion will not make any difference wrt. efficiency.

For the case in Iceland, I expect it to be a common thing to transform data from various epochs to a common epoch making it necessary to potentially apply deformation from several grids. But importantly a differing number of grids, depending on the "beginning epoch" of each coordinate. This is different from what is needed in New Zealand I believe.

Next week I will finish a document on how we plan on doing transformations in the Icelandic dynamic reference frame. I'll make sure to share it here as an input to this discussion.

Also, @kbevers, your comment about "reverse patches" seems to indicate that your world view is the same as I suggest - I just cannot read this intention from Chris' original description.

Maybe you already know this, but if not, the reverse patch term is used by LINZ to describe a complete update of NZGD2000, basically a re-realization that updates coordinates of the fundamental stations in the frame. At least that is how I understand the concept - please excuse me if I have presented the concept incorrectly. Depending on the level of support for this kind of datum that is needed it will be necessary to keep track of these reverse patch updates, so that coordinates from a previous version can be used with the most recent version.

@rouault
Copy link
Member

rouault commented May 18, 2018

though the header could be used to check endian-ness

Let's make it as simple as possible and remove the bytesig1/2/3 fields. We just need to add that floating-point number use IEEE-754 little-endian encoding, and everything is perfectly defined.

signature | char*10 | Identifies this as a defmod file, always "PROJ_DEFMOD"
-- | -- | --

This fits on 11 bytes, not 10 :-)

value1 | string | Second metadata key
-- | -- | --

Should read "First metadata value"

is_geographic | Set if the coordinate system is geographic, so can wrap longitudes
-- | --

Shouldn't we have (instead of in addition to this) a more standard way of expressing the CRS to which the defmod applies to : WKT2... ?

ncol | unsigned*2 | Number of x values (columns) in the grid.
-- | -- | --

I join @busstoptaktik sentiment there. Grids > 65536x65536 could potentially happen in a foreseable future
And consequently

gridn | unsigned*4 | The offset of the start of the grid from the start of the file
-- | -- | --

should probably be a unsigned * 8

Other points:

  • Does that make sense to consider grids that would have nodata values ? (as other grid formats have that). In which case we could decide as a convention that INT_MIN = -2147483648 is the nodata value. Behaviour to be defined when you encounter such nodata value.

  • Regarding GDAL support to read and/or create a defmod file, this could be tricky as GDAL raster model is rather simple compared to the nesting aspect contained in a defmode file. Might be possible by abusing it (GDAL has a record of having drivers that push it to/beyond its limits), but I'd rather see GDAL as used internally by a dedicated defmod.py creation tool, for example to read the input data of a single grid.

  • For the defmod_convert utility, the metadata file you sketched looks pretty similar to a YAML encoding, which is Python friendly

@ccrook
Copy link
Contributor Author

ccrook commented May 18, 2018

Following @busstoptaktik I'll deal with the concrete first.

Thanks for corrections and suggestions for amending the format. All good and I've update to reflect all (I think) except the removal of the check values in the header for integer and floating point format. The reason for leaving those in is that I can imagine using the format other than in proj. For example in NZ we have a deformation model used on a big-endian solaris platform as part of a database extension. Similarly the suggestion of defmod_check and defmod_calc is to support use outside of PROJ/GDAL, for example in other software. The defmod_calc component may also be useful in unit testing.

@kbevers thanks for the pointers to deformation, pipeline, and gridshift operators. I have read this page, but I certainly wouldn't claim to be familiar. As you note though, I would like to have this in a single file rather than defined by multiple steps in proj. Again the driver is that it has uses outside proj. Another example for NZ is that we use the deformation model in our geodetic survey network adjustment software to adjust together geodetic observations at different epochs. At the moment this is using a binary format not dissimilar to this specification(!)

And @rouault on the allocating 10 bytes for an 11 byte header string, well I just ran out of fingers!

@rouault mentioned that the model needs to specify the coordinate system it relates to. Definitely! I suggest that this is a standard metadata item for the model. On the other hand the only thing that the algorithm using the file needs to know about the coordinate system is whether it is allowed to wrap longitudes. Certainly a checking program could include validating the wkt2 definition against the flag, but for implementation I think it is cleaner if the code doesn't need to interpret a WKT2 string (again particularly for implementations outside proj).

@kbevers very cool presentation in your link, which I have looked at. You said "create some deformation models". Well, as you probably know, we already have on in NZ that we have been maintaining for 20 years now (our first velocity grid was built and used in 1998).

On to the more abstract and lengthy discussion:

@busstoptaktik The NZ thinking is very much along the lines you have suggested, that the most common case involves converting coordinates at the current epoch. Eg converting a precise GNSS observation made today in terms of ITRF to an NZGD2000 coordinate. How many components an grids are involved in the transformation depends on what has happened tectonically between the current time and the time represented by the reference coordinate system.

In NZ we use "reverse patches", as @kbevers has described which do achieve this. That is we update the reference coordinates to reflect the deformation, and so for converting pre-earthquake coordinates you need to apply the earthquake deformation, whereas for post-earthquake deformation you do not.

In terms of the deformation model the difference is very small - just whether the time model is a step function from 0 to 1, or a step function from -1 to 0. In terms of the user community however it is much greater, as it means that all their coordinate based information needs to be updated.

The driver for using reverse patches in NZ is not for computational efficiency however. It is to manage the level of distortion in the reference coordinate system.

The NZ system is approximately ground fixed. Which is to say that as far as it is reasonable the coordinates of "fixed" features do not change. This is a desirable feature for users of GIS databases, and for doing geospatial analysis combining data at different epochs.

However this comes at a price. As NZ is continuously deforming, the fixed coordinates no longer represent the actual physical relationship of points. If you naively calculate a distance between NZGD2000 latitude and longitudes of two points it will differ from the distance you actually measure due to the accumulated deformation. In practice the NZGD2000 coordinates define a geospatial reference system rather than a geodetic datum. The correct way to calculate a distance is to convert the coordinates to, say, ITRF, and then calculate using those coordinates.

However most users of NZGD2000 coordinates do not use the deformation, and assume that the coordinates do represent the relationship between points. So in managing the NZ datum we are balancing between two conflicting drivers - stable coordinates (ie not changing for "fixed" features), and relative accuracy of coordinates. And generally by relative accuracy we mean relative accuracy in terms of current location.

The reverse patch is used where the distortion caused by an earthquake means that the relative accuracy of the coordinates is compromised more than can be tolerated by users of the datum.

As @kbevers notes, in effect this is a re-realisation of the datum. On the other hand it is not a "complete re-realization". We call it a patch because it only affects the area where distortion is significant.

@kbevers raises an interesting point on support for dynamic datums as well as semi-dynamic datums.

I have a horrible feeling that the term "semi-dynamic" might have been coined by LINZ. Its one I'm trying to move away from. To me it makes much more sense to define the datum in terms of what it is in fixed by. So ITRF and WGS84 are global datums. The soon to be replaced AGD94 is a plate fixed datum.

The NZGD2000 coordinate system is more like a ground fixed datum in terms of the coordinates it defines. This is very much like our previous "static" datum which was defined by fixed coordinates assigned to first order marks. The difference is that NZGD2000 is defined by the deformation model which relates it to ITRF96 (and hence to ITRF2008/2014 etc), rather than by fixed coordinates of survey marks. I guess this does make a bit ambiguous in terms of what is is in fixed to, but then I did say it was more a geospatial reference system than a geodetic datum!

However in terms of the deformation model it makes no practical difference whether the datum is dynamic or semi-dynamic. The deformation model is a simplified model of surface displacement, which is a physical thing. It describes how things which we perceive as "fixed" are actually moving, and allows us to compare observations of those things at different times.

The difference is only in how the deformation model is used. In the ground fixed datum it is used in coordinate conversion, to convert coordinates between the global and ground-fixed datums. In the case of a "dynamic" datum, which is a global datum in which the coordinates of fixed objects are changing all the time, it is not used in coordinate conversions. In this case all coordinates are time tagged, and the deformation model would be used to shift the coordinate though time.

@ccrook
Copy link
Contributor Author

ccrook commented May 19, 2018

Updated the specification to include the extents of the grid in the component definition. This means that for a nested grid model the grid to use can be selected without needing to extract the grid definition from elsewhere in the file.

@ccrook
Copy link
Contributor Author

ccrook commented May 20, 2018

@kbevers your comment above mentioned support in GDAL. It has got me to thinking a bit more about the grid format component of the model.

As I have proposed at the moment each grid is a complete grid definition embedded within the overall defmod file. It makes sense for that format to stand alone as a grid format that can be used on its own or within the model definition. And of course if it can use an existing format supported by GDAL even better.

As far as I can see none of the existing grid formats meet the requirements of providing 1, 2, or 3 values at each data point (nor for that matter is it handled in the proj pg_gridinfo structure yet), so I think that this will require implementing a new grid format, even though there seem to already be so many. If that is the outcome, it may be worth ensuring that the grid format is as generic as reasonable, and doesn't just serve the purposes of the deformation model.

With that in mind I've modified the grid format to be more generic by allowing an arbitrary number of data values, and adding metadata to improve its utility as a generic grid format.

Question: Should this be taken a step further defining the nested grid as a separate format that is embedded, rather than having the nesting defined within each deformation component?

@rouault
Copy link
Member

rouault commented May 20, 2018

Requiring understanding WKT2 would be indeed overkill for most usages, so perhaps have this as a metadata with a reserved key name.

Regarding a grid format that would support a variable number of values per point, there are a number of raw binary formats handled by GDAL, but all those meeting this requirement + int32 values have a ASCII based header (like ESRI .hdr). Another possibility, but I'm writing this with some hesitation, would be to use a embedded TIFF file highly constrained on which TIFF fields are allowed and their values, to make it easily readable by PROJ. But for example the bounding box xmin, ymin, xmax, ymax are not standardized in TIFF, so this would require GeoTIFF, etc... Similarly offseti, scalei are not standardized, or adding set of key/value pairs,etc. So this gets really complicated (we don't want proj to depend on libtiff and libgeotiff, especially as the later depends on proj !), and having a dedicated grid format that meets all our requirements is probably simpler

I'm reading currently a draft of OGC 18-005r1 / ISO 19111:2018 (see http://www.opengeospatial.org/standards/requests/166 ) where they introduce the concept of dynamic reference frame (dynamic datum), opposed to a static one. It seems to me that NZGD2000 would match this category. Do you confirm ?

@ccrook
Copy link
Contributor Author

ccrook commented May 20, 2018

@rouault Short answer - yes. The statement in B.3 that "A CRS that has an associated deformation model is dynamic, regardless of whether it is plate-fixed or earth-fixed" is pretty unambiguous - NZGD2000 is dynamic.

With a little loose interpretation the specification does encompass NZGD2000. It doesn't quite cover it in that it tends to refer to velocity grids/models rather than more general deformation models, but at least in some instances it does mention deformation models. Also it refers to a "frame reference epoch" at which a dynamic reference frame coordinates are defined, which is not exactly correct - the coordinates are defined at any epoch, especially "now". @ndonnelly has been involved in this specification from a New Zealand point of view and for practical purposes it feels close enough!

@kbevers
Copy link
Member

kbevers commented May 22, 2018

@ccrook a few of comments:

As you note though, I would like to have this in a single file rather than defined by multiple steps in proj

I understand that and it certainly makes life a lot easier from a distribution point of view.

Well, as you probably know, we already have on in NZ that we have been maintaining for 20 years now (our first velocity grid was built and used in 1998).

I am aware, yes :-) What I meant was translate existing models to the file format. At least for existing models. Some of us still have a bit of work left on that matter.

I have a horrible feeling that the term "semi-dynamic" might have been coined by LINZ. Its one I'm trying to move away from. To me it makes much more sense to define the datum in terms of what it is in fixed by. So ITRF and WGS84 are global datums. The soon to be replaced AGD94 is a plate fixed datum.

I believe that to be true - unfortunately the cat's out of the bag and that term isn't going away any time soon. Especially since it seems to have been picked up in the coming ISO 19111 standard. I also prefer to use different terminology and will insist on doing so in any official documents I co-author in the future.

However in terms of the deformation model it makes no practical difference whether the datum is dynamic or semi-dynamic. The deformation model is a simplified model of surface displacement, which is a physical thing. It describes how things which we perceive as "fixed" are actually moving, and allows us to compare observations of those things at different times.

This is of course true. It was just an initial thought I had and wanted to make sure that any possible use cases are not excluded by creating a too narrow format.

The difference is only in how the deformation model is used. In the ground fixed datum it is used in coordinate conversion, to convert coordinates between the global and ground-fixed datums. In the case of a "dynamic" datum, which is a global datum in which the coordinates of fixed objects are changing all the time, it is not used in coordinate conversions. In this case all coordinates are time tagged, and the deformation model would be used to shift the coordinate though time.

I agree with all of this, apart from the last sentence, although this is proably just semantics: Shifting coordinates in time is also a coordinate conversion. Conversions that in practical use of a dynamic frame are important and very frequent (basically we want to transformat coordinates to now all the time).

@kbevers
Copy link
Member

kbevers commented May 22, 2018

@rouault

Other points:

Does that make sense to consider grids that would have nodata values ? (as other grid formats have that). In which case we could decide as a convention that INT_MIN = -2147483648 is the nodata value. Behaviour to be defined when you encounter such nodata value.

I think having a NODATA option is a good idea. INT_MIN seems like a good choice.

Regarding GDAL support to read and/or create a defmod file, this could be tricky as GDAL raster model is rather simple compared to the nesting aspect contained in a defmode file. Might be possible by abusing it (GDAL has a record of having drivers that push it to/beyond its limits), but I'd rather see GDAL as used internally by a dedicated defmod.py creation tool, for example to read the input data of a single grid.

I may have been too creative in my suggestion of incorporating this new format in GDAL. My main reason for that suggestion is that it has been very handy for me to be able to open the various deformation grids I work with directly in QGIS and visually evaluate them. This is in no way a hard requirement on my part and the same can likely be achieved in other ways.

Seems like a good option to be able to construct a defmod file from other grid formats using GDAL as a mediator. A similar tool to extract single grids from the defmod file would also be handy.

@rouault
Copy link
Member

rouault commented May 22, 2018

As you note though, I would like to have this in a single file rather than defined by multiple steps in proj

Just thinking that an alternate & more standard way of having multiple files stored in a single one would be to use the TAR format, which is simple enough to implement. Then it is easier to inspect the content of the file or build it from individual subfiles. GDAL has for example a /vsitar/ virtual file system that can be used to open transparently rasters in tar files: gdalinfo /vsitar/{my.defmod}/subgrid

@kbevers
Copy link
Member

kbevers commented May 22, 2018

Just thinking that an alternate & more standard way of having multiple files stored in a single one would be to use the TAR format, which is simple enough to implement.

A similar thought has struck me, but I discarded it quickly because I thought that would be complicated to implement. I have no experience with the subject though so if you think it is a reasonable thing to implement I am on board.

It would probably make things easier since we can then mostly rely on formats PROJ and other software already understand. So far I have not find a problem that couldn't be solved with a bit of creative use of either CTable2 or GTX files. That's not saying there aren't smarter ways to do things though.

@rouault
Copy link
Member

rouault commented May 22, 2018

I thought that would be complicated to implement

A tar file is basically a 512 byte header giving the subfile name and size ( https://en.wikipedia.org/wiki/Tar_(computing)#Header ) , followed by the file content, with nul byte padding up to the next 512 byte boundary. And repeat that with other files

@ccrook
Copy link
Contributor Author

ccrook commented May 22, 2018

I agree that defmod in gdal doesn't make much sense. I was thinking that the embedded grid format would be useful to support in GDAL. (Even though it feels like that we are slightly corrupting the role of a raster format in treating these grids as raster data)

@kbevers @rouault

Interesting discussion on the use of tar file. Certainly it has some advantages in terms of building and taking apart the file (and as you say it can be interpreted as a virtual file system). The main downside I think is that it may require scanning the whole file (or at least all the file headers) to find a specific grid file, but with a bit of caching it would only need to do that once. As a constraint it would make sense to require that the model definition is the first file in the archive. Also using tar would allow adding other related files (eg metadata). So despite my initial cringe response I think it makes sense. I've updated the description with this possibility.

On ctable2/gtx I'm not so convinced (yet!) Can you point me to a definition of the ctable2 format? So far I've not been able to find one. My preference is to use a format that supports 3d deformation. I'm a bit wary of GTX - you just never know when you might want to use a value of -88.88880. Also being a bit long in the tooth I don't like wasting space! While this is less of a concern than it used to be, there are still performance benefits in keeping things small, and that is still a concern as data sets that we may be processing get much bigger. OTOH there is always a trade off if using a smaller object requires more work.

@rouault Agree on NODATA==INT_MIN, I've added that to the description. Also your suggestion for YAML as a format for the ASCII model definition file.

@ccrook
Copy link
Contributor Author

ccrook commented May 22, 2018

@kbevers

The difference is only in how the deformation model is used. In the ground fixed datum it is used in
coordinate conversion, to convert coordinates between the global and ground-fixed datums. In the
case of a "dynamic" datum, which is a global datum in which the coordinates of fixed objects are
changing all the time, it is not used in coordinate conversions. In this case all coordinates are time
tagged, and the deformation model would be used to shift the coordinate though time.

I agree with all of this, apart from the last sentence, although this is proably just semantics: Shifting
coordinates in time is also a coordinate conversion. Conversions that in practical use of a dynamic
frame are important and very frequent (basically we want to transformat coordinates to now all the time).

Fair comment! I guess I was thinking of a conversion between coordinate reference frames, rather than a coordinate transformation. You are absolutely right - in the end we are always just converting coordinates, and regardless of the type of datum we need the same coordinate transformation capability.

@rouault
Copy link
Member

rouault commented May 22, 2018

@ccrook There's no formal definition of ctable2 as far as I know. The authoritative documentation is the code implementation in proj https://github.com/OSGeo/proj.4/blob/39581f346d5c4d86afb675d0f7faca6677ba586f/src/nad_init.c#L185 . GDAL has also its read/write driver for it : https://github.com/OSGeo/gdal/blob/master/gdal/frmts/raw/ctable2dataset.cpp

But CTable2 is constrained to a 2-band (ie 2 samples per node) float32 grid. And I think as I said in a previous comment my review of GDAL grid formats doesn't show anything that has both a binary header and a variable number of samples per node. Hum, with the .tar approach, we could potentially have a file with an ascii header and the grid data being in another file.

@ccrook
Copy link
Contributor Author

ccrook commented Jun 11, 2018

Another aspect I'd like to consider in terms of implementation is efficiency.

Some of the largest data sets we might want to transform, eg LIDAR point cloud, have the smallest extents. If we transform point by point we might need for each point to select which grid to use, which components of the deformation model to use, etc.

This would be much much more efficient if the proj API allowed for declaring a range of interest (coordinate extents, time extents). The deformation model code could use this to extract a subset of the model to use for the transformation, which might be a few grid cells in one or two components, rather than the whole model. This would be much more efficient to calculate.

More generally the API could allow predeclaring requirements on accuracy and repeatability which could provide optimisation opportunities for algorithms such as projections

The only thing I've seen in the API (old API) that could relate to this is the area parameter in https://proj4.org/development/reference/functions.html#c.proj_create_crs_to_crs. This is described as for future implementations to select the appropriate transformations to apply.

Perhaps this could be extended to include at least a time range as well as a space range, and be available to transformation functions when they are initiallizing?

@kbevers
Copy link
Member

kbevers commented Jun 12, 2018

@ccrook Those are all some good points that we should consider eventually. I think the first order of business is to have an implementation that works. Numerical efficiency can come later. I am not saying it is not worth considering how things is done smartest but I reckon it is going to be hard enough to get this working without trying to optimize things. Keep in mind that this will be just another operation in PROJ - it needs to conform to the rest to be usable.

Much of what you desire can be controlled by +parameters just as you would in any other operation in PROJ. Specifying a time range or which deformation model components you want should be possible to configure in the setup string. There may be smarter ways to go about this but those will likely require larger structural changes to the internals of PROJ. I don't want to introduce special cases unless it is absolutely necesary.

More generally the API could allow predeclaring requirements on accuracy and repeatability which could provide optimisation opportunities for algorithms such as projections
The only thing I've seen in the API (old API) that could relate to this is the area parameter in proj4.org/development/reference/functions.html#c.proj_create_crs_to_crs. This is described as for future implementations to select the appropriate transformations to apply.
Perhaps this could be extended to include at least a time range as well as a space range, and be available to transformation functions when they are initiallizing?

It is possible to reconfigure the PJ_AREA struct to include more than just the area without breaking the existing API. It was specifically introduced to be able to make transformation setup smarter when using the EPSG database. It may be wise to wait until that is done before we start working on that in relation to the deformation models. At least there's more than just one application to consider - this need to be as generic as possible.

To sum up, my take home message is: Keep it simple. There's nothing in the way of keeping the deformation model code contained within a single PROJ operation. Getting the file format right is a big task in itself so keeeping the application code simple will make life a lot easier. When the format is stable and as the PROJ internal infrastructure evolve we can do things smarter.

@ccrook
Copy link
Contributor Author

ccrook commented Jun 12, 2018

@kbevers True, for the most part I like to keep things simple also. I may be wrong but in this case I don't think that the implementation is difficult, even giving some consideration to optimisation. At the very least it is worth keeping it in mind. PROJ already has the framework in place so as far as it is concerned I think we are just adding another time dependent transformation function.

I do agree that getting the format right is the biggest and most important task. Certainly that is what is occupying most of my thoughts on this. And I think getting it right does involve some thinking about how it will be used as that provides some guidance on how it could be structured. While a lot of optimisation comes from implementation, it is nonetheless constrained to some extent by the format.

The most critical component of format I believe is the grid format. For example if the format is structured to more efficiently support extracting a subset for a limited extent transformation then that provides good opportunities for optimisation.

Likewise if we consider that the format could also be downloaded on the fly to support server side transformations in a web application then it would be good if it can be relatively compact. As an example the grid was defined in blocks each of which could be expressed as 1,2, or 4 byte integers then that would provide opportunities.

Basically what I'm thinking is that if we do end up defining a new grid format or nested grid format, then it might as well be as useful and generic as it can (but without being unnecessarily complicated of course!). For example it could be useful for very large global geoid grids, as well as deformation model grids.

On PJ_AREA, great that it can be reconfigured - feels like it could be generalised to something more like PJ_TFM_CONTEXT. But as you say that is for the future.

@kbevers
Copy link
Member

kbevers commented Jun 12, 2018

@kbevers True, for the most part I like to keep things simple also. I may be wrong but in this case I don't think that the implementation is difficult, even giving some consideration to optimisation. At the very least it is worth keeping it in mind. PROJ already has the framework in place so as far as it is concerned I think we are just adding another time dependent transformation function.

Since the proposed format can contain any number of grids with various forms of applications I am not sure you can get by using just what's in PROJ already. You will need some sort of mechanism to keep track of the grids (what sort of grid and when does it apply?). That could be made simple by managing the grids on the PJ object level but that could quickly prove problematic in multithreaded use (hogging memory). Today the grids are loaded into PROJs global memory allowing all PJ objects to access them at all times (unless there's a lock preventing it temporarily).

In a first implementation I wouldn't mind keeping all the grid memory local to the PJ object to simplify things. We are likely to learn some new things in the process that can be used to refactor the current grid handling of grids.

Basically what I'm thinking is that if we do end up defining a new grid format or nested grid format, then it might as well be as useful and generic as it can (but without being unnecessarily complicated of course!). For example it could be useful for very large global geoid grids, as well as deformation model grids.

I agree with this fully. And I think that there's only smaller details left to discuss so perhaps it is time to get started properly. Agree?

@rouault
Copy link
Member

rouault commented Jun 12, 2018

Regarding efficiency, I second @kbevers . This can be improved if experience shows this is needed. We could have some sort of mechanism similar to the GDAL block cache where regions of a grid frequently accessed are kept in memory. This wouldn't require a particular user hint.

@busstoptaktik
Copy link
Member

FWIW, in the trlib transformation system (https://bitbucket.org/KMS/trlib), we simplified some old code by essentially doing the opposite of what @ccrook suggests.

The old code was subsetting the transformation grids on the fly, based on the coordinate of the first point transformed, then resubsetting once the data set being transformed, drifted outside the initial part.

One version of the new code did direct disk reads for every new transformation point (i.e. 4 individual reads), one kept the entire grid in memory, then accessed stuff from memory as needed.

Neither of these solutions were prohibitively slow compared to the original autosubsetting - basically I believe the autosubsetting did a (historically well advised) attempt to second-guess the OS disk caching and/or memcaching mechanics, i.e. trying to do something the OS was able to do in a more efficient way, requiring less user space code.

So yes: Premature optimization is not just the root to all evil - it may even turn out to be no optimization at all ("pessimization")

@ccrook
Copy link
Contributor Author

ccrook commented Jun 14, 2018

@busstoptaktik Agree that premature and/or over complex optimisation is not always a good idea.

The thought here though is that a common transformation scenario is transforming a data set (eg GIS layer, LIDAR point cloud) the extents of which are already known, and in that case doing some preparation makes a lot of sense.

Even if we don't know the extents of the dataset we can achieve some benefit by not loading grids until they are required - that probably will be worth doing.

Also in the case of the deformation model we can save a some work if we know when the deformation is to be applied. For the NZ example the most common epoch of transformation is the current time, for which many grids do not need to be considered at all (reverse patches only affect coordinates before the epoch). Most data sets being transformed will have a single epoch so again it makes sense to do some preparation (though in this case caching the decisions for an epoch would probably be a good approach)

The thought is not that these necessarily need to be implemented from day one, more that the implementation is done with this in mind as a likely future enhancement. This is more setting the framework for the future rather than hoping to get a benefit immediately, as it depends on software using PROJ to specify the extents to get the benefit.

@kbevers
Copy link
Member

kbevers commented Jun 14, 2018

Even if we don't know the extents of the dataset we can achieve some benefit by not loading grids until they are required - that probably will be worth doing.

We already do this for other grids in PROJ today.

Also in the case of the deformation model we can save a some work if we know when the deformation is to be applied. For the NZ example the most common epoch of transformation is the current time, for which many grids do not need to be considered at all (reverse patches only affect coordinates before the epoch). Most data sets being transformed will have a single epoch so again it makes sense to do some preparation (though in this case caching the decisions for an epoch would probably be a good approach)

This is actually very close to what is done in the current gridshift operations. Similar mechanisms can be applied in this context.

The thought is not that these necessarily need to be implemented from day one, more that the implementation is done with this in mind as a likely future enhancement. This is more setting the framework for the future rather than hoping to get a benefit immediately, as it depends on software using PROJ to specify the extents to get the benefit.

It's always a good idea to think ahead.

How do we move on from here? Set up a repository where we can work on the file format specification and the supporting tools? I think we have covered enough of the details here to give et a first go now.

@ccrook
Copy link
Contributor Author

ccrook commented Jun 14, 2018

@kbevers Certainly could move to a separate repository if that makes sense. I guess for tools etc it makes sense to do that. Alternatively it could be in a fork with the deformation tools in a directory in a similar way to NAD. And could move the format documentation into the proj github wiki for development as well as long term reference. However I'm new to this project - I'll take guidance

In terms of discussion though I'd just like to throw in one more thought from Kevin Kelly (ESRI geodecist). @kevinmkelly has been thinking about deformation model implementation for some time and has struck a similar issue in terms of lack of satisfactory formats for grids and deformation models. He has just pointed me to the under development ASDF format https://github.com/spacetelescope/asdf-standard .

This has come from the Astronomy community. Like HDF5 a complete implementation would be heavy lifting for proj. On the other hand a limited implementation could serve the purpose for the deformation model, and would allow us to take advantage of any tools developed for it (currently appear to be python applications).

While the format is described as formulated more for data exchange rather than high performance computing it has a lot in common with this proposal - namely YAML metadata, binary array structures, and binary record formats. This subset of features would cover our requirements well.

It also has a mechanism for describing sequences of transformations that at first sight offers some potential for deformation models. However I suspect that this would add to the weight of the code with out much advantage to this specific application.

Thoughts?

@kbevers
Copy link
Member

kbevers commented Jun 14, 2018

If we want this format to be the gold standard of deformation model file formats it should live it's life outside of the PROJ ecosphere. The ASDF format you link to is a nice example of how that sort of thing should be done. The ASPRS LAS format specification is a another good example. The tools for the format should be organized either in the same repository or in its own within the same organization. This will provide a sort of reference implementation of the format.

I am not particularly happy about including the file format definition and related tools in the PROJ repository itself. Or the wiki (which we don't use) for that matter. PROJ will only be a consumer of the files. The NAD stuff mostly exist for historical reasons and shouldn't serve as the example for new functionality i PROJ.

ASDF is an interesting alternative. It certainly looks flexible enough for the job and the files seem to be fairly easy to create with a bit of Python. That's good. I wonder how easy it is to parse the yaml formatting in the header? I haven't read the docs in details but if we decide to got that way presumably we would then need to define a custom schema for our purpose, yes?
It's cool that you can define transformations related to the data within the format. We do have the tools to use that within PROJ but how exactly that would work and if it is even necessary I am not sure about. On a related subject, ASDF also offer a nice way to distribute polynomial coefficients for Horner

@rouault @busstoptaktik What do you think about the ASDF format?

@rouault
Copy link
Member

rouault commented Jun 14, 2018

I concur with @kbevers . I've looked quickly at the ASDF documentation and it seems it could be a plausible vector to store deformation models. The deformation model would be a custom schema for our purposes. There is a C implementation of YAML reading/writing - https://github.com/yaml/libyaml used by the Python yaml module - that could be used. That said, we could probably skip that dependency if we restrict ourselves to just reading the schema of the deformation model with some "reading at hand"

@kbevers
Copy link
Member

kbevers commented Jul 9, 2018

Having investigated #158 a bit, I think we should make a decision on a normalized longitude range for the new grid format. 0..360 would probably be a good choice.

With the current implementation we don't have any bounds on the grids and they can easily extend further than the normal -180..180 degree longitude range. As described in the issue, that is problematic when using longitudes that wraps around at the -180 deg mark. It can be dealt with by adjusting the longitude if it is outside the grid boundaries but that can be a bit risky in some situations. By always keeping grid longitudes in the 0..360 range we can better adjust input coordinates.

@rouault
Copy link
Member

rouault commented Jul 9, 2018

I think we should make a decision on a normalized longitude range for the new grid format. 0..360 would probably be a good choice.

That will be convenient for grids around the anti-meridian, but what about the grids overlapping Greenwich ?

@kbevers
Copy link
Member

kbevers commented Jul 9, 2018

That will be convenient for grids around the anti-meridian, but what about the grids overlapping Greenwich ?

It's just an offset, no? Greenwhich meridian is 0, in "grid-longitudes" it'll be 180. Or am I missing something obvious here?

@rouault
Copy link
Member

rouault commented Jul 9, 2018

It's just an offset, no?

Hum, I'm not sure to understand your idea. I assumed that when you said [0,360], that meant 0=greenwich, 180=antimeridian, 359=one degree west of greenwhich. And not adding an offset of 180 to longitudes, in which case that will not fix the antimeridian case.
If keeping just a range of 360 degrees for longitude, any trick that fixes the antimerdian will necessarily create issues at Greenwich, and vice-verca

@kbevers
Copy link
Member

kbevers commented Jul 9, 2018

No, I meant that 180 west would become 0, Greenwhich becomes 180 and 180 east becomes 360 (or 359?). But yeah, I see your point - it's just moving the problem elsewhere. I think the core of my idea was to fix longitudes to a specific range spanning a maximum of 360 degrees. Today we allow longitudes in any range, really. We also ignore the problem of longitude wrapping, partially because of the way the grids are layed out. The alaska grid spans longitudes from -194 to -128. Input longitude 173W is not translated to -187 in the existing code. We could of course just do that, although a bit of guess work is required. I fear that thoses guesses will be wrong from time to time.

Perhaps we could add grid metadata parameter specifying in which 360 degree range the grid coordinates exists. For example, it could be either 0..360, -180..180 or -360..0. That way it would be easier to map input coordinates to grid coordinates since we have a better understanding of what they mean and where to look for potential problems. Does that make sense?

@rouault
Copy link
Member

rouault commented Jul 9, 2018

Does that make sense?

Hum, I don't think we should impose too many constraints, and some potential non natural (I find the suggestion of adding a 180 offset confusing), on the way the grids are registered. I'd say that we could allow a value < -180 or > 180 for a corner, provided that the other corner falls in [-180,180]. With that criterion, the Alaska grid registration is OK since its east longitude is -128. A New Zealand grid that would start at 175 and ends on 185 would be OK too. This is "just" an implementation issue to properly deal with the antimeridian. Constraints on the registration of the grid will not really help us.

@kbevers
Copy link
Member

kbevers commented Jul 9, 2018

This is "just" an implementation issue to properly deal with the antimeridian. Constraints on the registration of the grid will not really help us.

You definitely have more experience in this area than me so I'll shut up and listen to the expert :-) With this discussion in mind, I'll try to come up with a fix to #158. We should at least be able to handle grid shifts when the grid exceeds the normal longitude range of -180..180.

@ccrook
Copy link
Contributor Author

ccrook commented Jul 9, 2018

I second Even's approach. For locating on the grid it is easy to correct longitude to the range of the grid, and the offset once determined is applicable to the longitude in either system. I don't think this is any different to handling the central meridian in a TM projection for example.

The issues for a global grid that I think need more careful attention are at the poles (where a displacement expressed in terms of east/north becomes very hard to interpolate) and to a lesser extent the longitude boundary of the grid. The latter is not too hard to handle if we are using bilinear interpolation, as it is enough to duplicate the edge set of grid points (ie include values at both -180 and +180, or whatever the longitude range of the grid is).

@ccrook
Copy link
Contributor Author

ccrook commented Dec 5, 2019

The proposed format for a full deformation model here has now been superseded by a much simpler proposal based on the development of a multipurpose multigrid format based on GeoTIFF. See https://lists.osgeo.org/pipermail/proj/2019-December/009097.html

@stale
Copy link

stale bot commented Feb 3, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Feb 3, 2020
@kbevers kbevers added the pinned Prevent stale-bot from closing an issue label Feb 3, 2020
@stale stale bot removed the wontfix label Feb 3, 2020
rouault added a commit to rouault/PROJ that referenced this issue May 4, 2020
rouault added a commit to rouault/PROJ that referenced this issue May 4, 2020
rouault added a commit to rouault/PROJ that referenced this issue May 4, 2020
…formation models

Fixes OSGeo#1001

Co-authored-by: Chris Crook <ccrook@linz.govt.nz>
rouault added a commit to rouault/PROJ that referenced this issue May 16, 2020
…formation models

Fixes OSGeo#1001

Co-authored-by: Chris Crook <ccrook@linz.govt.nz>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement feature request pinned Prevent stale-bot from closing an issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants