Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gplates paleocoordinates are sometimes blank #11

Closed
markuhen opened this issue Mar 8, 2017 · 16 comments
Closed

Gplates paleocoordinates are sometimes blank #11

markuhen opened this issue Mar 8, 2017 · 16 comments

Comments

@markuhen
Copy link
Collaborator

markuhen commented Mar 8, 2017

The Gplates data service sometimes fails to return paleocoordinates. This is likely because the service doesn't have paleocoordinates for those modern coordinates.

I think our data service should do the following instead of returning nothing:

  1. return the Scotese coordinates (appropriately labeled) instead of the Gplates coordinates
  2. if neither Gplates nor Scotese have any paleocoordinates, we should return something like NaN
@aazaff
Copy link
Member

aazaff commented Mar 8, 2017

Hi @markuhen

@jczaplew and I did some test cases, and we confirmed that the data service is working correctly. The problem is that these points are not rotatable using GPlates.

As per your suggested fixes:

  1. I strongly advise against returning two different rotation models within the same field.

  2. The data service already returns an explicit NULL when a rotation fails. So if we want to flag those as NaN, NULL, or NA within the PBDB, then @mmcclenn should do that as part of his download script rather than us changing the data service.

@jpjenk
Copy link

jpjenk commented Mar 9, 2017 via email

@aazaff
Copy link
Member

aazaff commented Mar 9, 2017

@jpjenkins

  1. There is no reason for @markuhen to have to laboriously convert to Scotese. We have always offered the option to download Scotese rotations instead of gplates coordinates through the API, and I believe also through the download form. If that option is not working, then that is a separate bug.

  2. Mixing models within a single field is bad scientific practice, regardless of how similar the model outputs might be. Especially when that fact is going to be buried in obscure documentation that nobody ever reads. Most people still think return Scotese by default, for example. Honestly, it would be better for us to roll back to Scotese entirely than mix models within the same field. I know of no field of science where mixing like that is permissable. But, if users want to use both, they can always download the Scotese coordinates from PBDB and do a simple join.

  3. The issue between Scotese and GPlates are not accuracy/precision so much, though there are differences. The difference is that GPlates is a true model in the sense that it is algorithmically defined based on hypotheses about the fundamentals of plate movement. Scotese maps are manually curated products that are not attached to a reproducible model. This is why shanan insisted on the initial shift over.

@markuhen
Copy link
Collaborator Author

markuhen commented Mar 9, 2017

Yes, you can download Scotese instead of Gplates. But, to have one data set, you have to mash them together to get the best of both worlds. Also, in a given download, you have to pick one or the other (using the downloader), so maybe we should just allow you to pick both to allow the mashing together to be easier.

Let's plan to chat about this next week when I am in town, and figure out the best solution to this issue.

@dwbapst
Copy link

dwbapst commented Mar 9, 2017 via email

@jczaplew
Copy link
Contributor

jczaplew commented Mar 9, 2017

In my mind this is a bit of a philosophical argument over whether APIs are for data delivery or data delivery and analysis.

I know the PBDB API edges into the realm of analysis with routes that allow users to do things like generate data for diversity curves, but typically I believe APIs are primarily for fetching data. Analysis and processing is typically left to the user, or an intermediate package that handles common functions (like velociraptr or the PBDB r package).

@aazaff
Copy link
Member

aazaff commented Mar 9, 2017

@markuhen idea of having a parameter option for returning multiple paleocoordinate models is a great idea. We could actually split GPlates into the Seton and Wright models, and add more models as they come out. So there would be something like a wright_paleolat, seton_paleolat, and scotese_paleolat field for people interested in getting them all.

?show=paleoloc&allrotations=TRUE (or whatever)

@mmcclenn
Copy link
Contributor

mmcclenn commented Mar 9, 2017 via email

@markuhen
Copy link
Collaborator Author

markuhen commented Mar 9, 2017

I just noticed that the GPLates return for Holocene collections are blank. We should decide what to do with this. Should we fill in with the modern coordinates, even if they don't come from GPlates?

@aazaff
Copy link
Member

aazaff commented Mar 9, 2017

That actually sounds like a bug with our data service. The Holocene should return modern coordinates. @jczaplew and I will look into it.

@aazaff
Copy link
Member

aazaff commented Mar 9, 2017

The Holocene rotation issue has been filed. UW-Macrostrat/gplates-reconstruct#5.

@aazaff
Copy link
Member

aazaff commented Mar 27, 2017

Hey @mmcclenn, as we discussed during @markuhen's visit, can you please close this issue once you've implemented the new API parameter for returning multiple paleocoordinate systems. I want to update velociraptr to use the new path once its up and running.

@mmcclenn
Copy link
Contributor

@aazaff that parameter is already part of the API. You can specify "pgm=model1,model2,..." where the available models are "gp_early", "gp_mid", "gp_late", "scotese". Also, "gplates" is a synonym for "gp_mid". This is already available on paleobiodb.org.

@mmcclenn
Copy link
Contributor

In the new release due out next week, collections whose paleocoords are blank will have the following label in the "geoplate" field: _coordinates not computable using this model. The coordinate fields will still be blank, because as @aazaff noted putting text into a field where numbers are expected is a bad idea. The "geoplate" field should be interpeted as unstructured text.

mmcclenn added a commit that referenced this issue Apr 21, 2017
 - Issue #11: when the 'paleoloc' output block is selected, records whose paleocoordinates
   are blank will have a message in the 'geoplate' field: "coordinates not computable using this model".
   The fields 'paleolat' and 'paleolng' will still be blank, because it is not a good idea
   to put text into a field that is supposed to contain numbers.
@aazaff
Copy link
Member

aazaff commented Apr 22, 2017

I am no longer sure that changing the geoplate field is the optimal solution. It is technically a breaking change. I was also under the impression that we had agreed on an alternative solution when Mark last visited. I would like us to discuss this again at our next meeting.

@mmcclenn
Copy link
Contributor

mmcclenn commented May 4, 2017

The behavior of the data service is now to report "cannot be computed under this model" in the "geoplate" field when the coordinates are blank. It is reported there instead of in the coordinate fields since they ought to be either empty or a valid coordinate where "geoplate" is an unstructured text string.

@mmcclenn mmcclenn closed this as completed May 4, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants