-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gplates paleocoordinates are sometimes blank #11
Comments
Hi @markuhen @jczaplew and I did some test cases, and we confirmed that the data service is working correctly. The problem is that these points are not rotatable using GPlates. As per your suggested fixes:
|
@aazaff, the problem here is that for large spatial analyses, the GPlates issues will render otherwise valid data unusable. @markuhen showed me a large data table that he was assembling yesterday and, by his rough approximation 5% of the paleo-coordinates could not be calculated with GPlates. He was in the process of laboriously filling these data with values approximated by Scotese. The only other alternative would be to eliminate these points which is unacceptable because the actual data is good.
The question then becomes, is Scotese so wildly inaccurate that this would propagate errors or is it good enough? A comparison of points rotatable by both models should be made - I can do it at some point if it has not already been done - to properly answer this. However, the sense is that this is not the case. GPlates is a higher resolution model and if a particular plate fragment in paleo-time can’t identified to within a comfortable degree of accuracy for it's authors, the model will return NULL. We believe however, that Scotese will return an approximate location, if not the specific plate sliver in these cases, which is good enough. Aspects of datasets are filled and/or interpolated all the time for consistency. In this case the plate model used would be identified in the field accompanying the estimated paleo-coordinates so if these points happen to stand out suspiciously in whatever general analysis one is doing, geolocation error as a possible culprit is easy to explore.
Since the methodology is documented and the source identified within returned datasets, I see no problem with a tiered, two plate model approach to make sure there is always a paleo-coordinate associated with each occurrence.
@jpjenk
… On Mar 8, 2017, at 4:19 PM, Andrew Zaffos ***@***.***> wrote:
Hi @markuhen
@jczaplew and I did some test cases, and we confirmed that the data service is working correctly. The problem is that these points are not rotatable using GPlates.
As per your suggested fixes:
• I strongly advise against returning two different rotation models within the same field.
• The data service already returns an explicit NULL when a rotation fails. So if we want to flag those as NaN, NULL, or NA within the PBDB, then @mmcclenn should do that as part of his download script rather than us changing the data service.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
|
Yes, you can download Scotese instead of Gplates. But, to have one data set, you have to mash them together to get the best of both worlds. Also, in a given download, you have to pick one or the other (using the downloader), so maybe we should just allow you to pick both to allow the mashing together to be easier. Let's plan to chat about this next week when I am in town, and figure out the best solution to this issue. |
Just a thought from the random peanut gallery following this repo, but I
concur with Andrew: it's bad database policy to mix data sources in output,
especially when filling in the gaps in one with the other could be done
with about 2 lines of R code for the 'mashing together'. When in doubt with
scientific software and databases, it is much better to force users to do
trivial tasks themselves than not, or else you run a serious risk of people
not understanding what they are doing. Allowing both to be obtained via the
download form is a good middle ground solution.
Cheers,
Dave B
On Thu, Mar 9, 2017 at 8:44 AM Mark D. Uhen ***@***.***> wrote:
Yes, you can download Scotese instead of Gplates. But, to have one data
set, you have to mash them together to get the best of both worlds. Also,
in a given download, you have to pick one or the other (using the
downloader), so maybe we should just allow you to pick both to allow the
mashing together to be easier.
Let's plan to chat about this next week when I am in town, and figure out
the best solution to this issue.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#11 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABfVL6jnpGaBNaM8p48I3mKZSKDBtnDoks5rkB5jgaJpZM4MXS6D>
.
--
David W. Bapst, PhD
Adjunct Asst. Professor, Geology and Geol. Eng.
South Dakota School of Mines and Technology
501 E. St. Joseph
Rapid City, SD 57701
http://webpages.sdsmt.edu/~dbapst/
http://cran.r-project.org/web/packages/paleotree/index.html
|
In my mind this is a bit of a philosophical argument over whether APIs are for data delivery or data delivery and analysis. I know the PBDB API edges into the realm of analysis with routes that allow users to do things like generate data for diversity curves, but typically I believe APIs are primarily for fetching data. Analysis and processing is typically left to the user, or an intermediate package that handles common functions (like velociraptr or the PBDB r package). |
@markuhen idea of having a parameter option for returning multiple paleocoordinate models is a great idea. We could actually split GPlates into the Seton and Wright models, and add more models as they come out. So there would be something like a wright_paleolat, seton_paleolat, and scotese_paleolat field for people interested in getting them all. ?show=paleoloc&allrotations=TRUE (or whatever) |
Actually, the data service does allow you to download both gplates and paleocoordinates at the same time. I just didn't put that capability into the download form because I was trying not to complicate it more than necessary. If you just add the parameter "&pgm=gplates,scotese" to the download URL, you will get both sets of coordintates. Or if the pgm parameter already appears in the URL, change the value to what I indicated.
If it seems like people will want to do this, I can pretty easily add that option to the download form.
…-- Michael
On Mar 9, 2017, at 9:44 AM, "Mark D. Uhen" <notifications@github.com<mailto:notifications@github.com>> wrote:
Yes, you can download Scotese instead of Gplates. But, to have one data set, you have to mash them together to get the best of both worlds. Also, in a given download, you have to pick one or the other (using the downloader), so maybe we should just allow you to pick both to allow the mashing together to be easier.
Let's plan to chat about this next week when I am in town, and figure out the best solution to this issue.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#11 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AAS2L1i7F_GNvA_WNsFyFKcTRDbwxv1dks5rkB5jgaJpZM4MXS6D>.
|
I just noticed that the GPLates return for Holocene collections are blank. We should decide what to do with this. Should we fill in with the modern coordinates, even if they don't come from GPlates? |
That actually sounds like a bug with our data service. The Holocene should return modern coordinates. @jczaplew and I will look into it. |
The Holocene rotation issue has been filed. UW-Macrostrat/gplates-reconstruct#5. |
@aazaff that parameter is already part of the API. You can specify "pgm=model1,model2,..." where the available models are "gp_early", "gp_mid", "gp_late", "scotese". Also, "gplates" is a synonym for "gp_mid". This is already available on paleobiodb.org. |
In the new release due out next week, collections whose paleocoords are blank will have the following label in the "geoplate" field: _coordinates not computable using this model. The coordinate fields will still be blank, because as @aazaff noted putting text into a field where numbers are expected is a bad idea. The "geoplate" field should be interpeted as unstructured text. |
- Issue #11: when the 'paleoloc' output block is selected, records whose paleocoordinates are blank will have a message in the 'geoplate' field: "coordinates not computable using this model". The fields 'paleolat' and 'paleolng' will still be blank, because it is not a good idea to put text into a field that is supposed to contain numbers.
I am no longer sure that changing the geoplate field is the optimal solution. It is technically a breaking change. I was also under the impression that we had agreed on an alternative solution when Mark last visited. I would like us to discuss this again at our next meeting. |
The behavior of the data service is now to report "cannot be computed under this model" in the "geoplate" field when the coordinates are blank. It is reported there instead of in the coordinate fields since they ought to be either empty or a valid coordinate where "geoplate" is an unstructured text string. |
The Gplates data service sometimes fails to return paleocoordinates. This is likely because the service doesn't have paleocoordinates for those modern coordinates.
I think our data service should do the following instead of returning nothing:
The text was updated successfully, but these errors were encountered: