-
-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
globi link #4998
Comments
So this is a GloBI issue, or something wrong on our end? If the former, can we let him know? |
if there's really something there then globi is giving an incorrect response. |
We need Jorrit in our Github as a collaborator... |
Created an issue at GloBI |
@dustymc @Jegelewicz I got the attached msb-para.zip via globalbioticinteractions/msb-para http://ipt.vertnet.org:8080/ipt/archive.do?r=msb_para cd513050869ee454c31f288433ddb2861c4b4acf3b7484ed126531922d26e2a8 2022-08-27T02:03:37.946Z and, the resource is pretty empty . . . Should I update the vertnet endpoint for msb-para ?
it appears the content changed somewhere before Aug 27 2022:
|
the previous version (< 2022-08-27) was quite a lot bigger:
Please advise. |
Where did your MSB Para data go? |
@dustymc maybe this IS part of the thing you and Dave are working on? |
Yep, sounds like it. Nothing's vaporized lately!
|
@dustymc now I am curious what you and Dave are working on. Super-compression? |
from http://ipt.vertnet.org:8080/ipt/?search=MSB I saw: |
Looks like the vertnet endpoint has changed since elton (GloBI bot) checked last. Now, the content looks different:
and more like before (14M vs a few kB) Looks like I just caught vertnet in a funny moment. Running manual update to confirm. |
Yep, perfect zero-byte compression, you just don't know how to use it. (Or the script that packages up DWC data sometimes confuses and then offs itself resulting in Dave automagically publishing empty files - one of those.....) |
@dustymc good to know - I'll sign up for that zero-byte compression course I always wanted to take 😉 . I can't imagine confused scripts . . . aren't machines perfect? |
Obviously not, perfect machines would imply imperfect programmers and we know that can't be the case! |
Looks like the update pulled in the non-compressed resource:
please allow for some time for GloBI to propagate the changes. |
I'm assuming this fix will propagate to all parasite collection records? I'm recording a demo for a lightening talk and really want the GloBI link to appear in a DMNS:Para record (specifically this one: https://arctos.database.museum/guid/DMNS:Para:49). |
@ebraker I was able to see GloBI link appear for http://arctos.database.museum/guid/MSB:Para:6170 . @dustymc is there any kind of caching that happens on the Arctos side? If so, please flush the 404s from GloBI if possible. |
@ebraker re: https://arctos.database.museum/guid/DMNS:Para:49 - I was able to see the records getting linked in a recent review of Arctos collections exposed by Vertnet: e.g., http://arctos.database.museum/guid/DMNS:Mamm:11096 -> http://arctos.database.museum/guid/DMNS:Para:49 I am still trying to figure out why the current Arctos page doesn't show this expected results.
|
re: DMNS:Para - @dustymc - I see the following para datasets on the vertnet rss feed:
where 17da4ef08733e4be6431053b4b0b90b77d6f7cc5fccc74e73b2b149df0aecbd9 is the rss feed retrieve on 2022-08-27T08:01:11.633Z from http://ipt.vertnet.org:8080/ipt/rss.do . By default, I configured GloBI to ignore the ggbn datasets. And, for DMNS, a ggbn endpoint exists, but not a "normal" one. Can you help explain what is going on? |
for some reason, http://ipt.vertnet.org:8080/ipt/resource?id=dmns_para_ggbn links to and http://ipt.vertnet.org:8080/ipt/resource?id=dmns_para (no ggbn) links to a landing page that also mentions GGBN and confusingly, the data as DwC "download" link for http://ipt.vertnet.org:8080/ipt/resource?id=dmns_para#anchor-downloads is @ebraker did you ever see the linking work for DMNS? Was there any recent work done on the digital management of DMNS? Thanks for being patient as I am trying to figure out what is going on. |
In tracking the provenance, I notice that DMNS Para collection is accessible via http://ipt.vertnet.org:8080/ipt/archive.do?r=dmn instead of expected http://ipt.vertnet.org:8080/ipt/archive.do?r=dmns_para . |
@ebraker I dug around a bunch to look for the linked DMNS specimen. And, they are showing up in early processing the GloBI. So, most likely, the links are somehow lost in transit in later GloBI processing stages. Here's an example of such records, extracted via a 2022-09 https://en.wiktionary.org/wiki/support http://arctos.database.museum/guid/DMNS:Mamm:11922?seid=463326 DMNS:Mamm:11922 Mamm 45 DMNS Callospermophilus lateralis Animalia | Chordata | Mammalia | Rodentia | Sciuridae | Callospermophilus | Callospermophilus lateralis kingdom | phylum | class | order | family | genus | species male http://purl.obolibrary.org/obo/RO_0002445 hasParasite http://arctos.database.museum/guid/DMNS:Para:933?seid=3736551 DMNS:Para:933 Para 127 DMNSSiphonaptera Animalia | Arthropoda | Insecta | Siphonaptera kingdom | phylum | class | order PreservedSpecimen 2009-05-15T00:00:00Z 40.6708 -105.6025 Mummy Range; FR268 off FR139 (Crown Point Road) http://arctos.database.museum/guid/DMNS:Mamm:11922 http://arctos.database.museum/guid/DMNS:Mamm:11922 globalbioticinteractions/vertnet DMNS Mammal Collection (Arctos) - Version 34.66 http://ipt.vertnet.org:8080/ipt/archive.do?r=dmns_mamm 2022-09-03T15:37:41.720Z f324019ad23212691cdb91d958bf8cf266bb93dd97210f1a153bc070a996c0a0 0.12.4 which tells me that GloBI is able to successfully link the http://arctos.database.museum/guid/DMNS:Mamm:11922 with their counterpart http://arctos.database.museum/guid/DMNS:Para:933 . So, as far as I can tell, the Arctos side is working perfectly! I'll have to dig into the linking issue a little more, and hoping to report. Thanks for being patient. . . I find linking stuff pretty tricky, especially when dealing with cross-platform, cross-institutional reference . . . clear that more work is needed to simplify link mechanisms efficiently. PS @dustymc currently, I am discovering links between occurrenceIds/ collection codes (e.g., MSB:Para) and their "official" data feed (e.g., http://ipt.vertnet.org:8080/ipt/archive.do?r=msb_para) by either manually curating list (see https://github.com/globalbioticinteractions/msb-para/blob/72ff2d6e9a30df60a3a843858f9e7a099f21bf5d/rss.xml) or by re-using the vertnet rss feed. Did you ever consider explicitly declaring dependencies of MSB:Para on other integration endpoints (e.g., MSB:Mamm, GenBank)? If so, I might be able to re-use that list of dependencies as the equivalent of data import statements (e.g., "MSB:Para references records from MSB:Mamm, and MSB:Mamm records may be found at http://ipt.vertnet.org:8080/ipt/archive.do?r=msb_mamm "). |
@ebraker similarly, I found records linking UCM:Para to UCM:Mamm (see example http://arctos.database.museum/guid/UCM:Mamm:20797 hasParasite http://arctos.database.museum/guid/UCM:Para:10 below). Please do note however, that the UCM:Para collection is somehow not included on the vertnet rss feed, so GloBI was unable to associated the detailed taxonomic information about the parasite. Did you publish the UCM:Para collection publicly? https://en.wiktionary.org/wiki/support http://arctos.database.museum/guid/UCM:Mamm:20797?seid=2623787 UCM:Mamm:20797 Mamm 95 UCM Sorex cinereus Animalia | Chordata | Mammalia | Soricomorpha | Soricidae | Sorex | Sorex cinereus kingdom | phylum | class | order | family | genus | species unknown http://purl.obolibrary.org/obo/RO_0002445 hasParasite http://arctos.database.museum/guid/UCM:Para:10 http://arctos.database.museum/guid/UCM:Para:10 PreservedSpecimen 2012-09-10T00:00:00Z 37.84595333 -108.0318 San Juan Mountains, Lizardhead Wilderness, along north side of West Dolores River, just east of Navajo Lake ('Navajo Lake Site') http://arctos.database.museum/guid/UCM:Mamm:20797 http://arctos.database.museum/guid/UCM:Mamm:20797 globalbioticinteractions/vertnet UCM Mammal Collection (Arctos) - Version 17.70 http://ipt.vertnet.org:8080/ipt/archive.do?r=ucm_mammals 2022-09-03T15:37:41.720Z f324019ad23212691cdb91d958bf8cf266bb93dd97210f1a153bc070a996c0a0 0.12.4 |
@jhpoelen Thanks for looking into this! We set up the UCM:Para collection fairly recently, so I still need to coordinate with Dave to have it published to the VertNet IPT. On the to-do list! |
Hey y'all - After some anxious moments yesterday (where did all the links go?), I did some checking, and found that the recently (just today) updated GloBI index has the expected Arctos links re-established. See e.g., globalbioticinteractions/globalbioticinteractions#818 globalbioticinteractions/globalbioticinteractions#817 . And, the original issue, https://api.globalbioticinteractions.org/exists?accordingTo=http://arctos.database.museum/guid/MSB:Para:30008 returning 404 is no longer happening. Instead the query returns the expected 200 "OK" (see screenshot) So, as far as I can tell, the Arctos datasets exposed via Vertnet have been re-indexed by GloBI and no longer include the experimental zero-byte compression experiments by @dustymc and @dbloom . That was fun! To me this exercise once again shows that: 1. data integration is a continuous activity, 2. humans are needed to care of these data integration processes by reporting unexpected behaviors and/or improving data integration method and 3. Arctos has an active community of contributors that care not only for their own data, but are interested to care for the health of the systems that re-use the Arctos data. So thanks! Suggest to close this issue, unless there's remaining concerns. |
Yep - I think we can close! |
@campmlc here's an example of the URL I check for globi links:
https://api.globalbioticinteractions.org/exists?accordingTo=http://arctos.database.museum/guid/MSB:Para:30008
And here's the response
I only create links if that responds with a 200 statuscode - that one is telling me that there's nothing to link to.
The text was updated successfully, but these errors were encountered: