Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for linking names to Plazi-enabled taxonomic literature. #23

Closed
jhpoelen opened this issue Sep 10, 2020 · 9 comments
Closed

Comments

@jhpoelen
Copy link
Member

https://plazi.org make taxonomic literature machine readable and link taxonomic names to their taxonomic treatments. One of the ways Plazi share these links is through GBIF (e.g., https://www.gbif.org/dataset/6384b520-7e9f-4874-a414-76c2e9b01d74 for Horseshoe bats).

Suggest to add support for Nomer to easily link a taxonomic name to it's plazi treatment.

For instance, when running:

$ echo -e "\tRhinolophus denti Thomas 1904" | nomer append plazi-names

an expected result would be

provided id provided name resolved id resolved name
Rhinolophus denti Thomas 1904 http://treatment.plazi.org/id/885887A2FFC88A21F8B1FA48FB92DD65 Rhinolophus denti

Ideally, globalnames would enable this kind of linkages (see GlobalNamesArchitecture/dwca_hunter#30), but alternatives are possible also (e.g., index dwc-a via plazi ipt or similar).

fyi @myrmoteras @dimus

jhpoelen pushed a commit to globalbioticinteractions/globalbioticinteractions that referenced this issue Sep 25, 2020
jhpoelen pushed a commit to globalbioticinteractions/globalbioticinteractions that referenced this issue Sep 25, 2020
@jhpoelen
Copy link
Member Author

jhpoelen commented Sep 25, 2020

Using wikidata to search for Plazi treatments:

$ echo -e "WORMS:126436\tGadus morhua" | nomer append wikidata-taxon-id-web
using matcher [wikidata-taxon-id-web]
WORMS:126436	Gadus morhua	SAME_AS	WD:Q199788	Gadus morhua			Gadus morhua			https://www.wikidata.org/wiki/Q199788	
WORMS:126436	Gadus morhua	SAME_AS	NCBI:8049	Gadus morhua			Gadus morhua			https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=8049	
WORMS:126436	Gadus morhua	SAME_AS	ITIS:164712	Gadus morhua			Gadus morhua			http://www.itis.gov/servlet/SingleRpt/SingleRpt?search_topic=TSN&search_value=164712	
WORMS:126436	Gadus morhua	SAME_AS	GBIF:2415835	Gadus morhua			Gadus morhua			http://www.gbif.org/species/2415835	
WORMS:126436	Gadus morhua	SAME_AS	WORMS:126436	Gadus morhua			Gadus morhua			https://www.marinespecies.org/aphia.php?p=taxdetails&id=126436	
WORMS:126436	Gadus morhua	SAME_AS	FBC:FB:SpecCode:69	Gadus morhua			Gadus morhua			http://fishbase.org/summary/69	
WORMS:126436	Gadus morhua	SAME_AS	PLAZI:99915444-EC70-3196-7F2D-637F418F0730	Gadus morhua			Gadus morhua		http://treatment.plazi.org/id/99915444-EC70-3196-7F2D-637F418F0730	
WORMS:126436	Gadus morhua	SAME_AS	INAT_TAXON:63740	Gadus morhua			Gadus morhua			https://inaturalist.org/taxa/63740	
WORMS:126436	Gadus morhua	SAME_AS	NBN:NBNSYS0000175392	Gadus morhua			Gadus morhua			https://data.nbn.org.uk/Taxa/NBNSYS0000175392	
WORMS:126436	Gadus morhua	SAME_AS	IRMNG:11195684	Gadus morhua			Gadus morhua			http://www.marine.csiro.au/mirrorsearch/ir_search.list_species?sp_id=11195684	

Note however, that, for some reason, the treatment for Gadus morhua does not seem to resolve on the Plazi server (see attached screenshot)

http://treatment.plazi.org/id/99915444-EC70-3196-7F2D-637F418F0730

Screenshot from 2020-09-25 16-12-13

@jhpoelen
Copy link
Member Author

@myrmoteras @mguidoti any idea why http://treatment.plazi.org/id/99915444-EC70-3196-7F2D-637F418F0730 , a treatment provided for Gadus morhua by wikidata / wikipedia does not resolve on the Plazi servers?

@jhpoelen
Copy link
Member Author

jhpoelen commented Sep 29, 2020

@myrmoteras suggested to use Synospecies instead via https://synospecies.plazi.org/advanced or https://synospecies.plazi.org/ . The wikidata linked information is said to be incomplete and out of date.

@jhpoelen
Copy link
Member Author

Very neat to see that Plazi publishes their >300k treatment meta-data as versioned and structured data via https://github.com/plazi/treatments-rdf/, enabling bulk data integration via tools like Nomer. Thank you Plazi!

@jhpoelen
Copy link
Member Author

A plazi matcher was implemented. The following result was produced:

echo -e "\tRhinolophus sinicus" | nomer append plazi
	Rhinolophus sinicus	SAME_AS	http://treatment.plazi.org/id/885887A2FFE18A06F899F4EFF9BFD35B	Rhinolophus sinicus	speciesAnimalia | Chordata | Mammalia | Chiroptera | Rhinolophidae | Rhinolophus | Rhinolophus sinicus		kingdom | phylum | class | order | family | genus | species	http://treatment.plazi.org/id/885887A2FFE18A06F899F4EFF9BFD35B

@jhpoelen
Copy link
Member Author

jhpoelen commented Sep 30, 2020

After some more modifications, all related treatments are linked, including the doi of the original publications:

-- provided name link type linked id linked taxon name ... ... ... ... ... ...
... Rhinolophus sinicus SAME_AS http://treatment.plazi.org/id/885887A2FFE18A06F899F4EFF9BFD35B Rhinolophus sinicus Animalia | Chordata | Mammalia | Chiroptera | Rhinolophidae | Rhinolophus | Rhinolophus sinicus kingdom | phylum | class | order | family | genus | species http://treatment.plazi.org/id/885887A2FFE18A06F899F4EFF9BFD35B
  Rhinolophus sinicus SAME_AS doi:10.5281/zenodo.3808964 Rhinolophus sinicus     Animalia | Chordata | Mammalia | Chiroptera | Rhinolophidae | Rhinolophus | Rhinolophus sinicus   kingdom | phylum | class | order | family | genus | species https://doi.org/10.5281/zenodo.3808964
  Rhinolophus sinicus SAME_AS http://treatment.plazi.org/id/03AF87D3C435B542FF728049FB55BB1B Rhinolophus sinicus     Animalia | Chordata | Mammalia | Chiroptera | Rhinolophidae | Rhinolophus | Rhinolophus sinicus   kingdom | phylum | class | order | family | genus | species http://treatment.plazi.org/id/03AF87D3C435B542FF728049FB55BB1B
  Rhinolophus sinicus SAME_AS doi:10.3161/150811009X465703 Rhinolophus sinicus     Animalia | Chordata | Mammalia | Chiroptera | Rhinolophidae | Rhinolophus | Rhinolophus sinicus   kingdom | phylum | class | order | family | genus | species https://doi.org/10.3161/150811009X465703
  Rhinolophus sinicus SAME_AS http://treatment.plazi.org/id/03D7CF296C71FF8C2C88F6BDF85AF415 Rhinolophus sinicus     Animalia | Chordata | Mammalia | Chiroptera | Rhinolophidae | Rhinolophus | Rhinolophus sinicus   kingdom | phylum | class | order | family | genus | species http://treatment.plazi.org/id/03D7CF296C71FF8C2C88F6BDF85AF415
  Rhinolophus sinicus SAME_AS doi:10.5281/zenodo.3805455 Rhinolophus sinicus     Animalia | Chordata | Mammalia | Chiroptera | Rhinolophidae | Rhinolophus | Rhinolophus sinicus   kingdom | phylum | class | order | family | genus | species https://doi.org/10.5281/zenodo.3805455

More feature can be imagined (e.g., synonyms, more name link types), but a first pass has been implemented.

@myrmoteras
Copy link

Super. I link this with the T.rex project to make sure the data flow works

jhpoelen pushed a commit that referenced this issue Oct 1, 2020
…n from name->treatment/publication associations; related to #23
jhpoelen pushed a commit that referenced this issue Oct 1, 2020
…nts to avoid including entire taxonomic context of a single treatment; #23
jhpoelen pushed a commit to globalbioticinteractions/globi-taxon-graph-patches that referenced this issue Oct 2, 2020
jhpoelen pushed a commit to globalbioticinteractions/globalbioticinteractions.github.io that referenced this issue Oct 2, 2020
jhpoelen pushed a commit to globalbioticinteractions/globi-taxon-graph-patches that referenced this issue Oct 2, 2020
jhpoelen pushed a commit that referenced this issue Oct 2, 2020
jhpoelen pushed a commit to globalbioticinteractions/globalbioticinteractions that referenced this issue Oct 2, 2020
jhpoelen pushed a commit to globalbioticinteractions/globalbioticinteractions.github.io that referenced this issue Oct 2, 2020
jhpoelen pushed a commit to globalbioticinteractions/globalbioticinteractions.github.io that referenced this issue Oct 2, 2020
@jhpoelen
Copy link
Member Author

jhpoelen commented Oct 2, 2020

@jhpoelen
Copy link
Member Author

jhpoelen commented Oct 8, 2020

This newly added feature was used to implement methods and produce results:

Methods

The automated taxon link procedure is described in Poelen, Jorrit H. (2020). Global Biotic Interactions: Taxon Graph Patches (Version 0.5) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.4062711 .

Result

The automated patch procedure was applied to GloBI Taxon Graph v0.3.25 to produce:Poelen, Jorrit H. (2020). Global Biotic Interactions: Taxon Graph (Version 0.3.26) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.4062765 - 41,894 Plazi treatments were linked, covering 27,721 Plazi taxon concepts and 9,346 original publications.

plazi-linkages

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants