-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TypeStatus - curation before uploading first vocabulary version #87
Comments
I'd like to work on this one, please |
Some intermediate notes:
|
This is at a stage where I would appreciate a review. Do we have a process for this, or whom should I include, @timrobertson100? |
Thanks @ahahn-gbif Each vocabulary can present different quirks, so it needs to be tailored to fit but I find that in general
In this instance you need domain-specific knowledge, so perhaps @mdoering would be a good reviewer? |
@ahahn-gbif happy to review, just ping me |
Concerning the handling of verbatim data with containing uncertainty markers ("type?", "possible type" etc): Policy decision to
For the vocabulary this means to
(after consultation with @mdoering, @timrobertson100 and @marcos-lg) |
I would recommend to rename this into NOT_A_TYPE so its clear. With all these cryptic type names NOTA_TYPE sounds a reasonable name of some kind of type. |
If you have to have a single parent it should be the syntype, as iso- just indicates a duplicate, the syntype is the more distinguished type status I would say. |
Is it worth considering an OTHER status to populate in case the verbatim status cannot be parsed but is not null? |
should ISOTYPE not have holotype as the parent instead of just type?
|
All isotypeXYZ should not have ISOTYPE as the parent. ISOTYPE is based on the holotype, all the others not.
|
the COL type status vocabulary sets up a hierarchy by specifying a "base" status (=parent). Maybe thats worth looking into: |
I would map the following rather to NULL, maybe flagging an uncertainty issue.
|
I would create a distinct new value for Neoparatype & Lectoparatype.
|
Paratopotype is mapped to PARATYPE and TYPE. Should be the former for all 3:
|
I am undecided here between the formal relationship between terms (there is a group of isotypes that gets further characterised as exisotype, plastoisotype etc), and relationships between objects (if there is an isotype, there also must be a holotype that the isotype relates to). I think the vocabulary models the former (grab anything that is an isotype out of the bucket), but not necessarily the latter (if I search for holotypes, would I expect to also get isotypes delivered?) - but I may be confused. Edit: I see that the CoL typestatus vocabulary (link above) does make holotype the parent of isotype. The purpose/use case may still be different there than our occurrence relevance (?) |
Also undecided on this one. I would introduce this only if we would expect other type-related terms that cannot be mapped to any of the existing ones, and where introducing a new concept would be wrong for some reason. Do we want an option for parsing type-unrelated content, and if so, for what purpose? Or would it be a temporary bucket for collecting content that is either mis-mapped or omitted from the vocabulary in error? |
I would agree about "TYPESTATUSUNKNOWN". For POSSIBLETYPE, the ORIGINALMATERIAL definition as "'type-suspicious' material" seems to fit rather well, so I am less sure here. I removed one, but kept the other for now. |
COL might do it wrongly. I think you are right. An isotype is a duplicate of a holotype, but not itself a holotype. |
Hm. But ISOEPITYPE, ISOLECTOTYPE or ISONEOTYPE are not subclasses of ISOTYPE - they are all duplicates, but not from the holotype |
Thanks, I missed that. Just as ISOTYPE is a sibling of HOLOTYPE and child of TYPE, this would also apply to the other SYN~ names. Following the same logic, they would all be subclasses of TYPE, rather. |
Actually, the same would apply to the Allo~ (opposite sex of the holotype) and Para~ (everything but the holotype, though used in the description) series (?) |
Indeed. Difficult to model. There are several unrelated properties/flags really that taxonomists have combined into a single word |
For handover due to intermediate absence: CLONOTYPE | (botany) Herbarium specimens made from plants vegetatively propagated from (thus clones of) the same plant from which a type specimen was made. Clonotypes are of some use in documenting a type collection but have no status under the International Code of Botanical Nomenclature. The term is sometimes also used to refer to the living plants themselves. Otherwise, I think this could be imported now and edited later, unless there are any remaining concerns? |
@ahahn-gbif do you think it would make sense to include the nomenclatural code, e.g. botanical, zoological etc., as a tag (http://api.catalogueoflife.org/vocab/typestatus)? It would allow us and editors to manage the vocabulary based on the code applied. I will follow up and add #87 (comment) before I upload to UAT. I realise we have the following decision for uncertain values: #87 (comment) - could we create a pipelines issue for this flag so we can discuss how best to implement it @marcos-lg? |
@ahahn-gbif If I understand correctly, all
|
I prefer not to add the fuzzy matching not to overcomplicate things. I'd leave them unmapped and flag them somehow, maybe we can flag them if we recognize some keywords in the verbatim value such as |
We had revised the parent-child relationships; remaining errors excepted, the parent should be what is listed in the parent column, meaning this is hierarchical, and a parent (like Type) can have multiple children. This does not mean that Type is the parent for all - the relationship from C:Alloneotype to P:Neotype is correct. Is this going to be a problem? |
|
Ok, we can leave it for now and add it later, if it makes sense. |
No, this should not be a problem. I will stick with this setup then. |
Yes, I will leave them unmapped and comment again once I have compiled all the values that should be flagged. |
@ahahn-gbif these are currently mapped to parent = |
@CecSve they should indeed rather be mapped as Markus suggested April 26, 2001 (your list above) - thanks! |
This seems to be a wrong designation of the label_En?
|
Technically speaking, probably yes (since it is not English). I was adding this as the technical term for the concept, but maybe just "type species" or "type species (typus generis)" would be closer to the label intention |
I have added the typus *** to alternativeLabels_en and put type *** in Label_en instead - it probably does not really matter, but just for consistency |
Changed concepts to UpperCamelCase and added alternativeLabels_en |
Added the suggested concepts. Should I add
Also suggested here. |
Sounds reasonable to add. Parent Paratype. |
|
The vocabulary is not uploaded to prod and UAT. |
@marcos-lg the following values are part of unmapped verbatim value strings and should be parsed and flagged during interpretation:
should the flag be |
Maybe |
Thanks that makes sense. |
Here is a file to edit: https://drive.google.com/file/d/1WOgxAt3nIL2TVpXV06Qa8wNTY9Q7R5R7/view?usp=sharing
It contains:
Pease check instructions here: #70
The text was updated successfully, but these errors were encountered: