-
Notifications
You must be signed in to change notification settings - Fork 15
Synchronize with Index Herbariorum - Collections and institutions #167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
To have an idea on how this would impact linking GBIF specimens to collections and institutions, I checked a few botanical collections on GBIF.
It seems like the tendency is to use mostly the same code for institution and collection (or skip one of them). |
Just to iterate option 2: so assuming the codes match IH and are unique in GrSciColl that would mean:
The important part is below i guess:
|
Logic for creating entities
Details of fields to updateIH fields: IdentifiersKeep the identifiers where they are. When will we decide to use DOIs?? |
The next fields will be added to the GrSciColl Collection entity in order to map some of the fields from IH:
|
@asturcon what is the types for these fields? single line strings, text blocks, markdown, numbers, uuids? |
|
same institutions is appearing multiple times in IH and hence GrSciColl. This seem to be a case of IH not splitting institutions and collections and hence have to create 2 entities with the same information, simply to have 2 codes for the 2 collections from the institution. Now that we have decided (in agreement with IH) to always create an implicit collection, we can arguably delete one of the institutions. |
In production and scheduled to run weekly. |
Before we start
These are my assumptions about the GrSciColl registry:
Option 1: Always Map IH to Institutions
Right now, entries in IH describe mostly institutions.
In the context of IH, it makes sense since we are talking about herbaria only. The problem is that GrSciColl is a broader context where the herbaria/botany part of an institution cannot always represent an institution.
Example of resulting issues
Let's take an example that illustrate the problem: UWO
An other example would be ANSP which also has an arthropod collection but is described a diatom herbarium in GrSciColl and in IH.
An other type of problem is the conflicts of information. We have some cases where the description of an institution on GrSciColl is more generic. For example, in the case of LUX, the information was rearranged on GrSciColl:
Possible solutions
With IH mapped to institutions in GrSciColl, we have two possible solutions:
Option 2: Map IH entries to collections
Conceptually, it would make more sense for herbaria to be collections in GrSciColl.
In a way, they are, "botany collections".
By this, I mean that each IH entry should be a collection attached to an institution. More ideas on how this could work below.
Advantages
Overall, I think it could make GrSciColl more coherent:
How this could work
This is just some ideas to be discussed. Here is what we could try to achieve:
I tried to illustrate this with the ANSP example:

Obviously, this would be far from perfect, but this makes more sense to me that mapping everything to institutions. Any thoughts on this? Did I forget anything?
Issue related: #159
The text was updated successfully, but these errors were encountered: