Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GrSciColl Number of specimens perhaps should be optional, or derived from GBIF count? #389

Closed
rukayaj opened this issue Oct 6, 2021 · 9 comments
Assignees
Labels
GRSciColl Issues related to institutions, collections and staff

Comments

@rukayaj
Copy link

rukayaj commented Oct 6, 2021

I would suggest that the Number of specimens field registered for each grscicoll institution should either be optional or it should be possible for it to be automatically generated with each page load from the occurrence count. Otherwise it becomes a bit tedious if you are publishing new data daily to have to go into grscicoll and manually update the specimen count every day there too.

So for example on https://www.gbif.org/grscicoll/institution/41930fa9-e6da-4351-bc6e-59dfaca3be7d, the green count of occurrences on the top right should always be equal to the Number of specimens field. Or the number of specimens field should be optional (currently if you put an empty string in there it sets it to '0').

@ManonGros ManonGros added the GRSciColl Issues related to institutions, collections and staff label Oct 6, 2021
@ManonGros
Copy link
Contributor

I agree that this field should be optional. I don't know if this is an API or UI thing, @marcos-lg does the API save 0 for specimen counts by default? Could it be changed?

But the specimen count doesn't have to be the number of specimens published on GBIF. In fact, I think it is very useful to be able to advertise undigitized specimens. And the difference between the estimated number of specimen and the actual number of records available on GBIF can be a good tool to prioritise data mobilisation efforts.

@marcos-lg
Copy link
Contributor

I agree that this field should be optional. I don't know if this is an API or UI thing, @marcos-lg does the API save 0 for specimen counts by default? Could it be changed?

I'll change it to be optional.

@nielsraes
Copy link

I am not sure about this. For me this is an estimate of the total number of specimens held by an institution. The upper right number is the number of specimen records that have been mobilized to GBIF. This difference between the estimate and realised number digital specimen records at GBIF can drive future digitisation on demand.

@rukayaj
Copy link
Author

rukayaj commented Oct 6, 2021

Yes good point, but then perhaps it's ok that it will be optional?

@marcos-lg marcos-lg self-assigned this Oct 6, 2021
@debpaul
Copy link

debpaul commented Oct 6, 2021

But the specimen count doesn't have to be the number of specimens published on GBIF. In fact, I think it is very useful to be able to advertise undigitized specimens. And the difference between the estimated number of specimen and the actual number of records available on GBIF can be a good tool to prioritise data mobilisation efforts.

Greetings all, I would also consult with the TDWG CD convenors here. (Matt Woodburn, Janeen Jones, Sharon Grant, Kate Webbink). What @ManonGros writes here is exactly what we are trying to support with these standards.

  1. We need "denominators." We need these numbers so that we can do data visualizations as well as programmatic calculations.
  2. We need to clearly! get counts of physical objects (for a denominator) and count of digital objects.
  3. With the digital objects, preferably there's a way to distinguish between individual objects and lots (such as for wet collections).
  4. With these data, in the appropriate standard buckets, the CD group has a data model that shows what calculations are possible, based on what is or is not provided by the data user.

@timrobertson100 has seen the above data models ... and may be able to offer more insights.

It is exciting to see this conversation taking place -- and noting that we are moving forward with our ability to better share these data so that we can better understand what we have, what makes us each unique, and what we need.

@ManonGros
Copy link
Contributor

Hi @debpaul integrating the TWDG CDs is part of our long term plans for GRSciColl! Although we are not there yet, we certainly keeping an eye on the evolution of the standard

@rukayaj
Copy link
Author

rukayaj commented Oct 14, 2021

Does it actually make more sense for number of specimens to be per grscicoll collection, rather than per institution?

@marcos-lg
Copy link
Contributor

Just deployed to PROD the change to make the number of specimens optional.

@ManonGros
Copy link
Contributor

@rukayaj it would make sense and I believe that this is something that is being taken into account in the TDWG CDs. I wouldn't want to start fiddling with the current model given that we will want to integrate the TDWG CDs anyway. I think that this will be something that will be incorporated at that point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GRSciColl Issues related to institutions, collections and staff
Projects
None yet
Development

No branches or pull requests

5 participants