-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Align citation with GBIF.org #1360
Comments
The corresponding GBIF registry issue is gbif/registry#4 |
Should user still have an opportunity to give a free text citation to IPT + a warning that this won't be displayed at GBIF.org, or should this opportunity be closed completely? |
I agree that both should be aligned! Regarding allowing the user to add a free text: that's something to avoid if it's never going to end up on GBIF (with or without warning)! I think it's better to limit the freedom to the user, but still allow him to provide some things to be included in an automatically generated citation. Here's what we do
Now, if we would be able to provide the DOI of the dataset, and the DOI of the data paper, and the citation is build from there, we would be happy. Other remarks
Example IPT vs GBIF citationDifferences in bold. Citation on IPT: http://data.inbo.be/ipt/resource?r=dagvlinders-inbo-occurrences
Citation on GBIF: https://www.gbif.org/dataset/7888f666-f59e-4534-8478-3a10a3bfee45
|
As discussed with, but not yet checked by @ahahn-gbif: In the IPT, next to the custom citation field, please add, somehow visibly, the following warning: |
I see this is a long standing issue, I would like to know if there are any plans to address it for the coming release (if there is one coming soon). |
@ahahn-gbif as we just discussed this again lately, I think we should add some closing remarks and close this, do you agree? |
@dschigel @ahahn-gbif that would be great as this disparity between IPT and GBIF creates a lot of noise between data publishers. |
@dschigel We will still need to pick this issue up in future IPT development work, but may want to move it to a different repository. There is no scheduled development work or release plan for the IPT at this point, the future IPT work being in early scoping stages. At this point, the IPT citation editing page contains the following warning against free-text citations: I assume we are only talking about the auto-generated citations from here on. I agree that their generation logic should be aligned between gbif.org and the IPT, so that an IPT data administrator has a reasonably clear idea of how the citation will later show on gbif.org. However, for several of the reasons @peterdesmet states above, both citations will never be fully identical, especially in the details pointing back at the source - citing an IPT endpoint is not referencing a dataset version fully identical to the one accessible through gbif.org (it may not even be published in a GBIF context), and both citations will continue to differ. Maybe it is important to communicate this latter point: accessing the dataset through the IPT endpoint will give access to the data as configured at source, potentially including unindexed extensions, different preferred taxon names, geodetic organization, etc. It will, on the other hand, not contain components added or annotated during the ingestion processes of GBIF et al.: alignment with a core taxonomic structure for unified search access, standardization of certain data fields, annotation of potential issues, data interpretation to mediate detected issues. |
We came to this in the Humboldt core topic, and I realize we can possibly solve it by focusing on DOI and by stressing that it's the target context that dictates citation format. I come from two assumptions:
If you agree with these statements, we actually don't need a recommended citation at all. For DOI based data citation to work, we don't need a publisher preferred citation either (but we can keep it in the IPT resource as it seems to be important, and yes we can stress that it is IPT resource which is then cited, not the GBIF view). Every GBIF page instead of the current Citation footer can instead have a section along the CitetheDOI lines with approx the following text, see below. In IPT view we need to have much softer wording e.g. "should" -> "example". The sentence on the publisher recommended view would be only shown if publisher recommended citation is not null. I think the suggestion below would capture key wishes expressed above. Styling and English will need to be fixed. How to cite Karlsholt O, Pedersen J, Hansen (deceased) M, Schigel D, Braak K (2016). Insects from light trap (1992–2009), rooftop Zoological Museum, Copenhagen. Version 1.4. Natural History Museum of Denmark. Sampling event dataset https://doi.org/10.15468/xabmiz accessed via GBIF.org on 2021-03-17. |
We sometimes tried to use test IPTs + test registry for checking how different sections of metadata would finally look once published (links and other html stuff, which authors will appear in citation and in which order, and things like that). Authors always prefer to see a test version of the final product. The problem -at least several times we tried- is the slowness of the test-portal in reflecting those changes after test IPT publications (even for metadata-only datasets). Thanks a lot in advance |
@abubelinha Thank you for the questions. This is a portal thing I think. |
Hello everyone. I found the discussion after realizing the isssue of author names and order. My case is a set of checklist datasets, that I am going to publish on behalf of a large group of authors as a series of chapters. I originated the idea and now I prepared a standard metadata description, which will be applied to all of the checklists (and modified in some cases).
If I delete myself from the metadata authorship it will also be wrong. I think IPT should provide a way to control it, using a simple checkbox near the metadata author part (ie "tick for inclusion to the author string" or so). What do you think? |
I agree. I think only the resource creators should be included as authors (in the order provided), as is done by the IPT. |
I am a bit reluctant about that direction. The original proposal was to have the IPT follow the logic of GBIF.org. The inclusion of metadata authors in the citation string had been discussed and decided in favor of, because metadata can often contribute substantially to the quality and usability of a dataset, and are not necessarily provided by the curators of the datasets themselves. Also, on a more procedural level, this would change citations for quite a number of datasets in GBIF.org without prior consultation or even information, which does not sound quite right. I would rather propose to
|
I am sorry - I overlooked that it is not possible in the IPT to not declare a metadata author, which makes this more tricky. So if I understand correctly, the situation we would like to reach is one where
At present
questions to check into:
|
Remember we are generating EML here, so we don't have complete flexibility. The metadata authorship becomes See https://eml.ecoinformatics.org/schema/, specifically https://eml.ecoinformatics.org/schema/eml_xsd.html#eml_dataset and https://eml.ecoinformatics.org/schema/eml-resource_xsd.html#ResourceGroup_metadataProvider |
I never understood the reasons for including the metadata author in the generated citation. I would strongly consider to remove that. If that person wants/needs to be cited I think it should become also a proper author/creator of the resource. Offering an option to include/exclude the metadata author would only add to the complexity I think. It might also be worth mentioning that In ChecklistBank we have decided to follow yet another approach. Removing the metadata author from the citation would align better with CLB/COL. |
@mdoering Removing metadata authors would affect thousands of citations at GBIF.org I think we should make metadata providers an optional section in the IPT basic metadata since it's optional in EML |
Yes. On the other hand it would mean that I am forced to not say who authored the metadata just to remove me from the citation. |
Yes. We do have to consider the situation today, however. Silently changing thousands of citations on GBIF.org can have major fallout, as nice as the change may be. This is not a quick fix to push through. |
I reviewed a few metadata records in the Environmental Data Initiative (EDI) repository and it does seem to me that the requirement for metadata provider that the IPT has enforced is unusual / not standard practice. In the few EML records I examined from EDI, none of them had metadata provider. |
@albenson-usgs I think this is actually quite revealing about attitudes towards metadata. When citation generating formula was rolled out (and now indeed affects thousands of dataset) GBIF.org view of the published datasets - which is the citable object in this case - is a product of data creation & reworks -> front authorship and of metadata creation & reworks -> metadata authorship. We have enough trouble with poor metadata across so many infrastructures, so removing the metadata authorhsip from the GBIF dataset equation will send us back to metadata-careless stone age thought metadata anonymity. I would be very protective of the second bullet here, and I understood @ahahn-gbif, too? Authorship is not only credit - it is responsibility, and this fully applied to metadata authoship. Please note that dataset can be cited (at its endpoint location) differently from the GBIF.org displayed instance. |
@dschigel I fully understand the need to maintain and increase the quality of metadata in GBIF datasets, but the statement you mention ("Name(s) of the dataset’s metadata author(s) [to be included to the author string], if one is registered, but only if also an originating author is named") does not necesarily close the topic. We may imagine the following scenario:
So if you decide to add this option to the IPT, it will not affect the existing datasets, and no one should complain. But ones who care will have the option. And it may also be appreciated by many curators of the older datasets. |
Metadata editing is indeed important and should be acknowledged. So is data collection, managing, etc. This is why we (INBO) include all these people as creators of a dataset, so they are included in the (IPT) citation. The metadata editor field is superfluous for us, because we already include those people as creators. I’d rather have one list of people (contributors), that are all included as EML creators and who are all included as authors (cf. GBIF citation). That way names don’t have to be repeated too. The IPT could still offer to indicate roles for those contributors (e.g. contact). That can be expressed in EML by listing those people under a specific property for that role (cf. current implementation), but in addition to them being listed as creator. It also provides a way to migrate info in the IPT: make metadata editors, creators and contacts all contributors and remove duplicates. To acknowledge people (cf. acknowledgement in paper) that should not be included in the citation, use additional parties. |
@dschigel I believe there is an assumption here that requiring someone to identify themselves as the metadata provider makes them 1) create better metadata and 2) feel more responsibility for the dataset. I am dubious that either of those things are true. The issue is not whether or not it should be an option to include metadata provider, it's whether it should be required.
While I agree that having to repeat author information up to three times (contact, creator, metadata provider) is quite tedious (you can copy from resource contact but only for the first contact), I don't agree with having only one list and it's only the contributors. I know for some of the projects I help share data it is nice to know who processed the data to Darwin Core but those people don't want to be listed as authors (and it would make the data originators frustrated to see that person's name in the citation). It does seem to me that we need a more flexible way for IPT data managers to select and decide the authorship and order of authors in the citation for the IPT and on GBIF.org. |
... interestingly we have followed in ChecklistBank DataCite and CSL to list contributors with an optional note that can express how they contributed, but explicitly excluded them from being cited as authors. I really like the traditional way of separating authors, editors, the publisher (included in the citation string) and a flexible list of other contributors that are not part of the citation string. This way you can control who is part of the citation string, but still attribute others. I know some people prefer to cite each and everyone equally, but I don't think we should require such practices but instead leave this to the dataset publisher. |
Maybe it would be good then to have one list where all people are only listed once, but can be assigned multiple roles. Someone with |
- include metadata providers - allow agents with lastname only - punctuation
Further discussion here #1917 |
The IPT citation does not reflect what the GBIF.org system does. This is highly confusing, and GBIF's publishing tool need to be fully consistent and intuitive.
I am not sure which is incorrect.
The text was updated successfully, but these errors were encountered: