Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing datasetName in download #270

Closed
niconoe opened this issue Oct 28, 2021 · 6 comments
Closed

Missing datasetName in download #270

niconoe opened this issue Oct 28, 2021 · 6 comments

Comments

@niconoe
Copy link

niconoe commented Oct 28, 2021

Hello,

I've recently encountered cases where a DwC-A download from the GBIF portal has an empty datasetName field.

Example: in the following download record with gbifId=297835694 has an empty datasetName in occurrences.txt, while the occurrence page gives a proper dataset name (Pl@ntNet automatically identified occurrences).

In that case, it looks like it might be cause by the non-ascii "@" character in the datasetName, but I've encountered it with seemingly less exotic dataset names (for example for this occurrence)

@niconoe
Copy link
Author

niconoe commented Dec 17, 2021

Hello GBIF team, do you already know if this this an issue you plan to tackle in the next few weeks/months?

Otherwise I'll work on my (data consumer) side to avoid the issue, for example by making API calls to retrieve the missing dataset names based on the datasetKey.

Thanks a lot!

@MattBlissett
Copy link
Member

@fmendezh, there aren't any empty datasetName values in Hive, could this be an ES issue? It's a small download.

@ManonGros
Copy link

Could this issue be related: gbif/portal-feedback#3814?

@marcos-lg
Copy link
Contributor

Fixed in PROD.

@niconoe
Copy link
Author

niconoe commented Sep 4, 2023

Unfortunately, the issue doesn't seem fully solved (I recently removed my workaround in GBIF Alert and the issue came back.

This download for example has 27141 blank dataset names.

@MattBlissett MattBlissett reopened this Sep 4, 2023
@marcos-lg
Copy link
Contributor

I took a look at this issue again and I realized that the datasetName field is the darwin core field (https://dwc.tdwg.org/terms/#dwc:datasetName), not the title of the dataset in our registry, which is the one that the occurrence page displays.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants