Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

oai_datacite renders invalid datacite (kernel-3) #35

Closed
cessda-bitbucket-importer opened this issue Nov 25, 2022 · 4 comments
Closed

oai_datacite renders invalid datacite (kernel-3) #35

cessda-bitbucket-importer opened this issue Nov 25, 2022 · 4 comments
Assignees
Labels
bug Something isn't working major
Milestone

Comments

@cessda-bitbucket-importer
Copy link
Contributor

Original report on BitBucket by Toni Sissala (GitHub: toni-sissala).


Use of xml:lang in multiple elements:

  • Attribute 'xml:lang' is not allowed to appear in element 'creator'.
  • Element 'creatorName' is a simple type, so it cannot have attributes, excepting those whose namespace name is identical to 'http://www.w3.org/2001/XMLSchema-instance' and whose [local name] is one of 'type', 'nil', 'schemaLocation' or 'noNamespaceSchemaLocation'. However, the attribute, 'xml:lang' was found.
  • Element 'publisher' is a simple type, so it cannot have attributes, excepting those whose namespace name is identical to 'http://www.w3.org/2001/XMLSchema-instance' and whose [local name] is one of 'type', 'nil', 'schemaLocation' or 'noNamespaceSchemaLocation'. However, the attribute, 'xml:lang' was found.
  • Element 'contributorName' is a simple type, so it cannot have attributes, excepting those whose namespace name is identical to 'http://www.w3.org/2001/XMLSchema-instance' and whose [local name] is one of 'type', 'nil', 'schemaLocation' or 'noNamespaceSchemaLocation'. However, the attribute, 'xml:lang' was found.
  • Attribute 'xml:lang' is not allowed to appear in element 'nameIdentifier'.
  • Attribute 'xml:lang' is not allowed to appear in element 'date'.
  • Attribute 'xml:lang' is not allowed to appear in element 'rights'.

Invalid element/structure:

@cessda-bitbucket-importer
Copy link
Contributor Author

Original comment by Toni Sissala (GitHub: toni-sissala).


Using Datacite schema v4 would allow

  • creatorName to have xml:lang
  • publisher to have xml:lang
  • contributorName to have xml:lang
  • nameIdentifier to have xml:lang
  • rights to have xml:lang

However, the OpenAIRE guidelines states: “OpenAIRE has adopted the DataCite Metadata Schema v3.1 with some minor adjustments.” at https://guidelines.openaire.eu/en/latest/data/use_of_datacite.html The minor adjustments are also listed, but they do not include allowing xml:lang attributes to elements listed in the issues description.

To target OpenAIRE, we’re stuck with schema v3 and the xml:lang issues need to be fixed.

Also the GeoLocationPlace element needs to be fixed since it is just plain wrong and needs to be wrapped inside a GeoLocation element.

@cessda-bitbucket-importer
Copy link
Contributor Author

Original comment by Toni Sissala (GitHub: toni-sissala).


Add hard-coded resourceType to 'oai_datacite' metadata

Add hard-coded resourceTYpe to 'oai_datacite' serialization which
always has the value 'Dataset'.

This implementation requires the aggregator OAI-PMH Repo Handler to
overwrite the template for 'oai_datacite' in Kuha2. Therefore the test
coverage that has been built for Kuha2, must also be included and
suited for this package.

Kuha2 oai_datacite template was previously fixed to not contain
invalid xml:lang attributes and has wrapped geoLocationPlace inside a
geoLocation element. As this commit copies the original template from
Kuha2, the issues with invalid Datacite v3 are also fixed.

Bump version to 0.4.0.

Write unreleased changelog entry for 0.4.0.

Fixes #33 at Bitbucket.
Fixes #35 at Bitbucket.

@cessda-bitbucket-importer
Copy link
Contributor Author

Original comment by Matthew Morris (GitHub: matthew-morris-cessda).


Merged in feature/33-include-resourcetype-in-oai-datacite (pull request #51)

Add hard-coded resourceType to 'oai_datacite' metadata

  • Add hard-coded resourceType to 'oai_datacite' metadata

Add hard-coded resourceTYpe to 'oai_datacite' serialization which
always has the value 'Dataset'.

This implementation requires the aggregator OAI-PMH Repo Handler to
overwrite the template for 'oai_datacite' in Kuha2. Therefore the test
coverage that has been built for Kuha2, must also be included and
suited for this package.

Kuha2 oai_datacite template was previously fixed to not contain
invalid xml:lang attributes and has wrapped geoLocationPlace inside a
geoLocation element. As this commit copies the original template from
Kuha2, the issues with invalid Datacite v3 are also fixed.

Bump version to 0.4.0.

Write unreleased changelog entry for 0.4.0.

Fixes #33 at Bitbucket.
Fixes #35 at Bitbucket.

  • Fix build issues caused by missing UID 1000 in the python build image

@cessda-bitbucket-importer
Copy link
Contributor Author

Original comment by Toni Sissala (GitHub: toni-sissala).


Update version. This was not released as a patch release, but was bundled together with features that lead to a minor release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working major
Projects
None yet
Development

No branches or pull requests

2 participants