Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Source data validation issues #283

Open
1 of 5 tasks
strogonoff opened this issue Aug 31, 2022 · 5 comments
Open
1 of 5 tasks

Source data validation issues #283

strogonoff opened this issue Aug 31, 2022 · 5 comments
Assignees
Labels
bug Something isn't working

Comments

@strogonoff
Copy link
Collaborator

strogonoff commented Aug 31, 2022

Describe the issue

Stefano outlined discrepancies between data sources and relaton-py expectations here.

The discrepancies arise because (1) previously invalid YAML was updated to match Relaton RNC/LutaML data specification (where relaton-py has come to rely on YAML to be invalid), or (2) data specification updated, YAML updated accordingly, but relaton-py was not updated.

These discrepancies lead to issues like: #277, #250, #281.

Fronts on which they could be attacked:

  • Normalizing abnormal source data in BibXML service (fixes partly implemented in Normalize some source data variations #282)
  • Making it so that abnormal source data doesn’t cause issues (not always realistic, but exploring options)
  • Pinning Relaton gems by their minor version part in GHAs (and relying on gem maintainers to avoid introducing breaking data format changes within the same minor version; I was informed that’s how it already works)
  • Updating Relaton specification to support versioning (medium term, we shouldn’t do it in rush)

Code of Conduct

@strogonoff strogonoff added the bug Something isn't working label Aug 31, 2022
@strogonoff
Copy link
Collaborator Author

strogonoff commented Aug 31, 2022

As of now, the discrepancies are:

contributor -> 0 -> organization -> abbreviation
  str type expected (type=type_error.str) # fixed
contributor -> 0 -> organization -> abbreviation
  str type expected (type=type_error.str) # fixed
keyword
  str type expected (type=type_error.str) # fixed
keyword -> 0
  str type expected (type=type_error.str) # fixed
copyright -> 0 -> owner -> 0 -> name -> 0
  str type expected (type=type_error.str)
copyright -> 0 -> owner -> 0 -> name
  str type expected (type=type_error.str)
copyright -> 0 -> owner -> 0 -> abbreviation
  str type expected (type=type_error.str) # fixed
copyright -> from
  field required (type=value_error.missing)
copyright -> owner
  value is not a valid list (type=type_error.list)
contributor -> 0 -> person -> name -> given -> forename -> 0
  Forename.__init__() missing 1 required positional argument: 'content' (type=type_error)
contributor -> 0 -> person -> name -> given -> forename -> content
  str type expected (type=type_error.str)
contributor -> 0 -> person -> name -> given -> forename -> 0
  Forename.__init__() missing 1 required positional argument: 'content' (type=type_error)
contributor -> 0 -> person -> name -> given -> forename -> content
  str type expected (type=type_error.str)
contributor -> 8 -> organization -> abbreviation
  str type expected (type=type_error.str)
contributor -> 0 -> person -> name -> given -> forename -> 0
  Forename.__init__() missing 1 required positional argument: 'content' (type=type_error)
contributor -> 0 -> person -> name -> given -> forename -> content
  str type expected (type=type_error.str)

@alicerusso
Copy link

adding some IEEE examples of this issue in practice. (presumably this is #283, although I don't see "Could not export" mentioned above.) Got the following errors upon click of "bibxml" on:

Could not export this item, the error was: Source data for item IEEE 1003.1-2017 (IEEE) didn’t validate (err: 7 validation errors for BibliographicItem copyright -> 0 -> owner -> 0 -> name -> 0 str type expected (type=type_error.str) copyright -> 0 -> owner -> 0 -> name str type expected (type=type_error.str) copyright -> 0 -> owner -> 0 -> abbreviation str type expected (type=type_error.str) copyright -> 0 -> owner -> 1 -> name -> 0 str type expected (type=type_error.str) copyright -> 0 -> owner -> 1 -> name str type expected (type=type_error.str) copyright -> from value is not a valid integer (type=type_error.integer) copyright -> owner field required (type=value_error.missing))

the last one:

Could not export this item, the error was: Source data for item IEEE 802.1Q-2014 (IEEE) didn’t validate (err: 5 validation errors for BibliographicItem copyright -> 0 -> owner -> 0 -> name -> 0 str type expected (type=type_error.str) copyright -> 0 -> owner -> 0 -> name str type expected (type=type_error.str) copyright -> 0 -> owner -> 0 -> abbreviation str type expected (type=type_error.str) copyright -> from value is not a valid integer (type=type_error.integer) copyright -> owner field required (type=value_error.missing))

@ronaldtse
Copy link
Collaborator

@ajeanmahoney
Copy link
Collaborator

Hi, @ronaldtse, is there an update for this issue? I see that the dependencies relaton/relaton#108 and relaton/relaton-ieee#32 are closed, and the reported discrepancies (#277, #250, #281.) are also closed. Thanks!

@rjsparks
Copy link
Member

rjsparks commented Feb 1, 2023

@ronaldtse is there anything left to do here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants