Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

contrib: dojson drop unknown fields #51

Closed
greut opened this issue Nov 25, 2015 · 6 comments · Fixed by #58
Closed

contrib: dojson drop unknown fields #51

greut opened this issue Nov 25, 2015 · 6 comments · Fixed by #58
Assignees
Milestone

Comments

@greut
Copy link
Member

greut commented Nov 25, 2015

In #50, a test fails because a subfield is not part of the spec.

the faulty record

<record>
    <datafield tag="852" ind1="8" ind2=" ">
        <subfield code="i">M 314</subfield>
        <subfield code="h">339</subfield>
        <subfield code="b">Library Reading Room</subfield>
        <subfield code="9">10</subfield>
    </datafield>
</record>

the test

expected = create_record(xml)
result = to_marc21.do(marc21.do(expected))
assert expected == result

The spec says nothing about a $9 subfield. But should it remove it nonetheless?

@jirikuncar jirikuncar changed the title dojson drop unknown fields contrib: dojson drop unknown fields Nov 25, 2015
@jirikuncar jirikuncar added this to the 0.5.0 milestone Nov 25, 2015
@jirikuncar
Copy link
Member

@greut you probably want "very liberal MARC" see #26

@jirikuncar jirikuncar modified the milestones: someday, 0.5.0 Nov 25, 2015
@egabancho
Copy link
Member

@greut I don't think we should remove unknown fields, actually this means that the translation for your data model is incomplete, therefore your tests should fail.

I think the question here is whether or not the liberal MARC21 implementation should add to the output JSON all the unknown fields. IMHO it should.

Just a side note, AFAIK the subfield $9 is used as a custom subfield, like 9xx or 69x, so in theory the definition of this subfield should go in your DoJSON package. We have a couple of cases like this now on CDS, i.e. https://github.com/CERNDocumentServer/cds-dojson/blob/master/cds_dojson/marc21/fields/default/bd7xx.py#L85

@greut
Copy link
Member Author

greut commented Dec 8, 2015

Thanks @egabancho, lemme invoke the MARC lord @blixhavn. The silent removal looks like a way to be in trouble.

@tiborsimko
Copy link
Member

Similarly to @egabancho I think the test should fail in this case (and not drop the field silently), because the input record is not compatible with the desired schema.

There are basically two solutions: either (1) stick to standard MARC that does not seem to allow $9 in that field(?), or (2) amend wanted schema to allow the presence of $9, going towards as "liberal" MARC schema as the given library instance needs.

@aw-bib can advise about whether $9 has any specific convention meaning across multiple fields in the MARC standard?

@greut
Copy link
Member Author

greut commented Dec 17, 2015

@tiborsimko how easy could be to amend/edit the schema for a particular purposes. E.g. the 9xx fields that are used internally by invenio.

If you provide a liberal MARC, people will use this one because loosing data is bit of a no-go.

@tiborsimko
Copy link
Member

The best example for 9xx so far is @egabancho's customisations for CERN-specific 9xx fields. I fully agree with you that loosing data is not acceptable, hence we should return an error, not silently ignore those "extra" fields. The error will hopefully prompt the site admins to amend the local schema rules to fit local cataloguing practices... or else switch to a liberal schema iike #26 that is very permissive on the input side. (At the price of loosing some precise JSON-Schema-generated editing forms.) Many sites have been calling for more permissive schema, see long discussion in #23, so perhaps we should get to implementing it sooner rather than later?

jirikuncar added a commit to jirikuncar/dojson that referenced this issue Dec 17, 2015
* NEW Adds `ignore_missing` option to `Overdo.do` method to specify if
  method should raise when there is no matching rule for any key.
  (closes inveniosoftware#51)

Signed-off-by: Jiri Kuncar <jiri.kuncar@cern.ch>
jirikuncar added a commit to jirikuncar/dojson that referenced this issue Dec 18, 2015
* NEW Adds `ignore_missing` option to `Overdo.do` method to specify if
  method should raise `MissingRule` exception when there is no matching
  rule for any key.  (closes inveniosoftware#51)

Signed-off-by: Jiri Kuncar <jiri.kuncar@cern.ch>
jirikuncar added a commit to jirikuncar/dojson that referenced this issue Dec 18, 2015
* NEW Adds new keyword argument `ignore_missing` to `Overdo.do` method
  to specify if method should raise `MissingRule` exception when there
  is no matching rule for any key.  On can change the behavior of `do`
  command too by using `--strict` option that sets the `ignore_missing`
  argument to `False`.  (closes inveniosoftware#51)

Reviewed-by: Tibor Simko <tibor.simko@cern.ch>
Signed-off-by: Jiri Kuncar <jiri.kuncar@cern.ch>
jirikuncar added a commit to jirikuncar/dojson that referenced this issue Dec 18, 2015
* NEW Adds new keyword argument `ignore_missing` to `Overdo.do` method
  to specify if method should raise `MissingRule` exception when there
  is no matching rule for a key.  One can change the behavior of `do`
  command too by using `--strict` option that sets the `ignore_missing`
  argument to `False`.  (closes inveniosoftware#51)

Reviewed-by: Tibor Simko <tibor.simko@cern.ch>
Signed-off-by: Jiri Kuncar <jiri.kuncar@cern.ch>
jirikuncar added a commit to jirikuncar/dojson that referenced this issue Dec 18, 2015
* NEW Adds new keyword argument `ignore_missing` to `Overdo.do` method
  to specify if method should raise `MissingRule` exception when there
  is no matching rule for a key.

* NEW Adds new CLI option `--strict` to the `do` command that sets the
  `ignore_missing` argument to `False`.  (closes inveniosoftware#51)

Reviewed-by: Tibor Simko <tibor.simko@cern.ch>
Signed-off-by: Jiri Kuncar <jiri.kuncar@cern.ch>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants