Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests: update/add tests for inspire-schemas v4.0 #1903

Merged

Conversation

spirosdelviniotis
Copy link
Contributor

Signed-off-by: Spiros Delviniotis spyridon.delviniotis@cern.ch

result = hep.do(create_record(snippet))

assert validate(result['isbns'], subschema) is None
assert expected == result['isbns'][0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert expected == result['isbns']

and above

expected = [
    {
        ...
    },
]

@@ -44,9 +49,20 @@ def test_doi_but_should_be_hdl_from_0247_a():
}]
result = hep.do(create_record(snippet))

assert validate(result['persistent_identifiers'], subschema) is None
assert result.get('persistent_identifiers', []) == expected
assert result.get('dois', []) == []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert expected == result['persistent_identifiers']

assert expected == result['020']


def test_isbns_from_020():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test_isbns_from_020__a_b_normalizes_print

'<datafield tag="020" ind1=" " ind2=" ">'
' <subfield code="a">9780198759713</subfield>'
'</datafield>'
) # record/1510325/export/xm
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/export/xm is implied by the fact that this is MARCXML, so it can be dropped here.

assert expected == result['773']


def test_isbns_from_020_a_only():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test_isbns_from_020__a

@jacquerie
Copy link
Contributor

Will be merged in #1899 when the comments are addressed.

@spirosdelviniotis spirosdelviniotis force-pushed the inspire_next_move_to_schemas_v4 branch 4 times, most recently from b3deef2 to ea91f8e Compare February 2, 2017 15:17
]
result = hep2marc.do(result)

assert expected == result['245']
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kaplun @jacquerie Do we need this case?
As I realized, we don't have dojson rule from json to MARCXML for 247 field.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On production we only have 6(!) records with 247. @michamos I think this are actually outliers and they should have used 246, right? (In the TWiki it says deprecated but to me looks like the other way round).

{
'c': ['45'],
'p': 'IAU Symp.',
'0': 1408366,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should be producing guys like these, because they come from the xme format, while here we are producing the xm format. This is a common problem in the MARCXML -> JSON rules as they are currently written.

CC: @kaplun

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jacquerie : you are right. In general we should not export back these records IDs.

Copy link
Contributor

@michamos michamos Feb 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Result of the discussion with @kaplun: we should write in MARCXML all IDs that exist in the XM format (those not marked XME only in https://twiki.cern.ch/twiki/bin/view/Inspire/DevelopmentRecordMarkup )

'keyword': 'programming: Monte Carlo',
'classification_scheme': 'INSPIRE',
},
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Watch out! Because you are not wrapping the two <datafield> tags with a <record> tag create_record only outputs the first.

@jacquerie
Copy link
Contributor

These tests need to be split following what was done in #1899. @michamos, can you check them?

@spirosdelviniotis
Copy link
Contributor Author

Just re based !

@@ -136,3 +136,37 @@ def test_titles_from_245__a_b():
result = hep2marc.do(result)

assert expected == result['245']


def test_titles_from_245_and_247__a_9():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we only had 6 records in HEP with 247 content, which I all changed to 247, so this test can be removed.

]
result = hep2marc.do(result)

assert expected == result['245']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should add a test for the case where the two titles in 245 and 246 are different, e.g. http://inspirehep.net/record/36035. In that case, the 245 goes to the first element of titles, the 246s go next.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, forget about this, just wrap everything with more than one <datafield> into a <record> and all titles should be preserved.

@@ -170,3 +170,68 @@ def test_titles_from_245_and_247__a_9():
result = hep2marc.do(result)

assert expected == result['245']


def test_title_translations_from_242__a():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is wrong: if there is a 242, 245 goes to translated_titles with the right language, whereas 242 goes to titles.

assert expected == result['242']


def test_title_translations_from_242__a_b():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment.

'</datafield>'
) # record/26564

expected = [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You shouldn't discard report numbers! report_numbers is a list for a good reason: so that we can place several values inside 😉
You should put both in expected and make sure they round-trip.

'</datafield>'
) # record/363605

expected = [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, you are discarding half of the keywords.

@spirosdelviniotis spirosdelviniotis force-pushed the inspire_next_move_to_schemas_v4 branch 3 times, most recently from 9bc6fc2 to 18feab5 Compare February 6, 2017 13:48
@spirosdelviniotis spirosdelviniotis force-pushed the inspire_next_move_to_schemas_v4 branch 2 times, most recently from 0111ac7 to b2c0a91 Compare February 6, 2017 15:39
Signed-off-by: Spiros Delviniotis <spyridon.delviniotis@cern.ch>
Signed-off-by: Spiros Delviniotis <spyridon.delviniotis@cern.ch>
Signed-off-by: Spiros Delviniotis <spyridon.delviniotis@cern.ch>
Signed-off-by: Spiros Delviniotis <spyridon.delviniotis@cern.ch>
@michamos
Copy link
Contributor

michamos commented Feb 6, 2017

@jacquerie @kaplun LGTM

@kaplun kaplun merged commit ca54db1 into inspirehep:master Feb 7, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants