New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add classification for degree_type field #1894
Add classification for degree_type field #1894
Conversation
Signed-off-by: Samuele Kaplun <samuele.kaplun@cern.ch>
Signed-off-by: Spiros Delviniotis <spyridon.delviniotis@cern.ch>
Signed-off-by: Spiros Delviniotis <spyridon.delviniotis@cern.ch>
Signed-off-by: Spiros Delviniotis <spyridon.delviniotis@cern.ch>
Signed-off-by: Spiros Delviniotis <spyridon.delviniotis@cern.ch>
Signed-off-by: Spiros Delviniotis <spyridon.delviniotis@cern.ch>
Signed-off-by: Spiros Delviniotis <spyridon.delviniotis@cern.ch>
Signed-off-by: Spiros Delviniotis <spyridon.delviniotis@cern.ch>
Signed-off-by: Spiros Delviniotis <spyridon.delviniotis@cern.ch>
Signed-off-by: Spiros Delviniotis <spyridon.delviniotis@cern.ch>
I would like to know how we should match the data from the legacy system with the new data defined in the This is the list with values and number of occurrences on the legacy system:
and this is the list of the new values in the schema for
and this is the actual mapping that we are doing in
this is the mapping that I would like to implement:
|
almost, thesis -> other, not everybody is Italian :) |
You can actually simplify your mapping by simply perform the lower case transformation first. (Invenio is case-insensitive).
|
Thanks! Do you think is necessary keep this field: https://github.com/inspirehep/inspire-next/pull/1894/files#diff-338f51def51c1adf539431d8baf989f6L466 |
IMHO nope because with the above mapping you exhaust the above mentioned list of occurrences on legacy. |
28bd64f
to
00300ee
Compare
87d2d7a
to
b4465dd
Compare
@@ -653,6 +661,8 @@ def test_advisors_from_701__a_g_i(): | |||
] | |||
result = hepnames.do(create_record(snippet)) | |||
|
|||
assert jsonschema_validate(result['advisors'], subschema, | |||
resolver=LocalRefResolver('', {})) is None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not the correct way to validate the schema. We have already discussed about this (https://github.com/inspirehep/inspire-next/pull/1896/files/10db08b2d25be331fe27b2ef7d18e63651cb1207#diff-8d2cf8d412e39817c6d484261eed7b89).
During the merge phase it has to be fixed in the "right way".
I was trying to understand how we convert 701 from schema to marc and I found this: https://github.com/inspirehep/inspire-next/blob/master/inspirehep/dojson/hepnames/fields/bd1xx.py#L483 |
Indeed you should look at |
b4465dd
to
082533f
Compare
|
||
result = hepnames2marc.do(snippet) | ||
|
||
assert expected == result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the test in this way for 2 reasons:
- It is more readable
- In order to test all the 701 fields
Let me know if there was a specific reason to do the test with the previous approach. If there was not a specific reason I think this way is better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, but you're based on an old branch: that test was already deleted in master.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💩
Signed-off-by: Riccardo Candido <riccardo.candido@gmail.com>
* Changes tests in according to the new degree_types * Adds schema validation Signed-off-by: Riccardo Candido <riccardo.candido@gmail.com>
082533f
to
692def1
Compare
Why do you want to get rid of laurea? It's more specific than thesis I think.
- Annette
On Jan 30, 2017, at 5:04 PM, Samuele Kaplun <notifications@github.com<mailto:notifications@github.com>> wrote:
You can actually simplify your mapping by simply perform the lower case transformation first. (Invenio is case-insensitive).
phd -> phd
bachelor -> bachelor
ug -> bachelor
habilitation -> habilitation
thesis -> thesis
diploma -> diploma
mas -> master
laurea -> thesis
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub<#1894 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AM1-O3vlhPhWQPSI39QkrKxCTDfqyVe4ks5rXgnwgaJpZM4LxjW2>.
|
It just says that's Italian. But it doesn't say if it's master or bachelor (as much as thesis doesn't say it). |
still, it's some data that we have, and that we decided to keep in the enum, so it should be mapped as |
Right. Didn't remember we actually have it in the |
As pointed out in inspirehep/inspire-schemas#80 |
BTW, I am confused: @spirosdelviniotis have you took over this branch? |
@kaplun I am working on top of master branch. |
@spirosdelviniotis I guess not. Was just asking to fully understand. |
Adds a classification for
degree_type
field in according to the schema: https://github.com/inspirehep/inspire-schemas/blob/master/inspire_schemas/records/elements/degree_type.json