jsonschema: pumped up references #1279

mihaibivol · 2016-06-30T13:05:24Z

Provide a `hep`-like reference field inside `hep` references

Closes #430

Add reference new schema (similar to hep)
Fix mapping - it was fixed already
Amend dojson to work with this new schema
~~Amend hepcrawl to work with this new schema~~ Moved to next bullet
~~Amend holdingpen to save this new schema~~ Refs not use in holdingpen
Amend HEP Detail View to work with this new schema
~~Create follow-up issue for merging good ideas into the hep schema~~

Follow-up + extra findings

Parse author names better references: try authors string split #1344
Figure out fate of dois isbns fields in hep schema RFC: isbns and dois in Literature schema #1349
Better rendering of reference datatables (we now curate external ids so we can link to them) record detail: link to external reference fall-back #1346
Fix error in double harvesting Article Workflow: UnmappedInstanceError on delete_self_and_stop_processing task #1341
Fix some other errors in harvesting (won't make an issue)
Bring references back -- not yet an issue

mihaibivol · 2016-07-01T07:57:00Z

inspirehep/modules/records/jsonschemas/records/elements/reference.json

+            "type": "array",
+            "title": "Document type"
+        },
+        "urls": {


mihaibivol · 2016-07-01T08:36:10Z

We are missing

999C5m (misc)
999C5[12] - won't need them
999C5e (might merge them in authors + put some role in there) @kaplun maybe we want this to be compatible with the first draft in unifying people
hdl

We have extra fields with no MARC counterpart:

imprint -> may come from publishers
book_series -> may come from publishers
arxiv_eprints - @kaplun where to populate these from (at least in the MARC case)
persistent_identifiers (we can keep them though, even if they don't actually have a source in the schema) @kaplun do we add source?

mihaibivol · 2016-07-01T11:36:37Z

We are missing

999C5[12] - won't need them

We added back

999C5m - misc will be here for a while
999C5e - merged into authors + role 'ed.'
hdl ids will go into persistent identifiers

We have extra fields with no MARC counterpart:

imprint -> may come from publishers
book_series -> may come from publishers
arxiv_eprints - @kaplun where to populate these from (at least in the MARC case)
persistent_identifiers (we can keep them though, even if they don't actually have a source in the schema) @kaplun do we add source?

mihaibivol · 2016-07-05T08:13:55Z

cc @annetteholtkamp
https://github.com/mihaibivol/inspire-next/blob/06d034b60a910a3545fe2a076870c48fdba73fe2/inspirehep/modules/records/jsonschemas/records/elements/reference.json here's the proposed file for porting 999C5

Also I commented in this issue how the fields are ported from the old data model to the current one.

kaplun · 2016-07-05T13:52:36Z

inspirehep/dojson/hep/fields/bd90x99x.py



-RE_VALID_PUBNOTE = re.compile(".*,.*,.*(,.*)?")
+RE_VALID_PUBNOTE = re.compile(r'.*,.*,.*(,.*)?')
+RE_VALID_ARXIV_REP_NO = re.compile(r'(arxiv:)?\d{4}.\d{4,5}|\w+-\w+/\d+|\w+/\d+r')


Wow I am surprised. Where have you taken these?

They were in the hep schema. For the report number I just saw that there are a lot of arXiv prepended report numbers.

kaplun · 2016-07-06T14:04:45Z

inspirehep/dojson/hep/fields/bd90x99x.py

+        texkey = force_single_element(value.get('1'))
+
+        # Publication info specific.
+        cnum = force_single_element(value.get('b'))


CNUM have a regexp to check them C\d\d-\d\d-\d\d(\.\d+)? (I think)

kaplun · 2016-07-12T19:54:39Z

BTW as part of your or branch drop the WIP from my commit :)

Signed-off-by: Mihai Bivol <mm.bivol@gmail.com>

Signed-off-by: Samuele Kaplun <samuele.kaplun@cern.ch>

* Adds rules for builidng the new reference schema. Signed-off-by: Mihai Bivol <mihai.bivol@cern.ch>

jmartinm · 2016-07-18T14:39:25Z

inspirehep/modules/theme/templates/inspirehep_theme/references.html

-      {% endif %}
-      {% if reference['isbn'] %}
-        <span class="reference-detail">{{ reference['isbn'] }}</span>
+      {% if reference['misc'] %}


Better to use dotted notation consistently?

jmartinm · 2016-07-18T15:03:10Z

Just went through it and did not see anything to change at first sight 👍

jacquerie · 2016-07-18T15:04:20Z

inspirehep/modules/records/jsonschemas/records/elements/reference.json

+                    },
+                    "serialization": {
+                        "type": "string",
+                        "description": "E.g. refextract, text, JATS, Elsevier, BibTeX..."


I don't understand what a serialization is ._.

Maybe we should call it source as in other parts of the schema?

Elsevier references come directly in a json format, while others come in text. I think the examples are not particularly good. We could actually have both source and format (text, json, bibtex, xml)

Signed-off-by: Mihai Bivol <mihai.bivol@cern.ch>

* Uses reference macros for Reference datatables. * Fixes small error in `title` field handling for references. * Adds prepend_text optinal param for publication_info macro. Signed-off-by: Mihai Bivol <mihai.bivol@cern.ch>

Signed-off-by: Mihai Bivol <mihai.bivol@cern.ch>

* Stands as an example for changing a schema using builders. Signed-off-by: Mihai Bivol <mihai.bivol@cern.ch>

jacquerie · 2016-07-19T08:30:25Z

inspirehep/dojson/hep/fields/bd90x99x.py

+        'e': [a.get('full_name') for a in value.get('authors', [])
+              if a.get('role') == 'ed.'],
+        'h': [a.get('full_name') for a in value.get('authors', [])
+              if a.get('role') != 'ed.'],


I'm almost tempted to add a condition lambda to get_value : )

jacquerie · 2016-07-19T08:44:00Z

LGTM 🚢

mihaibivol added this to the Information architecture - third iteration milestone Jun 30, 2016

mihaibivol self-assigned this Jun 30, 2016

mihaibivol added the WIP label Jun 30, 2016

mihaibivol mentioned this pull request Jun 30, 2016

simplify references #1273

Closed

17 tasks

mihaibivol reviewed Jul 1, 2016
View reviewed changes

kaplun reviewed Jul 5, 2016
View reviewed changes

mihaibivol force-pushed the references-schema branch from 16000bb to be527f4 Compare July 5, 2016 14:09

kaplun reviewed Jul 6, 2016
View reviewed changes

mihaibivol mentioned this pull request Jul 8, 2016

Validation errors on production records #1206

Closed

mihaibivol force-pushed the references-schema branch 4 times, most recently from bbeefa6 to c2cd40d Compare July 12, 2016 13:42

mihaibivol mentioned this pull request Jul 14, 2016

references: first commits #1330

Merged

mihaibivol force-pushed the references-schema branch from c2cd40d to 357d191 Compare July 15, 2016 13:06

mihaibivol and others added 3 commits July 18, 2016 16:04

jsonschema: move json_reference.json to schema elements

3ba2ced

Signed-off-by: Mihai Bivol <mm.bivol@gmail.com>

jsonschema: first draft of HEP-like reference.

ca71627

Signed-off-by: Samuele Kaplun <samuele.kaplun@cern.ch>

dojson: working references forward dojson.

f4ec8ad

* Adds rules for builidng the new reference schema. Signed-off-by: Mihai Bivol <mihai.bivol@cern.ch>

mihaibivol force-pushed the references-schema branch 2 times, most recently from 35b396a to 9cccf7f Compare July 18, 2016 14:07

mihaibivol force-pushed the references-schema branch from 9cccf7f to d5ee489 Compare July 18, 2016 14:37

mihaibivol removed the WIP label Jul 18, 2016

mihaibivol changed the title ~~WIP jsonschema: pumped up references~~ jsonschema: pumped up references Jul 18, 2016

jmartinm reviewed Jul 18, 2016
View reviewed changes

jacquerie reviewed Jul 18, 2016
View reviewed changes

mihaibivol added 6 commits July 19, 2016 09:16

dojson: working json to MARC for references.

bd6f391

Signed-off-by: Mihai Bivol <mihai.bivol@cern.ch>

holdingpen: note for future integration steps

a26603a

Signed-off-by: Mihai Bivol <mihai.bivol@cern.ch>

docker: dev setup

d79f865

Signed-off-by: Mihai Bivol <mihai.bivol@cern.ch>

references: unified reference display

e5d6bb1

* Uses reference macros for Reference datatables. * Fixes small error in `title` field handling for references. * Adds prepend_text optinal param for publication_info macro. Signed-off-by: Mihai Bivol <mihai.bivol@cern.ch>

dojson: reference pubnote building refactor

fbc2b59

Signed-off-by: Mihai Bivol <mihai.bivol@cern.ch>

schema: small reference.json schema refactoring

4404af9

* Stands as an example for changing a schema using builders. Signed-off-by: Mihai Bivol <mihai.bivol@cern.ch>

mihaibivol force-pushed the references-schema branch from d5ee489 to 4404af9 Compare July 19, 2016 07:55

jacquerie reviewed Jul 19, 2016
View reviewed changes

kaplun mentioned this pull request Jul 19, 2016

references.items.properties.url should be plural #1345

Closed

jacquerie merged commit 104d520 into inspirehep:master Jul 19, 2016

eamonnmag mentioned this pull request Jul 19, 2016

Data model: full support for references #430

Closed

2 tasks

mihaibivol mentioned this pull request Jul 19, 2016

record detail: link to external reference fall-back #1346

Closed

mihaibivol deleted the references-schema branch July 19, 2016 09:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

jsonschema: pumped up references #1279

jsonschema: pumped up references #1279

mihaibivol commented Jun 30, 2016 •

edited

Loading

mihaibivol Jul 1, 2016

mihaibivol commented Jul 1, 2016 •

edited

Loading

mihaibivol commented Jul 1, 2016

mihaibivol commented Jul 5, 2016

kaplun Jul 5, 2016

mihaibivol Jul 5, 2016

kaplun Jul 6, 2016 •

edited

Loading

kaplun commented Jul 12, 2016

jmartinm Jul 18, 2016

jmartinm commented Jul 18, 2016

jacquerie Jul 18, 2016

jmartinm Jul 18, 2016

mihaibivol Jul 18, 2016

mihaibivol Jul 18, 2016

jacquerie Jul 19, 2016

jacquerie commented Jul 19, 2016

jsonschema: pumped up references #1279

jsonschema: pumped up references #1279

Conversation

mihaibivol commented Jun 30, 2016 • edited Loading

Provide a hep-like reference field inside hep references

Follow-up + extra findings

Choose a reason for hiding this comment

mihaibivol commented Jul 1, 2016 • edited Loading

mihaibivol commented Jul 1, 2016

mihaibivol commented Jul 5, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kaplun Jul 6, 2016 • edited Loading

Choose a reason for hiding this comment

kaplun commented Jul 12, 2016

Choose a reason for hiding this comment

jmartinm commented Jul 18, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jacquerie commented Jul 19, 2016

mihaibivol commented Jun 30, 2016 •

edited

Loading

Provide a `hep`-like reference field inside `hep` references

mihaibivol commented Jul 1, 2016 •

edited

Loading

kaplun Jul 6, 2016 •

edited

Loading