-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create relaton-data-nist #53
Comments
Related to usnistgov/NIST-Tech-Pubs#1 |
@ronaldtse what are the references for those documents should be? For example, the first document has citation-id 78696207 and report-number NBS BH 1. Should we cite it by the "NIST 78696207" or the "NIST NBS BH 1" reference?
|
@andrew2net the proper citation document identifier is "NBS BH 1" in this case. NBS is the predecessor of NIST, so:
We can actually take hint from this: <doi type="report-paper_title">10.6028/NBS.BH.1</doi> The IDs that look like integer are clearly machine generated and possibly not for human citational use. |
@ronaldtse <publisher>
<publisher_name>error:</publisher_name>
<publisher_place>Gaithersburg, MD</publisher_place>
</publisher>
<institution>
<institution_name>error:</institution_name>
<institution_acronym>error:</institution_acronym>
<institution_place>Gaithersburg, MD</institution_place>
</institution> |
Yes! @andrew2net can you file a new issue here? |
@ronaldtse the source contains relations with <related_item>
<intra_work_relation relationship-type="replaces" identifier-type="doi">10.6028/NIST.SP.1108r3</intra_work_relation>
</related_item>
<related_item>
<intra_work_relation relationship-type="isVersionOf" identifier-type="doi">10.6028/NIST.SP.1108</intra_work_relation>
</related_item> |
Metanorma already implements the new NIST PubID scheme, which has defined transforms from machine-readable IDs to:
And we need to parse these old DOIs back to PubID. So we need to extract that code out from metanorma-nist: Then we can re-use that in relaton-nist. |
@ronaldtse there are documents like <edition_number>0</edition_number> should we ignore the |
@andrew2net usnistgov/NIST-Tech-Pubs#1 has been fixed, can you help update the location of the XML file? Thanks. |
Issue #53 (comment) is posted in #55. Can we close this ticket? |
@ronaldtse no, the relaton-data-nist isn't ready. It needs to convert DOI IDs to PubIDs to be able to reference the documents. But the DOI IDs in the source aren't the same as MR IDs. I have many questions about how to map parts of DOI IDs to PubIDs. I'll ask you later. Have a lot of other tasks to finish. |
Also, we need to move documents from the https://csrc.nist.gov/CSRC/media/feeds/metanorma/pubs-export.zip file to this repo to solve a problem similar to relaton/relaton-calconnect#11 |
@andrew2net sure, let's merge the bibdata from CSRC into this collection. |
@ronaldtse the source has some DOI identifiers what need clarification how should they be mapped to PubID:
|
https://nvlpubs.nist.gov/nistpubs/Legacy/circ/nbscircular15-April1909.pdf This is NBS CIRC ("Circular") No. 15. Yes
I think In this case, it means this is an "insert" of NBS CIRC 25. The "ins" part can be considered as in the same category like "supplement". Just as we can have "Supplement 1", we can have "Insert 1". https://www.govinfo.gov/app/details/GOVPUB-C13-45974defbd2f3d7ab324bcd3506831b7
"sup" and "supp" probably mean Supplement. Supplement is a supported type.
"sec" is Section. Treat it as similar to "Part", where we can have "Part 1" (pt1), we can have "Section 1" (sec1).
Both mean "index". Treat it as like Supplement and Insert.
Errata. Treat it as like Supplement and Insert.
Let's treat them as docnumbers, yes. But did you notice these entries have assigned numbers? Then we don't need to parse the DOIs for them. See this: https://pages.nist.gov/NIST-Tech-Pubs/CRPL.html . https://nvlpubs.nist.gov/nistpubs/Legacy/crpl/crpl-1-2_3-1.pdf
Yes.
https://nvlpubs.nist.gov/nistpubs/Legacy/IR/nistir6867es.pdf
Part C. https://nvlpubs.nist.gov/nistpubs/Legacy/IR/nistir7297c.pdf
Language: Chinese.
Language: Vietnamese.
Language: Portuguese.
https://nvlpubs.nist.gov/nistpubs/Legacy/NCSTAR/ncstar1-1av1.pdf
Docnumber is 1011. Volume is 1. Version is 2.0. https://www.nist.gov/system/files/documents/el/isd/ks/NISTSP_1011-I-2-0.pdf
This is very funny -- this is a case of a "duplicated" SP 1075!! https://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication1075-NCNR.pdf https://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication1075-PML.pdf So we need to find a way to resolve this... argh. In this case, "1075-NCNR" is the docnumber. Will report this to NIST.
This means Part A, Revision 1. https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-131Ar1.pdf
"Version" is a supported element just like "Revision".
Addendum to SP 800-38 Part A.
Part 1.
As above.
Supplement.
https://nvlpubs.nist.gov/nistpubs/ams/NIST.AMS.300-8r1.pdf https://nvlpubs.nist.gov/nistpubs/ams/NIST.AMS.300-8r1-upd.pdf "INCLUDES UPDATES AS OF 02-08-2021". This is an "errata update". From https://github.com/metanorma/nist-pubid/blob/master/README.adoc#4-machine-readable-form , this applies:
|
@andrew2net I've updated nist-pubid's README to reflect these element changes, please check. UPDATE: I actually went through the full set of documents for all series (see metanorma/pubid-nist#4), so the PubID scheme should work. |
@ronaldtse I've tried to use the assigned numbers but some of them are duplicated. For example: |
@andrew2net do you mean that |
@ronaldtse I found UPDATE:
|
@andrew2net in this case can you create an issue at nist-pubid about that mistake? Thanks. |
@ronaldtse These references UPDATE |
@andrew2net I've moved your last comment to a new issue. Let's not stack up the requests in this issue 😉 |
@ronaldtse there are DOIs with language and the documents with the DOIs has translated titles. It seems PubID doesn't support languages. Instead we have language attribute within titles in our data model. So we need to collect all the title translations into one document, do we? |
@andrew2net we do not need to parse the set perfectly right now. Let’s make sure we have most done and then file additional issues. Relationships between translated documents are not important right now. We are in a hurry to have the first cut. |
@ronaldtse now we have 3 sources for NIST documents:
Is there a way to detect which source should be used for certain reference? |
We will only use 1 and 3 from now on. They will already represent the full information of all NIST publications. For a reference we will prioritize the information of 1 over 3. |
@ronaldtse it seems the 1 and 3 don't represent full information. For example |
@andrew2net interesting! In this case we should consider this a bug in 1. The results from 1 and 2 are supposed to be identical. I will report and revert. |
In any case, we will migrate to a full-data approach with NIST instead of using dynamic scraping. Please help proceed. |
NIST CSRC responded that endpoint 1 is now fixed. Thanks guys! |
There are two kinds of NIST bibdata:
We should synchronise this information daily into relaton-data-nist for easy citation.
For relaton-nist, if a document is found in the former, use it. Otherwise, search in the latter set.
The text was updated successfully, but these errors were encountered: