-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support retrieval of Metrologia entries from IOP #2
Comments
e.g. DOI https://doi.org/10.1088/1681-7575/aa7b3f [9] Flowers-Jacobs N-E, Pollarolo A, Coakley J J, Fox A E, Rogalla H, Tew W L and Benz S P 2017 A Boltzmann constant determination based on Johnson noise thermometry Metrologia 54, 730-737 (8 pp) Gets forwarded to https://iopscience.iop.org/article/10.1088/1681-7575/aa7b3f |
We have many citations to Metrologia in Metanorma due to handling of BIPM documents. We need to support citation of Metrologia articles. If necessary we will need to update the Relaton BIPM bibdata model (or establish a new one for academic articles). |
@ronaldtse I'm unable to find Metrologia articles index or any search form. Do you have an idea of how to search Metrologia articles? |
I think we should have a syntax that fetches per:
The main Metrologia page is this: Issues in that Volume: The first issue in that list: https://iopscience.iop.org/issue/0026-1394/29/6 The first paper/article in that issue: Notice the citation writes this: I believe we can have two types of searches for this article:
For citing the Issue, we can do:
For citing the Volume, we can do:
For citing the Series, we can do:
|
Instead of scraping, we can also directly take the BibTeX export of that page: |
@ronaldtse I can guess how to map an article to BibModel but I have no idea how to map Issue, Volume, and Series. Do you have a suggestion? |
Volume = seriees/number |
@opoudjis as I understand Ronald means Volume, Issue, and Series to be separated documents. I'm asking what data can we map from the Volume, Issue, and Series pages to the BibliographicItem model? |
@andrew2net reports he has encountered rate-limiting via Captcha after several fetches. This is not appropriate for users who compile documents. I don't know whether it is surmountable using User-Agent (please try). Will also seek advice from BIPM. EDIT: have sought advice. Pending reply. |
@ronaldtse I've tried to use random User-Agent but it seems the opscience.iop.org allows only 6 requests per minute. After 2 minutes it starts redirecting to captcha. |
Got it. Let’s wait for BIPM’s response. |
The BIPM team has inquired with IOPP (the publisher) and they recommended the following:
Can you help implement the connection to CrossRef? Thanks. |
@ronaldtse yes I can. They ask for an email in HTTP requests. They need an email for contact us in case our script cause problems. Requests without email won't be redirected to more relaible servers. Do you have an email for this purpose? |
Let me ask them. It would be strange to use our email address when users (not us) are doing the requests. |
@andrew2net it seems that the email address is optional? Let's implement without the email first. Later on we can make a config option with Relaton CLI so users can set their own email address for CrossRef. |
@ronaldtse yes, it's optional but without an email, it will work slower https://github.com/CrossRef/rest-api-doc#good-manners--more-reliable-service |
@ronaldtse here is API status page https://status.crossref.org/#system-metrics you can see that "Polite API" average response time is about 1s while "Public API" averge response time is about 7s. |
7s!??!?!?!? Why don't we just use a random email address based on the IP address. require "net/http"
ip = Net::HTTP.get(URI("https://api.ipify.org"))
puts "My public IP Address is: " + ip Then sha256 it and truncate to 16 for the name. We can use relaton.org for the domain to indicate it is a Relaton request. i.e. "fa9514ae...@relaton.org". |
Anyway, the API works too slow. Only "OpenURL" and paid "Plus" services have an acceptable response time. I'll investigate OpenURL. |
I've sent this to BIPM, let's see what their response is. We’re now experimenting the CrossRef API, but it’s not ideal:
There is no mechanism to obtain exactly the Metrologia article unless the author provides the full title and authorship information. It is nearly impossible to locate a particular article with confidence.
Here’s a real example from the Candela definition MEP: NOTE: this reference actually has the wrong title — the correct title is "Predictable Quantum Efficient Detector II: Characterization and confirmed responsivity”, this has an effect on the resulting search. This is why auto-fetch is important — to mitigate authoring errors. The metadata attributes available here are: author, title, year, issue and page numbers. The intention with auto-fetching is to allow the author to enter minimal identifiable input (i.e. enough information to find this unique reference). e.g. However, the CrossRef API does not provide enough parameters to locate this information. In particular, CrossRef does not support search/filtering by volumes, issues, or page numbers. In order to use the CrossRef API, the author will be forced to provide the full title and some authorship information: Here are two attempts to find out if it works. Attempt 1 with author given titleThe best effort in finding this article in the CrossRef API is the following command: This means, “find items that match the following criteria":
And it returns 20 results, where the desired article is the 3rd. This query took 7 seconds. => Not possible to find article Attempt 2 with corrected titleSince the first attempt failed I did a search online and found the correct title, which is "Predictable quantum efficient detector: II. Characterization and confirmed responsivity”. Now we refine the command to: Now it returns 7 results, where the desired article is the 1st. This query took 10 seconds. => Works when author and title information are fully accurate. ConclusionThe CrossRef API is unable to facilitate location of a unique article with certainty because it only supports fuzzy search, and does not support searching by volume, issue or page numbers. It could only locate an article if and only if the article title and authorship information given is fully accurate, and it would return conflicting results when the title contains words that are also used in another article’s title. For example, these two citations will return ambiguous results, even though the volume, issue and years are vastly different: M G Cox, The evaluation of key comparison data, Metrologia, 39, 6, 589-595, 2002. M G Cox, The evaluation of key comparison data: determining the largest consistent subset, Metrologia, 44, 3, 2007. (both from the Kilogram definition MEP) |
In any case, I do think that we should support CrossRef separately in say relaton-crossref. There is also a Ruby client gem for CrossRef: https://github.com/sckott/serrano What do you think? |
@ronaldtse yes I used the serrano gem. |
@ronaldtse sine we have relaton-doi gem, which fetches documents from crossref.ogr, can we close this issue? |
@andrew2net we now have the full data set of Metrologia from BIPM. I will create a new issue and will close this one. |
Closing in favour of #28. |
No description provided.
The text was updated successfully, but these errors were encountered: