Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MIDAS Journal Translator #1030

Merged
merged 8 commits into from Apr 16, 2018
Merged

MIDAS Journal Translator #1030

merged 8 commits into from Apr 16, 2018

Conversation

str4w
Copy link
Contributor

@str4w str4w commented Mar 23, 2016

This is a translator for the journals hosted by Kitware - Insight Journal, VTK Journal, etc.

splitDownloadPath = downloadAllXPath.iterateNext().value.split('/');
version=splitDownloadPath[splitDownloadPath.length-1]
newItem.extra="Revision: " + version
newItem.issue=splitDownloadPath[splitDownloadPath.length-2]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All three lines above are missing a semicolon.

@zuphilip
Copy link
Contributor

Can you give some specific examples for your translator?

*/

function detectWeb(doc, url) {
if (url.match("browse/publication")) return "journalArticle";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is already guaranteed by the target regexp above. Since the only required field for translation is item.title, you probably want to check here that you can get the title.

@aurimasv
Copy link
Contributor

As @zuphilip says, please provide some examples and, preferably, add test pages using Scaffold.

@zuphilip zuphilip added the New Translator Pull requests for new translators label Nov 12, 2017
@zuphilip
Copy link
Contributor

zuphilip commented Jan 1, 2018

@str4w What is the status here?

@str4w
Copy link
Contributor Author

str4w commented Jan 2, 2018

Oops. Honestly, I assessed the number of comments at the time as "Too large to complete this week", and then I totally forgot about it after that. It is back on my radar, I will try to look at it in the near future.

@zuphilip
Copy link
Contributor

zuphilip commented Jan 2, 2018

Okay, thank you for the response. Can you start with the test cases?

I see that there is now also several export options, which we could also use instead of scraping from the website:

midas-export

We could use for example the structured data in the XML for the authors (full first names, split into first and last names, several authors are split). What do you think?

@adam3smith
Copy link
Collaborator

looking at this quickly, wouldn't it make sense to just import BibTeX or Medline (for which we already have import translators) and then add journal title and ISSN quickly based on the URL?

@zuphilip
Copy link
Contributor

zuphilip commented Jan 2, 2018

BibTeX does not look that good, e.g. for http://www.vtkjournal.org/browse/publication/880 we have

  Author	= "I. Herrera and C. Buchart and D. Borro",

Medline format could work also as there are also the full first names there, but publication title seems missing and maybe more. That is the reason I asked for examples...

@adam3smith
Copy link
Collaborator

right, publication titles seem to be missing everywhere -- that's why I thought we'd add them based on URL.

@str4w
Copy link
Contributor Author

str4w commented Apr 10, 2018

Sorry this dragged so long. This should be improved from the last iteration. Questions -

Looking at some of the other examples, sometimes they use
Zotero.
Z.
ZU.
what is the correct way, what are these globals?

More importantly, this would be cleaner and the data would be more complete if it could pull the exported XML (sadly, as the previous comments point out, that data is incomplete and we can't just use that for everything). To get that XML, need to make an HTTP post and parse the output - how does one do that - can the Zotero.doGet method do it?

@zuphilip zuphilip removed the waiting label Apr 15, 2018
@zuphilip
Copy link
Contributor

@str4w I worked directly on top of your work here and created a PR into your fork: str4w#1 . If you merge this PR then it will be automatically update here.

There are some special functions which you can use in coding Zotero translators and they either start with Zotero. (or short Z.) or Zotero.Utilities. (short ZU.). See also https://www.zotero.org/support/dev/translators/functions

We need a post call here and therfore I used Zotero.doPost. It is important then to continue inside this function and not afterwards. See my changes.

Update MIDAS and use XML data
@str4w
Copy link
Contributor Author

str4w commented Apr 15, 2018

Thanks for the explanations and improvements. I merged with my fork so it appears here.

@zuphilip
Copy link
Contributor

Okay, great. Let me know whether there is anything left from your point of view.

@str4w
Copy link
Contributor Author

str4w commented Apr 16, 2018

I'm running a few tests to be sure. A couple things - I see that you have added multiples. However, I tried this link
http://www.vtkjournal.org/browse/journal/53
and detectWeb did not detect it.

I tried this link http://www.insight-journal.org/browse/publication/921 and the doWeb returns the tags below. I don't understand the + and -. Is it normal? I am not actually using the autogenerated tags in Zotero, so this doesn't trouble me directly, just wondering.

         "tags": [
           {
     -       "tag": "Reproducibility"
     +       "tag": "IPython"
           }
           {
     -       "tag": "Tardigrades"
     +       "tag": "Reproducibility"
           }
           {
             "tag": "Scipy"
           }
           {
     -       "tag": "IPython"
     +       "tag": "Tardigrades"
           }
           {
     -       "tag": "notebook"
     +       "tag": "dexy"
           }
           {
     -       "tag": "dexy"
     +       "tag": "docker"
           }
           {
             "tag": "github"
           }
           {
     -       "tag": "docker"
     +       "tag": "notebook"
           }
         ]

@zuphilip
Copy link
Contributor

Okay, I added another path for this multiple with some filtering on the expected urls. Moroever, some clean-ups. Please merge again: str4w#2

Yes, what you see with the tags is normal. The differences which are indicated with the -/+ are just the ordering, which is performed alphabetically for the test cases.

@str4w
Copy link
Contributor Author

str4w commented Apr 16, 2018

I checked a bunch of articles, ran the tests, looks good. I added the ISSN tag, since it is available in many cases, why not have it?

Are you sure you want the article number to go into the pages field? If that is consistent with what other translators do then it would make sense, it just doesn't fit the literal definition of that field as I understand it.

Assuming the ISSN tag is ok with you and the page field for the article number was intentional, looks good to me, go for it.

@zuphilip
Copy link
Contributor

Are you sure you want the article number to go into the pages field? If that is consistent with what other translators do then it would make sense, it just doesn't fit the literal definition of that field as I understand it.

@adam3smith Do I remember correctly? Can you confirm (or correct) that saving an article id is best done in the pages field?

(I will merge it as soon as we clarified this.)

@adam3smith
Copy link
Collaborator

Do I remember correctly? Can you confirm (or correct) that saving an article id is best done in the pages field?

Yes, that's right

@zuphilip zuphilip merged commit d8bfa59 into zotero:master Apr 16, 2018
@zuphilip
Copy link
Contributor

Merged! 🎉 Thank you @str4w for the work! (You should reset your master branch now before any new work.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
New Translator Pull requests for new translators
Development

Successfully merging this pull request may close these issues.

None yet

4 participants