Set up citoid proxy endpoint#699
Conversation
|
One possible hiccup that I'm not sure affects this change in a bug upstream in citoid master: bibtex is currently getting returned as a Buffer for some reason and not plain text. In tests, I get Buffer from bibtex requests. Requests behave fine with curl. The browser doesn't seem to handle it well. https://phabricator.wikimedia.org/T115271 Also, this seems to be missing a query param, or does this all go into /{query}/ somehow? There's an additional param called basefields, documented here: https://www.mediawiki.org/wiki/Citoid/API#Arguments |
|
@mvolz Just tested the buffer issue, it seems to work fine. Also added the Also, fyi, in the current version of the patch nothing is stored in RESTBase, the requests are just proxied. |
|
I would hold off on adding the baseFields parameter until we understand how it's used, and whether it could be folded into |
|
Yeah, actually we could just roll it into format; basefields is only a valid param for mediawiki format, so we could proxy mediawikibasefields into basefields = 1 and then just mediawiki to leave off basefields param. |
|
Naming ideas for the |
|
The addition of the base fields is meant to be temporary for backwards compatibility issues. In the future, it should be only base fields replacing the type specific fields. There's not really a completeness element to this. |
| method: get | ||
| uri: '{{options.host}}/api?format={format}&search={query}' | ||
| headers: | ||
| accept-language: '{{accept-language}}' |
There was a problem hiding this comment.
If the user doesn't provide the header, we should default to the assumed language based on the domain (en, fr, etc)
There was a problem hiding this comment.
Hm, that's not super easy actually.. For simple stuff like en of fr that's trivial, but there're a whole bunch of exceptions, like simple or mediawiki.org.
So ideally we would want to fetch the wiki language from the MediaWiki SiteInfo api general.lang property - it won't be a huge overhead since all of the siteInfo requests are cached in memory, so we normally have all of the config locally. However, siteInfo fetch returns a promise, and we don't support promises in the functions we call from the template. So, I will need to add support for promises there before we can proceed with this feature.
Another option seem a bit hackier but way simpler to me - create a /sys endpoint in RESTBase that would expose the siteInfo by domain, and then fetch it in a step prior to the execution of the actual request and transclude the default from there.
I'm on the edge here - adding promise support is hard and it might have perfomance implications on the Template code which would affect how fast everything works, adding a /sys endpoint is quite hacky, but certainly easier.
There was a problem hiding this comment.
Why is the other option hackier? I think an endpoint like this might be useful for other services, too. I doubt it would need all of siteInfo. If we start with the minimal properties required that would be enough for a start. More could be added later if needed.
There was a problem hiding this comment.
@berndsi That would be an internal RESTBase-only endpoint, we have no intension to expose it publicly or to other services (right now, we may change the opinion if that feels useful)
Why is that more hacky is because normally client would set the accept-language header, so the information obtained from that endpoint would be useful only on a small percentage of requests, but our current tech powering the request templates would require us to fetch it regardless of whether the accept-language was provided or not.
There was a problem hiding this comment.
+1 for exposing the site info via /sys. Agreed that it's a bit hacky, but it will certainly be useful and much easier than the alternative.
Perhaps |
|
Another option: I have to say that to an outsider, the purposes of all these very similar format options appear rather vague. The docs really need to explain why you would use format a, b, or c, and what the trade-offs are. If we struggle to find a strong reason for using one of those formats over the others, then that could be a strong hint for deprecating or even outright removing it. Less is often more. |
I don't have any experience with these formats, so some help from @mvolz on the docs side would be much appreciated. |
|
There is some documentation here: Zotero and bibtex are standard formats used by other people and software. Mediawiki uses the same field names as Zotero for convenience. Zotero has MWDeprecated was a very early format that we discarded once we made changes There is actually an even larger set of formats to export from would be On Wed, Nov 2, 2016 at 6:03 PM, Petr Pchelko notifications@github.com
|
|
I've updated the PR trying to resolve all the comments. Because the |
|
Prod config patchset is here. |
| get: | ||
| tags: | ||
| - Citation | ||
| summary: Get citation data given an article identifyer. |
There was a problem hiding this comment.
Identifier is misspelled :).
As the first step and the beginning of the discussion, set up a simple proxying Citoid endpoint in RESTBase.
Bug: https://phabricator.wikimedia.org/T108646
cc @wikimedia/services