Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query limit inconsistency and required continuation in Wiki.getPageInfo #161

Closed
PeterBowman opened this issue Sep 17, 2018 · 0 comments
Closed

Comments

@PeterBowman
Copy link
Contributor

PeterBowman commented Sep 17, 2018

Per Wiki.getPageInfo(String[] pages):

getparams.put("action", "query");
getparams.put("prop", "info");
getparams.put("inprop", "protection|displaytitle|watchers");
getparams.put("intestactions", "create|delete|edit|move|protect|rollback|undelete|upload");

The testactions parameter is a recent addition (ebc1c82). It turns out to be regarded as an expensive query by MW API since the total number of tested actions counts towards the API slow limit (i.e. 50 results for regular users, 500 for bots).

Current behavior:

  • Regular users: getPageInfo(String[]) splits the page list in chunks of 50 titles. Works fine.
  • High-limit users: this method processes batches of 500 titles. Throws a StringIndexOutOfBoundsException while parsing the next <page> element.

Due to a bug in MW API, the 50-results limit was being applied on bot accounts while regular users could query 500 results at once. It should be the other way around. With said testactions parameter, a bot could not query more than 7 titles (8 actions * 7 titles > 50) in one batch, hence a continuation parameter is generated and any result beyond the 7th is returned by the API as <page ... /> instead of a full <page>...props...</page> element. Incidentally, Wiki.java is not prepared for continuation queries in this case nor for missing </page> tags. An exception is generated due to the latter at line 1638.

The bug was solved in patch 460886 and will be deployed in production WMF wikis soon.

New behavior (once MW master branch hits production):

  • Regular users: getPageInfo(String[]) will throw an exception if more than 7 pages are passed on to this method. See reasons above.
  • Bot users: an exception will be thrown for input arrays of 76+ page titles (8 * 75 = 500).

Proposed solutions:

  1. Handle continuation parameters in getPageInfo(String[]).
  2. Remove testactions from getPageInfo(String[]), but keep it for single page queries (getPageInfo(String)).
  3. Factor out testactions into a separate method that will use makeListQuery.

IMO testactions is not suitable for vectorized queries. Being able to query no more than 7 titles at once instead of 50 is a severe drawback wrt the previous implementation of this method.

@MER-C MER-C closed this as completed in 48268ff Sep 22, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant