Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wiki.java FAQ/TODO/whinge list and thoughts #154

Open
2 of 41 tasks
MER-C opened this issue Mar 5, 2018 · 3 comments
Open
2 of 41 tasks

Wiki.java FAQ/TODO/whinge list and thoughts #154

MER-C opened this issue Mar 5, 2018 · 3 comments

Comments

@MER-C
Copy link
Owner

MER-C commented Mar 5, 2018

Pull requests are welcome on some of these, please ask for my thoughts first.

MediaWiki annoyances and wishlist

a.k.a. why can't I do X?

Missing features

Vectorization

  • getLastRevision
  • getDeletedText: adding more titles gives all deleted revisions in those pages...
  • getDeletedHistory (do after reverse is culled)
  • getDeletedRevisions
  • getBlockList (users only). Add filters and support IPs properly.
  • Go wide in parse() to fetch the parsed text, original wikitext, wikilinks, categories, external links, sections and templates, all at the same time. This could be the base of a Page object (yes, I finally have meaningful data to put in there.)
  • Make text a field of Revision, getRevisions fetch text optionally and Revision.getText lazy loading of text (with a warning that it shouldn't be used in loops).

General FIXMEs

  • parse+diff: missingtitle is a generic error messages that represent real unrecoverable assertion errors in other methods.
  • LogEntry details handling (Refactor LogEntry details to a HashMap #126)
  • Simplify site info caching.

Deprecated API removal

  • Change signatures of parse and diff. Deprecate some trampolines.

WMF specific

Utilities

  • CSV export (revisions, log entries, user info?, page info?)
  • Diff parsing -- refactoring? I'd like to see machine readable diffs first.
  • Export of tabular data to wiki table (may be useful, just an idea at the moment)
  • LogEntry -> wikitext table

Tools

  • Explore stuff in paid for spam
  • UserLinkAdditionFinder: servlet version -- limited to one user per request if useful.
  • UserLinkAdditionFinder/CCIAnalyzer: do not return links or analyze text that was already there. Requires diff parsing refactoring.
  • CCIAnalyzer: aggressive mode (Add aggressive mode to CCIAnalyzer #97)
  • Transition user watchlist into a generic mass contribution fetcher, particularly from categories and lists of users. The tool should support new pages only (for spam sockfarms).
  • ContributionSurveyor: split long surveys into multiple text files, 2000 articles per file, and serve them ZIPped.
  • AdminStats: protections.
  • AdminStats: writeup and plots.

Non-problems and implementation notes

  • Why is X (e.g. assertion modes, log types, namespaces) not implemented as an Enum? MediaWiki has a large library of extensions, each extension may add more possible values. Furthermore, the site owner may add other possible values (e.g. more namespaces). Wiki.java only covers MediaWiki as shipped with no extensions.
@MER-C MER-C changed the title Wiki.java TODO/whinge list and thoughts Wiki.java FAQ/TODO/whinge list and thoughts Mar 6, 2018
@Abbe98
Copy link

Abbe98 commented Feb 10, 2022

I was wondering if anyone else got a use case for having the two upload methods support warnings. Turning ignorewarnings into a method parameter and letting the methods return a list of warnings if it's not enabled.

https://www.mediawiki.org/wiki/API:Upload#Upload_warnings

This would be very useful for Pattypan as it could serve as a neat solution to detect duplicates and raise other potential issues.

@MER-C
Copy link
Owner Author

MER-C commented Feb 13, 2022

I did the obvious thing with fd9b2dd.

@Abbe98
Copy link

Abbe98 commented Feb 13, 2022

I'm afraid fd9b2dd won't solve it on our end as we want to detect duplicates prior to or during the upload. We can solve it by just sending the hash to the WM-API but I imagine more upload tools could benefit for a way to get upload warnings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants
@Abbe98 @MER-C and others