Skip to content

Commit

Permalink
*Clarify the (lack of) dependencies.
Browse files Browse the repository at this point in the history
*Add methods for fetching the list of extensions installed on a wiki,
its timezone and locale.
*Readd specific methods for getting site properties. 8d597d8 was a bad
idea. Deprecate getSiteInfo.
*Add new method WMFWiki.requiresExtension and use it.
*Add test for WMFWiki.getGlobalUsage.
  • Loading branch information
MER-C committed May 15, 2018
1 parent 5ef9938 commit 1094386
Show file tree
Hide file tree
Showing 9 changed files with 240 additions and 34 deletions.
14 changes: 9 additions & 5 deletions Contributing.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,18 @@
## Design philosophy

Wiki.java is a bot framework contained entirely within one file, with minimal
(currently no) dependencies. Only vanilla MediaWiki is supported in Wiki.java.
Any WMF specific stuff ([Echo](https://mediawiki.org/wiki/Extension:Echo),
Wiki.java is a bot framework contained entirely within one file, with no dependencies.
Only vanilla MediaWiki is supported in Wiki.java. Any WMF specific stuff
([Echo](https://mediawiki.org/wiki/Extension:Echo),
[GlobalUsage](https://mediawiki.org/wiki/Extension:GlobalUsage), etc.) should go
to WMFWiki.java.
to WMFWiki.java. Any support for extensions not on WMF sites will go in separate
classes.

Please do not add any dependencies on Java libraries or MediaWiki extensions to
any class without asking first.

Servlets currently run on [Google App Engine](https://cloud.google.com/appengine/docs).
They should not use any Google-specfic classes and fit entirely within the free
quotas.
quotas. There is a soft cap of 80 network requests per servlet invocation.

## Tests

Expand Down
15 changes: 15 additions & 0 deletions Readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,21 @@ modules are required.
Latest stable version: [0.34](https://github.com/MER-C/wiki-java/releases/tag/0.34) --
MediaWiki versions 1.28+

## Dependencies

| Class/Package | Java | MediaWiki extensions |
| ------------------------ |-------------- | -------------------- |
| org.wikipedia.Wiki | None | None |
| org.wikipedia.WMFWiki | None | As indicated. Works on WMF sites. |
| org.wikipedia.*Utils | None | None |
| org.wikipedia.tools.* | None | See below. |
| org.wikipedia.servlets.* | javax.servlet | See below. |

Note: Some tools and servlets are hardcoded to work on WMF sites only, and in
some cases for just the English Wikipedia. (Some tools solve en.wp specific
problems). They should all work on WMF sites. If you would like tool coverage
for your wiki, please file a bug report.

## Bug reports

Bug reports may be filed in the Issue tracker or at [my talk page](https://en.wikipedia.org/wiki/User_talk:MER-C).
Expand Down
39 changes: 33 additions & 6 deletions src/org/wikipedia/WMFWiki.java
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,7 @@ public static WMFWiki createInstance(String domain)
public static WMFWiki[] getSiteMatrix() throws IOException
{
WMFWiki wiki = createInstance("en.wikipedia.org");
wiki.requiresExtension("SiteMatrix");
wiki.setMaxLag(0);
Map<String, String> getparams = new HashMap<>();
getparams.put("action", "sitematrix");
Expand All @@ -120,21 +121,42 @@ public static WMFWiki[] getSiteMatrix() throws IOException
temp.log(Level.INFO, "WMFWiki.getSiteMatrix", "Successfully retrieved site matrix (" + size + " + wikis).");
return wikis.toArray(new WMFWiki[size]);
}

/**
* Require the given extension be installed on this wiki, or throw an
* UnsupportedOperationException if it isn't.
* @param extension the name of the extension to check
* @throws UnsupportedOperationException if that extension is not
* installed on this wiki
* @throws UncheckedIOException if the site info cache is not populated
* and a network error occurs when populating it
* @see Wiki#installedExtensions
*/
public void requiresExtension(String extension)
{
if (!installedExtensions().contains(extension))
throw new UnsupportedOperationException("Extension \"" + extension
+ "\" is not installed on " + getDomain() + ". "
+ "Please check the extension name and [[Special:Version]].");
}

/**
* Get the global usage for a file (requires extension GlobalUsage).
* Get the global usage for a file.
*
* @param title the title of the page (must contain "File:")
* @return the global usage of the file, including the wiki and page the file is used on
* @throws IOException if a network error occurs
* @throws UnsupportedOperationException if <code>{@link Wiki#namespace(String)
* @throws IllegalArgumentException if <code>{@link Wiki#namespace(String)
* namespace(title)} != {@link Wiki#FILE_NAMESPACE}</code>
* @throws UnsupportedOperationException if the GlobalUsage extension is
* not installed
* @see <a href="https://mediawiki.org/wiki/Extension:GlobalUsage">Extension:GlobalUsage</a>
*/
public String[][] getGlobalUsage(String title) throws IOException
{
requiresExtension("Global Usage");
if (namespace(title) != FILE_NAMESPACE)
throw new UnsupportedOperationException("Cannot retrieve Globalusage for pages other than File pages!");
throw new IllegalArgumentException("Cannot retrieve Globalusage for pages other than File pages!");

Map<String, String> getparams = new HashMap<>();
getparams.put("prop", "globalusage");
Expand All @@ -154,14 +176,17 @@ public String[][] getGlobalUsage(String title) throws IOException

/**
* Determines whether a site is on the spam blacklist, modulo Java/PHP
* regex differences (requires extension SpamBlacklist).
* regex differences.
* @param site the site to check
* @throws IOException if a network error occurs
* @return whether a site is on the spam blacklist
* @throws IOException if a network error occurs
* @throws UnsupportedOperationException if the SpamBlacklist extension
* is not installed
* @see <a href="https://mediawiki.org/wiki/Extension:SpamBlacklist">Extension:SpamBlacklist</a>
*/
public boolean isSpamBlacklisted(String site) throws IOException
{
requiresExtension("SpamBlacklist");
if (globalblacklist == null)
{
WMFWiki meta = createInstance("meta.wikimedia.org");
Expand Down Expand Up @@ -208,11 +233,13 @@ public boolean isSpamBlacklisted(String site) throws IOException
* null to skip)
* @return the abuse filter log entries
* @throws IOException or UncheckedIOException if a network error occurs
* @throws UnsupportedOperationException if the AbuseFilter extension
* is not installed
* @see <a href="https://mediawiki.org/wiki/Extension:AbuseFilter">Extension:AbuseFilter</a>
*/
public List<LogEntry> getAbuseLogEntries(int[] filters, String user, String title, OffsetDateTime earliest, OffsetDateTime latest) throws IOException
{
// WARNING: don't use a BotPassword for this! See https://phabricator.wikimedia.org/T191703
requiresExtension("Abuse Filter");
Map<String, String> getparams = new HashMap<>();
getparams.put("list", "abuselog");
if (filters.length > 0)
Expand Down
114 changes: 102 additions & 12 deletions src/org/wikipedia/Wiki.java
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,9 @@
* Requires JDK 1.8 or greater. Uses the <a
* href="https://mediawiki.org/wiki/API:Main_page">MediaWiki API</a> for most
* operations. It is recommended that the server runs the latest version
* of MediaWiki (1.31), otherwise some functions may not work.
* of MediaWiki (1.31), otherwise some functions may not work. This framework
* requires no dependencies outside the core JDK and does not implement any
* functionality added by MediaWiki extensions.
* <p>
* Extended documentation is available
* <a href="https://github.com/MER-C/wiki-java/wiki/Extended-documentation">here</a>.
Expand All @@ -49,7 +51,7 @@
* </p>
* Please file bug reports <a href="https://en.wikipedia.org/wiki/User_talk:MER-C">here</a>
* or at the <a href="https://github.com/MER-C/wiki-java/issues">Github issue
* tracker</a>.
* tracker</a>.
*
* @author MER-C and contributors
* @version 0.34
Expand Down Expand Up @@ -461,19 +463,21 @@ public enum Gender
protected String query;

// wiki properties
private boolean siteinfofetched = false;
private boolean wgCapitalLinks = true;
private String mwVersion;
private ZoneId timezone = ZoneId.of("UTC");
private Locale locale = Locale.ENGLISH;
private List<String> extensions = Collections.emptyList();
private LinkedHashMap<String, Integer> namespaces = null;
private ArrayList<Integer> ns_subpages = null;

// user management
private final CookieManager cookies = new CookieManager(null, CookiePolicy.ACCEPT_ALL);
private User user;
private int statuscounter = 0;

// various caches
private Map<String, Object> siteinfo = null;
private LinkedHashMap<String, Integer> namespaces = null;
private ArrayList<Integer> ns_subpages = null;
// watchlist cache
private List<String> watchlist = null;

// preferences
Expand Down Expand Up @@ -807,15 +811,19 @@ public void setThrottle(int throttle)
* @return (see above)
* @since 0.30
* @throws IOException if a network error occurs
* @deprecated This method is likely going to get renamed with the return
* type changed to void once I finish cleaning up the site info caching
* mechanism. Use the specialized methods instead.
*/
@Deprecated
public synchronized Map<String, Object> getSiteInfo() throws IOException
{
if (siteinfo == null)
Map<String, Object> siteinfo = new HashMap<>();
if (!siteinfofetched)
{
siteinfo = new HashMap<>();
Map<String, String> getparams = new HashMap<>();
getparams.put("meta", "siteinfo");
getparams.put("siprop", "namespaces|namespacealiases|general");
getparams.put("siprop", "namespaces|namespacealiases|general|extensions");
String line = makeHTTPRequest(query, getparams, null, "getSiteInfo");

// general site info
Expand All @@ -825,10 +833,19 @@ public synchronized Map<String, Object> getSiteInfo() throws IOException
siteinfo.put("scriptpath", scriptPath);
timezone = ZoneId.of(parseAttribute(bits, "timezone", 0));
siteinfo.put("timezone", timezone);
siteinfo.put("version", parseAttribute(bits, "generator", 0));
mwVersion = parseAttribute(bits, "generator", 0);
siteinfo.put("version", mwVersion);
locale = new Locale(parseAttribute(bits, "lang", 0));
siteinfo.put("locale", locale);


// parse extensions
bits = line.substring(line.indexOf("<extensions>"), line.indexOf("</extensions>"));
extensions = new ArrayList<>();
String[] unparsed = bits.split("<ext ");
for (int i = 1; i < unparsed.length; i++)
extensions.add(parseAttribute(unparsed[i], "name", 0));
siteinfo.put("extensions", extensions);

// populate namespace cache
namespaces = new LinkedHashMap<>(30);
ns_subpages = new ArrayList<>(30);
Expand Down Expand Up @@ -857,9 +874,82 @@ public synchronized Map<String, Object> getSiteInfo() throws IOException
ns_subpages.add(ns);
}
initVars();
siteinfofetched = true;
log(Level.INFO, "getSiteInfo", "Successfully retrieved site info for " + getDomain());
}
return new HashMap<>(siteinfo);
return siteinfo;
}

/**
* Gets the version of MediaWiki this wiki runs e.g. 1.20wmf5 (54b4fcb).
* See [[Special:Version]] on your wiki.
* @return (see above)
* @throws UncheckedIOException if the site info cache has not been
* populated and a network error occurred when populating it
* @since 0.14
* @see <a href="https://gerrit.wikimedia.org/">MediaWiki Git</a>
*/
public String version()
{
ensureNamespaceCache();
return mwVersion;
}

/**
* Detects whether a wiki forces upper case for the first character in a
* title. Example: en.wikipedia = true, en.wiktionary = false.
* @return (see above)
* @throws UncheckedIOException if the site info cache has not been
* populated and a network error occurred when populating it
* @see <a href="https://mediawiki.org/wiki/Manual:$wgCapitalLinks">MediaWiki
* documentation</a>
* @since 0.30
*/
public boolean usesCapitalLinks()
{
ensureNamespaceCache();
return wgCapitalLinks;
}

/**
* Returns the list of extensions installed on this wiki.
* @return (see above)
* @throws UncheckedIOException if the site info cache has not been
* populated and a network error occurred when populating it
* @see <a href="https://www.mediawiki.org/wiki/Manual:Extensions">MediaWiki
* documentation</a>
* @since 0.35
*/
public List<String> installedExtensions()
{
ensureNamespaceCache();
return new ArrayList<>(extensions);
}

/**
* Gets the timezone of this wiki
* @return (see above)
* @throws UncheckedIOException if the site info cache has not been
* populated and a network error occurred when populating it
* @since 0.35
*/
public ZoneId timezone()
{
ensureNamespaceCache();
return timezone;
}

/**
* Gets the locale of this wiki.
* @return (see above)
* @throws UncheckedIOException if the site info cache has not been
* populated and a network error occurred when populating it
* @since 0.35
*/
public Locale locale()
{
ensureNamespaceCache();
return locale;
}

/**
Expand Down
9 changes: 9 additions & 0 deletions src/org/wikipedia/package-info.java
Original file line number Diff line number Diff line change
Expand Up @@ -21,5 +21,14 @@
/**
* A collection of MediaWiki/Wikimedia related utilities, including a rather
* sketchy bot framework that consists of only one file.
*
* <p>
* This package does not, and will not, have any dependencies outside of the
* core JDK. Only the java.base and java.logging modules are required.
*
* <p>
* All methods should work on a vanilla installation of MediaWiki with the
* exception of {@link WMFWiki}. Required extension(s) for any given method
* are denoted in the documentation.
*/
package org.wikipedia;
5 changes: 5 additions & 0 deletions src/org/wikipedia/servlets/package-info.java
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,11 @@
* backends for these tools reside in {@link org.wikipedia.tools}. The tools
* are hosted on <a href="https://wikipediatools.appspot.com">
* wikipediatools.appspot.com</a> (there is no dependence on any Google API).
*
* <p>
* As the name suggests, this package requires a compliant implementation of
* the <a href="https://jcp.org/en/jsr/detail?id=369">Java servlet API.</a>
* There are no other dependencies other than the core JDK.
*
* @see <a href="https://wikipediatools.appspot.com">wikipediatools.appspot.com</a>
* @see org.wikipedia.tools
Expand Down
9 changes: 9 additions & 0 deletions src/org/wikipedia/tools/package-info.java
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,15 @@
* as backends for the online tools at <a href="https://wikipediatools.appspot.com">
* wikipediatools.appspot.com</a>.
*
* <p>
* This package does not, and will not, have any dependencies outside of the
* core JDK. Only the java.base, java.logging and java.desktop modules are
* required.
*
* <p>
* It should be noted that many of these tools have hard-coded URLs that
* point to WMF wikis, and in particular, the English Wikipedia.
*
* @see <a href="https://wikipediatools.appspot.com">Online versions of these
* programs</a>
* @see org.wikipedia.servlets
Expand Down
32 changes: 32 additions & 0 deletions test/org/wikipedia/WMFWikiTest.java
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,38 @@ public void getLogEntries() throws Exception
}
}

@Test
public void requiresExtension()
{
// https://en.wikipedia.org/wiki/Special:Version
enWiki.requiresExtension("SpamBlacklist");
enWiki.requiresExtension("CheckUser");
enWiki.requiresExtension("Abuse Filter");
try
{
enWiki.requiresExtension("This extension does not exist.");
fail("Required a non-existing extension.");
}
catch (UnsupportedOperationException expected)
{
}
}

@Test
public void getGlobalUsage() throws Exception
{
try
{
enWiki.getGlobalUsage("Not an image");
fail("Tried to get global usage for a non-file page");
}
catch (IllegalArgumentException expected)
{
}
// YARR!
assertEquals("getGlobalUsage: non-existing file", 0, enWiki.getGlobalUsage("File:Pirated Movie Full HD Stream.mp4").length);
}

/**
* Test fetching the abuse log. This is a semi-privileged action that
* requires the test runner to be not blocked (even though login is not
Expand Down
Loading

0 comments on commit 1094386

Please sign in to comment.