Skip to content

Extended documentation

MER-C edited this page Sep 7, 2018 · 3 revisions

This should be read in conjunction with the Javadoc and the source code.

Using the framework

A typical setup can look like this:

  1. Instantiate the framework.
  2. Set a user-agent per Wikimedia user agent policy.
  3. Log in as a bot with the bot flag. You need to go to Special:Botpasswords to get credentials.
  4. Require that the bot flag is checked with every operation.
  5. Any other set up e.g. setting a global query limit or marking all edits as minor and bot by default.

In code:

Wiki wiki = Wiki.createInstance("en.wikipedia.org");
wiki.setThrottle(5000);
try
{    
    wiki.login("ExampleBot@Description", password);
}
catch (FailedLoginException | IOException ex)
{
    // deal with failed login attempt
    ex.printStackTrace();
    System.exit(1);
}

// login dependent setup
wiki.setUserAgent("My Bot/1.0");
wiki.setAssertionMode(Wiki.ASSERT_BOT);
// ...

Reading and editing the source to finely meet your use case is encouraged. There are some configuration variables that can only be modified via source code (e.g. the number of times a failing request is retried).

General use

General use should be self-explanatory from reading the documentation. There is generally a one-to-one correspondence between something that you can do with MediaWiki in the browser and the methods available.

Tips and tricks

  • Some methods are vectorized (e.g. one can get the text of 50 pages with a single network request). A downside of fetching data for multiple pages or revisions is that if you are modifying the wiki, there is an increasing risk of the data becoming stale in the time between the data being fetched and the modifications being made. This could result in more edit conflicts.

  • For online tools, set a global query limit to prevent denial of service attacks.

  • It is considered good etiquette to use a single thread.

  • Some exceptions are fatal while others can be skipped over (e.g. a protected page). Therefore one might write:

try
{
   List<String> pages = ...;
   for (String page : pages)
   {
       try
       {
           enWiki.edit(page, replacementText, "Bot: this is an edit summary");
       }
       catch (CredentialException ex)
       {
           System.err.println("Skipping " + page + ": protected");
       }
   }
}
// these exceptions are fatal - we need to abandon the task
catch (SecurityException ex)
{
   // deal with trying to do something we can't
}
catch (CredentialExpiredException ex)
{
   // deal with the expiry of the cookies
}
catch (AccountLockedException ex)
{
   // deal with being blocked
}
catch (IOException ex)
{
   // deal with network error despite retries
}
Clone this wiki locally