Skip to content

Latest commit

 

History

History
156 lines (123 loc) · 5.92 KB

README.md

File metadata and controls

156 lines (123 loc) · 5.92 KB

LinqToWiki

LinqToWiki is a library for accessing sites running MediaWiki (including Wikipedia) through the MediaWiki API from .Net languages like C# and VB.NET.

It can be used to do almost anything that can be done from the web interface and more, including things like editing articles, listing articles in categories, listing all kinds of links on a page and much more. Querying the various lists available can be done using LINQ queries, which then get translated into efficient API requests.

The library is strongly-typed, which means it should be hard to make invalid requests and it also makes it easy to discover available methods and properties though IntelliSense.

Because the API can vary from wiki to wiki, it's necessary to configure the library thorough an automatically generated assembly.

Downloads

Usage

Simple example

For example, to edit the Sandbox on the English Wikipedia anonymously, you can use the following:

var wiki = new Wiki("TheNameOfMyBot/1.0 (http://website, myemail@site)", "en.wikipedia.org");

// get edit token, necessary to edit pages
var token = wiki.tokens(new[] { tokenstype.edit }).edittoken;

// create new section called "Hello" on the page "Wikipedia:Sandbox"
wiki.edit(
    token: token, title: "Wikipedia:Sandbox", section: "new", sectiontitle: "Hello", text: "Hello world!");

As you can see, in methods like this, you should use named parameters, because the edit() method has lots of them, and you probably don't need them all.

The code looks more convoluted than necessary (can't the library get the token for me?), but that's because it's all generated automatically.

Queries

Where LINQ to Wiki really shines, though, are queries: If you wanted to get the names of all pages in Category:Mammals of Indonesia, you can do:

var pages = (from cm in wiki.Query.categorymembers()
             where cm.title == "Category:Mammals of Indonesia"
             select cm.title)
            .ToEnumerable();
List of mammals of Indonesia
Mammals of Borneo
Agile gibbon
Andrew's Hill Rat
Anoa
…

The call to ToEnumerable() (or, alternatively, ToList()) is necessary, so that LINQ to Wiki methods don't get mixed up with LINQ to Objects methods, but the result is now an ordinary IEnumerable<string>.

Well, actually, you want the list sorted backwards (maybe you want to know whether there are any Indonesiam mammals whose name starts with Z):

var pages = (from cm in wiki.Query.categorymembers()
              where cm.title == "Category:Mammals of Indonesia"
              orderby cm descending 
              select cm.title)
    .ToEnumerable();
Wild water buffalo
Wild boar
Whitish Dwarf Squirrel
Whitehead's Woolly Bat
White-thighed surili
…

Hmm, no luck with the Z. Okay, can I get the first section of those articles? This is where things start to get more comlicated. If you were using the API directly, you would have to use generators. LINQ to Wiki can handle that for you, but since generators are quite powerful, you have to do something like this:

var pages = (from cm in wiki.Query.categorymembers()
             where cm.title == "Category:Mammals of Indonesia"
             orderby cm descending
             select cm)
    .Pages
    .Select(
        page =>
        new
        {
            title = page.info.title,
            text = page.revisions()
                .Where(r => r.section == "0")
                .Select(r => r.value)
                .FirstOrDefault()
        })
    .ToEnumerable();
Wild water buffalo
{{About|the wild species|the domestic livestock varieties descended from it|water buffalo}}

{{Taxobox|…}}

The '''wild water buffalo''' (''Bubalus arnee''), also called '''Asian buffalo''' and '''Asiatic buffalo''',
is a large [[bovinae|bovine]] native to [[Southeast Asia]].  …

This deserves some explanation. When you use Pages to access more information about the pages in some list, you then call Select() to choose what exactly do you want to know. In that Select(), you can use info for basic information about the page, like its name, ID or whether you are watching it. Then there are several lists, including revisions(). You can again use LINQ methods to alter this part of the query. For example, I want only the first section (Where(r => r.section == "0")), I want to select the text of the revision (here called “value”, Select(r => r.value)) and only for the first (latest) revision (FirstOrDefault()).

For examples of almost all methods in LINQ to Wiki, have a look at the LinqToWiki.Samples project.

Developer documentation

If you want to modify this code (patches are welcome) or just have a look at the implementation, here is a short overview of the projects (more details are in the project directories):

  • LinqToWiki.Core – The core of the library. This project is referenced by all other projects and contains types necessary for acessing the API, processing LINQ expressions, etc.
  • LinqToWiki.Samples – Samples of code that uses this library.