# How to create a bot

* Bot documentation: https://www.wikidata.org/wiki/Wikidata:Bots
* Bot password: https://www.mediawiki.org/wiki/Special:BotPasswords
* Bot approval: https://www.wikidata.org/wiki/Wikidata:Requests_for_permissions/Bot

## Use pywikibot

[pywikibot](https://www.mediawiki.org/wiki/Manual:Pywikibot) is a python library based on the Mediawiki API.
In this notebook we will see how to use the API using Python with pywikibot and lay the groundwork to later develop a bot or tool for Wikidata.

Use pywikibot for Wikidata:

- https://www.mediawiki.org/wiki/Manual:Pywikibot/Wikidata
- https://www.mediawiki.org/wiki/Manual:Pywikibot/Scripts#Wikidata
- https://www.wikidata.org/wiki/Wikidata:Creating_a_bot

If you want to setup pywikibot on your computer, check this tutorial: https://www.wikidata.org/wiki/Wikidata:Pywikibot_-_Python_3_Tutorial/Setting_up_Shop

**Quick steps:**

Create a new directory for your project
Clone pywikibot in this directory: git clone --recursive https://gerrit.wikimedia.org/r/pywikibot/core.git pywikibot
Run python generate_user_files.py to create user-config.py
Run python pwb.py login to login with your account


In [2]:
# install the dependencies
!pip install requests



In [3]:
import pywikibot
import requests
import csv
import time

In [4]:
# site = pywikibot.Site("test", "wikidata")
site = pywikibot.Site("wikidata", "wikidata")
repo = site.data_repository()

In [5]:
zurich = pywikibot.ItemPage(repo, 'Q72').get()
zurich

{'aliases': {'en': ['City of Zurich', 'Zurich', 'ZH'],
  'fr': ['Zürich', 'Zuerich', 'ville de Zurich'],
  'nl': ['Zurich'],
  'zh': ['蘇黎世'],
  'th': ['ซูริก'],
  'sq': ['Zyrih', 'Zyrihu'],
  'kk-cyrl': ['Цюрих қаласы'],
  'ta': ['ஜூரிக்'],
  'kn': ['ಜ್ಯೂರಿ\u200dಕ್'],
  'pl': ['Miasto Zurychu', 'ZH'],
  'ms': ['Zurich'],
  'es': ['Zurich'],
  'gl': ['Zürich'],
  'bho': ['ज्युरिख', 'जूरिख', 'ज़्यूरिख़', 'ज्यूरिष'],
  'sr': ['Град Цирих', 'Зурих', 'ZH']},
 'labels': {'en': 'Zürich',
  'fr': 'Zurich',
  'de': 'Zürich',
  'it': 'Zurigo',
  'ru': 'Цюрих',
  'es': 'Zúrich',
  'nl': 'Zürich',
  'pl': 'Zurych',
  'de-ch': 'Zürich',
  'gsw': 'Züri',
  'rm': 'Turitg',
  'en-ca': 'Zurich',
  'en-gb': 'Zurich',
  'af': 'Zürich',
  'pdc': 'Zurich',
  'da': 'Zürich',
  'eo': 'Zuriko',
  'lb': 'Zürich',
  'nn': 'Zürich',
  'sv': 'Zürich',
  'fi': 'Zürich',
  'uk': 'Цюрих',
  'am': 'ዙሪክ',
  'an': 'Zúrich',
  'ar': 'زيورخ',
  'arc': 'ܬܣܝܪܝܟ',
  'arz': 'زوريخ',
  'az': 'Sürix',
  'bar': 'Zirich',
  'be'

## Let's create a bot to keep the population counts up to date

The data is available from Open Data Zurich, in the dataset [Bevölkerung nach Stadtquartier, seit 1970](https://data.stadt-zuerich.ch/dataset/bev_bestand_jahr_quartier_seit1970_od3240).

In [7]:
result = requests.get(
    'https://data.stadt-zuerich.ch/api/3/action/package_show?id=bev_bestand_jahr_quartier_seit1970_od3240'
)
dataset = result.json()['result']
population_url = dataset['resources'][0]['url']
data = requests.get(population_url)
decoded_data = data.content.decode('utf-8')
cr = csv.reader(decoded_data.splitlines(), delimiter=',')
rows = list(cr)
for row in rows:
    print(row)

['\ufeff"StichtagDatJahr"', 'QuarSort', 'QuarLang', 'AnzBestWir']
['2018', '123', 'Hirzenbach', '12801']
['2017', '123', 'Hirzenbach', '12627']
['2016', '123', 'Hirzenbach', '12463']
['2015', '123', 'Hirzenbach', '11930']
['2014', '123', 'Hirzenbach', '11679']
['2013', '123', 'Hirzenbach', '11153']
['2012', '123', 'Hirzenbach', '11404']
['2011', '123', 'Hirzenbach', '11516']
['2010', '123', 'Hirzenbach', '11459']
['2009', '123', 'Hirzenbach', '11610']
['2008', '123', 'Hirzenbach', '11478']
['2007', '123', 'Hirzenbach', '11343']
['2006', '123', 'Hirzenbach', '11205']
['2005', '123', 'Hirzenbach', '11265']
['2004', '123', 'Hirzenbach', '11336']
['2003', '123', 'Hirzenbach', '11432']
['2002', '123', 'Hirzenbach', '11434']
['2001', '123', 'Hirzenbach', '11302']
['2000', '123', 'Hirzenbach', '11281']
['1999', '123', 'Hirzenbach', '11119']
['1998', '123', 'Hirzenbach', '11015']
['1997', '123', 'Hirzenbach', '11013']
['1996', '123', 'Hirzenbach', '10990']
['1995', '123', 'Hirzenbach', '10995'

In [8]:
# get rid of header
rows.pop(0)

['\ufeff"StichtagDatJahr"', 'QuarSort', 'QuarLang', 'AnzBestWir']

In [9]:
# use the sandbox items on wikidata.org
zurich_quarters = {
    '123': "Q13406268",
    '122': "Q15397819",
}

In [10]:
def load_item_from_repo(repo, item_id):
    item = pywikibot.ItemPage(repo, item_id)
    item.get()
    return item

In [11]:
def existing_claim_from_year(item, year):
    try:
        claims = item.claims['P1082']
        time_str = pywikibot.WbTime(year=year).toTimestr()
        for claim in claims:
            for qualifier_value in claim.qualifiers['P585']:
                if (qualifier_value.getTarget().toTimestr() == time_str):
                    return claim
    except KeyError:
        pass
    return None

In [17]:
def create_popultation_claim(site, repo, item, value, year, url):
    population_prop_id = 'P1082'
    time_prop_id = 'P585'
    url_prop_id = 'P854'
    
    print("Add new value: %s (%s)" % (value, year))
    print()
    
    # population claim
    population_claim = pywikibot.Claim(repo, population_prop_id)
    population_claim.setTarget(
        pywikibot.WbQuantity(amount=int(value), site=site)
    )
    item.addClaim(population_claim)

    # time qualifier
    qualifier = pywikibot.Claim(repo, time_prop_id)
    yearObj = pywikibot.WbTime(year=year)
    qualifier.setTarget(yearObj)
    population_claim.addQualifier(qualifier)

    # source
    source = pywikibot.Claim(repo, url_prop_id)
    source.setTarget(url)
    population_claim.addSource(source)
    
    print ("Added population claim to %s for year %d" % (item.id, year))

In [None]:
# Loop over CSV file
for row in rows:
    year = int(row[0])
    qnr = row[1]
    quarter = row[2]
    population_value = row[3]
    
    # load item
    try:
        item_id = zurich_quarters[qnr]
    except KeyError:
        print("No mapping for qnr %s found." % qnr)
        continue

    # check if we already have an existing claim
    try:
        item = load_item_from_repo(repo, item_id)
        population_claim = existing_claim_from_year(item, year)
        if (population_claim is None):
            # add a new statement
            create_popultation_claim(site, repo, item, population_value, year, population_url)
        else:
            print("Population claim already exists on %s for year %d, skipping" % (item_id, year))
    except pywikibot.data.api.APIError as e:
        print("API Error: %s" % (e))
        break