# Populate the portal using the Wikibase API
Goal of this notebook: Import the data prepared in `filter_papers_by_software.ipynb` into the data structure setup in `import_wikidata_properties.ipynb`.

The API documentation is [here](https://www.wikidata.org/w/api.php?action=help&modules=wbeditentity)

In [71]:
# import common definitions and functions
%run WB_common.ipynb

## Login to the Wikibase

### Network settings
* Make sure the wikibase is running, e.g. using [MaRDI4NFDI/portal-compose](https://github.com/MaRDI4NFDI/portal-compose)
* Make sure this jupyter notebook is in the same network as the wiki. This is done in docker-compose

```
networks:
  default:
    external: true
    name: portal-compose_default
```

Networks can be listed using `docker network ls`. Here, "portal-compose_default" is the name of the network started by portal-compose.
* Verify that this notebook is in the correct network `docker network inspect portal-compose_default`

The wiki is then accessible from the notebook container at `http://mardi-wikibase`.

In [181]:
import requests
import json 
import configparser

# url of the API endpoint
WIKIBASE_API = 'http://mardi-wikibase/w/api.php?format=json'

def login(username, botpwd):
    """
    Starts a new session and logins using a bot account.
    @username, @botpwd string: credentials of an existing bot user
    @returns requests.sessions.Session object
    """
    # create a new session
    session = requests.Session()

    # get login token
    r1 = session.get(WIKIBASE_API, params={
        'format': 'json',
        'action': 'query',
        'meta': 'tokens',
        'type': 'login'
    })
    # login with bot account
    r2 = session.post(WIKIBASE_API, data={
        'format': 'json',
        'action': 'login',
        'lgname': username,
        'lgpassword': botpwd,
        'lgtoken': r1.json()['query']['tokens']['logintoken'],
    })
    # raise when login failed
    if r2.json()['login']['result'] != 'Success':
        raise WBAPIException(r2.json()['login'])
        
    return session

### Credentials
* Login to the wiki as admin
* Go to Special:BotPasswords, create a bot user, call it "import", grant it "High-volume editing", "Edit existing pages", "Create, edit, and move pages"
* Copy `data/credentials.tpl` to `data/credentials.ini`. Replace the username and password by those of the newly created bot user (make sure not to commit this file)

In [183]:
# read bot username and password from data/credentials.ini
config = configparser.ConfigParser()
config.sections()
config.read('data/credentials.ini')
username = config['default']['username']
botpwd = config['default']['password']

session = login(username, botpwd)

## Create a wikibase property
A function that creates a new wikidatabase property and returns the new id.

If the property label already exists in the wiki, will not overwrite it, but raise an error.

In [173]:
def get_csrf_token(session):
    """Gets a security (CSRF) token."""
    params1 = {
        "action": "query",
        "meta": "tokens",
        "type": "csrf"
    }
    r1 = session.get(WIKIBASE_API, params=params1)
    token = r1.json()['query']['tokens']['csrftoken']

    return token
    

def create_property(session, data):
    """
    Creates a wikibase property.
    @session requests.sessions.Session: session obtained from login 
    @data python dict: creation parameters of the property
    @returns string: id of the new property
    """
    token = get_csrf_token(session)
    
    params = {
        "action": "wbeditentity",
        "format": "json",
        'new': 'property',
        'data': json.dumps(data),
        'token': token
    }
    r1 = session.post(WIKIBASE_API, data=params)
    r1.json = r1.json()
    
    # raise when edit failed
    if 'error' in r1.json.keys():
        raise WBAPIException(r1.json['error'])

    return r1.json['entity']['id']

For example create a property with these parameters will return an id-string in the form 'Px' (where x is a number).

The property can be seen in the wiki under `http:localhost:8080/wiki/Property:Px`

In [174]:
data = {"labels":{"en":{"language":"en","value":"Propertylabel9"}},"descriptions":{"en":{"language":"en","value":"Propertydescription"}},"datatype":"string"}
create_property(session, data)

'P15'

## Create a wikibase entity
A function that creates a new wikidatabase entity (item) and returns the new id.

If the entity label already exists in the wiki, will not overwrite it, but create a new entity.

In [175]:
def create_entity(session, data):
    """
    Creates a wikibase entity.
    @session requests.sessions.Session: session obtained from login 
    @data python dict: creation parameters of the entity
    @returns string: id of the new entity
    """
    token = get_csrf_token(session)
    
    params = {
        "action": "wbeditentity",
        "format": "json",
        'new': 'item',
        'data': json.dumps(data),
        'token': token
    }
    r1 = session.post(WIKIBASE_API, data=params)
    r1.json = r1.json()
    
    # raise when edit failed
    if 'error' in r1.json.keys():
        raise WBAPIException(r1.json['error'])

    return r1.json['entity']['id']

For example create an item with these parameters will return an id-string in the form 'Qx' (where x is a number).

The item can be seen in the wiki under `http:localhost:8080/wiki/Item:Qx`

In [176]:
data={"labels":{"de":{"language":"de","value":"de-value"},"en":{"language":"en","value":"en-value"}}}
create_entity(session, data)

'Q1'

## Import the authors list
Before importing anything, make sure the corresponding items and properties have been imported from wikidata. See notebook `WB_wikidata_properties.ipynb`.

A subsample of the authors list was created in notebook `filter_papers_by_software.ipyb`. This list contains the authors of a papers related to the first 1000 software entries in the list of softwares (`data/swMath-software-list.csv`). The list of authors is in file `data/all_authors.csv.zip`. 

In [165]:
# load the list of authors
import pandas as pd

# load the list of zbMath authors
authors_df = pd.read_csv('data/all_authors.csv.zip') 
authors_df.head()

Unnamed: 0,author_id,author_name
0,aardal.karen-i,"Aardal, Karen"
1,aarts.gert,"Aarts, Gert"
2,abad.alberto-j,"Abad, Alberto"
3,abada.asmaa,"Abada, Asmaa"
4,abanades.miguel-angel,"Abánades, Miguel A."


Use the create_entity function to import the authors into the wiki.
The Q-id returned by the wikibase is appended to the pandas dataframe of authors.

In [210]:
for i,current in authors_df[:100].iterrows():

    data = {
        'labels':{'en':{'language':'en','value':current['author_name']}},
        'claims': [
            # instance of 'human'
            {'mainsnak':{
                'snaktype':'value', 'property':'P31', 'datavalue':{'type':'wikibase-entityid', 'value': {'entity-type':'item','id':'Q5'}}},
            'type': 'statement', 'rank': 'normal'},
            # zbMath author id
            {'mainsnak':{
                'snaktype':'value', 'property':'P1556', 'datavalue': {'type':'string', 'value': current['author_id']}},
            'type': 'statement', 'rank': 'normal'}
            ]
    }
    # import into wikibase, save Qid
    authors_df.loc[i, 'qid'] = create_entity(session, data)

## Import the software list
All software entries have already been imported into the MaRDI portal.
Here I will import the first 1000 (out of 40000) software entries into the local wiki for testing.

In [213]:
# load the list of swMath software
software_df = pd.read_csv('data/swMATH-software-list.csv')
software_df = software_df[:100]
software_df.head()

Unnamed: 0,qid,P13,Len,#
0,,'0',swMATH,initial csv import 2021-12-17
1,,'1',FORTRAN,initial csv import 2021-12-17
2,,'2',SuperLU-DIST,initial csv import 2021-12-17
3,,'3',WHISPAR,initial csv import 2021-12-17
4,,'4',MULTI2D,initial csv import 2021-12-17


Use the create_entity function to import the software into the wiki.
The Q-id returned by the wikibase is appended to the pandas dataframe of software.

In [214]:
for i,current in software_df[:100].iterrows():

    data = {
        'labels':{'en':{'language':'en','value':current['Len']}},
        'claims': [
            # instance of 'software'
            {'mainsnak':{
                'snaktype':'value', 'property':'P31', 'datavalue':{'type':'wikibase-entityid', 'value': {'entity-type':'item','id':'Q7397'}}},
            'type': 'statement', 'rank': 'normal'},
            # swMath work id
            {'mainsnak':{
                'snaktype':'value', 'property':'P6830', 'datavalue': {'type':'string', 'value': current['P13']}},
            'type': 'statement', 'rank': 'normal'}
            ]
    }
    # import into wikibase, save Qid
    software_df.loc[i, 'qid'] = create_entity(session, data)

In [215]:
software_df.head()

Unnamed: 0,qid,P13,Len,#
0,Q120,'0',swMATH,initial csv import 2021-12-17
1,Q121,'1',FORTRAN,initial csv import 2021-12-17
2,Q122,'2',SuperLU-DIST,initial csv import 2021-12-17
3,Q123,'3',WHISPAR,initial csv import 2021-12-17
4,Q124,'4',MULTI2D,initial csv import 2021-12-17
