# ORES API Example
This example illustrates how to generate quality scores for article revisions using [ORES](https://www.mediawiki.org/wiki/ORES). This example shows how to request a score of a specific revision, where the score provides probabilities for all of the possible article quality levels. The API documentation can be access from the main [ORES](https://ores.wikimedia.org) page. However, this documentation is a little skimpy and if you want more information you may have to dig around.

## License
This code example was developed by Dr. David W. McDonald for use in DATA 512, a course in the UW MS Data Science degree program. This code is provided under the [Creative Commons](https://creativecommons.org) [CC-BY license](https://creativecommons.org/licenses/by/4.0/). Revision 1.0 - May 13, 2022



In [1]:
# 
# These are standard python modules
import json, time, urllib.parse
#
# The 'requests' module is not a standard Python module. You will need to install this with pip/pip3 if you do not already have it
import requests

The example relies on some constants that help make the code a bit more readable.

In [2]:
#########
#
#    CONSTANTS
#

# The current ORES API endpoint
API_ORES_SCORE_ENDPOINT = "https://ores.wikimedia.org/v3"
# A template for mapping to the URL
API_ORES_SCORE_PARAMS = "/scores/{context}/{revid}/{model}"

# Use some delays so that we do not hammer the API with our requests
API_LATENCY_ASSUMED = 0.002       # Assuming roughly 2ms latency on the API and network
API_THROTTLE_WAIT = (1.0/100.0)-API_LATENCY_ASSUMED

# When making automated requests we should include something that is unique to the person making the request
# This should include an email - your UW email would be good to put in there
REQUEST_HEADERS = {
    'User-Agent': '<uwnetid@uw.edu>, University of Washington, MSDS DATA 512 - AUTUMN 2022'
}

# A dictionary of English Wikipedia article titles (keys) and sample revision IDs that can be used for this ORES scoring example
# ARTICLE_REVISIONS = { 'Bison':1085687913 , 'Northern flicker':1086582504 , 'Red squirrel':1083787665 , 'Chinook salmon':1085406228 , 'Horseshoe bat':1060601936 }

# This template lists the basic parameters for making an ORES request
ORES_PARAMS_TEMPLATE = {
    "context": "enwiki",        # which WMF project for the specified revid
    "revid" : "",               # the revision to be scored - this will probably change each call
    "model": "articlequality"   # the AI/ML scoring model to apply to the reviewion
}
#
# The current ML models for English wikipedia are:
#   "articlequality"
#   "articletopic"
#   "damaging"
#   "version"
#   "draftquality"
#   "drafttopic"
#   "goodfaith"
#   "wp10"
#
# The specific documentation on these is scattered so if you want to use one you'll have to look around.
#

The API request will be made using one procedure. The idea is to make this reusable. The procedure is parameterized, but relies on the constants above for the important parameters. The underlying assumption is that this will be used to request data for a set of article revisions. Therefore, the main parameter is article_revid.

In [30]:
#------Load json with title and LastRevID ----#
import json

def create_article_revisions(file):
    output_df = pd.DataFrame()
    f = open(file)
    df = json.load(f)
    
    for i in range(0, len(df)-1):
        temp_dict = df[i]['pages']
        res = json.loads(temp_dict)
        pull_df = pd.DataFrame.from_dict(res)
        output_df = pd.concat([output_df, pull_df])
    return output_df

In [31]:
output_df = create_article_revisions("politicians_by_country.json")
output_df

TypeError: the JSON object must be str, bytes or bytearray, not dict

In [3]:
#########
#
#    PROCEDURES/FUNCTIONS
#

def request_ores_score_per_article(article_revid = None, 
                                   endpoint_url = API_ORES_SCORE_ENDPOINT, 
                                   endpoint_params = API_ORES_SCORE_PARAMS, 
                                   request_template = ORES_PARAMS_TEMPLATE,
                                   headers = REQUEST_HEADERS,
                                   features=False):
    # Make sure we have an article revision id
    if not article_revid: return None
    
    # set the revision id into the template
    request_template['revid'] = article_revid
    
    # now, create a request URL by combining the endpoint_url with the parameters for the request
    request_url = endpoint_url+endpoint_params.format(**request_template)
    
    # the features used by the ML model can sometimes be returned as well as scores
    if features:
        request_url = request_url+"?features=true"
    
    # make the request
    try:
        # we'll wait first, to make sure we don't exceed the limit in the situation where an exception
        # occurs during the request processing - throttling is always a good practice with a free
        # data source like ORES - or other community sources
        if API_THROTTLE_WAIT > 0.0:
            time.sleep(API_THROTTLE_WAIT)
        response = requests.get(request_url, headers=headers)
        json_response = response.json()
    except Exception as e:
        print(e)
        json_response = None
    return json_response


In [4]:
ARTICLE = "Bison"
print("Getting ORES scores for '%s' with revid: %d"%(ARTICLE,ARTICLE_REVISIONS[ARTICLE]))
score = request_ores_score_per_article(ARTICLE_REVISIONS[ARTICLE])
print(json.dumps(score,indent=4))

Getting ORES scores for 'Bison' with revid: 1085687913
{
    "enwiki": {
        "models": {
            "articlequality": {
                "version": "0.9.2"
            }
        },
        "scores": {
            "1085687913": {
                "articlequality": {
                    "score": {
                        "prediction": "FA",
                        "probability": {
                            "B": 0.08229363159217795,
                            "C": 0.03896549372657358,
                            "FA": 0.5416846323313819,
                            "GA": 0.3211324199836142,
                            "Start": 0.011471779513278837,
                            "Stub": 0.0044520428529734885
                        }
                    }
                }
            }
        }
    }
}


In [5]:
ARTICLE = "Red squirrel"
print("Getting ORES scores for '%s' with revid: %d"%(ARTICLE,ARTICLE_REVISIONS[ARTICLE]))
score = request_ores_score_per_article(ARTICLE_REVISIONS[ARTICLE])
print(json.dumps(score['enwiki']['scores'],indent=4))

Getting ORES scores for 'Red squirrel' with revid: 1083787665
{
    "1083787665": {
        "articlequality": {
            "score": {
                "prediction": "C",
                "probability": {
                    "B": 0.3438908635045126,
                    "C": 0.5511452526996617,
                    "FA": 0.0360496066555655,
                    "GA": 0.056484346855134204,
                    "Start": 0.009081450706888131,
                    "Stub": 0.0033484795782379364
                }
            }
        }
    }
}
