# DATA 512 - Homework 2: Considering Bias in Data


The goal of this assignment is to explore the concept of bias in data using Wikipedia articles. This assignment will consider articles on political figures from different countries. 

We will perform 3 main steps in the course of this notebook: 
1. Data Acquisition
2. Data Processing
3. Data Analysis

In [34]:
#importing relavant libraries
import pandas as pd
import requests
import json
import time
import os

## 1. Data Acquisition

### Overview
We will first acquire the raw data provided to us. 
In this section, we load the dataset containing information about politicians and extract relevant Wikipedia page titles from the URLs. The goal is to prepare the data for subsequent API requests to gather additional information.


Then we will make API requests to the MediaWiki API to retrieve the current revision IDs for each politician's article. These IDs are then used to request quality scores from the ORES API. The processing includes error handling for articles where quality scores cannot be retrieved.

### Outputs
- The dataset is successfully loaded from `politicians_by_country_AUG.2024.csv`, displaying the first few rows to verify its structure.
- A new column, `page_title`, is created by extracting the Wikipedia page titles from the `url` column. This makes sure that we have the necessary identifiers for making API requests later.
- Each politician's `revision_id` is retrieved and added to the dataset, allowing for accurate reference to the most recent version of their Wikipedia articles.
- Quality scores are fetched from the ORES API and added to a new column, `quality_score`. This provides insights into the reliability of each article.
- Articles that do not return a quality score are logged in a separate dataframe, and a CSV file (`missing_scores.csv`) is generated for future reference.
- The error rate for the score retrieval process is calculated and printed, indicating the proportion of articles that could not be scored, which is crucial for understanding data completeness.

In [127]:
# Step 1: Read the CSV file containing article data
df = pd.read_csv('politicians_by_country_AUG.2024.csv')

# Display the first few rows to understand the structure
df.head()

Unnamed: 0,name,url,country
0,Majah Ha Adrif,https://en.wikipedia.org/wiki/Majah_Ha_Adrif,Afghanistan
1,Haroon al-Afghani,https://en.wikipedia.org/wiki/Haroon_al-Afghani,Afghanistan
2,Tayyab Agha,https://en.wikipedia.org/wiki/Tayyab_Agha,Afghanistan
3,Khadija Zahra Ahmadi,https://en.wikipedia.org/wiki/Khadija_Zahra_Ah...,Afghanistan
4,Aziza Ahmadyar,https://en.wikipedia.org/wiki/Aziza_Ahmadyar,Afghanistan


In [10]:
# Display the first few rows to understand the structure
df.head()

# Step 1: Extract Wikipedia page titles from the 'url' column
df['page_title'] = df['url'].apply(lambda x: x.split("/")[-1])

# Display the updated DataFrame with the extracted page titles
df[['name', 'url', 'page_title']].head()

Unnamed: 0,name,url,page_title
0,Majah Ha Adrif,https://en.wikipedia.org/wiki/Majah_Ha_Adrif,Majah_Ha_Adrif
1,Haroon al-Afghani,https://en.wikipedia.org/wiki/Haroon_al-Afghani,Haroon_al-Afghani
2,Tayyab Agha,https://en.wikipedia.org/wiki/Tayyab_Agha,Tayyab_Agha
3,Khadija Zahra Ahmadi,https://en.wikipedia.org/wiki/Khadija_Zahra_Ah...,Khadija_Zahra_Ahmadi
4,Aziza Ahmadyar,https://en.wikipedia.org/wiki/Aziza_Ahmadyar,Aziza_Ahmadyar


Here is where you can input your created Wikimedia username, email address and access token for future use. For more information on how to obtain these authentication keys you can follow the steps listed [here](https://www.mediawiki.org/wiki/API:Info).

In [67]:
USERNAME = ""
email_address = "" 
ACCESS_TOKEN = ""

This code insantiates some constants that we will be using throughout the functions defined in this code. Revise it as you see fit. 

In [102]:
#    CONSTANTS
#    The current LiftWing ORES API endpoint and prediction model
#
API_ORES_LIFTWING_ENDPOINT = "https://api.wikimedia.org/service/lw/inference/v1/models/{model_name}:predict"
API_ORES_EN_QUALITY_MODEL = "enwiki-articlequality"


#    The throttling rate is a function of the Access token that you are granted when you request the token. The constants
#    come from dissecting the token and getting the rate limits from the granted token. An example of that is below.

API_LATENCY_ASSUMED = 0.002       # Assuming roughly 2ms latency on the API and network
API_THROTTLE_WAIT = ((60.0*60.0)/5000.0)-API_LATENCY_ASSUMED  # The key authorizes 5000 requests per hour

#    When making automated requests we should include something that is unique to the person making the request
#    This should include an email - your UW email would be good to put in there
#    
#    Because all LiftWing API requests require some form of authentication, you need to provide your access token
#    as part of the header too
REQUEST_HEADER_TEMPLATE = {
    'User-Agent': "trishp3@uw.edu, University of Washington, MSDS DATA 512 - AUTUMN 2024",
    'Content-Type': 'application/json',
    'Authorization': f"Bearer {ACCESS_TOKEN}"
}

#    This is a template for the parameters that we need to supply in the headers of an API request
REQUEST_HEADER_PARAMS_TEMPLATE = {
    'email_address' : 'trishp3@uw.edu',         # your email address should go here
    'access_token'  : ACCESS_TOKEN        # the access token you create will need to go here
}

#    This is a template of the data required as a payload when making a scoring request of the ORES model
ORES_REQUEST_DATA_TEMPLATE = {
    "lang":        "en",     # required that its english - we're scoring English Wikipedia revisions
    "rev_id":      "",       # this request requires a revision id
    "features":    True
}

### Data Acquisition 1.2: Creating Functions
In this sections we will create 4 functions to create a dataset of Page Title, Revision IDs, and Article Quality Classifications. I split up my work here as follows:

1. read each line of politicians_by_country.AUG.2024.csv, 
2. make a page info request to get the current page revision, and 
3. make an ORES request using the page title and current revision id.  

#### Functions Created:
- **`request_pageinfo_per_article`** : This function takes a list of Wikipedia page titles as input and makes requests to the MediaWiki API to fetch detailed page information for each title. 
- **`get_current_revision_id`**: This function processes the output from `request_pageinfo_per_article` to extract the current revision ID for each Wikipedia article. 

- **`request_ores_score_per_article`**:This function makes API requests to the ORES API to obtain quality scores for each article based on their current revision IDs. 

- **`get_ores_quality_score`**:This function processes the output from `request_ores_score_per_article` to organize and extract quality scores for each Wikipedia article. 

Finally, we ouptut a file named `ores_quality_scores.csv` with the corresponding Page Quality Scores from the ORES Model, the article title, and the Revision ID from the Wikimedia Page Unfo API. 

Parts of these functions were taken and inspired by Dr. David W. McDonald's sample code on *Article Page Info MediaWiki API Example* and *Requesting ORES scores through LiftWing ML Service API*. For links to the sample code please visit the [Github Repository](https://github.com/trishaprasant/data-512-homework_2/tree/main/sample%20code)  or access theses sample code files `wp_ores_liftwing_example.ipynb` and `wp_page_info_example.ipynb`.


Here I create functions to obtain corresponding Revision ID for each article. The steps I took are as follows:

`request_pageinfo_per_article`:
- Construct a URL for the MediaWiki API using the given page titles.
- Send requests to the API and handles responses, collecting relevant data such as the current revision IDs.
- The collected data is structured into a dataframe.

`get_current_revision_id`:
- Takes the dataframe returned from the previous function as input.
- The function isolates the current revision IDs from the metadata and returns them in a structured format


In [108]:
# Constants
MEDIAWIKI_API_URL = "https://en.wikipedia.org/w/api.php"
API_HEADER_AGENT = 'User-Agent'
API_THROTTLE_WAIT = 0.002  
REQUEST_HEADERS = {
    'User-Agent': "trishp3@uw.edu, University of Washington, MSDS DATA 512 - AUTUMN 2024",
    'Content-Type': 'application/json',
    'Authorization': f"Bearer {ACCESS_TOKEN}"
}

# Template for MediaWiki page info requests
PAGEINFO_PARAMS_TEMPLATE = {
    "action": "query",
    "prop": "revisions",
    "rvprop": "ids",
    "format": "json",
    "titles": None 
}

def request_pageinfo_per_article(article_title=None, 
                                 endpoint_url=MEDIAWIKI_API_URL, 
                                 request_template=PAGEINFO_PARAMS_TEMPLATE,
                                 headers=REQUEST_HEADERS):
    """This function requests page info for a Wikipedia article using the Page Info API from Wikimedia"""
    
    # Ensure the article title is provided
    if article_title:
        request_template['titles'] = article_title
    else:
        raise Exception("Must supply an article title to make a pageinfo request.")

    # Ensure headers contain the UW email
    if API_HEADER_AGENT not in headers:
        raise Exception(f"The header data should include a '{API_HEADER_AGENT}' field that contains your UW email address.")
    
    if 'uwnetid@uw' in headers[API_HEADER_AGENT]:
        raise Exception(f"Use your UW email address in the '{API_HEADER_AGENT}' field.")

    # Make the request with throttling
    try:
        if API_THROTTLE_WAIT > 0.0:
            time.sleep(API_THROTTLE_WAIT)
        response = requests.get(endpoint_url, headers=headers, params=request_template)
        json_response = response.json()
    except Exception as e:
        print(f"Error during request: {e}")
        return None
    
    return json_response

def get_current_revision_id(page_title):
    """Function to get the current revision ID for a Wikipedia page"""
    response_data = request_pageinfo_per_article(article_title=page_title)
    
    if response_data:
        pages = response_data.get('query', {}).get('pages', {})
        page_info = next(iter(pages.values()), {})
        
        if 'revisions' in page_info:
            return page_info['revisions'][0]['revid']
    
    return None

Revision ID example for debugging purposes.

In [109]:
# Get the first page title from the dataframe
first_page_title = df['page_title'].iloc[0]

# Get the revision ID for the first page title
first_revision_id = get_current_revision_id(first_page_title)
first_revision_id

1251034492

Here I create functions to get ORES quality score using the page title and revision ID. The steps I took here are as follows: 

`request_ores_score_per_article`:
- Takes the current revision IDs as input and constructs appropriate requests for the ORES API.
- Returns the quality scores in a structured format, allowing for easy integration into the main dataset.

`get_ores_quality_score`:
- Receives the scores from the previous function and organizes them alongside the corresponding page titles and revision IDs.
- Outputs a complete dataframe containing the article titles, revision IDs, and their respective quality scores for further analysis.

In [None]:
# Constants
ORES_API_URL = "https://ores.wikimedia.org/v3/scores/enwiki/"
API_THROTTLE_WAIT = 0.002 # Throttle wait time to avoid exceeding request limits


# ORES request data template
ORES_REQUEST_DATA_TEMPLATE = {
    'rev_id': None,  #placeholder
    'model': 'articlequality'
}

def request_ores_score_per_article(article_revid=None, email_address=None, access_token=None,
                                   endpoint_url=ORES_API_URL, 
                                   model_name='articlequality', 
                                   request_data=ORES_REQUEST_DATA_TEMPLATE, 
                                   header_format=REQUEST_HEADER_TEMPLATE):
    """Function to request ORES quality score for a specific article revision"""
    
    # Ensure the necessary parameters are provided
    if article_revid:
        request_data['rev_id'] = article_revid
    else:
        raise Exception("Must provide an article revision id (rev_id) to score articles")
    
    if not email_address:
        raise Exception("Must provide an 'email_address' value")
    
    if not access_token:
        raise Exception("Must provide an 'access_token' value")
    
    # Construct request URL
    request_url = f"{endpoint_url}{article_revid}/?models={model_name}"
    
    # Create request headers
    headers = {key: value.format(email_address=email_address, access_token=access_token) 
               for key, value in header_format.items()}
    
    # Make the request with throttling
    try:
        if API_THROTTLE_WAIT > 0.0:
            time.sleep(API_THROTTLE_WAIT)
        response = requests.get(request_url, headers=headers)
        json_response = response.json()
    except Exception as e:
        print(f"Error during request: {e}")
        return None
    
    return json_response

def get_ores_quality_score(revision_id, email_address, access_token):
    score_data = request_ores_score_per_article(article_revid=revision_id, 
                                                email_address=email_address, 
                                                access_token=access_token)
    
    try:
        return score_data['enwiki']['scores'][str(revision_id)]['articlequality']['score']['prediction']
    except KeyError:
        return None

missing_scores = 0 # Variables to track missing scores
total_articles = len(df)

results = [] # Store the results in a list

ORES Score Example for debugging purposes:

In [114]:
ores_score = get_ores_quality_score(first_revision_id, email_address, ACCESS_TOKEN)
print(ores_score)

FA


In this step we actually perform the Data Acquisiton based on the functions we created. We output a file named 

In [13]:
# Loop through each page, get revision ID and ORES quality score
for i, row in df.iterrows():
    page_title = row['page_title']
    print(f"Processing article {i+1}/{total_articles}: {page_title}")
    
    # Get the revision ID
    revision_id = get_current_revision_id(page_title)
    
    if revision_id:
        # Get the ORES quality score
        score = get_ores_quality_score(revision_id)
        if score:
            results.append({'page_title': page_title, 'revision_id': revision_id, 'quality_score': score})
        else:
            print(f"Could not retrieve score for {page_title}")
            missing_scores += 1
    else:
        print(f"Could not retrieve revision ID for {page_title}")
        missing_scores += 1
    
    # Sleep to avoid overloading the API
    time.sleep(0.5)

# Calculate the error rate
error_rate = missing_scores / total_articles
print(f"Error Rate: {error_rate * 100:.2f}%")

# Save the results to a new CSV file
results_df = pd.DataFrame(results)
results_df.to_csv("ores_quality_scores.csv", index=False)

Processing article 1/7155: Majah_Ha_Adrif
Processing article 2/7155: Haroon_al-Afghani
Processing article 3/7155: Tayyab_Agha
Processing article 4/7155: Khadija_Zahra_Ahmadi
Processing article 5/7155: Aziza_Ahmadyar
Processing article 6/7155: Muqadasa_Ahmadzai
Processing article 7/7155: Mohammad_Sarwar_Ahmedzai
Processing article 8/7155: Amir_Muhammad_Akhundzada
Processing article 9/7155: Nasrullah_Baryalai_Arsalai
Processing article 10/7155: Abdul_Rahim_Ayoubi
Processing article 11/7155: Ismael_Balkhi
Processing article 12/7155: Abdul_Baqi_Turkistani
Processing article 13/7155: Mohammad_Ghous_Bashiri
Processing article 14/7155: Jan_Baz
Processing article 15/7155: Bashir_Ahmad_Bezan
Processing article 16/7155: Rafiullah_Bidar
Processing article 17/7155: Mohammad_Siddiq_Chakari
Processing article 18/7155: Cheragh_Ali_Cheragh
Processing article 19/7155: Nasir_Ahmad_Durrani
Processing article 20/7155: Muhammad_Hashim_Esmatullahi
Processing article 21/7155: Ezatullah_(Nangarhar)
Processing

Processing article 181/7155: Abdeslam_Bouchouareb
Processing article 182/7155: Ibrahim_Boughali
Processing article 183/7155: Yahia_Boukhari
Processing article 184/7155: Mohamed_Bouslimani
Processing article 185/7155: Foued_Chehat
Processing article 186/7155: Hamza_Al_Sid_Cheikh
Processing article 187/7155: Hocine_Cherhabil
Processing article 188/7155: Abderrahmane_Meziane_Chérif
Processing article 189/7155: Saïd_Chibane
Processing article 190/7155: Brahim_Chibout
Processing article 191/7155: Youcef_Chorfa
Processing article 192/7155: Fazia_Dahleb
Processing article 193/7155: Taha_Derbal
Processing article 194/7155: Nassim_Diafat
Processing article 195/7155: Mokhtar_Didouche
Processing article 196/7155: Bouras_Djamel
Processing article 197/7155: Fatiha_Serour
Processing article 198/7155: Ahmed_Hamiani
Processing article 199/7155: Nadir_Kassab
Processing article 200/7155: Brahim_Djamel_Kassali
Processing article 201/7155: Kaoutar_Krikou
Processing article 202/7155: Sassi_Lamouri
Processi

Processing article 353/7155: Carlos_Kunkel
Processing article 354/7155: Norberto_La_Porta
Processing article 355/7155: Andrés_Larroque
Processing article 356/7155: Julián_de_Leyva
Processing article 357/7155: Mónica_López_(politician)
Processing article 358/7155: Lucio_Victorio_Mansilla
Processing article 359/7155: Nicolás_Massot
Processing article 360/7155: Martín_Menem
Processing article 361/7155: Juan_Mignaburu
Processing article 362/7155: Guillermo_Montenegro
Processing article 363/7155: Mariano_Moreno
Processing article 364/7155: Marcelo_Muniagurria
Processing article 365/7155: Fernando_Nadra
Processing article 366/7155: Rómulo_Sebastián_Naón
Processing article 367/7155: Santiago_G._O'Farrell
Processing article 368/7155: Alfredo_Olmedo
Processing article 369/7155: Miguel_Ángel_Pesce
Processing article 370/7155: Carlos_Petroni
Processing article 371/7155: Alejo_Peyret
Processing article 372/7155: Aldo_Pignanelli
Processing article 373/7155: Abel_Posse
Processing article 374/7155: E

Processing article 522/7155: Asaf_Hajiyev
Processing article 523/7155: Haji_Molla_Ahmad_Nuruzade
Processing article 524/7155: Farhad_Hajiyev
Processing article 525/7155: Hasan_Majidov
Processing article 526/7155: Hasan_Hasanli
Processing article 527/7155: Jamil_Hasanli
Processing article 528/7155: Ali_M._Hasanov
Processing article 529/7155: Zakir_Hasanov
Processing article 530/7155: Ali_Huseynli
Processing article 531/7155: Alisahib_Huseynov
Processing article 532/7155: Ali_Ibrahimov
Processing article 533/7155: Irshad_Aliyev
Processing article 534/7155: Mamed_Iskenderov
Processing article 535/7155: Vidadi_Isgandarov
Processing article 536/7155: Ajdar_Ismailov
Processing article 537/7155: Saftar_Jafarov
Processing article 538/7155: Jeyhun_Jalilov
Processing article 539/7155: Arastun_Javadov
Processing article 540/7155: Rovshan_Javadov
Processing article 541/7155: Anar_Karimov
Processing article 542/7155: Boris_Kevorkov
Processing article 543/7155: Gurban_Khalilov
Processing article 544

Processing article 690/7155: Maulana_Shakhawat
Processing article 691/7155: Abul_Kalam_Shamsuddin
Processing article 692/7155: Badruddoza_Ahmed_Shuja
Processing article 693/7155: Sourendra_Nath_Chakraborty
Processing article 694/7155: Mohammad_Sultan
Processing article 695/7155: Syed_Sayedul_Haque_Suman
Processing article 696/7155: Mohammad_Toaha
Processing article 697/7155: Badruddin_Umar
Processing article 698/7155: Zahidunnabi_Dewan_Shamim
Processing article 699/7155: Mohammad_Zillur_Rahman
Processing article 700/7155: Elliott_Belgrave
Processing article 701/7155: Chad_Blackman
Processing article 702/7155: Santia_Bradshaw
Processing article 703/7155: Sonia_Browne
Processing article 704/7155: Michael_A._Carrington
Processing article 705/7155: William_Fondleroy_Duguid
Processing article 706/7155: Reginald_Farley
Processing article 707/7155: Adrian_Forde
Processing article 708/7155: Cynthia_Y._Forde
Processing article 709/7155: Ian_Gooding-Edghill
Processing article 710/7155: Damien_K.

Processing article 865/7155: Rinzin_Jamtsho
Processing article 866/7155: Shingkhar_Lam
Processing article 867/7155: Tshewang_Lhamo
Processing article 868/7155: Kuenga_Loday
Processing article 869/7155: Dorji_Namgyal
Processing article 870/7155: Jigme_Namgyal_(Bhutan)
Processing article 871/7155: Kunzang_C._Namgyel
Processing article 872/7155: Wangdi_Norbu
Processing article 873/7155: Pema_Chewang
Processing article 874/7155: Pema_Gyamtsho
Processing article 875/7155: Kinga_Penjor
Processing article 876/7155: Yeshey_Penjor
Processing article 877/7155: Thakur_S._Powdyel
Processing article 878/7155: Jai_Bir_Rai
Processing article 879/7155: Karma_Rangdol
Processing article 880/7155: Chogyal_Dago_Rigdzin
Processing article 881/7155: Loknath_Sharma
Processing article 882/7155: Tenzin_(politician)
Processing article 883/7155: Sonam_Tobgye
Processing article 884/7155: Dorji_Tshering
Processing article 885/7155: Kinga_Tshering
Processing article 886/7155: Namgay_Tshering
Processing article 887/

Processing article 1040/7155: Dora_Nascimento
Processing article 1041/7155: Carlos_Neder
Processing article 1042/7155: Domingos_Gomes_de_Aguiar_Neto
Processing article 1043/7155: Manuel_Luís_Osório,_Marquis_of_Erval
Processing article 1044/7155: Hildebrando_Pascoal
Processing article 1045/7155: Astrojildo_Pereira
Processing article 1046/7155: Joaquim_Alvaro_Pereira_Leite
Processing article 1047/7155: Francisco_José_Pinheiro
Processing article 1048/7155: Mario_Pinotti
Processing article 1049/7155: Henrique_Pizzolato
Processing article 1050/7155: Plínio_de_Arruda_Sampaio_Jr.
Processing article 1051/7155: Marcio_Pochmann
Processing article 1052/7155: Presidency_of_Castelo_Branco
Processing article 1053/7155: Presidency_of_Eurico_Gaspar_Dutra
Processing article 1054/7155: Presidency_of_Fernando_Henrique_Cardoso
Processing article 1055/7155: Presidency_of_Itamar_Franco
Processing article 1056/7155: Presidency_of_Juscelino_Kubitschek
Processing article 1057/7155: Irondi_Pugliesi
Processing a

Processing article 1208/7155: Gani_Markan
Processing article 1209/7155: Maw_Htun_Aung
Processing article 1210/7155: Myint_Swe_(politician,_born_1965)
Processing article 1211/7155: Myo_Yan_Naung_Thein
Processing article 1212/7155: Naw_Ohn_Hla
Processing article 1213/7155: Nay_Myo_Wai
Processing article 1214/7155: Nay_Zin_Lat
Processing article 1215/7155: Kyaw_Ni
Processing article 1216/7155: Nyi_Sein
Processing article 1217/7155: Nyo_Nyo_Thin
Processing article 1218/7155: Pu_Pa_Thang
Processing article 1219/7155: San_Tun
Processing article 1220/7155: Sao_Sanda
Processing article 1221/7155: Sasa_(politician)
Processing article 1222/7155: Saw_Sa
Processing article 1223/7155: Saya_Gyi_U_Nu
Processing article 1224/7155: Sein_Win_(politician,_born_1944)
Processing article 1225/7155: Mya_Lay_Sein
Processing article 1226/7155: Sao_Seng_Suk
Processing article 1227/7155: Shwe_Ohn
Processing article 1228/7155: Soe_Moe_Hlaing
Processing article 1229/7155: Sultan_Ahmed_(Burmese_politician)
Processi

Processing article 1386/7155: Baba_Laddé
Processing article 1387/7155: Nadji_Madou
Processing article 1388/7155: Mahmoud_Ali_Seid
Processing article 1389/7155: Mariam_Mahamat_Nour
Processing article 1390/7155: Laoukein_Kourayo_Médard
Processing article 1391/7155: Abbo_Nassour
Processing article 1392/7155: Tahir_Hamid_Nguilin
Processing article 1393/7155: Quatre_Sou_Quatre
Processing article 1394/7155: François_Tombalbaye
Processing article 1395/7155: Saadie_Goukouni_Weddeye
Processing article 1396/7155: Alejandro_Serani_Burgos
Processing article 1397/7155: Beltrán_Mathieu
Processing article 1398/7155: Alexandra_Benado
Processing article 1399/7155: Rolando_Calderón
Processing article 1400/7155: Eugenio_Cantuarias
Processing article 1401/7155: Carlos_Balmaceda_Saavedra
Processing article 1402/7155: Jorge_Cauas
Processing article 1403/7155: Alberto_Cooper
Processing article 1404/7155: Fernando_Cordero_Rusque
Processing article 1405/7155: Juan_de_Dios_Correa_de_Saa
Processing article 1406/

Processing article 1549/7155: Mouzawar_Abdallah
Processing article 1550/7155: Moustadroine_Abdou
Processing article 1551/7155: Ibrahim_Aboubacar
Processing article 1552/7155: Saïd_Ibrahim_Ben_Ali
Processing article 1553/7155: Mohamed_Bacar
Processing article 1554/7155: Chamina_Ben_Mohamed
Processing article 1555/7155: Saïd_Mohamed_Cheikh
Processing article 1556/7155: Mohamed_Dahalani
Processing article 1557/7155: Dhoihir_Dhoulkamal
Processing article 1558/7155: Ahmed_Ben_Said_Djaffar
Processing article 1559/7155: Abdou_Soulé_Elbak
Processing article 1560/7155: Gaston_Feuillard
Processing article 1561/7155: Siti_Kassim
Processing article 1562/7155: Said_Ali_Kemal
Processing article 1563/7155: Fouad_Mohadji
Processing article 1564/7155: Dawiat_Mohamed
Processing article 1565/7155: Sittou_Raghadat_Mohamed
Processing article 1566/7155: Idi_Nadhoim
Processing article 1567/7155: Fahmi_Said_Ibrahim
Processing article 1568/7155: Djaffar_Ahmed_Said
Processing article 1569/7155: Abdallah_Said_Sa

Processing article 1711/7155: Siarke
Processing article 1712/7155: Tomás_Enrique_Soley_Soler
Processing article 1713/7155: José_María_Soto_Alfaro
Processing article 1714/7155: Nazario_Toledo
Processing article 1715/7155: Marco_Vinicio_Vargas_Pereira
Processing article 1716/7155: Santos_Velázquez_y_Tinoco
Processing article 1717/7155: William_Forbes_(Talamancan_king)
Processing article 1718/7155: Ivan_Anušić
Processing article 1719/7155: Mato_Arlović
Processing article 1720/7155: Branko_Bačić
Processing article 1721/7155: Fran_Barac
Processing article 1722/7155: Stephen_Nikola_Bartulica
Processing article 1723/7155: Živko_Bertić
Processing article 1724/7155: Ante_Biankini
Processing article 1725/7155: Juraj_Biankini
Processing article 1726/7155: Antun_Bonifačić
Processing article 1727/7155: Andrija_Torkvat_Brlić
Processing article 1728/7155: Josip_Broz_Tito
Processing article 1729/7155: Gajo_Bulat_(politician,_born_1867)
Processing article 1730/7155: Nadežda_Čačinovič
Processing article

Processing article 1880/7155: Miroslav_Lidinský
Processing article 1881/7155: Jiří_Lobkowicz
Processing article 1882/7155: Jiří_Löw
Processing article 1883/7155: Ivan_Mašek
Processing article 1884/7155: Mikuláš_of_Hus
Processing article 1885/7155: Zdeněk_Mraček
Processing article 1886/7155: Jana_Nečasová
Processing article 1887/7155: Joseph_Nekl
Processing article 1888/7155: Jiří_Oberfalzer
Processing article 1889/7155: Roman_Onderka
Processing article 1890/7155: Vladimír_Oplt
Processing article 1891/7155: Mojmír_Povolný
Processing article 1892/7155: Vladimír_Príkazský
Processing article 1893/7155: Ondřej_Přikryl
Processing article 1894/7155: Jozef_Regec
Processing article 1895/7155: František_Reichel
Processing article 1896/7155: Bedřich_Reicin
Processing article 1897/7155: Richard_Sacher
Processing article 1898/7155: Tomáš_Eduard_Šilinger
Processing article 1899/7155: Karel_Sladkovský
Processing article 1900/7155: Karel_Štogl
Processing article 1901/7155: Matěj_Stropnický
Processing 

Processing article 2048/7155: Mohammed_bin_Khalifa_Al_Maktoum
Processing article 2049/7155: Saeed_bin_Maktoum_bin_Hasher_Al_Maktoum
Processing article 2050/7155: Hawaa_Al_Mansoori
Processing article 2051/7155: Mohammed_bin_Saud_Al_Qasimi
Processing article 2052/7155: Hamed_bin_Zayed_Al_Nahyan
Processing article 2053/7155: Khaled_bin_Mohamed_Al_Nahyan
Processing article 2054/7155: Mohammed_bin_Khalifa_bin_Zayed_Al_Nahyan
Processing article 2055/7155: Mohamed_bin_Zayed_Al_Nahyan
Processing article 2056/7155: Saeed_bin_Zayed_Al_Nahyan
Processing article 2057/7155: Saqr_bin_Zayed_Al_Nahyan
Processing article 2058/7155: Shakhbut_bin_Sultan_Al_Nahyan
Processing article 2059/7155: Mohammed_Hussein_Al_Shaali
Processing article 2060/7155: Abdulaziz_Nasser_Al_Shamsi
Processing article 2061/7155: Ahmad_Al_Tayer
Processing article 2062/7155: Vice_President_of_the_United_Arab_Emirates
Processing article 2063/7155: Felipe_Hinestrosa_Ikaka
Processing article 2064/7155: Manuel_Osa_Nsue_Nsua
Processing

Processing article 2217/7155: Agénor_de_Gasparin
Processing article 2218/7155: Joseph_Matthias_Gérard_de_Rayneval
Processing article 2219/7155: Charles_Le_Bègue_de_Germiny
Processing article 2220/7155: Roland_Giberti
Processing article 2221/7155: Brigitte_Girardin
Processing article 2222/7155: Madeleine_de_Grandmaison
Processing article 2223/7155: Jean-Henry-Louis_Greffulhe
Processing article 2224/7155: Alain_Griset
Processing article 2225/7155: Odette_Grzegrzulka
Processing article 2226/7155: Guillaume_IV_de_Melun
Processing article 2227/7155: Joseph-Ignace_Guillotin
Processing article 2228/7155: Joseph_Guinard
Processing article 2229/7155: Maurice_d'Hartoy
Processing article 2230/7155: Guy_Hascoët
Processing article 2231/7155: Paul_Hay_du_Chastelet
Processing article 2232/7155: Georges_Humann
Processing article 2233/7155: Philippe_Hurault_de_Cheverny
Processing article 2234/7155: Jean_Louvet_(politician)
Processing article 2235/7155: Étienne_de_La_Grange
Processing article 2236/7155:

Processing article 2380/7155: Joseph_Musiol
Processing article 2381/7155: Theodor_Olshausen
Processing article 2382/7155: Dietrich_Heinrich_Ludwig_von_Ompteda
Processing article 2383/7155: Max_von_Oppenheim
Processing article 2384/7155: Christian_Theodor_Overbeck
Processing article 2385/7155: Hans_Paasche
Processing article 2386/7155: Walter_von_Saint_Paul-Illaire
Processing article 2387/7155: Friedrich-Carl_Peus
Processing article 2388/7155: H._Busso_Peus
Processing article 2389/7155: Franz_Seraph_von_Pfistermeister
Processing article 2390/7155: Adolf_Pilar_von_Pilchau
Processing article 2391/7155: Gunter_Pleuger
Processing article 2392/7155: Johan_Rantzau
Processing article 2393/7155: Prince_Richard_of_Hesse
Processing article 2394/7155: Nela_Riehl
Processing article 2395/7155: Moritz_Rittinghausen
Processing article 2396/7155: Hermann_Schäfer
Processing article 2397/7155: Mike_Schubert
Processing article 2398/7155: Eduard_von_Schele_zu_Schelenburg
Processing article 2399/7155: Paul_

Processing article 2545/7155: Ismaël_Touré
Processing article 2546/7155: Antony_Beaujon
Processing article 2547/7155: Stephen_Campbell
Processing article 2548/7155: John_Carter_(ambassador)
Processing article 2549/7155: Nigel_Dharamlall
Processing article 2550/7155: Roy_Fredericks
Processing article 2551/7155: Rahman_Baccus_Gajraj
Processing article 2552/7155: Winifred_Gaskin
Processing article 2553/7155: Joseph_Harmon
Processing article 2554/7155: Rashleigh_Jackson
Processing article 2555/7155: Manzoor_Nadir
Processing article 2556/7155: Ubraj_Narine
Processing article 2557/7155: Robert_Persaud
Processing article 2558/7155: Jane_Phillips-Gay
Processing article 2559/7155: Barton_Scotland
Processing article 2560/7155: C._N._Sharma
Processing article 2561/7155: A._R._F._Webber
Processing article 2562/7155: Robert_Victor_Evan_Wong
Processing article 2563/7155: President_of_Haiti
Processing article 2564/7155: Georges_Anglade
Processing article 2565/7155: Paul_Arcelin
Processing article 256

Processing article 2718/7155: Bommidi_Narayana_Nayakar
Processing article 2719/7155: Sunil_Bose
Processing article 2720/7155: Satpal_Brahamchari
Processing article 2721/7155: Chadalavada_Aravinda_Babu
Processing article 2722/7155: Chamala_Kiran_Kumar_Reddy
Processing article 2723/7155: Chirri_Balaraju
Processing article 2724/7155: Lumbaram_Choudhary
Processing article 2725/7155: Shambhavi_Choudhary
Processing article 2726/7155: Malaiyarasan_D
Processing article 2727/7155: Anup_Dhotre
Processing article 2728/7155: Bachhav_Shobha_Dinesh
Processing article 2729/7155: Dineshbhai_Makwana
Processing article 2730/7155: Yanamala_Divya
Processing article 2731/7155: Gajraj_Bahadur_Nagar
Processing article 2732/7155: P._Geetha_Jeevan
Processing article 2733/7155: Giddi_Satyanarayana
Processing article 2734/7155: Mamidi_Govinda_Rao
Processing article 2735/7155: Nalini_Gupta
Processing article 2736/7155: Hansraj_Meena
Processing article 2737/7155: Inturi_Nageswara_Rao
Processing article 2738/7155: 

Processing article 2883/7155: Nurdin_Halid
Processing article 2884/7155: Syarif_Hamid_II_of_Pontianak
Processing article 2885/7155: Syarwan_Hamid
Processing article 2886/7155: Hanum_Salsabiela_Rais
Processing article 2887/7155: Mulfachri_Harahap
Processing article 2888/7155: Hasan_Basry
Processing article 2889/7155: Albert_Hasibuan
Processing article 2890/7155: Anang_Hermansyah
Processing article 2891/7155: Romi_Herton
Processing article 2892/7155: Hilman_Djajadiningrat
Processing article 2893/7155: Helmud_Hontong
Processing article 2894/7155: Ida_Bagus_Putra_Manuaba
Processing article 2895/7155: Rasimah_Ismail
Processing article 2896/7155: Jafar_Nainggolan
Processing article 2897/7155: Jihan_Nurlela
Processing article 2898/7155: Herman_Johannes
Processing article 2899/7155: John_Djopari
Processing article 2900/7155: S._Kabo
Processing article 2901/7155: Zulkarnain_Karim
Processing article 2902/7155: Bambang_Kesowo
Processing article 2903/7155: Ade_Komarudin
Processing article 2904/715

Processing article 3054/7155: Sassoon_Eskell
Processing article 3055/7155: Qasim_Al-Fahadawi
Processing article 3056/7155: Fakhri_al-Tabaqchali
Processing article 3057/7155: Falih_Al-Fayyadh
Processing article 3058/7155: Mohammed_Al-Ghabban
Processing article 3059/7155: Saadoun_Ghaidan
Processing article 3060/7155: Mohammed_Hadid
Processing article 3061/7155: Rustam_Haidar
Processing article 3062/7155: Mohamed_Al-Halbousi
Processing article 3063/7155: Ahmad_Hardi
Processing article 3064/7155: Hasan_Turan_(Iraqi_politician)
Processing article 3065/7155: 'Abd_al-Razzaq_al-Hasani
Processing article 3066/7155: Khairuddin_Haseeb
Processing article 3067/7155: Mohammad_Iqbal_Omar
Processing article 3068/7155: Jaafar_al-Sadr
Processing article 3069/7155: Salim_al-Jabouri
Processing article 3070/7155: Dhia_Jafar
Processing article 3071/7155: Iyad_Jamal_Al-Din
Processing article 3072/7155: Jamal_Baban
Processing article 3073/7155: Jamil_Abdul_Wahab
Processing article 3074/7155: Amin_Farhan_Jejo


Processing article 3225/7155: Pietro_Paleocapa
Processing article 3226/7155: Bartolomeo_Panciatichi
Processing article 3227/7155: Maffeo_Pantaleoni
Processing article 3228/7155: Renzo_Patria
Processing article 3229/7155: Renato_de'_Pazzi
Processing article 3230/7155: Bartolo_Pellegrino
Processing article 3231/7155: Giuseppe_Perrucchetti
Processing article 3232/7155: Pandolfo_Petrucci
Processing article 3233/7155: Vittorio_Pezzuto
Processing article 3234/7155: Augusto_Pierantoni
Processing article 3235/7155: Giancarlo_Pittelli
Processing article 3236/7155: Luigi_Pizzardi
Processing article 3237/7155: Gennaro_Placco
Processing article 3238/7155: Stefano_Porcari
Processing article 3239/7155: Giovanni_Puoti
Processing article 3240/7155: Renzo_Rabellino
Processing article 3241/7155: Ercole_Ricotti
Processing article 3242/7155: Roberto_di_Ridolfo
Processing article 3243/7155: Italo_Righi
Processing article 3244/7155: Arnaldo_Rivera
Processing article 3245/7155: Carlo_Romussi
Processing artic

Processing article 3401/7155: Yoshitoki_Sugitani
Processing article 3402/7155: Yuasa_Kurahei
Processing article 3403/7155: Abu_Sayyaf_(Jordan)
Processing article 3404/7155: Raed_Abu_Soud
Processing article 3405/7155: Musa_Habes_Almaaytah
Processing article 3406/7155: Abdul-Latif_Arabiyat
Processing article 3407/7155: Hayel_Daoud
Processing article 3408/7155: Mohammad_Daoudiyeh
Processing article 3409/7155: Haditha_Al-Khraisha
Processing article 3410/7155: Jumah_Hammad
Processing article 3411/7155: Ahmad_Hanandeh
Processing article 3412/7155: Mahmoud_Hanandeh
Processing article 3413/7155: Hani_Khasawneh
Processing article 3414/7155: Fakhri_Kawar
Processing article 3415/7155: Khaled_Kalaldeh
Processing article 3416/7155: Saleh_Ali_Al-Kharabsheh
Processing article 3417/7155: Hisham_Khatib
Processing article 3418/7155: Haditha_Jamal_Haditha_Al-Khreisha
Processing article 3419/7155: Rashed_Al-Khuzai
Processing article 3420/7155: Yahya_Kisbi
Processing article 3421/7155: Kamel_Mahadin
Proces

Processing article 3577/7155: Gonzi_Rai
Processing article 3578/7155: Ali_Roba
Processing article 3579/7155: Wesley_K._Rono
Processing article 3580/7155: Ruweida_Obo
Processing article 3581/7155: Ibrahim_Sane
Processing article 3582/7155: Ibrahim_A._Saney
Processing article 3583/7155: Kimaiyo_Sego
Processing article 3584/7155: Cornelly_Serem
Processing article 3585/7155: Peter_Safari_Shehe
Processing article 3586/7155: Mohamed_Shidiye
Processing article 3587/7155: Elias_Shill
Processing article 3588/7155: Lawrence_Sifuna
Processing article 3589/7155: Elijah_K._Sumbeiywo
Processing article 3590/7155: Quincy_Timberlake
Processing article 3591/7155: Lilian_Tomitom
Processing article 3592/7155: Tecla_Tum
Processing article 3593/7155: Badi_Twalib
Processing article 3594/7155: Benedict_Wachira
Processing article 3595/7155: Machel_Waikenda
Processing article 3596/7155: Nzioka_Waita
Processing article 3597/7155: George_Wajackoyah
Processing article 3598/7155: Ali_Wario
Processing article 3599/

Processing article 3759/7155: Naoum_Labaki
Processing article 3760/7155: Chafic_Nassif
Processing article 3761/7155: Mohamad_Osseiran
Processing article 3762/7155: Vahan_Papazian
Processing article 3763/7155: Philippe_El_Khazen
Processing article 3764/7155: Pierre_Bou_Assi
Processing article 3765/7155: Renewal_Bloc
Processing article 3766/7155: Edmond_Rizk
Processing article 3767/7155: Fady_Saad
Processing article 3768/7155: Osama_Saad
Processing article 3769/7155: Waddah_Sadek
Processing article 3770/7155: Alice_Shabtini
Processing article 3771/7155: Abu_Youssef_Sharqieh
Processing article 3772/7155: Joseph_Skaff
Processing article 3773/7155: Myriam_Skaff
Processing article 3774/7155: Maurice_Sleem
Processing article 3775/7155: Mounira_Solh
Processing article 3776/7155: Fares_Souaid
Processing article 3777/7155: Strong_Lebanon
Processing article 3778/7155: Imad_Wakim
Processing article 3779/7155: Amal_Abou_Zeid
Processing article 3780/7155: Camille_Ziade
Processing article 3781/7155: 

Processing article 3934/7155: Risto_Gjorgjiev
Processing article 3935/7155: Vasil_Ivanovski
Processing article 3936/7155: Stevčo_Jakimovski
Processing article 3937/7155: Adnan_Jashari
Processing article 3938/7155: Minčo_Jordanov
Processing article 3939/7155: Perko_Kolevski
Processing article 3940/7155: Dimitar_Kovačevski
Processing article 3941/7155: Venko_Markovski
Processing article 3942/7155: Hristijan_Mickoski
Processing article 3943/7155: Sašo_Mijalkov
Processing article 3944/7155: Goran_Mitevski
Processing article 3945/7155: Stanko_Mladenovski
Processing article 3946/7155: Ferdinand_Odžakov
Processing article 3947/7155: Xhezair_Shaqiri
Processing article 3948/7155: Vančo_Šontevski
Processing article 3949/7155: Nikola_Spasovski
Processing article 3950/7155: Ivan_Stoilković
Processing article 3951/7155: Dobri_Veličkovski
Processing article 3952/7155: Zoran_Veruševski
Processing article 3953/7155: Mile_Zečević
Processing article 3954/7155: Augustin_Andriamananoro
Processing article 

Processing article 4110/7155: Siby_Ginette_Bellegarde
Processing article 4111/7155: Cheick_Bougadary_Traoré
Processing article 4112/7155: Dramane_Dembélé
Processing article 4113/7155: Hamadoun_Dicko
Processing article 4114/7155: Mamadou_Djigué
Processing article 4115/7155: Amadou_Doucoure
Processing article 4116/7155: Ly_Taher_Dravé
Processing article 4117/7155: Mahamane_Haidara
Processing article 4118/7155: Jamille_Bittar
Processing article 4119/7155: Modibo_Keïta
Processing article 4120/7155: Mamadou_Konaté
Processing article 4121/7155: Balla_Koné
Processing article 4122/7155: Garan_Fabou_Kouyate
Processing article 4123/7155: Mamadou_M'Bodje
Processing article 4124/7155: Mohamed_Ag_Intalla
Processing article 4125/7155: Zeïni_Moulaye
Processing article 4126/7155: N'Diaye_Ramatoulaye_Diallo
Processing article 4127/7155: Yeah_Samake
Processing article 4128/7155: Foutanga_Babani_Sissoko
Processing article 4129/7155: Ousmane_Sy
Processing article 4130/7155: Tiéman_Coulibaly
Processing art

Processing article 4283/7155: Predrag_Bulatović
Processing article 4284/7155: Miodrag_Davidović
Processing article 4285/7155: Vladimir_Dobričanin
Processing article 4286/7155: Dobrilo_Dedeić
Processing article 4287/7155: Sekula_Drljević
Processing article 4288/7155: Fatmir_Gjeka
Processing article 4289/7155: Milutin_Jelić
Processing article 4290/7155: Batrić_Jovanović
Processing article 4291/7155: Jovan_Kavarić
Processing article 4292/7155: Zdravko_Krivokapić
Processing article 4293/7155: Vujica_Lazović
Processing article 4294/7155: Vladimir_Leposavić
Processing article 4295/7155: Stjepan_Mitrov_Ljubiša
Processing article 4296/7155: Stevan_Lukačević
Processing article 4297/7155: Branko_Lukovac
Processing article 4298/7155: Savić_Marković_Štedimlija
Processing article 4299/7155: Tarzan_Milošević
Processing article 4300/7155: Iko_Mirković
Processing article 4301/7155: Fuad_Nimani
Processing article 4302/7155: Darko_Pajović
Processing article 4303/7155: Vladimir_Pavićević
Processing artic

Processing article 4453/7155: Bam_Shah
Processing article 4454/7155: Bishnu_Pratap_Shah
Processing article 4455/7155: Santoshi_Shahi
Processing article 4456/7155: Shukraraj_Shastri
Processing article 4457/7155: Sonam_Gyalchhen_Sherpa
Processing article 4458/7155: Sanu_Siva
Processing article 4459/7155: Bhuwan_Bahadur_Sunar
Processing article 4460/7155: Surya_Raj_Acharya
Processing article 4461/7155: Thapa_dynasty
Processing article 4462/7155: Ranabir_Singh_Thapa
Processing article 4463/7155: Goma_Devi_Timilsina
Processing article 4464/7155: Yagya_Raj_Joshi
Processing article 4465/7155: Dharma_Ratna_Yami
Processing article 4466/7155: Guillermo_Arce_Castaño
Processing article 4467/7155: Oscar_Danilo_Blandón
Processing article 4468/7155: José_Francisco_Cardenal
Processing article 4469/7155: Pedro_Joaquín_Chamorro_Barrios
Processing article 4470/7155: Juan_Sebastián_Chamorro
Processing article 4471/7155: Edipcia_Dubón
Processing article 4472/7155: Juan_Espinosa_(politician)
Processing arti

Processing article 4625/7155: Sebastian_Okechukwu_Mezu
Processing article 4626/7155: Zubby_Michael
Processing article 4627/7155: Mohammed_Sani_Idriss
Processing article 4628/7155: Danladi_Mohammed
Processing article 4629/7155: Janet_Nwadiogo_Mokelu
Processing article 4630/7155: Fati_Muhammad
Processing article 4631/7155: Sunusi_Musa
Processing article 4632/7155: Saliu_Mustapha
Processing article 4633/7155: Umar_Namadi
Processing article 4634/7155: Muhammad_Mamman_Nami
Processing article 4635/7155: Ajuri_Ngelale
Processing article 4636/7155: Suleiman_Oba_Nimota
Processing article 4637/7155: Uchechukwu_Nnam-Obi
Processing article 4638/7155: Frank_Nneji
Processing article 4639/7155: Justin_Nnorom
Processing article 4640/7155: Blessing_Nwagba
Processing article 4641/7155: Dozie_Nwankwo
Processing article 4642/7155: Nnia_Nwodo
Processing article 4643/7155: Oba_C._D._Akran
Processing article 4644/7155: Patrick_Obahiagbon
Processing article 4645/7155: Gaius_Obaseki
Processing article 4646/715

Processing article 4796/7155: Mian_Muhibullah_Kakakhel
Processing article 4797/7155: Niat_Qabool_Hayat_Kakakhel
Processing article 4798/7155: Syed_Kamal
Processing article 4799/7155: Kazi_Abdul_Kader
Processing article 4800/7155: Amir_Habibullah_Khan_Saadi
Processing article 4801/7155: Khan_Muhammad_Khan
Processing article 4802/7155: Malik_Allahyar_Khan
Processing article 4803/7155: Omar_Asghar_Khan
Processing article 4804/7155: Rana_Khudadad_Khan
Processing article 4805/7155: Rao_Muhammad_Afzal_Khan
Processing article 4806/7155: Mohammad_Aslam_Khan_(Pakistan_Peoples_Party_politician)
Processing article 4807/7155: Tufail_Ahmad_Khan
Processing article 4808/7155: Wajih-uz-Zaman_Khan
Processing article 4809/7155: Shahal_Khan_Khoso
Processing article 4810/7155: Mir_Dariya_Khan_Khoso
Processing article 4811/7155: Naseer_Khan_Khoso
Processing article 4812/7155: Zahoor_Hussain_Khoso
Processing article 4813/7155: Mumtaz_Hasan_Kizilbash
Processing article 4814/7155: Abdul_Aziz_Kurd
Processing a

Processing article 4967/7155: José_Félix_Fernández_Estigarribia
Processing article 4968/7155: Amilcar_Ferreira
Processing article 4969/7155: Mario_Ferreiro
Processing article 4970/7155: Lalo_Gomes
Processing article 4971/7155: Domingo_Gribeo
Processing article 4972/7155: Walter_Harms
Processing article 4973/7155: Ramón_Jiménez_Gaona
Processing article 4974/7155: Fidel_Maíz
Processing article 4975/7155: Óscar_Rodríguez_(Paraguayan_politician)
Processing article 4976/7155: Silvio_Ovelar
Processing article 4977/7155: Vicente_Rodríguez_(politician)
Processing article 4978/7155: Ramón_Romero_Roa
Processing article 4979/7155: Oscar_Salomón_(politician)
Processing article 4980/7155: Marcial_Samaniego
Processing article 4981/7155: Marcos_Zeida
Processing article 4982/7155: Nelson_Chui
Processing article 4983/7155: Fausto_Alvarado
Processing article 4984/7155: Roger_Amuruz
Processing article 4985/7155: Pedro_Angulo_Arana
Processing article 4986/7155: Carlos_Arana
Processing article 4987/7155: E

Processing article 5136/7155: Tadeusz_Mostowski
Processing article 5137/7155: Marian_Moszoro
Processing article 5138/7155: Kazimierz_Narutowicz
Processing article 5139/7155: Andrzej_Niegolewski
Processing article 5140/7155: Szymon_Niemiec
Processing article 5141/7155: Bonawentura_Niemojowski
Processing article 5142/7155: Wincenty_Niemojowski
Processing article 5143/7155: Jerzy_Nos
Processing article 5144/7155: Mieczysław_Nowicki
Processing article 5145/7155: Juliusz_Nowina-Sokolnicki
Processing article 5146/7155: Wacław_Olszak
Processing article 5147/7155: Antoni_Jan_Ostrowski
Processing article 5148/7155: Henryk_Ostrowski
Processing article 5149/7155: Rudolf_Paszek
Processing article 5150/7155: Grzegorz_Piechowiak
Processing article 5151/7155: Kazimierz_Pietkiewicz
Processing article 5152/7155: Leon_Piniński
Processing article 5153/7155: Edward_Pomorski
Processing article 5154/7155: Dominik_Potocki
Processing article 5155/7155: Ignacy_Potocki
Processing article 5156/7155: Gustaw_Przec

Processing article 5299/7155: Aleksandr_Aleksandrovich_Bublikov
Processing article 5300/7155: Salau_Aliyev
Processing article 5301/7155: Viktor_Anpilov
Processing article 5302/7155: Pavel_Astakhov
Processing article 5303/7155: Marat_Baglai
Processing article 5304/7155: Mikhail_Barsukov
Processing article 5305/7155: Mikhail_Batin
Processing article 5306/7155: Odes_Baysultanov
Processing article 5307/7155: Alexander_Bekovich-Cherkassky
Processing article 5308/7155: Oleg_Belozyorov
Processing article 5309/7155: Bogdan_Belsky
Processing article 5310/7155: Andrey_Belyaninov
Processing article 5311/7155: Ivan_Besedin
Processing article 5312/7155: Nikolay_Bogachyov
Processing article 5313/7155: Pavel_Borodin
Processing article 5314/7155: Alexander_Bortnikov
Processing article 5315/7155: Anton_Budilovich
Processing article 5316/7155: Anatoly_Bykov
Processing article 5317/7155: Sergey_Chebotaryov
Processing article 5318/7155: Sergey_Chemezov
Processing article 5319/7155: Vladimir_Chernukhin
Pro

Processing article 5465/7155: Juan_Vicente_Villacorta
Processing article 5466/7155: César_Yanes_Urías
Processing article 5467/7155: Xavier_Zablah_Bukele
Processing article 5468/7155: Sua_Rimoni_Ah_Chong
Processing article 5469/7155: Le_Mamea_Matatumua_Ata
Processing article 5470/7155: William_Coe_(governor)
Processing article 5471/7155: Va'aelua_Eti_Alesana
Processing article 5472/7155: Lauaki_Namulauulu_Mamoe
Processing article 5473/7155: Tui_Manuʻa_Elisala
Processing article 5474/7155: Toleʻafoa_Solomona_Toʻailoa
Processing article 5475/7155: Tamaseu_Leni_Warren
Processing article 5476/7155: Abdullah_bin_Faisal_Al_Saud_(1831–1889)
Processing article 5477/7155: Abdullatif_bin_Abdulaziz_Al-Sheikh
Processing article 5478/7155: Dahham_ibn_Dawwas
Processing article 5479/7155: Faisal_bin_Turki_Al_Saud_(1785–1865)
Processing article 5480/7155: Khalid_bin_Saud_Al_Saud_(1811–1865)
Processing article 5481/7155: Armand-Pierre_Angrand
Processing article 5482/7155: Léopold_Angrand
Processing arti

Processing article 5635/7155: Andrea_Kalavská
Processing article 5636/7155: Ján_Kollár
Processing article 5637/7155: Juraj_Košút
Processing article 5638/7155: Magda_Košútová
Processing article 5639/7155: Ján_Krošlák
Processing article 5640/7155: Ivan_Lesay
Processing article 5641/7155: Ivan_Lexa
Processing article 5642/7155: Ján_Lunter
Processing article 5643/7155: József_Nagy_(politician)
Processing article 5644/7155: Viliam_Novotný
Processing article 5645/7155: Štefan_Osuský
Processing article 5646/7155: Gabriel_Palacka
Processing article 5647/7155: Lucia_Plaváková
Processing article 5648/7155: Jozef_Pribilinec
Processing article 5649/7155: Ján_Richter
Processing article 5650/7155: Tatiana_Rosová
Processing article 5651/7155: Jozef_Sivák
Processing article 5652/7155: Radovan_Sloboda_(politician)
Processing article 5653/7155: Ľudovít_Štúr
Processing article 5654/7155: Jaroslav_Svěchota
Processing article 5655/7155: Ľubomír_Vážny
Processing article 5656/7155: 8th_National_Assembly_of_S

Processing article 5804/7155: Nafisat_Yusuf_Mohammed
Processing article 5805/7155: Ismail_Haji_Nour
Processing article 5806/7155: Osman_Saleban_Jama
Processing article 5807/7155: Jibrell_Ali_Salad
Processing article 5808/7155: Faysal_Ali_Warabe
Processing article 5809/7155: Dulton_Adams
Processing article 5810/7155: Bongani_Baloyi
Processing article 5811/7155: Japie_Basson
Processing article 5812/7155: Christiaan_Frederik_Beyers
Processing article 5813/7155: Bongani_Bongo
Processing article 5814/7155: Shaun_Byneveldt
Processing article 5815/7155: Marshall_Campbell
Processing article 5816/7155: Badih_Chaaban
Processing article 5817/7155: Geoffrey_Cronjé
Processing article 5818/7155: Rowan_Cronjé
Processing article 5819/7155: David_Dichaba
Processing article 5820/7155: Cathy_Dlamini
Processing article 5821/7155: Nhlanhla_Lux
Processing article 5822/7155: Stephanus_Jacobus_du_Toit
Processing article 5823/7155: Elijah_Mdolomba
Processing article 5824/7155: Jennifer_Ferguson
Processing arti

Processing article 5980/7155: Augustino_Kiri_Gwolo
Processing article 5981/7155: Paul_Malong_Awan
Processing article 5982/7155: Joseph_Bakosoro
Processing article 5983/7155: Tito_Biel
Processing article 5984/7155: Joseph_Bol_Chan
Processing article 5985/7155: Abraham_Makoi_Bol
Processing article 5986/7155: Bona_Malwal
Processing article 5987/7155: Peter_Cirillo
Processing article 5988/7155: Cirino_Hiteng_Ofuho
Processing article 5989/7155: Deng_Deng_Akuei
Processing article 5990/7155: William_Deng_Nhial
Processing article 5991/7155: Aldo_Deng
Processing article 5992/7155: Ronald_Ruai_Deng
Processing article 5993/7155: Teker_Riek_Dong
Processing article 5994/7155: John_Malish_Dujuk
Processing article 5995/7155: George_Echom
Processing article 5996/7155: Jacob_Kuwinsuk_Gale
Processing article 5997/7155: Naphtali_Hassan_Gale
Processing article 5998/7155: Joseph_Garang
Processing article 5999/7155: Isaac_Cleto_Hassan
Processing article 6000/7155: Sarah_Cleto_Hassan
Processing article 6001/

Processing article 6137/7155: Pere_Navarro_Olivella
Processing article 6138/7155: Trinitat_Neras_i_Plaja
Processing article 6139/7155: Juan_Everardo_Nithard
Processing article 6140/7155: Pablo_de_Olavide
Processing article 6141/7155: Álvaro_d'Ors_Pérez-Peix
Processing article 6142/7155: Román_Oyarzun_Oyarzun
Processing article 6143/7155: Juan_Pacheco
Processing article 6144/7155: Juan_Palarea_Blanes
Processing article 6145/7155: Berenguer_de_Palou_II
Processing article 6146/7155: Javier_María_Pascual_Ibañez
Processing article 6147/7155: Pedro_Antonio_de_Aragón
Processing article 6148/7155: Juan_Manuel_de_la_Peña_Bonifaz
Processing article 6149/7155: Antonio_Pérez_Crespo
Processing article 6150/7155: Antonio_Pérez_(statesman)
Processing article 6151/7155: Manuel_Pizarro_Moreno
Processing article 6152/7155: Luis_Gabriel_Portillo
Processing article 6153/7155: Pablo_de_Porturas_y_Landázuri
Processing article 6154/7155: Jesús_Quijano
Processing article 6155/7155: Domingo_Ram_y_Lanaja
Proces

Processing article 6301/7155: Abraham_Brodersson
Processing article 6302/7155: Leila_Ali_Elmi
Processing article 6303/7155: Bengt_Snivil
Processing article 6304/7155: Rudolf_Fredrik_Berg
Processing article 6305/7155: Birger_Jarl
Processing article 6306/7155: Johan_Ludvig_Boye
Processing article 6307/7155: Birger_Brosa
Processing article 6308/7155: Axel_Brusewitz
Processing article 6309/7155: Nils_Claëson
Processing article 6310/7155: Petrus_Olai_Dalekarlus
Processing article 6311/7155: Christoffer_Dulny
Processing article 6312/7155: Klas_Eklund
Processing article 6313/7155: Martin_Ekström
Processing article 6314/7155: Nils_Elowsson
Processing article 6315/7155: Folke_the_Fat
Processing article 6316/7155: Hans_Reinhold_von_Fersen
Processing article 6317/7155: Hemming_Gadh
Processing article 6318/7155: Gösta_Hallberg-Cuula
Processing article 6319/7155: Evelina_Hahne
Processing article 6320/7155: Greger_Helin
Processing article 6321/7155: Anders_Henriksson_(politician)
Processing article 

Processing article 6475/7155: Jihad_Makdissi
Processing article 6476/7155: Mohammad_Jumah
Processing article 6477/7155: Sami_al-Jundi
Processing article 6478/7155: Kamal_al-Qassab
Processing article 6479/7155: Kamil_Pasha_al-Qudsi
Processing article 6480/7155: Abdul-Aziz_al-Khair
Processing article 6481/7155: Fayez_al-Khoury
Processing article 6482/7155: Mar'i_Pasha_al-Mallah
Processing article 6483/7155: Hrant_Maloyan
Processing article 6484/7155: Antun_Maqdisi
Processing article 6485/7155: Mahmoud_Ahmad_Marei
Processing article 6486/7155: Muhammad_Talab_Hilal
Processing article 6487/7155: Mustafa_al-Mousa
Processing article 6488/7155: Mustafa_al-Siba'i
Processing article 6489/7155: Shukri_al-Quwatli
Processing article 6490/7155: Nasim_al-Safarjalani
Processing article 6491/7155: Salem_al-Meslet
Processing article 6492/7155: Muhammad_al-Sufi
Processing article 6493/7155: Mohammad_Farouk_Tayfour
Processing article 6494/7155: Huang_Chaoqin
Processing article 6495/7155: Li_Huang
Processi

Processing article 6649/7155: Rached_Ghannouchi
Processing article 6650/7155: Sadok_Ghileb
Processing article 6651/7155: Nebiha_Gueddana
Processing article 6652/7155: Mezri_Haddad
Processing article 6653/7155: Radhia_Haddad
Processing article 6654/7155: Ali_Bach_Hamba
Processing article 6655/7155: Olfa_Hamdi
Processing article 6656/7155: Mohamed_Hechmi_Hamdi
Processing article 6657/7155: Kamel_Idir
Processing article 6658/7155: Chaima_Issa
Processing article 6659/7155: Kamel_Jendoubi
Processing article 6660/7155: Faten_Kallel
Processing article 6661/7155: Ibrahim_Kassas
Processing article 6662/7155: Othman_Kechrid
Processing article 6663/7155: Omezzine_Khelifa
Processing article 6664/7155: Khemaïs_Ksila
Processing article 6665/7155: Slaheddine_Maaoui
Processing article 6666/7155: Selma_Hédia_Mabrouk
Processing article 6667/7155: Zouhair_Maghzaoui
Processing article 6668/7155: Seifeddine_Makhlouf
Processing article 6669/7155: Mohamed_Masmoudi
Processing article 6670/7155: Mustapha_Masmo

Processing article 6827/7155: Ivan_Gel
Processing article 6828/7155: Moshe_Gutman
Processing article 6829/7155: Olena_Halushka
Processing article 6830/7155: Izet_Hdanov
Processing article 6831/7155: Artur_Herasymov
Processing article 6832/7155: Valeriy_Holovko
Processing article 6833/7155: Volodymyr_Hryshchenko
Processing article 6834/7155: Stepan_Ivakhiv
Processing article 6835/7155: Borys_Kachura
Processing article 6836/7155: Dmytro_Kashchuk
Processing article 6837/7155: Serhii_Khlan
Processing article 6838/7155: Borys_Klimchuk_(politician)
Processing article 6839/7155: Klitschko_brothers
Processing article 6840/7155: Mykola_Kolisnyk
Processing article 6841/7155: Mykhailo_Korolenko
Processing article 6842/7155: Ivan_Korshynskyi
Processing article 6843/7155: Roman_Kostenko
Processing article 6844/7155: Heorhiy_Kryuchkov
Processing article 6845/7155: Mykola_Kulinich
Processing article 6846/7155: Vitalii_Kurylo
Processing article 6847/7155: Anatolii_Kutsevol
Processing article 6848/7155

Processing article 6999/7155: Lewis_Pérez
Processing article 7000/7155: Arkiely_Perfecto
Processing article 7001/7155: Eduardo_Piñate
Processing article 7002/7155: Humberto_Prado
Processing article 7003/7155: Omar_Prieto
Processing article 7004/7155: Benjamín_Rausseo
Processing article 7005/7155: Manuel_Felipe_Rugeles
Processing article 7006/7155: Carlos_Santana_Tovar
Processing article 7007/7155: Milena_Sardi_de_Selle
Processing article 7008/7155: Carlos_Eduardo_Stolk
Processing article 7009/7155: Bolivia_Suárez
Processing article 7010/7155: Gustavo_Tarre
Processing article 7011/7155: José_Herrera_Uslar
Processing article 7012/7155: Addy_Valero
Processing article 7013/7155: José_Antonio_Velutini
Processing article 7014/7155: Henry_Ventura
Processing article 7015/7155: Nervis_Villalobos
Processing article 7016/7155: Bùi_Tiến_Dũng_(politician)
Processing article 7017/7155: Chu_Văn_An
Processing article 7018/7155: Nguyễn_Quốc_Định
Processing article 7019/7155: Đinh_Xuân_Quảng
Processing 

I've printed each processing article for debugging purposes. Please clear the output if you would not like to see it. 

# 2. Data Processing

## Overview
This section involves combining the provided datasets (`population_by_country_AUG.2024.csv` and `politicians_by_country_AUG.2024.csv`, with the ORES Page quality dataset (`ores_quality_scores.csv`) we created in the step above. 

In this step, I ran into entries which cannot be merged. The following datasets outputted in the processing step are listed below. 

## Outputs
- **`wp_countries-no_match.txt`**: Text file of all countries for which there are no matches, with each country on a separate line.
- **`wp_politicians_by_country.csv`**: The remaining data, with columns for the country,region,	population	article_title,revision_id, and article_quality of each article about a given politician in the provided dataset.


In [145]:
#Step 3: Combining the Datasets
population_df = pd.read_csv('population_by_country_AUG.2024.csv')
ores_df

Unnamed: 0,page_title,revision_id,quality_score
0,Majah_Ha_Adrif,1233202991,Start
1,Haroon_al-Afghani,1230459615,B
2,Tayyab_Agha,1225661708,Start
3,Khadija_Zahra_Ahmadi,1234741562,Stub
4,Aziza_Ahmadyar,1195651393,Start
...,...,...,...
7141,Josiah_Tongogara,1203429435,C
7142,Langton_Towungana,1246280093,Stub
7143,Sengezo_Tshabangu,1228478288,Start
7144,Herbert_Ushewokunze,959111842,Stub


In these steps I take the steps needed to clean, process, and merge all of our given datasets together. 

**Key Steps:**
- Identify uppercase rows in `population_by_country_AUG.2024.csv` as regions
- Extract page title from URL
- Identify unmatched countries in two ways:
    - 'left_only' to obtain countries in df that have no match in `population_by_country_AUG.2024.csv`
    - 'right_only' to obtain countries in population_df that have no match in ``politician_by_country_AUG.2024.csv`
- Save the final merged dataset to `wp_politicians_by_country.csv`.

In [164]:
population_df['region'] = None  # Add a region column to store the region

current_region = None
for i, row in population_df.iterrows():
    if row['Geography'].isupper():  # Identify uppercase rows as regions
        current_region = row['Geography']
    else:
        population_df.at[i, 'region'] = current_region  # Assign region to country rows

# Remove rows that are regions themselves (we only need countries)
population_df_clean = population_df[population_df['Geography'].apply(lambda x: not x.isupper())]

# Step 1: Extract page_title from URL
df['page_title'] = df['url'].apply(lambda x: x.split("/")[-1])  # Extract page_title from URL

# Step 2: Perform the first merge (df with population_df_clean)
merged_df_1 = pd.merge(df, population_df_clean, left_on='country', right_on='Geography', how='outer', indicator=True)

# Step 3: Identify unmatched countries
# 'left_only' indicates countries in df that have no match in population_df
# 'right_only' indicates countries in population_df that have no match in df

no_match_countries = merged_df_1[merged_df_1['_merge'] != 'both']

with open('wp_countries-no_match.txt', 'w') as f:
    # Log countries in df but not in population_df (left_only)
    left_only_countries = no_match_countries[no_match_countries['_merge'] == 'left_only']['country'].unique()
    f.write("Countries in df but not in population_df:\n")
    for country in left_only_countries:
        f.write(f"{country}\n")

    # Log countries in population_df but not in df (right_only)
    right_only_countries = no_match_countries[no_match_countries['_merge'] == 'right_only']['Geography'].unique()
    f.write("\nCountries in population_df but not in df:\n")
    for country in right_only_countries:
        f.write(f"{country}\n")

# Step 4: Filter out unmatched rows, keep only rows where there's a match
matched_df = merged_df_1[merged_df_1['_merge'] == 'both'].drop(columns=['_merge'])

# Step 5: Perform the second merge 
final_merged_df = pd.merge(matched_df, ores_df, on='page_title', how='left')

# Step 6: Save the final merged dataset 
final_merged_df.to_csv('wp_politicians_by_country.csv', index=False)

In this step I rename and reorder the columns of our final merged dataframe.

In [165]:
# Rename columns
final_merged_df = final_merged_df.rename(columns={'name': 'article_title', 'quality_score': 'article_quality', 'Population': 'population'})
final_merged_df.drop('url', axis=1, inplace=True)
final_merged_df.drop('page_title', axis=1, inplace=True)

# Rearrange columns
final_merged_df = final_merged_df[['country', 'region', 'population', 'article_title', 'revision_id', 'article_quality']]
final_merged_df

Unnamed: 0,country,region,population,article_title,revision_id,article_quality
0,Afghanistan,SOUTH ASIA,42.4,Majah Ha Adrif,1.233203e+09,Start
1,Afghanistan,SOUTH ASIA,42.4,Haroon al-Afghani,1.230460e+09,B
2,Afghanistan,SOUTH ASIA,42.4,Tayyab Agha,1.225662e+09,Start
3,Afghanistan,SOUTH ASIA,42.4,Khadija Zahra Ahmadi,1.234742e+09,Stub
4,Afghanistan,SOUTH ASIA,42.4,Aziza Ahmadyar,1.195651e+09,Start
...,...,...,...,...,...,...
7103,Zimbabwe,EASTERN AFRICA,16.7,Josiah Tongogara,1.203429e+09,C
7104,Zimbabwe,EASTERN AFRICA,16.7,Langton Towungana,1.246280e+09,Stub
7105,Zimbabwe,EASTERN AFRICA,16.7,Sengezo Tshabangu,1.228478e+09,Start
7106,Zimbabwe,EASTERN AFRICA,16.7,Herbert Ushewokunze,9.591118e+08,Stub


## Data Analysis

### Overview
In the analysis section, we examine the distribution and characteristics of the quality scores that can be observed in the merged dataset we created. 

My analysis consisted of calculating total-articles-per-capita (a ratio representing the number of articles per person)  and high-quality-articles-per-capita (a ratio representing the number of high quality articles per person) on a country-by-country and regional basis.

For this analysis a "high quality" article are considered to be articles that ORES predicted in either the "FA" (featured article) or "GA" (good article) classes.

The calculated proportions in this step are likely to be very small numbers. This is because the populations that are in `population_by_country_AUG.2024.csv` provides population in millions.  

### Outputs
We output 6 tables as follows: 
- The 10 countries with the highest total articles per capita
- The 10 countries with the lowest total articles per capita
- The 10 countries with the highest high quality articles per capita
- The 10 countries with the lowest high quality articles per capita
- A rank ordered list of geographic regions by total articles per capita
- Rank ordered list of geographic regions by high quality articles per capita 


In [174]:

# Count articles per country
country_article_count = final_merged_df.groupby('country')['article_title'].count().reset_index(name='total_articles')

# Count high-quality articles (FA or GA)
high_quality_df = final_merged_df[final_merged_df['article_quality'].isin(['FA', 'GA'])]
country_high_quality_count = high_quality_df.groupby('country')['article_title'].count().reset_index(name='high_quality_articles')

# Merge article counts with the original data to get population and region
country_stats_df = pd.merge(final_merged_df[['country', 'region', 'population']].drop_duplicates(), 
                            country_article_count, on='country', how='left')

country_stats_df = pd.merge(country_stats_df, country_high_quality_count, on='country', how='left').fillna(0)

# Calculate per capita ratios
country_stats_df['total_articles_per_capita'] = country_stats_df['total_articles'] / country_stats_df['population']
country_stats_df['high_quality_articles_per_capita'] = country_stats_df['high_quality_articles'] / country_stats_df['population']

# aggregate on a regional basis
region_article_count = final_merged_df.groupby('region')['article_title'].count().reset_index(name='total_articles_region')

# Count high-quality articles by region
region_high_quality_count = high_quality_df.groupby('region')['article_title'].count().reset_index(name='high_quality_articles_region')

# Merge region stats
region_stats_df = pd.merge(final_merged_df[['region', 'population']].drop_duplicates(), 
                           region_article_count, on='region', how='left')

region_stats_df = pd.merge(region_stats_df, region_high_quality_count, on='region', how='left').fillna(0)

# Calculate per capita ratios for regions
region_stats_df['total_articles_per_capita_region'] = region_stats_df['total_articles_region'] / region_stats_df['population']
region_stats_df['high_quality_articles_per_capita_region'] = region_stats_df['high_quality_articles_region'] / region_stats_df['population']

1. Top 10 countries by coverage: The 10 countries with the highest total articles per capita (in descending order).

Mostly right European islands/countries.

In [176]:
top_10_by_coverage = country_stats_df[['country', 'total_articles_per_capita']].sort_values(by='total_articles_per_capita', ascending=False).head(10)
top_10_by_coverage

Unnamed: 0,country,total_articles_per_capita
155,Tuvalu,inf
99,Monaco,inf
4,Antigua and Barbuda,330.0
97,Federated States of Micronesia,140.0
95,Marshall Islands,130.0
150,Tonga,100.0
12,Barbados,83.333333
101,Montenegro,63.333333
130,Seychelles,60.0
92,Maldives,55.0


2. Bottom 10 countries by coverage: The 10 countries with the lowest total articles per capita (in ascending order).

Large populous countries. Perhaps GDP is lower?

In [181]:
bottom_10_by_coverage = country_stats_df[['country', 'total_articles_per_capita']].sort_values(by='total_articles_per_capita').head(10)
bottom_10_by_coverage

Unnamed: 0,country,total_articles_per_capita
32,China,0.011337
66,India,0.105698
57,Ghana,0.117302
127,Saudi Arabia,0.135501
164,Zambia,0.148515
109,Norway,0.181818
70,Israel,0.204082
46,Egypt,0.304183
72,Cote d'Ivoire,0.323625
103,Mozambique,0.353982


3. Top 10 countries by high quality: The 10 countries with the highest high quality articles per capita (in descending order).

GDP high countries? More internet access?

In [180]:
top_10_by_high_quality = country_stats_df[['country', 'high_quality_articles_per_capita']].sort_values(by='high_quality_articles_per_capita', ascending=False).head(10)
top_10_by_high_quality

Unnamed: 0,country,high_quality_articles_per_capita
101,Montenegro,5.0
87,Luxembourg,2.857143
1,Albania,2.592593
77,Kosovo,2.352941
86,Lithuania,2.068966
92,Maldives,1.666667
38,Croatia,1.315789
62,Guyana,1.25
112,Palestinian Territory,1.090909
134,Slovenia,0.952381


4. Bottom 10 countries by high quality: The 10 countries with the lowest high quality articles per capita (in ascending order).

In [179]:
bottom_10_by_high_quality = country_stats_df[['country', 'high_quality_articles_per_capita']].sort_values(by='high_quality_articles_per_capita').head(10)
bottom_10_by_high_quality

Unnamed: 0,country,high_quality_articles_per_capita
165,Zimbabwe,0.0
115,Paraguay,0.0
59,Grenada,0.0
119,Qatar,0.0
55,Gambia,0.0
122,St. Kitts and Nevis,0.0
123,St. Lucia,0.0
50,Estonia,0.0
49,Eritrea,0.0
48,Equatorial Guinea,0.0


5. Geographic regions by total coverage: A rank ordered list of geographic regions (in descending order) by total articles per capita.

In [178]:
regions_by_total_coverage = region_stats_df[['region', 'total_articles_per_capita_region']].sort_values(by='total_articles_per_capita_region', ascending=False)
regions_by_total_coverage

Unnamed: 0,region,total_articles_per_capita_region
149,OCEANIA,inf
97,WESTERN EUROPE,inf
125,EASTERN AFRICA,6730.000000
4,CARIBBEAN,2200.000000
93,SOUTHERN EUROPE,1365.000000
...,...,...
132,SOUTHERN AFRICA,2.026359
66,SOUTHEAST ASIA,1.435235
72,EAST ASIA,1.228916
65,SOUTH ASIA,0.473191


6. Geographic regions by high quality coverage: Rank ordered list of geographic regions (in descending order) by high quality articles per capita.


In [177]:
regions_by_high_quality_coverage = region_stats_df[['region', 'high_quality_articles_per_capita_region']].sort_values(by='high_quality_articles_per_capita_region', ascending=False)
regions_by_high_quality_coverage

Unnamed: 0,region,high_quality_articles_per_capita_region
149,OCEANIA,inf
97,WESTERN EUROPE,inf
125,EASTERN AFRICA,190.000000
4,CARIBBEAN,90.000000
93,SOUTHERN EUROPE,88.333333
...,...,...
36,MIDDLE AFRICA,0.078201
104,WESTERN AFRICA,0.058088
72,EAST ASIA,0.024096
65,SOUTH ASIA,0.014700
