# Background

The United Nations peacekeeping mandate comes from the United Nations Charter, chapters VI and VII. Since its establishment, the UN has deployed peacekeepers around the world to monitor and support ceasefires, creating a path to sustainable peace. Sadly, peacekeepers serving in these missions lose their lives to malicious acts, accidents, illnesses, and self-inflicted causes. These deaths are registered in a publicly-accessible dataset that gives us an opportunity to have a data-informed overview of the UN peacekeeping missions.

# Situation & Task

We get a [dataset listing all registered deaths of UN peacekeepers](https://psdata.un.org/dataset/DPPADPOSS-FATALITIES). We want to quickly visualise the information from this dataset using QGIS to create an animation that would explain the changing geography and intensity of the missions.

# Scope & Limitations

We will map all UN peacekeeping missions that have had at least 1 fatality on the world map. We will be using the present borders of the countries. A country will be the smallest unit in our mapping since we won’t be able to pinpoint the location of the UN missions to sub-national geographical units. We will be using AI-assisted Python, AI-based data preprocessing, and QGIS.

# Result

![Peacekeeping missions gif](./images/peacekeeping_missions_full_size.gif)


# Data exploration

The initial data exploration process can be found [in an auxiliary notebook](./data_exploration.ipynb). 

![Data Exploration](./images/casualties_by_year.png)

# Process

1. I [download the dataset in .csv format](https://drive.google.com/open?id=1YTJ2V4vfMRSBIkoGl17EUlgnDP-9vZMp&usp=drive_fs).

2. I use OpenAI API to help me determine the geography and timeline of unique missions. I send the results through OpenAI once more, asking it to act as a fact checker to minimise errors.

In [None]:
%%capture
!pip install pandas requests openai python-dotenv

In [None]:
import pandas as pd
import requests
import openai
import os
from dotenv import load_dotenv

from preprocessing import send_to_chatgpt, get_mission_data, add_verification_column

url = f'https://drive.google.com/uc?export=download&id=1YTJ2V4vfMRSBIkoGl17EUlgnDP-9vZMp'
df = pd.read_csv(url)

#use openai API to get data for every mission
mission_data = get_mission_data(df, send_to_chatgpt)

#use openai API to fact-check the answers received on the previous step
mission_data = add_verification_column(mission_data, send_to_chatgpt)

#export the data to excel file for manual processing
mission_data.to_excel("mission_countries_years_with_check.xlsx", index=False)

# Print the updated DataFrame for inspection
#print("\n--- Updated Mission Data with Verification Column ---")
#print(mission_data)

<div style="background-color: #fff8b0; border: 1px solid #e6c200; padding: 15px; border-radius: 8px; box-shadow: 2px 2px 5px rgba(0,0,0,0.1); width: fit-content; font-family: sans-serif;">
  📌 <strong>Note on using openai for data preprocessing:</strong> In this example, I am using gpt4o as the model for my requests. It gives better results than gpt4o-mini but is significantly more expensive: the execution of the query above cost me about USD 0.10 with the help of gpt4o, and about USD 0.01 with gpt4o-mini (April 2025). 
Regardless of the model used, it is prone to error, both by hallucinating wrong answers and not retreiving easilly accessible ones. Therefore, using openai for data preprocessing is advisable only for non-critical tasks with some tolerance for wrong answers.
To minimise errors, I introduce a feedback loop, where I send the results to openai again with a prompt asking it to perform fact-checking.
</div>


3. I manually clean the resulting table of missions, checking the entries that the fact-checking function marked for review.

One mission_acronym, UNSecretariat, doesn't have the corresponding geography. There are 9 fatalities associated with this code. Four of those come from a [1961 Ndola Transair Sweden DC-6 crash](https://en.wikipedia.org/wiki/1961_Ndola_Transair_Sweden_DC-6_crash) which killed Dag Hammarskjöld and his staff. I manually reclassify these four deaths to the ONUC. Two more come from 12 March 2017 when [Zaida Catalán and Michael Sharp were murdered in DRC](https://en.wikipedia.org/wiki/Zaida_Catal%C3%A1n). I reclassify these two deaths to the MONUSCO. I wasn't able to find details on three deaths, so I excluded them from the analysis.

Resulting tables:
[Fatalities](https://drive.google.com/file/d/10aIygNb5I5LXXCJgEDKZr064keo9QM2K/view?usp=sharing) 
[Missions](https://docs.google.com/spreadsheets/d/1SZgPl5hS7UHwdS8DKiUMrsT5GqMmQPCM/edit?usp=sharing&ouid=102406437055907246206&rtpof=true&sd=true)

4. I compare the unique values to the QGIS default world map countries to avoid errors. I then group the casualties by mission and by year.

In [None]:
import pandas as pd

from processing import check_country_names, replace_country_names, create_mission_year_fatalities

#Load data
url_missions = f'https://drive.google.com/uc?export=download&id=1SZgPl5hS7UHwdS8DKiUMrsT5GqMmQPCM'
url_countries = f'https://drive.google.com/uc?export=download&id=1RJtogT44ejo8jEv4f3rt-zpAauXUXjjV'
url_casualties = f'https://drive.google.com/uc?export=download&id=10aIygNb5I5LXXCJgEDKZr064keo9QM2K'

missions_df = pd.read_excel(url_missions)
countries_df = pd.read_excel(url_countries)
casualties_df = pd.read_csv(url_casualties)

# Check that the resulting country names match the spelling in the QGIS default map layer
check_country_names(missions_df, countries_df)

# Replace identified inconsistent names with names from QGIS
replace_country_names(missions_df)

# Group casualties by mission and year
create_mission_year_fatalities(missions_df, casualties_df)

5. I create a new layer in QGIS with the polygons for each mission using the present-day boundaries for the countries (this code is to be run in the QGIS python console).

[Code for the QGIS python console](./create_polygons.txt)

The resulting polygons look like this:
![polygons.png](./images/polygons.png)

We see that the majority of missions happen in Africa and Middle East, with some missions ocurring in Latin America and the Caribbean and Southeast Asia. This will inform our layout choice. 

6. I add the .csv file to my QGIS project as a delimited text layer.
7. I create a new virtual layer that combines data from the csv and the polygon layers:

[SQL code to run in the "New Virtual Layer" console](./create_virtual_layer.txt)

Using a field calculator, I create a new field with a date, applying to_date function: to_date( "date" )

I create a rule-based symbology: light blue fill for active missions with no casualties and different shades of orange and red for 1 casualty, 2-10, 11-20, 21-50, 50+ casualties.

8. I then activate the Dynamic Temporal Control for that layer (Single Field with Date/Time; event duration 1 day).
In order to show the year in animation, I create the rule-based labeling and add one label with the year (year(date)) to one of the features ("mission" = 'UNOWAS').
I then use data-defined placement to put this label in an appropriate place where it would be easy to notice and wouldn't obstruct the view. I also add the labels for the name of the mission, conditional upon "active_operation" = 1 and move them around the map to arrange properly.
![colours_and_labels.png](./images/colours_and_labels.png)


9. I create a layout with the Layout Manager, where I use a map of Africa, the Middle East, and Europe as the main map and add parts of Latin America and the Caribbean, and Southeast Asia as insets.

10. I use the Atlas feature to create time-based animation. I create a new .csv layer with years 1947 through 2025 and make this the atlas coverage layer. I then use the following code for dynamic temporal control of each of my 3 maps (main and insets):

`to_datetime(attribute(@atlas_feature, 'year') || '-01-01')` for Start

`to_datetime(attribute(@atlas_feature, 'year') || '-12-31')` for End

11. I export the atlas, getting 79 pictures which I then combine into a gif:
![Peacekeeping missions gif](./images/peacekeeping_missions_full_size.gif)

# Nationalities

Additionally, I analyse the fatalities by nationality. The analysis is demonstrated in a [dedicated notebook](./countries_of_origin.ipynb).

[fatalities by year](./images/fatalities_by_year.gif)

# Insights

Looking at this gif, we can get some valuable insights into the UN peacekeeping missions:
* The end of the Cold War increased the number of missions.
* Most missions happen in Africa.
* Africa is also the place of most deaths of UN personnel.

These insights are demonstrated via a 40-second gif, which is arguably less information-dense than text but more convincing.