# MAPPING AFRICA'S CONFLICT RELATIONSHIPS

**An exploration of actor-to-actor conflict dynamics across the African continent (1997–2014) using ACLED Dyadic Data**

## Business Understanding  
In Africa’s conflict zones, it’s not just about what happened- it’s about **who keeps coming back to fight whom**, and **where things are heating up**.  
That’s the part most datasets skip. But the ACLED Dyadic data? That’s where the real signal lives.

We’re not here to count bullets. We’re here to **map relationships**, **track escalations**, and surface the patterns that matter- before things spiral.  
This project leans into that gap- turning actor-to-actor conflict data into something **actionable** for NGOs, peacebuilders, analysts, and anyone serious about understanding violence from the inside out.

## Project Overview  
We’re breaking down over a decade of conflict- **who fought whom**, **where**, and **how it played out**- to answer the questions that lead to better decisions:

- What dyads keep reappearing?
- Who’s triggering the worst violence?
- Which areas are consistently volatile?
- How do these relationships evolve?

From mapping conflict webs to scoring high-risk actor pairs, the goal is simple:  
**Give the right people the right lens before the next crisis hits.**

## Deliverables

- **Cleaned + enriched dataset** (dyad ID, conflict region, actor normalization, year breakdown)
- **Conflict dyad explorer**- who fights whom, how often, and with what impact
- **Escalation curves** for the most volatile dyads
- **Hotspot heatmaps** and regional breakdowns
- **Network graphs** showing actor relationships and central nodes
- **Dyadic Risk Score**- composite risk index based on frequency, intensity, and recency
- **Notebooks + visuals + ready-to-use summaries** for stakeholders

## Success Metrics

- Top 10 riskiest dyads identified and profiled  
- Escalation trends clearly visualized for key actor pairs  
- Accurate hotspot detection by region and year  
- Reusable code and clean outputs for policy teams or analysts  
- Project structured for future ACLED updates or country- focused expansions  

> This isn’t just a dataset. It’s a lens. One that tells us not just what happened- **but who’s likely to make it happen again.**
> Powered by data. Grounded in people.

# 2️ Data Loading & Initial Exploration
- Load the Excel file
- Preview rows, columns, and datatypes
- Summary stats (rows, years covered, unique actors, countries)

# 3️ Data Cleaning & Preprocessing
- Handle missing values
- Rename ambiguous columns
- Create new features: dyad name, year, month, actor interaction type
- Normalize actor names (optional)

# 4️ Exploratory Data Analysis (EDA)
## A. Univariate
- Most common Actor1 / Actor2
- Top countries, event types, interactions

## B. Bivariate
- Fatalities by dyad
- Dyad frequency vs. fatalities
- Temporal trend per country / actor

## C. Geospatial
- Choropleth: conflicts by country
- Heatmap: event locations
- Regional focus maps

# 5️ Network Analysis
- Construct directed graph of actors
- Degree, centrality, clustering
- Visualize with NetworkX or Plotly
- Highlight conflict communities

# 6️ Modeling & Scoring
## A. Fatality Classifier (LogReg / Tree)
- Predict high-fatality dyadic events

## B. Dyad Risk Scoring
- Create a composite score per dyad
- Rank and visualize

## C. Temporal Prediction (Optional)
- Predict future dyadic recurrence or escalation

# 7️ Insights & Dashboards
- Top 10 riskiest dyads
- Timeline of escalation by actor
- Region-wise conflict summaries
- Downloadable actor profiles

# 8️ Deliverables
- Cleaned dataset
- Visuals + charts
- Notebook (.ipynb)
- PDF/HTML report
- GitHub repo with README

# 9 Conclusion & Next Steps
- Insights recap
- Policy relevance
- Future data integrations (refugees, elections, natural resources)

## INITIAL DATA EXPLORATION (IDE)

Every dataset tells a story- but before I dive into any narratives, I'll flip through the table of contents. This phase is about getting comfortable with the data: seeing what’s there, what’s missing, and what might surprise me later if I don’t pay attention now.

#### What's happening:
- Importing key libraries like 'pandas', 'numpy', 'seaborn', 'matplotlib', and 'plotly'- the usual suspects for slicing, dicing and visualizing data.
- Previewing the first few rows to get a feel for the dataset’s structure, naming conventions, and early red flags (no one likes nasty surprises 30 cells in).
- Checking the shape of the data because whether it's 500 rows or 50,000 completely changes the game.
- Get metadata
- Get basic statistics information of both numerica and categorical columns

This might not be the flashiest part of the workflow, but it’s where trust is built- between me and the dataset. And as I’ve learned from previous projects, a few extra minutes spent here can save hours of confusion down the road.

Exploration done right is part instinct, part structure- this is BOTH!

In [1]:
# Mathematical computation and data manipulation libraries
import numpy as np
import pandas as pd

# Data visualization libraries
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.figure_factory as ff
import plotly.graph_objects as go
from plotly.subplots import make_subplots

# Modeling and ML libraries
from sklearn.preprocessing import LabelEncoder

# Load the data
conflict_df = pd.read_excel('ACLED Dyadic Relationships.xlsx')

# Preview first 5
conflict_df.head()

Unnamed: 0,GWNO,EVENT_ID_CNTY,EVENT_ID_NO_CNTY,EVENT_DATE,YEAR,TIME_PRECISION,EVENT_TYPE,ACTOR1,ALLY_ACTOR_1,INTER1,...,ADMIN1,ADMIN2,ADMIN3,LOCATION,LATITUDE,LONGITUDE,GEO_PRECIS,SOURCE,NOTES,FATALITIES
0,615,1ALG,1,1997-01-02,1997,1,Violence against civilians,GIA: Armed Islamic Group,,2,...,Blida,Blida,,Blida,36.4686,2.8289,1,www.algeria-watch.org,4 January: 16 citizens were murdered in the vi...,16.0
1,615,2ALG,2,1997-01-03,1997,1,Violence against civilians,GIA: Armed Islamic Group,,2,...,Tipaza,Douaouda,,Douaouda,36.6725,2.7894,1,www.algeria-watch.org,5 January: Massacre of 18 citizens in the Oliv...,18.0
2,615,3ALG,3,1997-01-04,1997,1,Violence against civilians,GIA: Armed Islamic Group,,2,...,Tipaza,Hadjout,,Hadjout,36.5139,2.4178,1,www.algeria-watch.org,6 January: 23 citizens were horribly mutilated...,23.0
3,615,4ALG,4,1997-01-05,1997,1,Remote violence,GIA: Armed Islamic Group,,2,...,Alger,Bouzareah,,Algiers,36.766,3.05,1,www.algeria-watch.org,7 January: Explosion of a bomb in the Didouche...,20.0
4,615,5ALG,5,1997-01-09,1997,1,Violence against civilians,GIA: Armed Islamic Group,,2,...,Alger,Ouled Chebel,,Ouled Chebel,36.5994,2.9944,1,www.algeria-watch.org,11 January: 5 citizens massacred in Ouled Cheb...,5.0


In [2]:
# Check how many rows and columns I am working with
print(f'The dataset has {conflict_df.shape[0]} rows and {conflict_df.shape[1]} columns')

# Check column names to inform on standardisation needs
print('\nColumn Names:\n', conflict_df.columns)

The dataset has 99548 rows and 25 columns

Column Names:
 Index(['GWNO', 'EVENT_ID_CNTY', 'EVENT_ID_NO_CNTY', 'EVENT_DATE', 'YEAR',
       'TIME_PRECISION', 'EVENT_TYPE', 'ACTOR1', 'ALLY_ACTOR_1', 'INTER1',
       'ACTOR2', 'ALLY_ACTOR_2', 'INTER2', 'INTERACTION', 'COUNTRY', 'ADMIN1',
       'ADMIN2', 'ADMIN3', 'LOCATION', 'LATITUDE', 'LONGITUDE', 'GEO_PRECIS',
       'SOURCE', 'NOTES', 'FATALITIES'],
      dtype='object')


In [4]:
# Standardise column names
conflict_df.columns = (conflict_df.columns.str.strip().str.lower())

# Preview changes
conflict_df.sample(3)

Unnamed: 0,gwno,event_id_cnty,event_id_no_cnty,event_date,year,time_precision,event_type,actor1,ally_actor_1,inter1,...,admin1,admin2,admin3,location,latitude,longitude,geo_precis,source,notes,fatalities
23882,651,2104EGY,23882,2013-05-30,2013,1,Riots/Protests,Rioters (Egypt),Students (Egypt),5,...,Giza,Zemam out,,Abou Rawash,30.043548,31.097562,1,Daily News Egypt,Minor clashes broke out with security forces a...,0.0
75597,560,1476SAF,75599,2012-05-14,2012,1,Riots/Protests,Protesters (South Africa),,6,...,Gauteng,Johannesburg,,Johannesburg,-26.20227,28.04363,1,South African Press Association,Disgruntled taxi drivers march to protest agai...,0.0
25965,651,4182EGY,25960,2014-01-15,2014,1,Riots/Protests,Rioters (Egypt),Muslim Brotherhood,5,...,Fayoum,Zemam Out,,QaSr al Basil,29.109865,30.821798,1,Egyptian Organization for Human Rights (Cairo),"In the province of Fayoum, clashes in the vill...",0.0


In [5]:
# Get metadata
conflict_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 99548 entries, 0 to 99547
Data columns (total 25 columns):
 #   Column            Non-Null Count  Dtype         
---  ------            --------------  -----         
 0   gwno              99548 non-null  int64         
 1   event_id_cnty     99548 non-null  object        
 2   event_id_no_cnty  99548 non-null  int64         
 3   event_date        99548 non-null  datetime64[ns]
 4   year              99548 non-null  int64         
 5   time_precision    99548 non-null  int64         
 6   event_type        99548 non-null  object        
 7   actor1            99548 non-null  object        
 8   ally_actor_1      14384 non-null  object        
 9   inter1            99548 non-null  int64         
 10  actor2            77440 non-null  object        
 11  ally_actor_2      8594 non-null   object        
 12  inter2            99548 non-null  int64         
 13  interaction       99548 non-null  int64         
 14  country           9954

In [6]:
# Get basic statistical info of numerical variables
conflict_df.describe()

Unnamed: 0,gwno,event_id_no_cnty,year,time_precision,inter1,inter2,interaction,latitude,longitude,geo_precis,fatalities
count,99548.0,99548.0,99548.0,99548.0,99548.0,99548.0,99548.0,99548.0,99548.0,99548.0,95070.0
mean,531.333096,49774.5,2007.951199,1.172259,3.414825,3.157864,30.286746,4.711653,23.942672,1.275535,6.208815
std,61.163793,28737.176636,5.678059,0.500496,2.121612,2.814157,17.573653,15.365206,16.858715,0.544663,100.121537
min,230.0,1.0,1997.0,1.0,1.0,0.0,10.0,-34.71011,-17.47389,1.0,0.0
25%,490.0,24887.75,2003.0,1.0,2.0,1.0,13.0,-1.466667,13.20841,1.0,0.0
50%,520.0,49774.5,2010.0,1.0,3.0,2.0,27.0,4.31887,29.2833,1.0,0.0
75%,560.0,74661.25,2013.0,1.0,5.0,7.0,38.0,11.01667,34.0,1.0,1.0
max,651.0,99548.0,2014.0,3.0,8.0,8.0,88.0,37.274423,51.2668,3.0,25000.0


In [7]:
# Get basic statistical info of categorical variables
conflict_df.describe(include = 'O').T

Unnamed: 0,count,unique,top,freq
event_id_cnty,99548,99548,7064SOM,1
event_type,99548,9,Battle-No change of territory,30131
actor1,99548,2627,Unidentified Armed Group (Somalia),4301
ally_actor_1,14384,2060,Muslim Brotherhood,1344
actor2,77440,2245,Civilians (Somalia),4165
ally_actor_2,8594,1488,MDC: Movement for Democratic Change,904
country,99548,50,Somalia,15150
admin1,99548,651,Banaadir,4752
admin2,96169,3522,Mogadisho,4752
admin3,48168,2854,Harare City Council,1686


In [8]:
# Check for duplicattes and nulls
print('Duplicates:', conflict_df.duplicated().sum())
print('\nNull Values:\n', conflict_df.isna().sum())

Duplicates: 0

Null Values:
 gwno                    0
event_id_cnty           0
event_id_no_cnty        0
event_date              0
year                    0
time_precision          0
event_type              0
actor1                  0
ally_actor_1        85164
inter1                  0
actor2              22108
ally_actor_2        90954
inter2                  0
interaction             0
country                 0
admin1                  0
admin2               3379
admin3              51380
location                0
latitude                0
longitude               0
geo_precis              0
source                187
notes               10929
fatalities           4478
dtype: int64
