<a href="https://colab.research.google.com/github/mohsud56/sudan-displacement-analysis/blob/main/sudan_analysis.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
!pip install plotly openpyxl



In [2]:
import pandas as pd
import matplotlib.pyplot as plt
import plotly.express as px
from google.colab import files
from datetime import datetime

In [3]:
# Upload your Excel file: sudan_hrp_political_violence_events_and_fatalities_by_month-year_as-of-15jan2026.xlsx
uploaded = files.upload()
# After upload, note the exact file name
file_name = 'sudan_hrp_political_violence_events_and_fatalities_by_month-year_as-of-15jan2026.xlsx'  # Update if different
df = pd.read_excel(file_name, sheet_name='Data')  # Reads the 'Data' sheet

Saving sudan_hrp_political_violence_events_and_fatalities_by_month-year_as-of-15jan2026.xlsx to sudan_hrp_political_violence_events_and_fatalities_by_month-year_as-of-15jan2026.xlsx


In [5]:
# Create date column (assume day 1 for monthly data)
df['date'] = pd.to_datetime(df['Year'].astype(str) + '-' + df['Month'] + '-01')

# Filter since April 2023 (war start) to latest (Jan 2026)
df_war = df[df['date'] >= '2023-04-01']

# Group by month/year to get national totals (sum across regions)
df_national = df_war.groupby(['Year', 'Month', 'date']).agg({
    'Events': 'sum',
    'Fatalities': 'sum'
}).reset_index()

# Create year_month for grouping
df_national['year_month'] = df_national['date'].dt.to_period('M').astype(str)

# Total fatalities and events since war start
total_fatalities = df_national['Fatalities'].sum()
total_events = df_national['Events'].sum()
print(f"Total reported fatalities since April 2023: {total_fatalities}")
print(f"Total reported events since April 2023: {total_events}")

Total reported fatalities since April 2023: 52125
Total reported events since April 2023: 14581


In [13]:
# Melt for stacked bar
df_melt = df_national.melt(id_vars=['year_month'], value_vars=['Events', 'Fatalities'],
                           var_name='Type', value_name='Count')

fig2 = px.bar(df_melt, x='year_month', y='Count', color='Type',
              title='Monthly Conflict Events and Fatalities in Sudan (2023–2026)',
                            barmode='group')
fig2.show()

In [16]:
# Interactive dashboard: Explore the charts above!
# For more interactivity, you can add sliders or filters in Plotly if needed.
print("Dashboard: Use the interactive charts to zoom/explore trends!")

Dashboard: Use the interactive charts to zoom/explore trends!


In [15]:
print("""
UNHCR Displacement Data (Latest as of Jan 2026):
- Internally Displaced Persons (IDPs): ~7.1 million
- Refugees who fled Sudan: ~3.5 million
- Total forcibly displaced: ~11.8 million

These numbers correlate with spikes in fatalities/events, showing conflict's impact on migration.
""")


UNHCR Displacement Data (Latest as of Jan 2026):
- Internally Displaced Persons (IDPs): ~7.1 million
- Refugees who fled Sudan: ~3.5 million
- Total forcibly displaced: ~11.8 million

These numbers correlate with spikes in fatalities/events, showing conflict's impact on migration.



In [8]:
negative_fatalities = df_war[df_war['Fatalities'] < 0]

if not negative_fatalities.empty:
    print("Rows with negative 'Fatalities' values:")
    print(negative_fatalities)
else:
    print("No negative 'Fatalities' values found in df_war.")

No negative 'Fatalities' values found in df_war.


In [9]:
df_regional_fatalities = df_war.groupby('Admin1')['Fatalities'].sum().reset_index()
print(df_regional_fatalities.head())

           Admin1  Fatalities
0           Abyei         545
1      Al Jazirah        4728
2  Bahr el Ghazal           0
3       Blue Nile         346
4  Central Darfur         647


In [10]:
df_regional_fatalities = df_regional_fatalities.sort_values(by='Fatalities', ascending=False)
print(df_regional_fatalities.head(10))

            Admin1  Fatalities
10    North Darfur       14469
9         Khartoum       11177
18     West Darfur        5247
1       Al Jazirah        4728
11  North Kordofan        4256
16    South Darfur        3097
19   West Kordofan        2883
17  South Kordofan        2450
15          Sennar         942
20      White Nile         765


In [11]:
fig = px.bar(
    df_regional_fatalities,
    x='Admin1',
    y='Fatalities',
    title='Total Conflict Fatalities by Region in Sudan (April 2023-Jan 2026)',
    labels={'Admin1': 'Region', 'Fatalities': 'Total Fatalities'},
    color='Fatalities', # Color by fatalities
    color_continuous_scale=px.colors.sequential.Reds_r # Red color gradient, reversed for darker red with higher fatalities
)

fig.update_layout(
    xaxis_tickangle=-45, # Rotate x-axis labels
    xaxis_title='Region',
    yaxis_title='Total Fatalities'
)
fig.show()

## Insights from Regional Fatalities Visualization

The interactive bar chart effectively highlights the regions most affected by conflict fatalities in Sudan since April 2023. Key observations include:

*   **North Darfur** stands out as the region with the highest number of fatalities, significantly exceeding other areas.
*   **Khartoum** follows closely as the second most affected region, indicating intense conflict in the capital.
*   **West Darfur** and **Al Jazirah** also show a high number of fatalities, suggesting widespread conflict beyond the capital.
*   There's a clear gradient in fatalities, with a few regions bearing the brunt of the conflict, while others like Abyei, Bahr el Ghazal, and Red Sea report relatively fewer or zero fatalities.

This visualization provides a clear picture of the geographical distribution of the conflict's human cost, indicating areas that likely require the most urgent humanitarian intervention and peace-building efforts.

## Final Task

### Subtask:
Provide insights from the new regional fatalities bar chart, identifying top regions and discussing the distribution of fatalities across Sudan.


## Summary:

### Q&A
The analysis reveals that conflict fatalities since April 2023 are not evenly distributed across Sudan. North Darfur and Khartoum are identified as the regions most severely affected, followed by West Darfur and Al Jazirah. In contrast, regions like Abyei, Bahr el Ghazal, and Red Sea report significantly fewer or zero fatalities.

### Data Analysis Key Findings
*   The `df_war` DataFrame was confirmed to contain no negative 'Fatalities' values, ensuring data integrity for the analysis.
*   **North Darfur** exhibits the highest number of conflict fatalities, indicating it as the most severely impacted region.
*   **Khartoum** is the second most affected region, showing intense conflict in the capital.
*   **West Darfur** and **Al Jazirah** also record a high number of fatalities, suggesting a broader conflict impact beyond the capital.
*   A clear gradient in fatalities exists, with a few regions bearing the majority of the conflict's human cost.

### Insights or Next Steps
*   The concentration of fatalities in specific regions like North Darfur and Khartoum highlights critical areas for targeted humanitarian and peace-building interventions.
*   Further investigation into the specific causes and dynamics of conflict in the most affected regions could inform more effective conflict resolution strategies.
