# Conflict Model Prediction Analysis

**Objective:**
Develop a prediction model to analyze political violence and predict future events, trends, and fatality risks in Africa based on historical data.

**Questions a successful model could answer**:

1) Where are future political violence events likely to occur?

*Predict regions/countries most prone to future incidents.*

2) Which events are likely to have the highest fatality rates?

*Forecast event types with high mortality, enabling early interventions.*

3) What are the most frequent sub-event types linked to disorder types?

*Identify patterns linking specific sub-event types to disorder outcomes.*

4) Which actors are most involved in escalating violence?

*Highlight key actors contributing to increased violent activities over time.*

5) How do geographic and temporal trends correlate with event severity?


The success of this model could help;


*   Analyze the impact of location and time on event escalation and fatalities.
*   This model could inform government agencies and humanitarian organizations for strategic planning and conflict prevention.




In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import statsmodels.api as sm
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score

In [None]:
# Load the csv dataset
conflict_df = pd.read_csv("/content/Africa_1997-2024_Aug23.csv")

# Print the first 5 rows
print(conflict_df.head())

# Check datatypes and missing values

print(conflict_df.info())

  event_id_cnty  event_date  year  time_precision       disorder_type  \
0       ANG4104  2024-08-23  2024               1  Political violence   
1      BFO12464  2024-08-23  2024               1  Political violence   
2      BFO12471  2024-08-23  2024               1  Political violence   
3      BFO12472  2024-08-23  2024               1  Political violence   
4      CAO14533  2024-08-23  2024               1  Political violence   

  event_type sub_event_type  \
0      Riots   Mob violence   
1    Battles    Armed clash   
2    Battles    Armed clash   
3    Battles    Armed clash   
4    Battles    Armed clash   

                                              actor1  \
0                                   Rioters (Angola)   
1       JNIM: Group for Support of Islam and Muslims   
2       JNIM: Group for Support of Islam and Muslims   
3       JNIM: Group for Support of Islam and Muslims   
4  Islamic State (West Africa) and/or Boko Haram ...   

              assoc_actor_1  inter1  

In [None]:
conflict_df.head()

Unnamed: 0,event_id_cnty,event_date,year,time_precision,disorder_type,event_type,sub_event_type,actor1,assoc_actor_1,inter1,...,location,latitude,longitude,geo_precision,source,source_scale,notes,fatalities,tags,timestamp
0,ANG4104,2024-08-23,2024,1,Political violence,Riots,Mob violence,Rioters (Angola),Vigilante Group (Angola),5,...,Luanda,-8.8383,13.2344,1,Ango Noticias; Correio da Kianda; Novo Journal,National,"On 23 August 2024, a mob assaulted a police of...",1,crowd size=no report,1724714023
1,BFO12464,2024-08-23,2024,1,Political violence,Battles,Armed clash,JNIM: Group for Support of Islam and Muslims,,2,...,Niempourou,12.6018,-3.2158,2,Signal,New media,"On 23 August 2024, JNIM ambushed a patrol of s...",0,,1724714023
2,BFO12471,2024-08-23,2024,1,Political violence,Battles,Armed clash,JNIM: Group for Support of Islam and Muslims,,2,...,Djibo,14.0875,-1.6418,1,Al Zallaqa,New media,"On 23 August 2024, JNIM claimed to have killed...",3,,1724714023
3,BFO12472,2024-08-23,2024,1,Political violence,Battles,Armed clash,JNIM: Group for Support of Islam and Muslims,,2,...,Diougo,11.2472,0.1221,1,Facebook; Whatsapp,New media,"On 23 August 2024, JNIM militants attacked vol...",10,,1724714023
4,CAO14533,2024-08-23,2024,1,Political violence,Battles,Armed clash,Islamic State (West Africa) and/or Boko Haram ...,,2,...,Moskota,10.9508,13.8671,2,Humanity Purpose,New media,"On 23 August 2024, ISWAP or Boko Haram militan...",0,,1724714031


In [None]:
# Feature Selection
# Pick region, country, year, latitude, longitude, fatalities, interaction, disorder_type, event_type

conflict_pred = conflict_df[["region", "country","year", "fatalities", "latitude", "longitude",
                              "interaction", "disorder_type", "event_type"]]

conflict_pred.head()



Unnamed: 0,region,country,year,fatalities,latitude,longitude,interaction,disorder_type,event_type
0,Middle Africa,Angola,2024,1,-8.8383,13.2344,15,Political violence,Riots
1,Western Africa,Burkina Faso,2024,0,12.6018,-3.2158,12,Political violence,Battles
2,Western Africa,Burkina Faso,2024,3,14.0875,-1.6418,24,Political violence,Battles
3,Western Africa,Burkina Faso,2024,10,11.2472,0.1221,24,Political violence,Battles
4,Middle Africa,Cameroon,2024,0,10.9508,13.8671,24,Political violence,Battles


In [None]:
from sklearn.preprocessing import LabelEncoder

# Initialize LabelEncoder for categorical variables
label_encoder = LabelEncoder()

# Apply encoding to the categorical columns
conflict_pred['event_type'] = label_encoder.fit_transform(conflict_pred['event_type'])
conflict_pred['disorder_type'] = label_encoder.fit_transform(conflict_pred['disorder_type'])
conflict_pred['region'] = label_encoder.fit_transform(conflict_pred['region'])
conflict_pred['country'] = label_encoder.fit_transform(conflict_pred['country'])


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  conflict_pred['event_type'] = label_encoder.fit_transform(conflict_pred['event_type'])
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  conflict_pred['disorder_type'] = label_encoder.fit_transform(conflict_pred['disorder_type'])
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  conflict_pred['region'] =

In [None]:
# Split the data into training and test sets

X = conflict_pred.drop("event_type", axis=1)
y = conflict_pred["event_type"]

# Split the data into 80 train and 20 test sizes
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Check the shape of the training and testing sets
print("Training set shape:", X_train.shape, y_train.shape)
print("Testing set shape:", X_test.shape, y_test.shape)

Training set shape: (305597, 8) (305597,)
Testing set shape: (76400, 8) (76400,)


In [None]:
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report, pair_confusion_matrix

# Initialize and train the Logistic Regression model
logreg = LogisticRegression()
logreg.fit(X_train, y_train)

# Make predictions on the test set
y_pred = logreg.predict(X_test)

# Evaluate the model's performance
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)
print("Classification Report:\n", classification_report(y_test, y_pred))

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


Accuracy: 0.5685471204188481
Classification Report:
               precision    recall  f1-score   support

           0       0.48      0.79      0.60     19526
           1       0.11      0.00      0.00      5648
           2       0.68      0.84      0.75     17914
           3       0.56      0.13      0.21      7981
           4       0.00      0.00      0.00      6827
           5       0.58      0.65      0.61     18504

    accuracy                           0.57     76400
   macro avg       0.40      0.40      0.36     76400
weighted avg       0.49      0.57      0.50     76400



  _warn_prf(average, modifier, msg_start, len(result))


In [None]:
from sklearn.ensemble import RandomForestClassifier

# Initialize and train a Random Forest model
rf_model = RandomForestClassifier()
rf_model.fit(X_train, y_train)

# Predict on the test set
y_pred_rf = rf_model.predict(X_test)

# Evaluate the Random Forest model
print("Accuracy:", accuracy_score(y_test, y_pred_rf))
print("Classification Report:\n", classification_report(y_test, y_pred_rf))


Accuracy: 0.943913612565445
Classification Report:
               precision    recall  f1-score   support

           0       0.91      0.94      0.93     19526
           1       0.68      0.53      0.60      5648
           2       1.00      1.00      1.00     17914
           3       1.00      0.99      1.00      7981
           4       1.00      1.00      1.00      6827
           5       0.95      0.98      0.96     18504

    accuracy                           0.94     76400
   macro avg       0.92      0.91      0.91     76400
weighted avg       0.94      0.94      0.94     76400

