### Systemic Crisis Prediction in Africa

This project analyzes a dataset (1860-2014) from 13 African countries to predict systemic crisis emergence using indicators like annual inflation rates.

Dataset: [**Dataset Link**](**https://drive.google.com/file/d/1fTQ9R29kgAhInFO0HMqvkcAfSZWg6fCx/view**)

**Steps**

- Data Exploration

- Imported and inspected the dataset.

- Used pandas profiling for insights on missing values, duplicates, and outliers.

- Data Cleaning

- Removed duplicates and addressed outliers using statistical methods.

- Encoded categorical features using Label Encoding.

- Selected the target variable and relevant features based on correlation analysis.

- Model Training and Evaluation

- Split data into training (80%) and test (20%) sets.

- Trained classification models (e.g., Logistic Regression, Random Forest).

- Evaluated using metrics like Accuracy, Precision, Recall, F1-Score, and ROC-AUC.

- Model Improvement

- Explored feature engineering and hyperparameter tuning.

- Discussed balancing techniques like SMOTE and advanced models (XGBoost, CatBoost).

**Tools and Libraries**

- Python, Pandas, Numpy, Scikit-learn, Matplotlib, Seaborn, Pandas Profiling, Imbalanced-learn.

**Key Insights**

- Effective preprocessing improved model performance.

- Collaboration provided strategies for enhancement.

In [2]:
## importing libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.preprocessing import LabelEncoder
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.svm import SVC
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score
from sklearn.model_selection import cross_val_score

In [None]:
# Importing dataset using pandas
dataset = pd.read_csv(r