# Terrorism in the Middle East - Data Cleaning & EDA

This .ipynb notebook contains the data cleaning process and the exploratory data analysis that is meant to map the landscape of the terrorism occuring in the Middle East.

## Project Objectives

The objectives of the project in its entirety are manifold. The objectives of the project are:
<ol>
    <li>Gain an understanding of landscape of terrorism in the Middle East.</li>
    <li>Establish a timeline of the development of terrorism in the Middle East.</li>
    <li>Understand the motivations of terrorist in the Middle Eastern region. </li>
    <li>Draw paralles to the observations found of terrorism in Europe.</li>
</ol>

Before these can be answered the dataset requires cleaning and subsequently an initial analysis of the processed dataset can be carried out.

### Importing Libraries & Dataset

In [None]:
# General purpose libraries.
import pandas as pd
import numpy as np
import sys

# Visualisation libraries.
import altair as alt
alt.data_transformers.disable_max_rows() # Disable max rows for Altair.
from vega_datasets import data

# Import functions.
sys.path.insert(1, '../Functions') # Setting path to 'Functions' folder.
from data_cleaning import data_cleaner # Import data cleaner object.
from map_creator import simple_map # Import simple map creator.
from eda import eda # Import EDA class.

# Setting path to data.
path = '../Data/' # Setting path to data.

In [None]:
# Importing data.
raw_data = pd.read_excel(path + 'globalterrorism.xlsx')

### Cleaning Data

A few pre-processing steps are required. Firstly, the dataset needs to be cut down to only include the Middle Eastern (and North African) region, and deleting columns with a high degree of NaN values.

In [None]:
# Use the data cleaner function to clean the dataset. 
# For a full description of its functionality, please refer to the
# data_cleaning.py file.

# Set the region to 10 (Middle East & North Africa), the intended name of the cleaned file,
# a path where it should be saved, and insert the original dataset.

data = data_cleaner(region=10, file_name='region_middle_east.csv',path = path, df=raw_data)

In [None]:
# Load cleaned dataset.
df = pd.read_csv(path + 'region_middle_east.csv')

### Exploratory Data Analysis 

In [None]:
eda = eda(df)
plot = eda.group_plot()
time, activity = eda.simple_plots()

#### Timeline of Terrorist Activity in the Middle East

In [None]:
time

#### Activity per Group

In [None]:
activity

#### Timeline of Most Active Groups

In [None]:
plot

#### Map of Terrorist Activity in the Middle East Region

In [None]:
# Map translation values to center map in Middle Eastern region/North Africa.
map_projection = simple_map(df, x_translation=160, y_translation=650, map_scale=1200)
map_projection