# TURKEY'S POSITION IN GLOBAL TERRORISM DATABASE (1970-2017)           

### AN EXPLORATORY DATA ANALYSIS (EDA) PREPARED BY ABDULKADİR GERÇEKSEVER

### Introduction 

The Global Terrorism Database (GTD) is an open-source database including information on terrorist attacks around the world from 1970 through 2017. The GTD includes systematic data based on information on more than 180,000 terrorist attacks which have occurred during this time period. The database is maintained by researchers at the National Consortium for the Study of Terrorism and Responses to Terrorism (START), headquartered at the University of Maryland. So we are using this material to achieve the best results on our analysis. 

### The Aim of Analysis

The main aim of this explarotary data analysis is to achieve more clear vision on the terrorist activities not only at world but also in Turkey during the years 1970-2017. We will get better understanding on what's going on the terrorism side of world and Turkey when we finish to analyze the data.

### General View of the Data

Variables which are used during EDA:

Year: The year in which the incident occurred.

Month: The number of the month in which the incident occurred.

Day: The numeric day of the month on which the incident occurred.

Country: The country or location where the incident occurred. 

Region: The region in which the incident occurred. The regions are divided into the 12 categories.

City: The name of the city, village, or town in which the incident occurred. 

Lat.: The latitude (based on WGS1984 standards) of the city in which the event occurred.

Long.: The longitude (based on WGS1984 standards) of the city in which the event occurred.

AttackType: The general method of attack and often reflects the broad class of tactics used. It consists of nine categories.

Killed: The number of total confirmed fatalities for the incident

Wounded: The number of confirmed non-fatal injuries to both perpetrators and victims.

Target: The specific person, building, installation, etc., that was targeted and/or victimized and is a part of the entity. 

Group: The name of the group that carried out the attack.

Target_type: The general type of target/victim. This variable consists of the following 22 categories.

Weapon_type: The type of weapon. Up to four weapon types are recorded for each incident. It consists 13 categories

Motive: When reports explicitly mention a specific motive for the attack, this motive is recorded here.

In [None]:
# importing modules which are going to use during EDA

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import scipy.stats as stats
from scipy.stats.mstats import winsorize
from statsmodels.stats.weightstats import ttest_ind
import io
import codecs
import base64
%matplotlib inline
from IPython.display import HTML, display
from matplotlib import animation,rc
import plotly.offline as py
py.init_notebook_mode(connected=True)
import plotly.graph_objs as go
from mpl_toolkits.basemap import Basemap
import plotly.tools as tls
import time
import warnings

warnings.filterwarnings('ignore')

In [None]:
# the first look through the data 

Terror_World = pd.read_csv("GTD.csv", encoding = "ISO-8859-1")
Terror_World.head()

In [None]:
# data types and numbers of variables

Terror_World.info()

In [None]:
# we need to see 135 columns to understand what we have in the data

for i in Terror_World.columns:
    print(i)

In [None]:
# we need to see rows and columns together to grasp the data

for i in range (0, 136, 10):
    display (Terror_World.iloc[0:3, i:i+10])

In [None]:
# rename the columns which we need to use during our EDA

Terror_World.rename(columns={'iyear':'Year','imonth':'Month','iday':'Day','country_txt':'Country', 'region_txt':'Region',
                             'city':'City', 'latitude':'Lat.', 'longitude':'Long.', 'attacktype1_txt':'AttackType','target1':'Target','nkill':'Killed',
                       'nwound':'Wounded', 'gname':'Group','targtype1_txt':'Target_type',
                       'weaptype1_txt':'Weapon_type','motive':'Motive'},
              inplace=True)

In [None]:
# new Terror data set which we are going to work on

Terror_World=Terror_World[['Year','Month','Day','Country','Region', 'City', 'Lat.','Long.', 'AttackType','Killed','Wounded',
                           'Target', 'Group','Target_type','Weapon_type','Motive']]

Terror_World

In [None]:
# new data types and numbers of variables which we are going to work on

Terror_World.info()

### Data Wrangling

In [None]:
# info of NaN in our data set as percentage

Terror_World.isnull().sum()*100/Terror_World.shape[0]

In [None]:
Terror=Terror_World.copy()

In [None]:
# dropping the NaNs from 'City' and 'Target'

Terror=Terror.dropna(subset=['City', 'Target'])

In [None]:
# filling the NaNs with median

Terror['Killed'] = Terror['Killed'].fillna(Terror['Killed'].median())
Terror['Wounded'] = Terror['Wounded'].fillna(Terror['Killed'].median())
Terror['Casualities']=Terror['Killed']+Terror['Wounded']

In [None]:
# filling the NaNs of 'Lat.' with 39 and the NaNs of 'Long.' with 35.15

Terror['Lat.'] = Terror['Lat.'].fillna(39)
Terror['Long.'] = Terror['Long.'].fillna(35.15)

In [None]:
Terror.isnull().sum()*100/Terror.shape[0]

In [None]:
Terror.describe()

### Categorical Variables
- Country
- Region
- City
- Attacktype
- Target
- Group
- Target_type
- Weapon_type
- Motive

### Continuous Variables

- Year
- Month
- Day
- Lat.
- Long.
- Killed
- Wounded
- Casualties

### Data Exploration

In [None]:
# Number of terrorist activities by year

plt.subplots(figsize=(14,6))
sns.countplot('Year',data=Terror,palette='Spectral')
plt.xticks(rotation=70)
plt.title('Number Of Terrorist Activities By Year', color='red')
plt.show()

In [None]:
# to see the Country and the regions with highest terrorist attacks and also the max. killed attack

print('Regions with Highest Terrorist Attacks:',Terror['Region'].value_counts().index[0])
print('Country with Highest Terrorist Attacks:',Terror['Country'].value_counts().index[0])
print('Maximum people killed in an attack are:',Terror['Killed'].max(),'that happened in',Terror.loc[Terror['Killed'].idxmax()].Country)

In [None]:
# Number of terrorist attacks by region on chart

plt.subplots(figsize=(14,6))
sns.countplot('Region',data=Terror,palette='Spectral',order=Terror['Region'].value_counts().index)
plt.xticks(rotation=90)
plt.title('Number Of Terrorist Activities By Region', color='red')
plt.show()

In [None]:
# The first 18 countries by the most killed ones in terrorist attacks

KilledByTerror=Terror.groupby('Country').sum().sort_values('Killed', ascending=False).iloc[:18].Killed

In [None]:
# Number of terrorist attacks and number of killed by country on charts

plt.subplots(figsize=(12,6))
plt.subplot(1,2,1)
sns.countplot('Country',data=Terror,palette='Spectral', 
              order=Terror.Country.value_counts().iloc[:15].index)
plt.xticks(rotation=70)
plt.title('Number Of Terrorist Attacks by Country', fontsize=16, color='red')

plt.subplot(1,2,2)
plt.bar(KilledByTerror.index, KilledByTerror)
plt.xticks(rotation=70)
plt.title('Number Of Killed by Country', fontsize=16, color='red')

plt.show()

In [None]:
# Attacking methods by terrorists

plt.subplots(figsize=(14,6))
sns.countplot('AttackType', data=Terror,palette='Spectral',order=Terror['AttackType'].value_counts().index)
plt.xticks(rotation=70)
plt.title('Attacking Methods by Terrorists', fontsize=16, color='red')
plt.show()

In [None]:
# Favorite targets of attacks

plt.subplots(figsize=(14,6))
sns.countplot(Terror['Target_type'],palette='Spectral',order=Terror['Target_type'].value_counts().index)
plt.xticks(rotation=90)
plt.title('Favorite Targets', fontsize=16, color='red')
plt.show()

In [None]:
# Terrorist groups by highest terror attacks

sns.barplot(Terror['Group'].value_counts()[1:15],Terror['Group'].value_counts()[1:15].index,palette=('Spectral'))
plt.xticks(rotation=90)
fig=plt.gcf()
fig.set_size_inches(10,8)
plt.title('Terrorist Groups by Highest Terror Attacks', fontsize=16, color='red')
plt.show()

In [None]:
TR_Terror=Terror[Terror['Country']=='Turkey']
TR_Terror

In [None]:
TR_Terror.info()

In [None]:
TR_Terror.isnull().sum()*100/TR_Terror.shape[0]

In [None]:
# Number of attacks by year

TR_Terror.Year.plot(kind = 'hist', color = 'b', bins=range(1970, 2018), figsize = (15,7), alpha=0.5, grid=True)
plt.xticks(range(1970, 2018), rotation=90, fontsize=14)
plt.yticks(fontsize=14)
plt.xlabel("Year", fontsize=15)
plt.ylabel("Number of Attacks", fontsize=15)
plt.xticks(rotation=70)
plt.title("Number of Attacks By Year", fontsize=16, color = 'r')
plt.show()

In [None]:
# Most targeted part of Turkey

Parts= ["NorthEast", "NorthWest", "SouthEast", "SouthWest"]


TR_Terror[TR_Terror['Long.'] > 35.15]


NorthEast=TR_Terror[TR_Terror['Long.'] > 35.15] or TR_Terror[TR_Terror['Lat.'] > 39]
NorthWest=TR_Terror[TR_Terror['Long.'] < 35.15] or TR_Terror[TR_Terror['Lat.'] > 39]
SouthEast=TR_Terror[TR_Terror['Long.'] > 35.15] or TR_Terror[TR_Terror['Lat.'] < 39]
SouthWest=TR_Terror[TR_Terror['Long.'] < 35.15] or TR_Terror[TR_Terror['Lat.'] < 39]

TR_East=TR_Terror[TR_Terror['Long.'] > 35.15]
TR_SouthEast=TR_East[TR_East['Lat.'] < 39]

TR_Terror.City.value_counts().drop('Unknown').head(10).plot.bar(figsize=[12,6], grid=True, alpha=0.8)
plt.yticks(fontsize=10)
plt.xticks(fontsize=10)
plt.xlabel("Cities", fontsize=15)
plt.ylabel("Number of Attacks", fontsize=15)
plt.xticks(rotation=70)
plt.title("Most Targeted Cities", fontsize=16, color = 'r')
plt.show()

In [None]:
# Most targeted cities

TR_Terror.City.value_counts().drop('Unknown').head(10).plot.bar(figsize=[12,6], grid=True, alpha=0.8)
plt.yticks(fontsize=10)
plt.xticks(fontsize=10)
plt.xlabel("Cities", fontsize=15)
plt.ylabel("Number of Attacks", fontsize=15)
plt.xticks(rotation=70)
plt.title("Most Targeted Cities", fontsize=16, color = 'r')
plt.show()

In [None]:
!pip install wordcloud

In [None]:
from wordcloud import WordCloud
df = TR_Terror[TR_Terror.City != 'Unknown']
wordcloud = WordCloud(max_font_size=80, max_words=100, background_color="yellow").generate(" ".join(df.City))
plt.figure()
plt.imshow(wordcloud, interpolation="bilinear")
plt.axis("off")
plt.savefig("graph.png")
plt.show()

In [None]:
# Attacking methods by terrorists in Turkey

plt.subplots(figsize=(14,6))
sns.countplot('AttackType', data=TR_Terror,palette='Spectral',order=TR_Terror['AttackType'].value_counts().index)
plt.xticks(rotation=70)
plt.title('Attacking Methods by Terrorists', fontsize=16, color='red')
plt.show()

In [None]:
# Favorite targets of terrorists in Turkey

plt.subplots(figsize=(14,6))
sns.countplot(Terror['Target_type'],palette='Spectral',order=Terror['Target_type'].value_counts().index)
plt.xticks(rotation=90)
plt.title('Favorite Targets of Terrorists', fontsize=16, color='red')
plt.show()

In [None]:
sns.barplot(TR_Terror['Group'].value_counts()[0:12],TR_Terror['Group'].value_counts()[0:12].index,palette=('Spectral'))
plt.xticks(rotation=90)
fig=plt.gcf()
fig.set_size_inches(10,8)
plt.title('Terrorist Groups by Highest Terror Attacks', fontsize=16, color='red')
plt.show()

In [None]:
TR_Terror.corr()

In [None]:
f,ax = plt.subplots(figsize=(13, 13))
sns.heatmap(TR_Terror.corr(), annot=True, linewidths=.5, fmt= '.1f',ax=ax)
plt.show()

In [None]:
PKK=TR_Terror[TR_Terror['Group']=="Kurdistan Workers' Party (PKK)"]
PKK

In [None]:
stats.ttest_ind()

### Conclusions

At the end of this analysis we conclude that;
- When we look at number of terrorist activities through the World; we see that especially after 9/11, three years seems to be stable but from 2005 to 2014 there is really a steady increase, even though there seems to be a decrease until 2017, the activities after 2005 until today are really high level than before 2005 till 1970.
- Number of terrorist activities in Middle East & North Africa is at the highest point, after then comes South Asia. In the Middle East, Iraq is the first country with its terrorist activities and killed people. After Iraq comes Pakistan and Afghanistan. The terrorists use bombings/explosions as method during their attacks. And the favorite target of the attacks is private citizens/properties and then respectively; military, police, government and business parts of the countries. 
- Most of the terror attacks were held by Taliban, then ISIL.
- When we look at number of terrorist activities through Turkey; especially after 1987 until 1999 there seems to be an increase. This is because of the Kurdistan Worker’s Party (PKK)’s occurence. On the other hand there is again an increase on the activities after 2014 till today. We can say that the activities in Turkey is not related with 9/11. There is some another characteristic specialities and related terrorism events of the country because of its geographical position between Europe and Asia.
- Number of terrorist activities in İstanbul is at the highest point, after then comes ankara and Diyarbakır. The terrorists use bombings/explosions and then armed assault as a method during their attacks. And the favorite target of the attacks is private citizens/properties and then respectively; military, police, government and business parts of the cities like the scene at World. 
- Most of the terror attacks were held by Kurdistan Worker’s Party (PKK). 


