# Global Terrorism Analysis

An analysis of the global terrorism Kaggle dataset [Global Terrorism Database](https://www.kaggle.com/START-UMD/gtd).



* Direct link: https://www.kaggle.com/START-UMD/gtd (requires a Kaggle account)
* Terroris attacks in the time period 1970-2017
* Each row represents a terrorist attack
* Used fields:
    * country_txt: country name
    * region_txt: macro-region {Western Europe, Eastern Europe, Middle East, North America, ...}
    * nkill: number of victims
    * nwound: number of wounded
    * latitude
    * longitude
    * attacktype1_txt: type of attack {assassination, hostage taking, bombing, ...}

We analyze several aspects:  
 * Most affected countries
 * Most affected macro-regions in the world
 * Most active terrorist groups
 * Trends over time: world-wide and Western Europe

### Import & Load dataset

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import plotly.express as px
import seaborn as sns

In [None]:
df_terr = pd.read_csv('globalterrorismdb_0718dist.csv', delimiter=",", low_memory=False, encoding='ISO-8859-1')

In [None]:
df_terr.sample(3)

In [None]:
print(f"Number of rows: {df_terr.shape[0]}")
print(f"Number of columns: {df_terr.shape[1]}")

### Data preparation: column rename & pre-processing

In [None]:
df_terr.rename(columns={'iyear':'year',
                       'imonth':'month',
                       'iday':'day',
                       'target1' : 'target',
                       },inplace=True)

df_terr=df_terr[['year','month','day', 'country_txt','region_txt',
               'city','latitude','longitude','attacktype1_txt',
               'nkill','nwound','target','summary','gname',
               'motive']]


In [None]:
df_terr.loc[:, "nhit"] = df_terr["nwound"] + df_terr["nkill"] # added column for analysis: number of killed + wounded

### Most hit countries



In [None]:
df_terr_most_hit = df_terr.groupby("country_txt").sum()
df_terr_most_hit = df_terr_most_hit.sort_values(by="nhit", ascending=False)
df_terr_most_hit = df_terr_most_hit[:10]

In [None]:
fig=px.bar(df_terr_most_hit, y="nhit", x=df_terr_most_hit.index, color=df_terr_most_hit.index, width=1600, height=900)
fig.layout["yaxis"]["title"] = "Number of persons affected"
fig.layout["xaxis"]["title"] = "Nation"
fig.layout["title"] = "Number of killed + wounded, by country"
fig.update_layout(font_size=20)
fig.show()

Iraq is (by far) the most affected country

### Most hit macro-regions

In [None]:
df_terr_by_region = df_terr.groupby("region_txt").sum()
df_terr_by_region = df_terr_by_region.sort_values(by="nhit", ascending=False)

In [None]:
fig = px.bar(df_terr_by_region, y="nhit", x=df_terr_by_region.index, color=df_terr_by_region.index, width=1600, height=800)
fig.layout["yaxis"]["title"] = "Number of persons affected"
fig.layout["xaxis"]["title"] = "Nation"
fig.layout["title"] = "Number of killed + wounded, by macro-region"
fig.update_layout(font_size=20)
fig.show()

Middle East & North Africa is the most affacted macro-region.

### Number of victims by terrorist group

In [None]:
df_terr_group = df_terr.groupby("gname").sum()
df_terr_group = df_terr_group.sort_values(by="nkill", ascending=False)
df_terr_group = df_terr_group.iloc[0:20]

In [None]:
fig = px.bar(df_terr_group, y=df_terr_group.index, x="nkill", color=df_terr_group.index, width=1600, height=900)
fig.layout["xaxis"]["title"] = "Victims"
fig.layout["yaxis"]["title"] = "Terrorist Groups"
fig.layout["title"] = "Number of killed by terrorist groups"
fig.update_layout(font_size=22)
fig.show()

Most of the victims are from unknown groups

### Trend Attacks

In [None]:
df_terr_year_count = df_terr.groupby(["year"]).size()
fig = px.line(x=df_terr_year_count.index, y=df_terr_year_count.values, width=1400, height=600)
fig.layout["title"] = "Terrorist Attacks in the World"
fig.layout["xaxis"]["title"] = "Year"
fig.layout["yaxis"]["title"] = "Total attacks per year"
fig.update_layout(showlegend=False, font_size=22)
fig.show()

A sharp increase in 2011 (Syrian civil war). The peak was reached in 2014.

### Attacks in Western Europe

In [None]:
df_terr_eu = df_terr[df_terr["region_txt"] == "Western Europe"]

In [None]:
df_terr_year_count = df_terr_eu.groupby(["year"]).size()
fig = px.line(x=df_terr_year_count.index, y=df_terr_year_count.values, width=1400, height=600)
fig.layout["title"] = "Terrorist attacks in Western Europe"
fig.layout["xaxis"]["title"] = "Year"
fig.layout["yaxis"]["title"] = "Total attacks per year"
fig.update_layout(showlegend=False, font_size=22)
fig.show()

Terrorism in Western Europe had two main peaks: in the late 70s and mid-90s.

### Attacks in Europe

In [None]:
worldPlot = px.scatter_mapbox(df_terr, lat="latitude", lon="longitude",zoom=3,
                             hover_name="city", hover_data=["gname"], labels={"attacktype1_txt": "attack type"},
                              color="attacktype1_txt", animation_frame="year",center={'lon':8,'lat':48}, height=800, width=1400)
worldPlot.update_layout(mapbox_style="open-street-map")
worldPlot.update_layout(title="During the years the type of attack and the position changes", font_size=22)
worldPlot.show()

In Italy, several terrorist attacks in the 70s.

<center>
    <h1>Thanks for your attention.</h1><br/>
    <h2>Questions?</h2>
</center>