<h2><center><b><i>Cluster bomb</b></i>: Uncovering Patterns in Terrorist Group Beliefs and Attacks</center></h2>

#### **COM-480: Data Visualization**

**Team**: Alexander Sternfeld, Silvia Romanato & Antoine Bonnet

**Dataset**: [Global Terrorism Database (GTD)](https://www.start.umd.edu/gtd/) 

**Additional dataset**: [Profiles of Perpetrators of Terrorism in the United States (PPTUS)](https://dataverse.harvard.edu/dataset.xhtml?persistentId=hdl%3A1902.1/17702)

## **Terrorist groups**

 

In [1]:
import os
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

from load_data import *

pd.set_option('display.max_columns', None)

GTD = load_GTD()
PPTUS_data, PPTUS_sources = load_PPTUS()


GTD pickle file found, loading...
PPTUS pickle files found, loading...


### Ideologies

In [2]:
# Rename some of the columns
PPTUS_data.rename(columns={'DOM_I': 'dominant_ideology', 'I_ETHNO': 'ethno_nationalist',  'I_REL': 'religious', 'I_RACE':  'racist',
                            'I_LEFT': 'extreme_left', 'I_RIGHT':  'extreme_right', 'G_POL_1':  'politic_reasons', 'G_SOC_1':  'social_reasons',
                            'G_ECO_1': 'economic_reasons', 'G_REL_1':  'religious_reasons'}, inplace=True)


In [3]:
# Print number of rows where gname is not "Unknown"
print("-"*30, "GTD", "-"*30)	
print("Number of attacks where group name is not Unknown: ", len(GTD[GTD.gname != "Unknown"]))
print("Number of distinct groups in GTD:", len(GTD.gname.unique())-1)

print("")
print("-"*30, "PPTUS", "-"*30)
known_groups = PPTUS_data['ORGNAME'].unique()

print("Number of known groups: ", len(known_groups))
print("Number of attacks executed by a known group: ", len(GTD[GTD.gname.isin(known_groups)]))

------------------------------ GTD ------------------------------
Number of attacks where group name is not Unknown:  120991
Number of distinct groups in GTD: 3766

------------------------------ PPTUS ------------------------------
Number of known groups:  145
Number of attacks executed by a known group:  7131


In [4]:
# Retain only attacks executed by a known group
GTD_gattacks = GTD[GTD.gname.isin(known_groups)]

# Merge GTD_gattacks with PPTUS_data
GTD_gattacks = pd.merge(GTD_gattacks, PPTUS_data, left_on='gname', right_on='ORGNAME', how='left')

In [5]:
# Make a dataframe with as rows the distinct ideologies and as a column the number of attacks
ideologies = {1: "Extreme Right Wing", 2: "Extreme Left Wing",
              3: "Religious", 4: "Ethno-nationalist/separist", 5: "Single issue"}

ideology_counts = pd.DataFrame(columns=['ideology', 'count'])
for ideology in ideologies:
    ideology_counts = ideology_counts.append({'ideology': ideologies[ideology], 'count': len(
        GTD_gattacks[GTD_gattacks.dominant_ideology == ideology])}, ignore_index=True)

ideology_counts.set_index('ideology', inplace=True)

ideology_counts

Unnamed: 0_level_0,count
ideology,Unnamed: 1_level_1
Extreme Right Wing,61
Extreme Left Wing,430
Religious,2722
Ethno-nationalist/separist,3495
Single issue,415


In [6]:
religious_groups = GTD_gattacks[GTD_gattacks.dominant_ideology == 3]
ethno_nat_groups = GTD_gattacks[GTD_gattacks.dominant_ideology == 4]
extreme_left_groups = GTD_gattacks[GTD_gattacks.dominant_ideology == 2]

# Print the top 5 groups with the most attacks of both ideologies
print("-"*30, "Religious groups", "-"*30)
print(religious_groups.groupby('gname').size().sort_values(ascending=False).head(5))

print("")

print("-"*30, "Ethno-nationalist/separist groups", "-"*30)
print(ethno_nat_groups.groupby('gname').size().sort_values(ascending=False).head(5))

print("")

print("-"*30, "Extreme left wing groups", "-"*30)
print(extreme_left_groups.groupby('gname').size().sort_values(ascending=False).head(5))


------------------------------ Religious groups ------------------------------
gname
Tehrik-i-Taliban Pakistan (TTP)                  1509
Al-Qaida in the Arabian Peninsula (AQAP)         1121
Al-Qaida                                           74
Jamaat-al-Fuqra                                     5
Covenant, Sword and the Arm of the Lord (CSA)       4
dtype: int64

------------------------------ Ethno-nationalist/separist groups ------------------------------
gname
Irish Republican Army (IRA)                           2670
Armenian Secret Army for the Liberation of Armenia     186
Black September                                        120
Fuerzas Armadas de Liberacion Nacional (FALN)          120
Jewish Defense League (JDL)                             82
dtype: int64

------------------------------ Extreme left wing groups ------------------------------
gname
Mujahedin-e Khalq (MEK)              112
New World Liberation Front (NWLF)     86
Weather Underground, Weathermen       46
Blac

In [7]:
# print top 10 groups with most attacks in GTD 
print("-"*30, "Top 10 groups with most attacks in GTD", "-"*30)
print(GTD[GTD.gname != "Unknown"].groupby('gname').size().sort_values(ascending=False).head(20))

------------------------------ Top 10 groups with most attacks in GTD ------------------------------
gname
Taliban                                             12936
Islamic State of Iraq and the Levant (ISIL)          7479
Shining Path (SL)                                    4567
Al-Shabaab                                           4547
Houthi extremists (Ansar Allah)                      3516
Boko Haram                                           3459
New People's Army (NPA)                              3441
Farabundo Marti National Liberation Front (FMLN)     3351
Irish Republican Army (IRA)                          2670
Kurdistan Workers' Party (PKK)                       2612
Revolutionary Armed Forces of Colombia (FARC)        2490
Maoists                                              2164
Communist Party of India - Maoist (CPI-Maoist)       2113
Basque Fatherland and Freedom (ETA)                  2024
National Liberation Army of Colombia (ELN)           1842
Liberation Tigers of Ta

## Group analysis

In [8]:
# Get the groups of interest
taliban_attacks = GTD_gattacks[GTD_gattacks.gname ==
                               "Tehrik-i-Taliban Pakistan (TTP)"]
mek_attacks = GTD_gattacks[GTD_gattacks.gname == "Mujahedin-e Khalq (MEK)"]
IRA_attacks = GTD_gattacks[GTD_gattacks.gname == "Irish Republican Army (IRA)"]

#### Differences in targets

In [10]:
# import plotly
import plotly.graph_objects as go
import plotly.express as px

# Make an interactive graph, that shows the different target types ('targtype1_txt') for the groups of interest
# Use percentages instead of absolute numbers
# Use a stacked bar chart, use the categories "Military", "Government (General)", "Police", "Private Citizens & Property", "Business" and "Other"

types = ['Military', 'Government (General)', 'Police', 'Private Citizens & Property', 'Business', 'Other']
# Make a dataframe with the target types and the percentages
target_types = pd.DataFrame(columns=['target_type', 'percentage', 'group'])
for group in [taliban_attacks, mek_attacks, IRA_attacks]:
    for type in types:
        target_types = target_types.append({'target_type': type, 'percentage': len(group[group.targtype1_txt == type]) / len(group) * 100, 'group': group.gname.iloc[0]}, ignore_index=True)

# Make a stacked bar chart, each bar represents a group, the different colors represent the different target types
fig = px.bar(target_types, x="group", y="percentage", color="target_type", barmode="stack")
fig.show()

# Save the figure as a html file to display on a website, save in the folder 'plots'
fig.write_html("../plots/group_target_types.html")


#### Differences in

### Ideas for plots

* For each group, show how the number of attacks changed over time (ideally on a map?)
