<a name='toc'></a>
#<font color=#F46767><b> 💣 Exploring Terror in Europe</b> (2000 - 2020) 🔥</font>

> _"The GTD--Global Terrorism Database-- defines a terrorist attack as the threatened or actual use of illegal force and violence by a nonstate actor to attain a political, economic, religious, or social goal through fear, coercion, or intimidation."_

<br>

## &#9889; _General_<br>

  - <font color='darkturquoise'>Scope</font>:
  Analyze terrorist incidents in Europe from January 1, 2000 through December 31, 2020 using the GTD dataset.

  - <font color='darkturquoise'>Objective</font>:
    - `Primary Q0`: _"How did the nature, lethality, and modus operandi of terrorist activity evolve across Europe from 2000 to 2020?"_
    - `Secondary Q1`: _"Which perpetrator groups changed their preferred attack types or target profiles over the two decades, and when did those shifts occur?"_
    - `Secondary Q2`: _"What country-to-country “spill-over” patterns exist?"_


<br>

##<img src="https://img.icons8.com/?size=100&id=yTvVS6whPDpp&format=png&color=000000" width="30" height="30"/> Main File Structure

&emsp;&emsp;
<img src="https://img.icons8.com/?size=100&id=43817&format=png&color=FCC419" width='25' height='25'/> gtd_project/<br>
&emsp;&emsp;
&#x251C;&#x2500; <img src="https://img.icons8.com/?size=100&id=43817&format=png&color=FCC419" width='25' height='25'/> source_files/<br>
&emsp;&emsp;&emsp;&emsp;
&#x251C;&#x2500; <img src="https://img.icons8.com/?size=100&id=13425&format=png&color=000000" width='25' height='25'/> globalterrorismdb_1970_2020.xlsx<br>
&emsp;&emsp;&emsp;&emsp;
&#x251C;&#x2500; <img src="https://img.icons8.com/?size=100&id=13395&format=png&color=000000" width='25' height='25'/> included_features.txt<br>
&emsp;&emsp;&emsp;&emsp;
&#x251C;&#x2500; <img src="https://img.icons8.com/?size=100&id=21Ss4P1afplY&format=png&color=000000" width='25' height='25'/> european_countries.csv<br>
&emsp;&emsp;&emsp;&emsp;
&#x2514;&#x2500; <img src="https://img.icons8.com/?size=100&id=21Ss4P1afplY&format=png&color=000000" width='25' height='25'/> countries_iso_codes.csv<br>
&emsp;&emsp;
&#x251C;&#x2500; <img src="https://img.icons8.com/?size=100&id=43817&format=png&color=FCC419" width='25' height='25'/> notebooks/<br>
&emsp;&emsp;
&#x251C;&#x2500; <img src="https://img.icons8.com/?size=100&id=13441&format=png&color=000000" width='25' height='25'/> gtd_util.py/<br>
&emsp;&emsp;
&#x251C;&#x2500; <img src="https://img.icons8.com/?size=100&id=43817&format=png&color=FCC419" width='25' height='25'/> etl_outputs/<br>
&emsp;&emsp;
&#x251C;&#x2500; <img src="https://img.icons8.com/?size=100&id=43817&format=png&color=FCC419" width='25' height='25'/> images/<br>
&emsp;&emsp;
&#x251C;&#x2500; <img src="https://img.icons8.com/?size=100&id=43817&format=png&color=FCC419" width='25' height='25'/> processed_data/<br>
&emsp;&emsp;
&#x2514;&#x2500; <img src="https://img.icons8.com/?size=100&id=VUckOuTyLQ7W&format=png&color=19B1FC" width='25' height='25'/> README.md<br>
<br>

###<img src="https://img.icons8.com/?size=100&id=XOQ8AO4LZthX&format=png&color=000000" width="30" height="30"/>References
1. START (National Consortium for the Study of Terrorism and Responses to Terrorism). (2022). Global Terrorism Database, 1970 - 2020. <a href="https://www.start.umd.edu/gtd">Dataset</a>&emsp;<a href="https://www.start.umd.edu/sites/default/files/2024-10/Codebook.pdf">Codebook</a>&emsp;<a href="https://www.start.umd.edu/gtd-terms">TermsOfUse</a>
2. Icons by [icons8](https://icons8.com)

## *Libraries, General Settings & Aux Functions*

In [None]:
import warnings
import numpy as np
import pandas as pd
from datetime import datetime

In [None]:
# defining the project directory to work with

PROJECT_PATH = '/content/drive/MyDrive/pf_pjs/gtd_project'

%cd $PROJECT_PATH

/content/drive/MyDrive/pf_pjs/gtd_project


In [None]:
# importing custom Python package implementing the ETL process

from gtd_util.etl_process import run_etl

In [None]:
# pandas global setting
pd.options.display.max_columns = None

# ignore warnings
warnings.filterwarnings('ignore')

<a name='etl'></a>
##<img src="https://img.icons8.com/?size=100&id=y9cF2cRMjkU6&format=png&color=06A9E7" width="25" height="25"/> <font color=orange><b>ETL</b> (Extract - Transform - Load)</font>
<br>

We will implement the ETL process, as follows:
1. <font color=ff5733><b>Extract</b></font>: In this stage, we'll perform the following actions:
    - Retrieve European countries to process (.csv file).
    - Retrieve the ISO-3 country codes to process (.csv file).
    - Retrieve features (.txt file).
    - Retrieve and filter the GTD data (.xlsx file).
    
2. <font color=33ffd4><b>Transorm</b></font>: In this stage, we'll preprocess the retrieved data, as follows:
    - Alter country names were needed for readability and consistency.
    - Change data types were needed.
    - Create new columns if necessary.
    - Filter our data to include only European countries' data and merge our sources.
    - Handle missing values.
3. <font color=d18bcc><b>Load</b></font>: In this stage, the transformed data will be exported to a .pkl file for the analysis phase.


In [None]:
run_etl()

Data extraction, Finished!
	-Initial GTD memory usage: 170.69 MB
	-Extracted 139872 GTD records from year 2000 to 2020
Data transformation, Finished!
	-Final GTD memory usage: 3.44 MB
	-Transformed 10046 GTD records from year 2000 to 2020
Data Loading, Finished!

ETL Process, Complete!

-ETL Duration: 0hrs - 4mins - 48.69secs
-Percentage difference in memory usage: -97.98%
-The shape of the dataframe is: (10046, 35)


In [None]:
gtd = pd.read_pickle('etl_outputs/gtd_final.pkl')

In [None]:
gtd.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10046 entries, 0 to 10045
Data columns (total 35 columns):
 #   Column                 Non-Null Count  Dtype         
---  ------                 --------------  -----         
 0   id                     10046 non-null  int64         
 1   date                   10046 non-null  datetime64[ns]
 2   five_year              10046 non-null  category      
 3   quarter                10046 non-null  Int8          
 4   year                   10046 non-null  Int16         
 5   month                  10046 non-null  Int16         
 6   month_name             10046 non-null  object        
 7   day                    10046 non-null  Int16         
 8   region                 10046 non-null  category      
 9   country                10046 non-null  category      
 10  alpha2                 10046 non-null  category      
 11  alpha3                 10046 non-null  category      
 12  province_state         10046 non-null  category      
 13  c

In [None]:
gtd.head(5)

Unnamed: 0,id,date,five_year,quarter,year,month,month_name,day,region,country,alpha2,alpha3,province_state,city,lat,lon,is_success,is_suicide,is_property_damaged,terr_group,is_claimed,attack_type,weapon_type,weapon_subtype,target_type,target_subtype,target_nationality,fatalities_total,fatalities_terrorists,wounded_total,wounded_terrorists,is_hostage,hostages_total,hostage_duration,is_ransom
0,200001010004,2000-01-01,2000-2005,1,2000,1,Jan,1,Southern,Kosovo,XK,XKX,Kosovo,Peje,42.659809,20.307119,1,0,UKN,UKN,0.0,Bombing,Explosives,UKN,Civilian,Residence,Kazakhstan,0.0,0.0,1.0,0.0,0.0,0,0.0,UKN
1,200001010008,2000-01-01,2000-2005,1,2000,1,Jan,1,Southern,Kosovo,XK,XKX,Kosovo,Gorazhdec,42.640556,20.369722,0,0,0,UKN,0.0,Armed Assault,Firearms,UKN,Civilian,Race_Ethnicity,Serbia-Montenegro,0.0,0.0,0.0,0.0,0.0,0,0.0,UKN
2,200001010009,2000-01-01,2000-2005,1,2000,1,Jan,1,Eastern,Turkiye,TR,TUR,Istanbul,Istanbul,41.106178,28.689863,1,0,1,UKN,0.0,Bombing,Explosives,Pipe Bomb,Civilian,Retail_Facility,Turkey,0.0,0.0,0.0,0.0,0.0,0,0.0,UKN
3,200001010010,2000-01-01,2000-2005,1,2000,1,Jan,1,Southern,Spain,ES,ESP,Basque Country,Galdacano,43.230556,-2.845833,1,0,1,UKN,0.0,Armed Assault,Incendiary,Gasoline/Alcohol,Security,Military_Facility,Spain,0.0,0.0,1.0,0.0,0.0,0,0.0,UKN
4,200001010011,2000-01-01,2000-2005,1,2000,1,Jan,1,Southern,Spain,ES,ESP,Basque Country,Guernica,43.317073,-2.678975,1,0,1,UKN,0.0,Armed Assault,Incendiary,Gasoline/Alcohol,Civilian,Finance_Facility,Spain,0.0,0.0,0.0,0.0,0.0,0,0.0,UKN


In [None]:
gtd.tail(5)

Unnamed: 0,id,date,five_year,quarter,year,month,month_name,day,region,country,alpha2,alpha3,province_state,city,lat,lon,is_success,is_suicide,is_property_damaged,terr_group,is_claimed,attack_type,weapon_type,weapon_subtype,target_type,target_subtype,target_nationality,fatalities_total,fatalities_terrorists,wounded_total,wounded_terrorists,is_hostage,hostages_total,hostage_duration,is_ransom
10041,202012280017,2020-12-28,2016-2020,4,2020,12,Dec,28,Eastern,Russia,RU,RUS,Chechnya,Grozny,43.320228,45.654493,1,0,0,Caucasus Province of the Islamic State,1.0,Armed Assault,Melee,Sharp Object,Security,Police_Personnel,Russia,3.0,2.0,1.0,0.0,0.0,0,0.0,UKN
10042,202012280018,2020-12-28,2016-2020,4,2020,12,Dec,28,Southern,Cyprus,CY,CYP,Nicosia,Strovolos,35.127271,33.323645,1,0,UKN,UKN,0.0,Bombing,Explosives,UKN,Civilian,Medical_Facility,Cyprus,0.0,0.0,0.0,0.0,0.0,0,0.0,UKN
10043,202012300012,2020-12-30,2016-2020,4,2020,12,Dec,30,Eastern,Russia,RU,RUS,Krasnodar,Starominskaya,46.533307,39.040516,0,0,0,UKN,0.0,Bombing,Explosives,UKN,Transport,Rail_System,Russia,0.0,0.0,0.0,0.0,0.0,0,0.0,UKN
10044,202012310010,2020-12-31,2016-2020,4,2020,12,Dec,31,Southern,Greece,GR,GRC,Attica,Athens,37.997492,23.762727,1,0,1,UKN,0.0,Bombing,Explosives,UKN,Security,Police_Facility,Greece,0.0,0.0,0.0,0.0,0.0,0,0.0,UKN
10045,202012310017,2020-12-31,2016-2020,4,2020,12,Dec,31,Western,Germany,DE,DEU,Lower Saxony,Leipzig,51.342239,12.374772,1,0,1,Left-wing extremists,1.0,Infrastructure Attack,Incendiary,Arson/Fire,Security,Military_Vehicle,Germany,0.0,0.0,0.0,0.0,0.0,0,0.0,UKN


In [None]:
# descriptive stats
gtd.describe().T[['count', 'min', 'max', 'mean','std','50%',]]

Unnamed: 0,count,min,max,mean,std,50%
id,10046.0,200001010004.0,202012310017.0,201172085997.854462,564622564.102012,201405190047.5
date,10046.0,2000-01-01 00:00:00,2020-12-31 00:00:00,2012-02-25 08:06:47.087397888,,2014-05-19 00:00:00
quarter,10046.0,1.0,4.0,2.518813,1.07896,3.0
year,10046.0,2000.0,2020.0,2011.65439,5.646286,2014.0
month,10046.0,1.0,12.0,6.499303,3.30936,7.0
day,10046.0,1.0,31.0,15.527673,8.747425,15.0
lat,10046.0,34.672216,67.143672,45.347288,6.044082,43.320698
lon,10046.0,-21.89521,158.383333,25.754414,19.461907,34.637833
