# **Project Name**    -



##### **Project Type**    - EDA
##### **Contribution**    - Individual
##### **Author**    - Allan Cheerakunnil Alex

# **Project Summary -**

This project embarks on a comprehensive exploration of the Global Terrorism Database (GTD) using advanced data analysis and visualization techniques. Leveraging Python libraries such as Pandas for robust data manipulation, NumPy for efficient numerical operations, and Matplotlib/Seaborn for compelling visualizations, the analysis aims to uncover intricate trends and patterns in terrorism-related activities spanning from 1970 to 2017. With over 180,000 recorded incidents, the project employs a diverse set of visualizations, including bar plots, scatter plots, histograms, and heatmaps, to elucidate relationships between variables and present a vivid picture of the dataset's characteristics.

The investigation delves into critical dimensions such as attack frequency, targeted countries, methods employed, weapon types, casualties, and the evolution of terrorist organizations. By unraveling these facets, the project seeks to provide a nuanced understanding of global terrorism trends over the decades. The derived insights are poised to inform counter-terrorism strategies and policies, offering valuable perspectives on the characteristics of regions prone to attacks and the underlying factors contributing to their vulnerability. Through this exploration, the project contributes to a more informed and data-driven approach to addressing the complex challenges posed by global terrorism.

In conclusion, this project delves into terrorism's intricate patterns within the extensive Global Terrorism Database (GTD), using data manipulation, numerical computation, and visualizations to extract crucial insights that can significantly aid counter-terrorism efforts and guide future research in the field.

# **GitHub Link -**

Provide your GitHub Link here.

# **Problem Statement**


### What are the key security issues and strategic insights that can be derived through an in-depth Exploratory Data Analysis (EDA) of the Global Terrorism Database (GTD)?

**Through a comprehensive Exploratory Data Analysis (EDA) of the Global Terrorism Database (GTD), the objective is to pinpoint the hot zones of terrorism globally and discern evolving patterns of terrorist activities. By analyzing geographical distribution, frequency, and severity of incidents, this study aims to uncover insights that can inform the development of effective counter-terrorism strategies. The analysis will delve into identifying regions with the highest incidence of terrorist activities, understanding the changing tactics and methods employed by terrorist groups, and discerning potential correlations with socio-political factors. The derived insights will be instrumental in shaping security policies, allowing for proactive measures to mitigate the impact of terrorism, allocate resources strategically, and enhance international collaboration to address specific security challenges in identified hot zones.**



#### **Define Your Business Objective?**

The primary objective of this project is to harness the wealth of information within the Global Terrorism Database (GTD) through a thorough exploratory data analysis (EDA), aiming to derive actionable insights into terrorist activities worldwide from 1970 to 2017. The overarching goal is to facilitate informed decision-making for security analysts, policy-makers, and counter-terrorism agencies.

The specific objectives include:

1. **Identification of Global Hot Zones:** Through EDA, pinpointing regions with the highest concentration of terrorist activities to guide the optimal allocation of resources for preventing future attacks.

2. **Analysis of Attack Frequency and Intensity:** Examining the evolution of attack frequency and intensity over time to uncover trends and changing dynamics, allowing for more accurate risk assessments.

3. **Examination of Attack Methodologies and Weapons Used:** Delving into the methodologies and weapons employed in terrorist attacks to gain insights into operational preferences, potentially providing early indicators of future threats.

4. **Assessment of Casualty Trends:** Analyzing patterns related to casualties to identify the most devastating types of attacks and inform targeted response planning to minimize human loss.

5. **Unveiling Patterns Related to Terrorist Organizations:** Exploring patterns associated with various terrorist organizations to aid in understanding their strategies, supporting intelligence agencies in effective counter-terrorism efforts.

This project seeks to contribute valuable knowledge that goes beyond identifying patterns to provide practical, data-driven insights for enhancing global counter-terrorism efforts.

# **General Guidelines** : -  

1.   Well-structured, formatted, and commented code is required.
2.   Exception Handling, Production Grade Code & Deployment Ready Code will be a plus. Those students will be awarded some additional credits.
     
     The additional credits will have advantages over other students during Star Student selection.
       
             [ Note: - Deployment Ready Code is defined as, the whole .ipynb notebook should be executable in one go
                       without a single error logged. ]

3.   Each and every logic should have proper comments.
4. You may add as many number of charts you want. Make Sure for each and every chart the following format should be answered.
        

```
# Chart visualization code
```
            

*   Why did you pick the specific chart?
*   What is/are the insight(s) found from the chart?
* Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

5. You have to create at least 20 logical & meaningful charts having important insights.


[ Hints : - Do the Vizualization in  a structured way while following "UBM" Rule.

U - Univariate Analysis,

B - Bivariate Analysis (Numerical - Categorical, Numerical - Numerical, Categorical - Categorical)

M - Multivariate Analysis
 ]





# ***Let's Begin !***

## ***1. Know Your Data***

### Import Libraries

In [None]:
# Import Libraries
# Import Libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px

### Dataset Loading

In [None]:
# Load Dataset
from google.colab import drive
drive.mount('/content/drive/')

### Dataset First View

In [None]:
# Dataset First Look
data_path = "/content/drive/MyDrive/AlmaBetter Projects/Python project 2/"

# Loading the Global Terrorism Dataset
data = pd.read_csv(data_path + 'Global Terrorism Data.csv', encoding='latin-1')
data.head()


### Dataset Rows & Columns count

In [None]:
# Dataset Rows & Columns count
rows, cols = data.shape
print(f'There are {rows} rows and {cols} columns in the dataset.')

### Dataset Information

In [None]:
# Dataset Info
data.info()

#### Duplicate Values

In [None]:
# Dataset Duplicate Value Count
duplicate_rows = data.duplicated().sum()

print(f'There are {duplicate_rows} duplicate_rows in the dataset')

#### Missing Values/Null Values

In [None]:
# Missing Values/Null Values Count
missing_values = data.isnull().sum()
print(missing_values)

In [None]:
# Missingno matrix or the seaborn heatmap can be used.
# Seaborn heatmap is used.
# Visualizing the missing values
# Plot a heatmap of missing values
plt.figure(figsize=(10, 6))
sns.heatmap(data.isnull(), cbar=False, cmap='viridis')
plt.title('Missing Values Heatmap')
plt.show()

### What did you know about your dataset?

In [None]:
# Dataset Size
print("Dataset Size:", len(data))

# Feature Quantity
print("Number of Features:", len(data.columns))

# Data Types
print("Data Types:")
print(data.dtypes.value_counts())

# Memory Usage
print("Memory Usage:")
print(data.info(memory_usage='deep'))

# Missing Values
print("Missing Values:")
missing_values = data.isnull().sum()
print(missing_values[missing_values > 0].sort_values(ascending=False))

- **Dataset Size:** The dataset is quite large, containing 181,691 entries or rows.

- **Feature Quantity:** The dataset contains 135 features or columns.

- **Data Types:** The dataset has a mix of data types. There are 55 features with floating-point numbers (float64), 22 features with integers (int64), and 58 features with objects (object). The object datatype in pandas typically means the column contains string (text) data.

- **Memory Usage:** The dataset uses over 626.8 MB of memory.

- **Missing Values:** There are some columns with a large number of missing values. For example, the 'gsubname3' column has 181,671 missing values, 'weapsubtype4' and 'weapsubtype4_txt' columns have 181,621 missing values each, 'weaptype4' and 'weaptype4_txt' columns have 181,618 missing values each. However, several columns do not have any missing values, such as 'eventid', 'iyear', 'imonth', 'iday', 'INT_LOG', 'INT_IDEO', 'INT_MISC', and 'INT_ANY'. There are also columns like 'guncertain1' with 380 missing values, 'ishostkid' with 178 missing values, 'specificity' with 6 missing values, 'doubtterr' and 'multiple' with 1 missing value each.

## ***2. Understanding Your Variables***

In [None]:
# Dataset Columns
columns = data.columns

print("Columns in the dataset")

for column in columns:
  print(column)

In [None]:
# Dataset Describe
# Dataset Describe
summary = data.describe()

print(summary)

### Variables Description

**eventid**: A unique identifier for each terrorist incident.

**iyear, imonth, iday**: Date components of the incident, indicating the year, month, and day, respectively.

**country, country_txt**: Numeric and textual representations of the country where the incident occurred.

**region, region_txt**: Numeric and textual representations of the region where the incident occurred.

**provstate**: The name or abbreviation of the province or state where the incident occurred.

**city**: The name of the city or location where the incident occurred.

**attacktype1, attacktype1_txt**: Numeric and textual representations of the primary method of attack.

**targtype1, targtype1_txt**: Numeric and textual representations of the primary target type.

**weaptype1, weaptype1_txt**: Numeric and textual representations of the primary weapon type used.

**nkill**: Number of confirmed kills.

**nwound**: Number of confirmed injuries.

**gname**: The name of the terrorist group responsible for the incident.

**summary**: A brief description or summary of the incident.

**motive**: The perceived motive or reason behind the terrorist incident.

**related**: Information on related incidents.

**ishostkid**: Indicates whether hostages were taken (1 if hostages taken, 0 if not).

### Check Unique Values for each variable.

In [None]:
# Check Unique Values for each variable.
unique_countries = data['country_txt'].unique()
print(unique_countries)

print()  #this will leave gap

unique_year = data['iyear'].unique()
print(unique_year)

## 3. ***Data Wrangling***

### Data Wrangling Code

In [None]:
# Write your code to make your dataset analysis ready.
print(data.isnull().sum())

In [None]:
pd.set_option('display.max_rows', None)
print(data.dtypes)

In [None]:
pd.reset_option('display.max_rows')

In [None]:
data.rename(columns={'iyear':'Year','imonth':'Month','iday':'Day','country_txt':'Country','provstate':'state','region_txt':'Region','attacktype1_txt':'AttackType','target1':'Target','nkill':'Killed','nwound':'Wounded','summary':'Summary','gname':'Group','targtype1_txt':'Target_type','weaptype1_txt':'Weapon_type','motive':'Motive'},inplace=True)

In [None]:
data=data[['Year','Month','Day','Country','state','Region','city','latitude','longitude','AttackType','Killed','Wounded','Target','Summary','Group','Target_type','Weapon_type','Motive']]

In [None]:
data.head()

In [None]:
print(data.dtypes)

### What all manipulations have you done and insights you found?

### Given the dataset's extensive nature with 135 columns, which may be overwhelming for comprehensive learning, the decision has been made to enhance clarity and focus by renaming the columns for better understanding, subsequently extracting only the necessary features for streamlined analysis.

## ***4. Data Vizualization, Storytelling & Experimenting with charts : Understand the relationships between variables***

#### Chart - 1

In [None]:
# Chart - 1 visualization with custom color
plt.figure(figsize=(12, 6))
attacks_per_year = data['Year'].value_counts().sort_index()
sns.lineplot(x=attacks_per_year.index, y=attacks_per_year.values)
plt.title('Number of Terrorist Attacks Over the Years')
plt.xlabel('Year')
plt.ylabel('Number of Attacks')
plt.show()

##### 1. Why did you pick the specific chart?

**The line chart was chosen to represent the number of terrorist attacks over the years. A line chart is suitable for showing trends and patterns over a continuous variable, in this case, the progression of attacks over different years.**

##### 2. What is/are the insight(s) found from the chart?

**The line chart visually depicts the trend in the number of terrorist attacks over the years. It helps in identifying whether there is a significant increase, decrease, or any noticeable pattern in the frequency of attacks.**

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**The insights gained from this chart can be valuable for businesses, governments, or organizations involved in security and risk management. Understanding the trend in terrorist attacks over the years allows for better preparation, resource allocation, and planning to address security concerns. It can contribute to the development of strategies aimed at preventing and mitigating the impact of terrorist incidents.**

**For businesses operating in regions affected by terrorism, this information can be crucial for risk assessment and business continuity planning. It may influence decisions related to security investments, insurance coverage, and overall risk management.**

**Government agencies can use this data to enhance security measures, allocate resources effectively, and develop policies to counter terrorism. The insights gained can contribute to the formulation of evidence-based counterterrorism strategies.**

#### Chart - 2

In [None]:
# Create a horizontal bar plot
plt.figure(figsize=(12, 8))
attack_type_distribution = data['AttackType'].value_counts()
sns.barplot(x=attack_type_distribution.values, y=attack_type_distribution.index, hue=attack_type_distribution.index, palette='viridis', dodge=False, orient='h')

plt.title('Distribution of Terrorist Attack Types')
plt.ylabel('Attack Type')
plt.xlabel('Number of Attacks')
plt.legend(title='Attack Type', labels=attack_type_distribution.index, bbox_to_anchor=(1.05, 1), loc='upper left')
plt.tight_layout()  # Adjust layout to prevent clipping of labels
plt.show()


##### 1. Why did you pick the specific chart?

**The horizontal bar plot was chosen to represent the distribution of terrorist attack types. This type of chart is effective for comparing the frequency of different categories (attack types) and allows for easy visualization of the most prevalent types.**

##### 2. What is/are the insight(s) found from the chart?

**The chart provides a clear visual representation of the distribution of terrorist attack types, allowing for easy comparison between categories. From the stacked bars, it's possible to observe which attack types are more prevalent and how they contribute to the overall number of attacks.**

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**The insights gained from this chart can be valuable for various stakeholders, including government agencies, security organizations, and businesses operating in regions prone to terrorism.**

**Government Agencies: Government bodies responsible for security can use this information to allocate resources effectively. It helps in understanding the most common types of attacks, enabling the development of targeted counterterrorism strategies and policies.**

**Security Organizations: Private security firms can tailor their services based on the prevalent attack types. For example, if bombings are more common, there might be an increased focus on bomb detection and prevention measures.**

**Businesses: Companies operating in regions with a high risk of terrorism can use this information for risk assessment and business continuity planning. For instance, businesses might implement security measures specific to the prevalent attack types to safeguard employees and assets.**

#### Chart - 3

In [None]:
# Updated code to address the warning and customize the legend
plt.figure(figsize=(15, 7))

# Get the count of each group
group_data = data[data['Group'] != 'Unknown']['Group'].value_counts().head(10)

# Use this count for plotting (horizontal bar plot with hue)
ax = sns.barplot(x=group_data.values, y=group_data.index, hue=group_data.index, palette='Set2', dodge=False)

plt.title('Top 10 terrorist groups with highest number of attacks')
plt.xlabel('Count')
plt.ylabel('Group')

# Customize the legend
handles, labels = ax.get_legend_handles_labels()
ax.legend(handles, labels, title='Groups', loc='upper right')

plt.show()

##### 1. Why did you pick the specific chart?

**The horizontal bar plot with grouped bars (stacked) was chosen to represent the top 10 terrorist groups with the highest number of attacks. This chart is effective for comparing the frequencies of different terrorist groups and understanding their relative contribution to the overall number of attacks.**

##### 2. What is/are the insight(s) found from the chart?

**The chart provides a clear visual representation of the top 10 terrorist groups and their respective frequencies of attacks. It allows for easy comparison and identification of the groups with the highest impact.**

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**The insights gained from this chart can be valuable for various stakeholders, including government agencies, security organizations, and businesses operating in regions affected by terrorism.**

**Government Agencies: Security agencies can use this information to prioritize efforts and resources in countering the activities of the most active terrorist groups. It aids in strategic planning and intelligence gathering.**

**Security Organizations: Private security firms can tailor their services based on the threat posed by specific terrorist groups. For example, heightened security measures might be implemented in areas where the most active groups operate.**

**Businesses: Companies operating in regions with a high risk of terrorism can use this information for risk assessment and business continuity planning. It helps in understanding the specific threats posed by certain groups and allows for the implementation of targeted security measures.**

#### Chart - 4

In [None]:
# Chart - 4 visualization code
pd.crosstab(data.Year, data.Region).plot(kind='area',figsize = (15,6))
plt.title('Terrorist Activities by Region in each Year')
plt.ylabel('Number of Attacks')
plt.show()

##### 1. Why did you pick the specific chart?

**The area plot depicting terrorist activities by region over each year was chosen. This chart is suitable for visualizing the trends and patterns of terrorist activities in different regions over time. The stacked areas provide a sense of the overall volume and the relative contribution of each region to the total.**

##### 2. What is/are the insight(s) found from the chart?

**The chart offers insights into how terrorist activities have evolved over the years across different regions. By observing the areas under each curve, one can identify regions that have consistently experienced high levels of terrorist activities and those that have seen fluctuations. It also helps in comparing the overall distribution of attacks across regions.**

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**The insights gained from this chart can be beneficial for various stakeholders, including government agencies, security organizations, and businesses operating in regions prone to terrorism.**

**Government Agencies: Government bodies can use this information to allocate resources effectively and prioritize regions that require heightened security measures. It aids in the development of regional-specific counterterrorism strategies.**

**Security Organizations: Private security firms can tailor their services based on the historical patterns of terrorist activities in different regions. Understanding the dynamics allows for better preparation and risk mitigation strategies.**

**Businesses: Companies operating in regions with a history of terrorism can use this information for risk assessment and business continuity planning. It helps in identifying regions with higher security risks, allowing businesses to implement targeted security measures.**

#### Chart - 5

In [None]:
# Assuming your DataFrame is named 'data'
column_names = data.columns
print(column_names)


In [None]:
# Create a bar plot
plt.figure(figsize=(12, 6))
most_affected_countries = data['Country'].value_counts().head(15)
sns.barplot(x=most_affected_countries.values, y=most_affected_countries.index, hue=most_affected_countries.index, palette='Set1', dodge=False)

plt.title('Top 15 Most Affected Countries by Terrorism')
plt.xlabel('Number of Attacks')
plt.ylabel('Country')
plt.legend(title='Country', labels=most_affected_countries.index, loc='upper right')
plt.show()

##### 1. Why did you pick the specific chart?

**The horizontal bar plot with grouped bars (stacked) was chosen to represent the top 15 most affected countries by terrorism. This chart is effective for comparing the frequencies of terrorist attacks across different countries and understanding their relative impact.**

##### 2. What is/are the insight(s) found from the chart?

**The chart provides a clear visual representation of the countries most affected by terrorism, with the stacked bars indicating the contribution of each country to the total number of attacks. It allows for easy comparison and identification of the countries with the highest impact.**

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**The insights gained from this chart can be valuable for various stakeholders, including government agencies, security organizations, and businesses operating globally.**

**Government Agencies: Security agencies can use this information to allocate resources effectively and prioritize efforts in regions that are most affected by terrorism. It aids in the development of targeted counterterrorism strategies.**

**Security Organizations: Private security firms can tailor their services based on the threat posed by specific countries. For example, if certain countries consistently experience high levels of terrorism, businesses may implement heightened security measures in those regions.**

**Businesses: Companies operating globally can use this information for risk assessment and business continuity planning. Understanding the countries with a high risk of terrorism allows businesses to implement targeted security measures, ensuring the safety of employees and assets.**

#### Chart - 6

In [None]:
# Filter the data for the top 10 years
top_10_years = data['Year'].value_counts().head(10).index
filtered_data = data[data['Year'].isin(top_10_years)]

# Group by year and calculate the sum of killed and wounded
casualties_per_year = filtered_data.groupby('Year')[['Killed', 'Wounded']].sum()

# Create a stacked bar chart
casualties_per_year.plot(kind='bar', stacked=True, figsize=(12, 6), colormap='viridis')
plt.title('Casualties (Killed and Wounded) in the Top 10 Years')
plt.xlabel('Year')
plt.ylabel('Number of Casualties')
plt.show()


##### 1. Why did you pick the specific chart?

**The stacked bar chart was chosen to represent the casualties (both killed and wounded) over the top 10 years. This chart is effective for visualizing the total impact of terrorist attacks on casualties over a specific period, broken down by the number of killed and wounded individuals.**

##### 2. What is/are the insight(s) found from the chart?

**The chart provides a visual overview of the casualties (killed and wounded) caused by terrorist attacks in the top 10 years. The stacked bars show the contribution of each year to the total number of casualties, allowing for comparison and identification of the years with the highest impact.**

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**The insights gained from this chart can be valuable for various stakeholders, including government agencies, security organizations, and businesses operating in regions affected by terrorism.**

**Government Agencies: Security agencies can use this information to assess the overall impact of terrorist attacks on public safety. It helps in resource allocation, emergency response planning, and the development of policies aimed at reducing casualties.**

**Security Organizations: Private security firms can tailor their services based on the historical patterns of casualties. Understanding the trends allows for better preparation and risk mitigation strategies to minimize the impact of future attacks.**

**Businesses: Companies operating in regions with a high risk of terrorism can use this information for risk assessment and business continuity planning. It helps in understanding the potential impact on employees and infrastructure, allowing businesses to implement measures to enhance safety.**

#### Chart - 7

In [None]:
# Chart - 7 visualization code
plt.figure(figsize=(15,7))
sns.lineplot(data=data, x='Year', y='Killed', estimator='sum')
plt.title('Number of people killed by terror attack')
plt.xticks(rotation=90)
plt.show()

##### 1. Why did you pick the specific chart?

**The line plot was chosen to represent the number of people killed by terror attacks over the years. This chart is suitable for visualizing trends and variations in the total number of casualties (killed) across different years.**

##### 2. What is/are the insight(s) found from the chart?

**The chart provides insights into the trends in the number of people killed by terrorist attacks over the years. By observing the line, one can identify periods of escalation or decline in casualties, helping to understand the overall impact of terrorism on human lives.**

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**The insights gained from this chart can have implications for various stakeholders, including government agencies, security organizations, and businesses.**

**Government Agencies: Security agencies can use this information to assess the overall impact of terrorism on public safety. It aids in resource allocation, policy development, and the implementation of measures to reduce the number of casualties.**

**Security Organizations: Private security firms can tailor their services based on the historical patterns of casualties. Understanding the trends allows for better preparation, risk mitigation strategies, and the development of security protocols to minimize the impact on human lives.**

**Businesses: Companies operating in regions with a high risk of terrorism can use this information for risk assessment and business continuity planning. Understanding the potential impact on human lives allows businesses to implement measures to enhance employee safety and contribute to the well-being of the local community.**

#### Chart - 8

In [None]:
# Aggregate data by weapon type and get the top 10
weapon_type_distribution = data.groupby('Weapon_type').size().reset_index(name='Number_of_Attacks')
weapon_type_distribution = weapon_type_distribution.sort_values(by='Number_of_Attacks', ascending=False).head(10)

# Create a bar plot with a categorical colormap
plt.figure(figsize=(12, 6))
plt.barh(weapon_type_distribution['Weapon_type'][::-1], weapon_type_distribution['Number_of_Attacks'][::-1], color=plt.get_cmap('tab10').colors)

plt.title('Weapon Type Distribution in Terrorist Attacks')
plt.xlabel('Number of Attacks')
plt.ylabel('Weapon Type')
plt.show()

##### 1. Why did you pick the specific chart?

**The horizontal bar chart was selected to represent the distribution of terrorist attacks based on weapon types. This chart is effective for comparing the frequency of attacks involving different weapons and identifying the most commonly used weapons in terrorist activities.**

##### 2. What is/are the insight(s) found from the chart?

**The chart provides insights into the prevalence of specific weapon types in terrorist attacks. By examining the horizontal bars, one can identify which weapons are most frequently used, offering valuable information for understanding the tactics employed by terrorist groups.**

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**The insights gained from this chart can be valuable for various stakeholders, including government agencies, security organizations, and businesses.**

**Government Agencies: Security agencies can use this information to assess the types of weapons commonly used in terrorist activities. This knowledge is crucial for developing effective counterterrorism strategies, implementing relevant regulations, and improving security measures.**

**Security Organizations: Private security firms can tailor their services based on the prevalent weapons in the region. Understanding the types of weapons used allows for better preparation, risk mitigation strategies, and the development of security protocols to minimize the impact of attacks.**

**Businesses: Companies operating in regions with a high risk of terrorism can use this information for risk assessment and business continuity planning. Understanding the prevalent weapons helps businesses implement measures to protect employees and assets.**

#### Chart - 9

In [None]:
# Create a bar plot
plt.figure(figsize=(12, 6))
sns.barplot(x='AttackType', y='Killed', data=data, estimator=sum, hue='AttackType', palette='viridis', dodge=False)

plt.title('Attack Method vs. Casualties in Terrorist Attacks')
plt.xlabel('Attack Method')
plt.ylabel('Total Casualties')
plt.xticks(rotation=45, ha='right')  # Rotate x-axis labels for better visibility
plt.legend(title='Attack Type', labels=data['AttackType'].unique(), loc='upper right')
plt.show()

##### 1. Why did you pick the specific chart?

**A grouped bar plot was chosen to represent the relationship between different attack methods and the total casualties (sum of killed individuals) in terrorist attacks. This chart visually compares the impact of various attack methods on casualties.**

##### 2. What is/are the insight(s) found from the chart?

**The chart provides insights into the distribution of casualties across different attack methods. By comparing the heights of the bars within each attack method category, one can identify which methods tend to result in higher casualties. It also allows for a quick comparison of the overall impact of different attack methods.**

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**The insights gained from this chart can be valuable for various stakeholders, including government agencies, security organizations, and businesses.**

**Government Agencies: Security agencies can use this information to prioritize counterterrorism efforts based on the effectiveness and impact of various attack methods. It aids in resource allocation, policy development, and the implementation of measures to mitigate casualties.**

**Security Organizations: Private security firms can tailor their services based on the prevalent attack methods. Understanding the impact of different methods allows for better preparation, risk mitigation strategies, and the development of security protocols to minimize casualties.**

**Businesses: Companies operating in regions with a high risk of terrorism can use this information for risk assessment and business continuity planning. Understanding the impact of different attack methods helps businesses implement measures to protect employees and assets.**

#### Chart - 10

In [None]:
# Convert 'Year', 'Month', and 'Day' to datetime column, handling errors
data['Date'] = pd.to_datetime(data[['Year', 'Month', 'Day']], errors='coerce')

# Drop rows with missing values in the 'Date' column
data = data.dropna(subset=['Date'])

# Create a copy of the DataFrame
data_copy = data.copy()

# Extract day of the week from the 'Date' column
data_copy['Day_of_Week'] = data_copy['Date'].dt.day_name()

# Create a bar chart for the number of attacks by day of the week
plt.figure(figsize=(10, 6))
sns.countplot(x='Day_of_Week', data=data_copy, order=['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday'], palette='viridis', hue='Day_of_Week', legend=False)
plt.title('Number of Attacks by Day of the Week')
plt.xlabel('Day of the Week')
plt.ylabel('Number of Attacks')
plt.show()

##### 1. Why did you pick the specific chart?

**A count plot (bar chart) was selected to visualize the number of terrorist attacks on each day of the week. This type of chart effectively displays the frequency of events for different categories, in this case, the days of the week.**

##### 2. What is/are the insight(s) found from the chart?

**The chart provides insights into the distribution of terrorist attacks across the days of the week. By examining the heights of the bars, one can identify patterns or variations in the number of attacks on specific days. This information can help understand if there are certain days more prone to terrorist activities.**

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**The insights gained from this chart can be relevant for various stakeholders, including government agencies, security organizations, and businesses.**

**Government Agencies: Security agencies can use this information to allocate resources more effectively, especially if there are trends indicating higher activity on specific days. It aids in planning and implementing security measures to address potential threats on certain days.**

**Security Organizations: Private security firms can adjust their protocols and staffing based on the observed patterns. This understanding allows for better preparation and response strategies, contributing to enhanced security.**

**Businesses: Companies operating in regions with a high risk of terrorism can use this information for risk assessment and business continuity planning. Understanding the days with higher risk may influence scheduling, employee safety measures, and security protocols.**

#### Chart - 11



In [None]:
# Load your data or define 'data' before this point

# Create a bar chart for the number of attacks by month
plt.figure(figsize=(12, 6))
sns.countplot(x='Month', data=data, palette='viridis', order=range(1, 13), hue='Month', legend=False)
plt.title('Number of Attacks by Month')
plt.xlabel('Month')
plt.ylabel('Number of Attacks')
plt.show()


##### 1. Why did you pick the specific chart?

**A count plot (bar chart) with the 'Month' variable on the x-axis was selected to visualize the distribution of terrorist attacks across different months. This chart is suitable for comparing the frequency of events in each month.**

##### 2. What is/are the insight(s) found from the chart?

**The chart provides insights into the seasonal variation of terrorist attacks throughout the year. By examining the heights of the bars for each month, one can identify patterns or trends in the occurrence of attacks. For example, certain months may exhibit higher or lower activity levels.**

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**The insights gained from this chart can be relevant for various stakeholders, including government agencies, security organizations, and businesses.**

**Government Agencies: Security agencies can use this information to understand if there are specific months with heightened risks of terrorist activities. This understanding aids in planning and implementing targeted security measures during those periods.**

**Security Organizations: Private security firms can adjust their protocols and staffing based on the observed seasonal patterns. This information allows for better preparation and response strategies, contributing to enhanced security during peak periods.**

**Businesses: Companies operating in regions with a high risk of terrorism can use this information for risk assessment and business continuity planning. Seasonal variations may impact scheduling, employee safety measures, and security protocols.**

#### Chart - 12

In [None]:
# Create a bar chart for the top 10 target locations (cities)
plt.figure(figsize=(12, 6))
sns.countplot(x='city', data=data, palette='viridis', order=data['city'].value_counts().nlargest(10).index)
plt.title('Top 10 Target Locations (Cities)')
plt.xlabel('City')
plt.ylabel('Number of Attacks')
plt.xticks(rotation=45, ha='right')
plt.show()

##### 1. Why did you pick the specific chart?

**A count plot (bar chart) with the 'city' variable on the x-axis was selected to visualize the distribution of terrorist attacks across different cities. This chart is suitable for comparing the frequency of attacks in various cities, helping identify the top 10 target locations.**

##### 2. What is/are the insight(s) found from the chart?

**The chart provides insights into the cities that have been most frequently targeted in terrorist attacks. By examining the heights of the bars, one can quickly identify which cities have experienced a higher number of attacks.**

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**The insights gained from this chart can be valuable for various stakeholders, including government agencies, law enforcement, and businesses.**

**Government Agencies and Law Enforcement: Security agencies can use this information to allocate resources and implement targeted security measures in cities that are more prone to terrorist attacks. This can aid in improving overall public safety and security.**

**Businesses: Companies operating in or near these cities can use this information for risk assessment and business continuity planning. Understanding the areas with higher risks allows businesses to implement additional security measures and contingency plans.**

**Urban Planning: City officials and urban planners may use this data to assess vulnerabilities and prioritize security measures. It can influence policies related to public safety and emergency response planning.**

#### Chart - 13

In [None]:
# Get the top 5 target types
top_target_types = data['Target_type'].value_counts().nlargest(5)

# Create a pie chart for the distribution of the top 10 target types
plt.figure(figsize=(8, 8))
top_target_types.plot.pie(autopct='%1.1f%%', colors=sns.color_palette('viridis'), startangle=90)
plt.title('Distribution of Top 5 Target Types')
plt.ylabel('')
plt.show()


##### 1. Why did you pick the specific chart?

**A pie chart was chosen to represent the distribution of the top 5 target types. Pie charts are effective for displaying the proportion of different categories within a whole. In this case, it helps visualize the percentage distribution of attacks across the selected target types.**

##### 2. What is/are the insight(s) found from the chart?

**The chart provides insights into the proportion of terrorist attacks targeting different types of entities. By looking at the slices of the pie, one can quickly grasp the relative significance of each target type within the top 5.**

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**Security Planning: The insights gained from this chart can be valuable for security planning and resource allocation. Understanding which types of targets are more frequently attacked allows for the implementation of targeted security measures.**

**Risk Management: Businesses and organizations can use this information for risk assessment. Knowing the types of targets that are more susceptible to attacks helps in developing risk mitigation strategies.**

**Public Awareness: The information from this chart can also contribute to public awareness and education. By understanding the common targets, the public, as well as relevant authorities, can be better prepared and vigilant.**

#### Chart - 14

In [None]:
from wordcloud import WordCloud

# Create a word cloud for the most common target keywords
plt.figure(figsize=(12, 8))
wordcloud = WordCloud(width=800, height=400, background_color='white').generate(' '.join(data['Target'].dropna()))
plt.imshow(wordcloud, interpolation='bilinear')
plt.title('Word Cloud: Most Common Target Keywords')
plt.axis('off')
plt.show()


##### 1. Why did you pick the specific chart?

**The word cloud chart is chosen to visually represent the most common target keywords in a textual dataset. It provides a quick and intuitive overview of the prominent words by emphasizing the size of the words based on their frequency.**

##### 2. What is/are the insight(s) found from the chart?

**The word cloud reveals the words or phrases that occur most frequently in the "Target" column of the dataset. Larger words in the cloud indicate higher frequency. It helps identify patterns, trends, or recurring themes in the targets of terrorist attacks.**

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**The insights gained from the word cloud can be valuable for understanding the common targets of terrorist attacks. This information might be useful for businesses or organizations involved in risk management, security, or international affairs. It can aid in making informed decisions related to security measures, threat assessments, and preparedness strategies, potentially contributing to a safer environment.**

#### Chart - 15

In [None]:
# Chart - 15 visualization code
# Create a heatmap to visualize the correlation matrix of numeric columns
plt.figure(figsize=(10, 8))
sns.heatmap(data[['Killed', 'Wounded', 'latitude', 'longitude']].corr(), annot=True, cmap='viridis')
plt.title('Correlation Matrix of Numeric Columns')
plt.show()

##### 1. Why did you pick the specific chart?

**The heatmap of the correlation matrix is chosen to visualize the relationships between numeric columns in the dataset. It provides a clear and concise representation of the correlation coefficients, making it easy to identify patterns and dependencies between variables.**

##### 2. What is/are the insight(s) found from the chart?

**The heatmap displays the correlation matrix for numeric columns such as 'Killed,' 'Wounded,' 'latitude,' and 'longitude.' Positive or negative correlation values are indicated by color intensity and annotations. For instance, one might observe whether there is a correlation between the number of people killed and wounded in terrorist attacks, or if there is a correlation between geographic coordinates (latitude and longitude) and casualties.**

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**The insights gained from the correlation matrix can be beneficial for understanding how different numeric variables are related. For businesses or organizations involved in security, risk assessment, or policy planning, this information can aid in making data-driven decisions. For example, if there is a strong correlation between certain variables, it could inform strategies for allocating resources, enhancing security measures, or improving emergency response plans.**

## **5. Solution to Business Objective**

#### What do you suggest the client to achieve Business Objective ?

Based on the exploratory data analysis conducted on the Global Terrorism Dataset, there are several recommendations that could be provided to a client interested in using this information:

1. **Prioritize Hotspot Regions**: Allocate resources and security measures to regions with the highest frequencies of terrorist activities.

2. **Adapt Counter-Terrorism Strategies**: Monitor yearly trends to proactively adjust counter-terrorism strategies based on evolving patterns.

3. **Focus on Major Threat Groups**: Concentrate intelligence efforts on high-impact terrorist groups to prevent and mitigate potential attacks.

4. **Specialize Measures for Common Attack Types**: Tailor preventive measures and response strategies based on the most common types of terrorist attacks.

5. **Encourage International Collaboration**: Foster collaboration with international organizations and neighboring countries to strengthen the global response to terrorism.

6. **Implement Socio-Economic Programs**: Address root causes of terrorism in hotspot regions through education, employment, and community development programs.

7. **Establish Early Warning Systems**: Develop systems that use data analytics and intelligence for early detection and swift response to emerging threats.

# **Conclusion**


The Exploratory Data Analysis (EDA) on the Global Terrorism Dataset uncovered significant insights into patterns and trends spanning 1970 to 2017. Utilizing Python libraries such as Pandas, Matplotlib, Seaborn, and NumPy, we decoded complex terrorism-related data, revealing key findings.

The analysis highlighted temporal trends, identified hotspots, recognized dominant terrorist groups, and outlined prevalent attack methods. These insights serve as vital inputs for crafting effective counter-terrorism strategies.

The power of data-driven decision-making was evident throughout this process. Converting raw data into meaningful insights allowed for the efficient allocation of resources by security agencies and policymakers. This, in turn, has the potential to save lives and protect property.

It's crucial to note, however, that addressing terrorism requires a holistic approach beyond historical data analysis. A comprehensive strategy must encompass real-time intelligence, geopolitical nuances, and ground-level dynamics.

In essence, this project showcases the value of data analysis in informing counter-terrorism efforts. While serving as a valuable foundation, it emphasizes the ongoing need for continuous data collection, analysis, and interpretation to effectively address global security challenges like terrorism.

### ***Hurrah! You have successfully completed your EDA Capstone Project !!!***