<a href="https://colab.research.google.com/github/DataWithAaditya/United-Nation-Global-Terrorism-Analysis/blob/main/EDA_United_Nation_Global_Terrorism_Analysis.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Project Name**    - United Nation Global Terrorism Analysis



##### **Project Type**    - EDA
##### **Contribution**    - Individual

# **Project Summary -**

####***Objective:***
The goal of this Exploratory Data Analysis (EDA) project is to analyze and uncover insights from global terrorism incidents recorded in the Global Terrorism Database (GTD). This analysis will help understand trends in terrorism, the most affected regions, attack methods, casualty rates, and other key patterns.

By conducting a structured EDA, this project aims to provide meaningful insights that could support security agencies, policymakers, and researchers in making informed decisions.

#### ***Key Steps in the Project:***
1. Data Collection & Understanding
2. Exploratory Data Analysis (EDA) Using the UBM Rule
3. Data Visualization & Insights
4. Deployment-Ready Code
5. Final Presentation & Report

####***Expected Outcome:***
This project will deliver data-driven insights into terrorism trends, identifying key risk factors, vulnerable regions, and evolving attack patterns. The findings will be valuable for:
- Policymakers – Formulating counter-terrorism policies.
- Security Agencies – Strengthening security measures in high-risk areas.
- Researchers & Analysts – Understanding global terrorism patterns.

# **GitHub Link -**

GitHub Link: https://github.com/DataWithAaditya/United-Nation-Global-Terrorism-Analysis/tree/main

# **Problem Statement**


####***Problem Statement:***
Terrorism remains one of the most critical global challenges, causing widespread loss of life, economic damage, and geopolitical instability. Governments and international organizations continuously strive to understand terrorism patterns to develop effective countermeasures. However, due to the complexity and vast amount of data, deriving actionable insights remains a challenge.

This project aims to analyze historical terrorism data from the Global Terrorism Database (GTD) to answer key questions such as:

- Which regions and countries are most affected by terrorism?
- What are the most common attack methods and target types?
- How have terrorist activities evolved over time?
- What is the impact of terrorism in terms of casualties and property damage?
- Are there any patterns that indicate possible future risks?

By conducting Exploratory Data Analysis (EDA), this project seeks to uncover hidden trends and relationships, providing valuable insights for security agencies, policymakers, and researchers to make data-driven decisions.

#### **Define Your Business Objective?**

####***Business Objective:***
Analyze global terrorism trends to identify high-risk regions, attack methods, and evolving patterns, supporting better counter-terrorism strategies and decision-making.

# **General Guidelines** : -  

1.   Well-structured, formatted, and commented code is required.
2.   Exception Handling, Production Grade Code & Deployment Ready Code will be a plus. Those students will be awarded some additional credits.
     
     The additional credits will have advantages over other students during Star Student selection.
       
             [ Note: - Deployment Ready Code is defined as, the whole .ipynb notebook should be executable in one go
                       without a single error logged. ]

3.   Each and every logic should have proper comments.
4. You may add as many number of charts you want. Make Sure for each and every chart the following format should be answered.
        

```
# Chart visualization code
```
            

*   Why did you pick the specific chart?
*   What is/are the insight(s) found from the chart?
* Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

5. You have to create at least 20 logical & meaningful charts having important insights.


[ Hints : - Do the Vizualization in  a structured way while following "UBM" Rule.

U - Univariate Analysis,

B - Bivariate Analysis (Numerical - Categorical, Numerical - Numerical, Categorical - Categorical)

M - Multivariate Analysis
 ]





# ***Let's Begin !***

## ***1. Know Your Data***

### Import Libraries

In [None]:
# Import Libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')

### Dataset Loading

In [None]:
# Mount Drive
from google.colab import drive
drive.mount('/content/drive')


In [None]:
# Load Dataset
df = pd.read_csv('/content/drive/My Drive/Colab Notebooks/Global Terrorism Data.csv', encoding='latin1')


### Dataset First View

In [None]:
# Display top few rows
df.head()

### Dataset Rows & Columns count

In [None]:
# Dataset Rows & Columns count
rows, columns = df.shape
print(f"Number of rows: {rows}")
print(f"Number of columns: {columns}")

### Dataset Information

In [None]:
# Dataset Info
df.info()

#### Duplicate Values

In [None]:
# Dataset Duplicate Value Count
duplicate_count = df.duplicated().sum()
print(f"Number of duplicate value: {duplicate_count}")

#### Missing Values/Null Values

In [None]:
# Missing Values/Null Values Count
missing_values_count = df.isnull().sum()

print("Missing Values Count for Each Column:")
print(missing_values_count)

In [None]:
# Visualizing the missing values
missing_data = df.isnull().sum()
missing_cols = missing_data[missing_data > 0].index  # Select only columns with missing values

if len(missing_cols) > 0:
    plt.figure(figsize=(12, 6))
    sns.heatmap(df[missing_cols].isnull(), cmap="coolwarm", cbar=False)
    plt.title("Missing Values Heatmap (Filtered)")
    plt.show()
else:
    print("No missing values found in the dataset.")

####***Why Picked:***

- It visually highlights missing data across rows and columns, making patterns easy to spot.

####***Insights Found:***

- Columns like approxdate and related have significant missing data, requiring attention.

####***Business Impact:***

- Positive: Helps improve data quality by addressing missing values.
- Negative: Missing key data could lead to incomplete or biased insights,
affecting decisions.

This visualization aids in ensuring high-quality data for better decision-making.

### What did you know about your dataset?

The Global Terrorism Database (GTD) contains worldwide terrorism incidents from 1970 onward, covering attack types, locations, casualties, weapons, and terrorist groups. It helps analyze trends, most affected regions, common attack methods, and impact.

## ***2. Understanding Your Variables***

In [None]:
# Dataset Columns
df.columns

In [None]:
# Dataset Describe
df.describe()

### Variables Description

Here’s a detailed description of all key variables in your Global Terrorism Database (GTD):

####***Date & Location:***

**eventid-** Unique ID for each terrorist incident.

**iyear-**	Year when the attack occurred.

**imonth-**	Month of the attack (1-12).

**iday-**	Day of the attack (1-31).

**country_txt-**	Country where the attack took place.

**region_txt-**	Region where the attack took place.

**provstate-**	State/Province of the attack location.

**city-**	City where the attack occurred.

**latitude-**	Latitude of the attack location.

**longitude-**	Longitude of the attack location.

####***Attack Details:***

**attacktype1_txt-**	Primary type of attack (e.g., bombing, assassination).

**attacktype2_txt-**	Secondary attack type (if applicable).

**attacktype3_txt-**	Tertiary attack type (if applicable).

**targtype1_txt-**	Primary target type (e.g., military, civilians).

**targsubtype1_txt-**	More specific target type (e.g., police, religious figures).

**gname-**	Name of the terrorist group responsible for the attack.

**gsubname-**	Subgroup of the terrorist organization.

**motive-**	Stated or suspected motive of the attack.

####***Casualties & Damage:***

**nkill-**	Number of people killed.

**nkillus-**	Number of U.S. citizens killed.

**nwound-**	Number of people wounded.

**nwoundus-**	Number of U.S. citizens wounded.

**propextent_txt-**	Extent of property damage (e.g., minor, major, destroyed).


####*** Incident Outcome:***

**success-**	Whether the attack was successful (1 = Yes, 0 = No).

**suicide-**	Whether it was a suicide attack (1 = Yes, 0 = No).

**ransom-**	Whether a ransom was demanded (1 = Yes, 0 = No).

**claimed-**	Whether a terrorist group claimed responsibility (1 = Yes, 0 = No).


####***Weapons & Tactics:***

**weaptype1_txt-**	Primary weapon type (e.g., firearms, explosives).

**weapsubtype1_txt-**	More specific weapon category.

**weapdetail-**	Additional details on the weapon used.


### Check Unique Values for each variable.

In [None]:
# Check Unique Values for each variable.
for column in df.columns:
    unique_values = df[column].nunique()
    print(f"{column}: {unique_values} unique values")

## Detect Outliers

In [None]:
# List of numerical columns to check for outliers
num_cols = ['nkill', 'nwound']  # Add more columns as needed

# Define subplot grid dynamically
num_plots = len(num_cols)  # Number of boxplots
cols_per_row = 3  # Adjust this for better spacing
rows = (num_plots // cols_per_row) + (num_plots % cols_per_row > 0)  # Auto-adjust rows

plt.figure(figsize=(cols_per_row * 4, rows * 5))  # Adjust figure size dynamically

for i, col in enumerate(num_cols, 1):
    plt.subplot(rows, cols_per_row, i)
    sns.boxplot(y=df[col], color='red')
    plt.title(f"Boxplot of {col}")
    plt.ylabel(col)
    plt.yscale("log")  # Log scale to handle extreme outliers

plt.tight_layout()
plt.show()


Why Boxplot?
- Detects outliers effectively.
- Handles skewed data.
- Quick comparison across features.

Insights:
- Extreme outliers exist (high casualties in some attacks).
- Most incidents have low casualties.
- Possible data quality issues (check extreme values).

## Remove Outliers Using IQR Method

In [None]:
# Function to remove outliers using IQR
def remove_outliers(df, column):
    Q1 = df[column].quantile(0.25)
    Q3 = df[column].quantile(0.75)
    IQR = Q3 - Q1
    lower_bound = Q1 - 1.5 * IQR
    upper_bound = Q3 + 1.5 * IQR
    return df[(df[column] >= lower_bound) & (df[column] <= upper_bound)]

# Removing outliers for all numerical columns
for col in num_cols:
    df = remove_outliers(df, col)

print("Outliers removed successfully!")

### After handling outliers!

In [None]:
# List of numerical columns to check for outliers after handling

num_cols = ['nkill', 'nwound']  # Add more columns as needed

# Define subplot grid dynamically
num_plots = len(num_cols)  # Number of boxplots
cols_per_row = 3  # Adjust this for better spacing
rows = (num_plots // cols_per_row) + (num_plots % cols_per_row > 0)  # Auto-adjust rows

plt.figure(figsize=(cols_per_row * 4, rows * 5))  # Adjust figure size dynamically

for i, col in enumerate(num_cols, 1):
    plt.subplot(rows, cols_per_row, i)
    sns.boxplot(y=df[col], color='red')
    plt.title(f"Boxplot of {col}")
    plt.ylabel(col)
    plt.yscale("log")  # Log scale to handle extreme outliers

plt.tight_layout()
plt.show()

## 3. ***Data Wrangling***

### Data Wrangling Code

In [None]:
# Write your code to make your dataset analysis ready.

# Display initial shape of dataset
print(f"Original Dataset Shape: {df.shape}")

#Remove Irrelevant Columns
columns_to_drop = [
    'eventid', 'scite1', 'scite2', 'scite3', 'dbsource',  # Source columns
    'summary', 'provstate', 'alternative_txt', 'motive',  # Text-based columns (not useful for EDA)
    'propcomment', 'addnotes'  # Free-text columns
]
df.drop(columns=columns_to_drop, axis=1, inplace=True)

#Handle Missing Values
df.fillna({
    'nkill': 0,
    'nwound': 0,
    'weaptype1_txt': 'Unknown',
    'gname': 'Unknown',
    'city': 'Unknown',
    'targtype1_txt': 'Unknown'
}, inplace=True)

# Fill numeric missing values with 0
df.fillna(0, inplace=True)

#Convert Data Types
df['iyear'] = df['iyear'].astype(int)  # Year as integer
df['success'] = df['success'].astype(bool)  # Convert to boolean
df['suicide'] = df['suicide'].astype(bool)  # Convert to boolean

#Remove Duplicates (if any)
df.drop_duplicates(inplace=True)

#Create New Features
df['total_casualties'] = df['nkill'] + df['nwound']

#Save the Cleaned Dataset
df.to_csv("Cleaned_Global_Terrorism_Data.csv", index=False)

# Display final shape after cleaning
print(f"Cleaned Dataset Shape: {df.shape}")

Explanation:

- Boxplots help visualize extreme values.
- Any data points outside the whiskers (beyond 1.5 times the IQR) are considered outliers.

### What all manipulations have you done and insights you found?

###Data Manipulations Done & Insights Found

####***Data Manipulations:***

1. Dropped irrelevant columns (e.g., eventid, summary, provstate, etc.) to remove unnecessary text fields.
2. Handled missing values by filling numeric columns (nkill, nwound) with 0 and categorical columns (gname, city, weaptype1_txt) with 'Unknown'.
3. Converted data types (iyear as integer, success & suicide as boolean) for better analysis.
4. Removed duplicate rows to maintain data integrity.
5. Created a new feature: total_casualties = nkill + nwound, which helps in understanding attack severity.
6. Saved the cleaned dataset for further analysis and visualization.
7. Detect Outliers and fix them successfully.

####***Insights Found:***

1. Yearly Trend: The number of terrorist attacks has increased significantly in recent decades.
2. Most Affected Countries: Some countries, like Iraq, Afghanistan, and Pakistan, have the highest number of attacks.
3. Deadliest Attacks: A few incidents have caused extremely high casualties, affecting global security.
4. Most Used Weapons: Explosives and firearms are the most commonly used weapons in attacks.
5. Primary Targets: Government, military, and civilians are the most targeted groups.

## ***4. Data Vizualization, Storytelling & Experimenting with charts : Understand the relationships between variables***

#### Chart - 1: Count of Terrorist Attacks by Year. (Univariate Analysis)

In [None]:
# Plot the number of terrorist attacks per year
plt.figure(figsize=(8, 6))
sns.lineplot(x=df['iyear'].value_counts().index, y=df['iyear'].value_counts().values, marker='o', color='red')
plt.xlabel("Year")
plt.ylabel("Number of Attacks")
plt.title("Trend of Terrorist Attacks Over the Years")
plt.grid(True)
plt.show()

##### 1. Why did you pick the specific chart?

**Answer:** Line charts show trends over time.

##### 2. What is/are the insight(s) found from the chart?

**Answer:** Terrorist attacks increased sharply after 2000, peaking in 2014-2015.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**Answer-**
1. Impact: Helps governments allocate resources for counter-terrorism strategies.

2. Negative growth: Unchecked terrorism increases global instability.

#### Chart - 2: Most Affected Countries

In [None]:
# Plot the top 10 most affected countries
plt.figure(figsize=(12, 6))
sns.barplot(x=df['country_txt'].value_counts().head(10).values, y=df['country_txt'].value_counts().head(10).index, palette="Reds_r")
plt.xlabel("Number of Attacks")
plt.ylabel("Country")
plt.title("Top 10 Most Affected Countries by Terrorist Attacks")
plt.show()

##### 1. Why did you pick the specific chart?

**Answer-** Bar charts compare categorical data effectively.

##### 2. What is/are the insight(s) found from the chart?

**Answer-** Iraq, Afghanistan, and Pakistan have the most attacks.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**Answer-**
1. Impact: Helps prioritize global security efforts.
2. Negative growth: Affects economic stability and tourism.

#### Chart - 3: Deadliest Terrorist Groups. (Bivariate Analysis)

In [None]:
# Calculate total casualties for each terrorist group
df_filtered = df[df['gname'] != "Unknown"]
top_groups_filtered = df_filtered.groupby('gname')['casualties'].sum().sort_values(ascending=False).head(10)

# Plot top terrorist groups by casualties
plt.figure(figsize=(12, 6))
sns.barplot(x=top_groups_filtered.values, y=top_groups_filtered.index, palette="magma")
plt.xlabel("Total Casualties (Killed + Wounded)")
plt.ylabel("Terrorist Group")
plt.title("Top 10 Deadliest Identified Terrorist Groups")
plt.show()

##### 1. Why did you pick the specific chart?

**Answer-** Bar Chart Shows the impact of different terrorist groups.

##### 2. What is/are the insight(s) found from the chart?

**Answer-**  Taliban, Shining Path (SL), and Al-Shabaab rank among the deadliest groups, showing their significant impact.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**Answer-**
1. Impact: Helps in counter-terrorism strategies.
2. Negative growth: Higher casualties increase fear and instability.

#### Chart - 4: Relationship Between Attack Type & Number of Attacks

In [None]:
# Show distribution of attack types
attack_types = df['attacktype1_txt'].value_counts().head(5)
plt.figure(figsize=(8, 8))
plt.pie(attack_types, labels=attack_types.index, autopct='%1.1f%%', colors=['red', 'blue', 'green', 'purple', 'orange'])
plt.title("Most Common Types of Terrorist Attacks")
plt.show()

##### 1. Why did you pick the specific chart?

**Answer-** Pie charts show proportions clearly

##### 2. What is/are the insight(s) found from the chart?

**Answer-** Bombings are the most common type of attack.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**Answer-**
1. Impact: Helps in preventive security measures.
2. Negative growth: Bombings cause mass casualties and economic damage.

#### Chart - 5: Relationship Between Target Type & Number of Attacks

In [None]:
# Plot most targeted groups in terrorist attacks
top_targets = df['targtype1_txt'].value_counts().head(10)
plt.figure(figsize=(12, 6))
sns.barplot(x=top_targets.values, y=top_targets.index, palette="coolwarm")
plt.xlabel("Number of Attacks")
plt.ylabel("Target Type")
plt.title("Top 10 Most Targeted Groups in Terrorist Attacks")
plt.show()

##### 1. Why did you pick the specific chart?

**Answer-** Clearly shows the most affected groups.

##### 2. What is/are the insight(s) found from the chart?

**Answer-** Civilians and government officials are most targeted.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**Answer-**
1. Impact: Helps law enforcement protect vulnerable groups.
2. Negative growth: Instills fear and disrupts governance.

#### Chart - 6: Relationship Between Weapon Type & Number of Attacks

In [None]:
# Show most used weapon types in terrorist attacks
weapon_types = df['weaptype1_txt'].value_counts().head(5)
plt.figure(figsize=(8, 8))
plt.pie(weapon_types, labels=weapon_types.index, autopct='%1.1f%%', colors=['cyan', 'purple', 'orange', 'red', 'green'], wedgeprops={'edgecolor': 'black'})
plt.gca().add_artist(plt.Circle((0, 0), 0.5, color='white'))  # Donut effect
plt.title("Most Common Types of Weapons Used in Terrorism")
plt.show()

##### 1. Why did you pick the specific chart?

**Answer-** Highlights weapon preferences in attacks.

##### 2. What is/are the insight(s) found from the chart?

**Answer-** Explosives and firearms are the most used.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**Answer-**
1. Impact: Helps in weapon tracking policies.
2. Negative growth: Easy weapon access worsens terrorism.

#### Chart - 7: Pairplot of Casualties, Attacks, and Year. (Multivariate Analysis)

In [None]:
# Pairplot to analyze relationships between variables
sns.pairplot(df[['iyear', 'nkill', 'nwound']], diag_kind='kde', corner=True)
plt.show()

##### 1. Why did you pick the specific chart?

**Answer-** Shows correlations between multiple variables.

##### 2. What is/are the insight(s) found from the chart?

**Answer-** Recent years show higher casualties.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**Answer-**
1. Impact: Justifies stronger counter-terrorism policies.
2. Negative growth: Indicates worsening security conditions.

#### Chart - 8: Relationship Between Attack Type, Target Type, and Casualties

In [None]:
!pip install squarify

In [None]:
import squarify

# Aggregate data for top attack types and target types
df['attack_target'] = df['attacktype1_txt'] + ' - ' + df['targtype1_txt']
attack_target_counts = df.groupby('attack_target')['nkill'].sum().reset_index()

# Sort by casualties and pick top 10 categories
top_attacks = attack_target_counts.sort_values(by='nkill', ascending=False).head(10)

# Define figure size
plt.figure(figsize=(12, 6))

# Create treemap
squarify.plot(
    sizes=top_attacks['nkill'],
    label=top_attacks['attack_target'],
    alpha=0.7,
    text_kwargs={'fontsize': 10}
)

# Formatting
plt.title("Top 10 Attack Type and Target Combinations by Casualties", fontsize=14)
plt.axis("off")  # Hide axis

# Show plot
plt.show()

##### 1. Why did you pick the specific chart?

**Answer-** Shows hierarchical relationships.

##### 2. What is/are the insight(s) found from the chart?

**Answer-** Bombings on civilians cause the highest casualties.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**Answer-**
1. Impact: Helps security agencies focus on high-risk attack types.
2. Negative growth: Leads to increased fear among civilians.

#### Chart - 9: Distribution Plot of Casualties

In [None]:
# Box plot to compare casualties across regions.
plt.figure(figsize=(12, 6))
sns.boxplot(x='region_txt', y='casualties', data=df)
plt.xticks(rotation=90)
plt.xlabel("Region")
plt.ylabel("Total Casualties")
plt.title("Casualties Distribution Across Regions")
plt.show()

##### 1. Why did you pick the specific chart?

**Answer-** To compare how different regions are impacted by terrorism.

##### 2. What is/are the insight(s) found from the chart?

**Answer-**
- Some regions have higher median casualties, indicating more severe attacks.
- Outliers show exceptionally high-casualty incidents in certain regions.


##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**Answer-**
- Business Impact:
Helps in designing targeted security strategies.
- Negative Growth Insights?
High-casualty regions may experience lower foreign investments and slower economic growth.

#### Chart - 10: Attack Types vs Regions

In [None]:
# Heatmap to see how different attack types affect various regions.
plt.figure(figsize=(12, 6))
pivot_table = df.pivot_table(index='attacktype1_txt', columns='region_txt', values='casualties', aggfunc='sum')
sns.heatmap(pivot_table, cmap='coolwarm', annot=True, fmt=".0f")
plt.xlabel("Region")
plt.ylabel("Attack Type")
plt.title("Heatmap of Attack Types Across Regions")
plt.show()

##### 1. Why did you pick the specific chart?

**Answer-** To identify which attack types are most common in different regions.

##### 2. What is/are the insight(s) found from the chart?

**Answer-**
- Certain attack types dominate specific regions.
- Suicide bombings and armed assaults cause high casualties in some areas.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**Answer-**
- Business Impact:
Helps governments and organizations focus on high-risk attack types.
- Negative Growth Insights?
If a particular region experiences repeated high-casualty attacks, it may face economic decline.

#### Chart - 11: Casualties vs Year

In [None]:
# Scatter plot to observe trends in casualties over time.
plt.figure(figsize=(12, 6))
sns.scatterplot(x=df['iyear'], y=df['casualties'], alpha=0.5, color='red')
plt.xlabel("Year")
plt.ylabel("Total Casualties")
plt.title("Casualties Over the Years")
plt.show()

##### 1. Why did you pick the specific chart?

**Answer-** To identify trends in casualties across different years.

##### 2. What is/are the insight(s) found from the chart?

**Answer-**
- Some years have significantly higher casualties, indicating periods of heightened terrorist activity.
- If casualties decrease over time, it suggests improved counter-terrorism measures.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**Answer-**
- Business Impact:
Countries can use this insight to assess policy effectiveness.
- Negative Growth Insights?
Years with high casualties may cause economic instability and global security concerns.

#### Chart - 12: Casualties by Region

In [None]:
# Box plot to compare casualties across regions.
plt.figure(figsize=(12, 6))
sns.boxplot(x='region_txt', y='casualties', data=df)
plt.xticks(rotation=90)
plt.xlabel("Region")
plt.ylabel("Total Casualties")
plt.title("Casualties Distribution Across Regions")
plt.show()

##### 1. Why did you pick the specific chart?

**Answer-** To compare how different regions are impacted by terrorism.

##### 2. What is/are the insight(s) found from the chart?

**Answer-**
- Some regions have higher median casualties, indicating more severe attacks.
- Outliers show exceptionally high-casualty incidents in certain regions.


##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**Answer-**
- Business Impact:
Helps in designing targeted security strategies.
- Negative Growth Insights?
High-casualty regions may experience lower foreign investments and slower economic growth.

#### Chart - 13

In [None]:
# Chart - 13 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 14 - Correlation Heatmap

In [None]:
# Selecting only relevant numerical columns
num_cols = ['iyear', 'nkill', 'property', 'propextent', 'ransompaid', 'casualties']

# Compute correlation matrix
corr_matrix = df[num_cols].corr()

# Plot the heatmap with improved readability
plt.figure(figsize=(8, 6))
sns.heatmap(corr_matrix, annot=True, cmap='coolwarm', fmt=".2f", linewidths=0.5)
plt.title("Correlation Heatmap of Key Numerical Variables")
plt.show()


##### 1. Why did you pick the specific chart?

**Answer-** A correlation heatmap is the best way to visualize relationships between multiple numerical variables in a dataset.
Reasons for Choosing It:
- Shows Strength of Relationships → Helps identify strong and weak correlations between variables.
- Quickly Highlights Patterns → Uses color coding to make relationships easy to understand.
- Identifies Redundant Variables → If two variables are highly correlated, one can be removed to avoid redundancy.

##### 2. What is/are the insight(s) found from the chart?

**Answer-**
- High correlation between nkill (killed) and nwound (wounded) → More fatalities lead to more injuries.
- Casualties depend directly on nkill and nwound.
- Property damage does not always mean high casualties (low correlation).

#### Chart - 15 - Pair Plot

In [None]:
# Selecting important numerical variables for the pair plot
selected_columns = ['iyear', 'nkill', 'nwound', 'attacktype1', 'targtype1', 'weaptype1']
df_selected = df[selected_columns]

# Creating pair plot
plt.figure(figsize=(12, 8))
sns.pairplot(df_selected, diag_kind="kde", plot_kws={'alpha':0.5, 's':10})

# Show plot
plt.show()

##### 1. Why did you pick the specific chart?

**Answer-**
1. Pair Plots help visualize relationships between multiple numerical variables simultaneously.
2. Shows scatter plots for variable pairs and distribution of individual variables.
3. Identifies patterns, correlations, and outliers effectively.

##### 2. What is/are the insight(s) found from the chart?

**Answer-**
1. Possible Correlations → If two variables show a linear pattern, they may be correlated.
2. Outliers Detection → Some scatter points may indicate unusual behavior (e.g., extremely high casualties).
3. Distribution Understanding → The diagonal plots (KDE) show how each variable is distributed.

## **5. Solution to Business Objective**

#### What do you suggest the client to achieve Business Objective ?
Explain Briefly.

**Answer-**

####***Suggestion to the Client:***
**Enhanced Security Measures:**
- Focus on high-risk regions and time periods (based on EDA findings) to allocate security forces efficiently.
- Implement AI-powered surveillance and predictive analytics to detect potential threats.

**Policy and Law Enforcement:**
- Strengthen policies against frequently used attack types and weapons to minimize damage.
- Collaborate with international intelligence agencies for early threat detection.

**Public Awareness & Emergency Response:**
- Educate people in high-target areas about safety protocols.
- Improve emergency response teams in regions with high casualties.

**Investment in Technology & Cybersecurity:**
- Use machine learning models to predict potential future attacks based on past data.
- Secure digital communication channels to prevent cyber-terrorism threats.

**Strategic Resource Allocation:**
- Deploy resources where maximum damage occurs (based on casualty analysis).
- Increase security at frequently attacked locations like government buildings and public places.

####***Business Impact:***
1. Risk Reduction: Proactive security can prevent attacks.
2. Cost Efficiency: Optimized resource allocation reduces unnecessary spending.
3. Public Safety: Improved response time and awareness lower casualties.

# **Conclusion**

The United Nations Global Terrorism Analysis provided valuable insights into the patterns, causes, and impacts of terrorism worldwide. Through Exploratory Data Analysis (EDA), we identified key trends in attack types, target groups, casualties, and regional distribution.

**Key Takeaways:**
1. Most Common Attack Type: Bombing/Explosions are the most frequent and deadly.
2. High-Risk Regions: Specific countries and cities face repeated attacks, demanding focused security measures.
3. Casualty Insights: Certain attacks cause significantly higher casualties, requiring better emergency preparedness.
4. Weapon Usage Trends: A few weapon types dominate, indicating a need for strict monitoring and regulation.

By leveraging these findings, organizations and governments can implement targeted security strategies, allocate resources efficiently, and strengthen policies to combat terrorism. Proactive measures based on data-driven insights can significantly reduce the impact of terrorist activities.

### ***Hurrah! You have successfully completed your EDA Capstone Project !!!***