# **Project Name**    - EDA - Global Terrorism Database (GTD) 



### My Self **Paras Hirapara**






# **Project Summary -**

The Global Terrorism Database (GTD) is a database of terrorist incidents starting in 1970. The list continued through 2017, noting more than 200,000 events. The National Consortium for the Study of Terrorism and Responses to Terrorism (START) at the University of Maryland, College Park in the United States is responsible for maintaining the database. It serves as the foundation for further terrorism-related metrics, including the Institute for Economics and Peace's Global Terrorism Index (GTI). 

In its 2021 version, the GTD lists nearly 200,000 terrorist attacks and bills itself as the "largest comprehensive declassified data base on terrorist occurrences in the world." There are almost 95,000 bombings in the GTD. Additionally, it covers more than 15,000 kidnappings and more than 20,000 murders. The GTD produced several broad conclusions on the types and locations of terrorist strikes. For instance, just roughly one percent of terrorist attacks in the GTD result in 25 or more fatalities, but these very lethal attacks killed more than 140,000 people overall between 1970 and 2017. About half of all terrorist attacks in the GTD are non-lethal. More than 2,000 specifically identified perpetrator organisations and more than 700 other generic groupings, including "Tamil separatists," are blamed for the attacks in the GTD. However, less than a year and less than four attacks are carried out by two-thirds of these groups. Similarly, from 1970 to 2017, only 20 perpetrator groups are accountable for 50% of all incidents for which a perpetrator could be located. 

The Institute for Economics and Peace (IEP) in Australia publishes the Global Terrorism Index (GTI) every year with the goal of thoroughly examining the effects of terrorism on 163 nations, representing 99.7% of the world's population. The GTI takes into consideration terrorist events during the previous five years and defines terrorism as "the threatened or actual use of illegal force and violence by a non-state actor to gain a political, economic, religious, or social goal through fear, coercion, or intimidation." 

The 2019 GTI Report draws some shocking findings, with India's ranking dropping from 8th in 2017 and 2018 to 7th in 2019, which may not necessarily reflect a worsening of the nation's overall security condition due to terrorism. India has historically lagged behind countries with a history of armed conflict, such the Democratic Republic of the Congo, South Sudan, Sudan, Burkina Faso, Palestine, and Lebanon. 

The report also draws other striking conclusions, such as the fact that all 10 of the nations with the greatest impact of terrorism, including India, are now involved in at least one armed conflict. In order to give decision-makers a fuller, more comprehensive picture, data analysis connected to GTA may also offer a short- and long-term prognosis of terrorist attacks globally and regionally, coupled with additional analysis of prior trends.

# **GitHub Link -**

https://github.com/parashirapara/Paras_Global_Terrorism_Database_-GTD-.git

# **Problem Statement**


The probability of occurrences happening at the same geolocation is a little bit low in the START terrorism dataset. The majority of the occurrences are inconsistent or infrequent. Consequently, it becomes challenging to make quantitative projections with different degrees of similar events. Thus, various categorization methods yield various outcomes.

The fact that various studies reach different conclusions presents another significant challenge when working with this dataset. Current flaws and restrictions in data gathering methods, disagreements over definitions, irregularities in coding and analysis lead to disagreements among researchers, which in turn invalidates their conclusions.

One of the problems is that the definitional disputes surrounding different terrorist incidents have a negative impact on the development of this field. This problem necessitates the exercise of a need for consensus among experts and relevant authorities to agree on what might be the standard norms and procedure to be regarded as a legitimate piece of information on terrorism on which suitable research can be done.

Thus, the research and analysis conducted for this project were highly influenced by the data from the Global Terrorism Dataset, and our findings may not be comparable to those of other studies due to the variety of sources we used.

#### **Define Your Business Objective?**

This project's implementation includes system design, backend design, graphic design, and user interface design.

(1). Perform ‘Exploratory Data Analysis’ on dataset ‘Global Terrorism’

(2). As a security/defense analyst, try to find out the hot zone of terrorism.

(3). What all security issues and insights you can derive by EDA?

# **General Guidelines** : -  

1.   Well-structured, formatted, and commented code is required. 
2.   Exception Handling, Production Grade Code & Deployment Ready Code will be a plus. Those students will be awarded some additional credits. 
     
     The additional credits will have advantages over other students during Star Student selection.
       
             [ Note: - Deployment Ready Code is defined as, the whole .ipynb notebook should be executable in one go
                       without a single error logged. ]

3.   Each and every logic should have proper comments.
4. You may add as many number of charts you want. Make Sure for each and every chart the following format should be answered.
        

```
# Chart visualization code
```
            

*   Why did you pick the specific chart?
*   What is/are the insight(s) found from the chart?
* Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

5. You have to create at least 20 logical & meaningful charts having important insights.


[ Hints : - Do the Vizualization in  a structured way while following "UBM" Rule. 

U - Univariate Analysis,

B - Bivariate Analysis (Numerical - Categorical, Numerical - Numerical, Categorical - Categorical)

M - Multivariate Analysis
 ]





# ***Let's Begin !***

## ***1. Know Your Data***

### Import Libraries

In [None]:
# Import Libraries

import pandas as pd
import numpy as np
import missingno as msno
import plotly.express as px
import seaborn as sns
import matplotlib.pyplot as plt

### Dataset Loading

In [None]:
# Load Dataset

from google.colab import drive
drive.mount('/content/drive')

In [None]:
# Load File Path for Read

file_path = '/content/drive/MyDrive/Data Science/Capston Project 1/Global Terrorism Data.csv'
Data = pd.read_csv(file_path, encoding='ISO-8859-1')

### Dataset First View

In [None]:
# Dataset First Look

Data.head()

### Dataset Rows & Columns count

In [None]:
# Dataset Rows & Columns count

num_rows = Data.shape[0]
num_columns = Data.shape[1]

print(f' Rows is %d and colunms is %d.' % (num_rows, num_columns))

### Dataset Information

In [None]:
# Dataset Information

# Dataset Information for year
Data['iyear'].info()

# Dataset Information for approxdate
Data['approxdate'].info()

# Dataset Information for country_txt
Data['country_txt'].info()

# Dataset Information for latitude
Data['latitude'].info()

# All Dataset Information
Data.info()

#### Duplicate Values

In [None]:
# Dataset Duplicate Value Count

duplicates = Data.duplicated().sum()

# Print the result

print("Number of duplicate values:", duplicates)

#### Missing Values/Null Values

In [None]:
# Missing Values/Null Values Count

missing_values_count = Data.isnull().sum()

print(missing_values_count)

In [None]:
# Visualizing the missing values

msno.bar(Data)

# Bottom of the plot value is the scale is measured in index values.
# Top of the plot value is ranges from 0.0 to 1.0, where 1.0 represents 100% data completeness.
# Right of the plot calue is represent the total count within that column.

### What did you know about your dataset?

Given dataset about terresisom accors the world. It has 181691 rows and 135 columns. In dataset float values, integers, and object values are 55, 22 and 58 respectively. There is no any duplicate values in dataset but there is many null values. So, here we used missingno library to generate graph for batter undersatnding of null values.

## ***2. Understanding Your Variables***

In [None]:
# Dataset Columns

x = list(Data.columns)

print(x)

In [None]:
# iyear Dataset Describe

Data['iyear'].describe()

In [None]:
# extended Dataset Describe

Data['extended'].describe()

In [None]:
# region Dataset Describe

Data['region'].describe()

In [None]:
# All Dataset Description

Data.describe()

### Variables Description 

**attacktype1_txt** = The type of attack happened. Attacktype1_txt consists of categories like explosion, armed assault, assassination, kidnapping, unarmed assaults.

**target1_txt** = Type of target involved in the attack. Target1_txt consists of
categorical values like private citizens, military, police, government officials, transportation, education, religious institution, airports, etc.

**success** = ‘1’ if attack was a success. ‘0’ if attack was a failure.

**multiple** = Value for the number of attacks conducted in a single terrorist event.

**natlty1** = Nationality of the attacker.

**weaptype1 Type** = of weapon used in the attack. Weaptype1 contains values like firearms, explosives, melee, vehicles etc. 

**nkill** = Number of people killed in any event.

**nwonded** = Number of people wounded in any event

**region_txt** = Name of the region where the attack happened. Region_txt consists values like East Asia, South Asia, Western Europe, etc.

**longitude** = Longitude of the location.

**latitude** = Latitude of the location.

**property** = Total property damage happened in any event.

**suicide** = ‘1’ if attack was a suicide attempt. ‘0’ if attack was not a suicide attempt.

**motive** = Known motive of the attacker.

**age** = Estimated age of the attacker.

**iday, imonth, iyear** = Calendar details of the event.


### Check Unique Values for each variable.

In [None]:
# Check Unique Values for each variable.

# Apply the unique() method to all columns
unique_values = Data.apply(lambda x: x.unique())

# Print the unique values
print(unique_values)

## 3. ***Data Wrangling***

### Data Wrangling Code

In [None]:
# Write your code to make your dataset analysis ready.

# Renaming Columns

rename_data = Data.rename(columns={"nkill":"Killed","nwound":"Wounded","attacktype1":"Attacktype","eventid":"Eventid","iyear":"Year","imonth":"Month","iday":"Day","country_txt":"Country","region_txt":"Region","city":"City","attacktype1_txt":"Attack_Type","targtype1_txt":"Target_Type","target1":"Target","extended":"Extended","natlty1_txt":"Nationality","gname":"Group_Name","ishostkid":"Kidnapped","dbsource":"Source","weaptype1_txt":"Weapon_Type","guncertain1":"Gang_Sus",'motive':'Motive','latitude':'Latitude','longitude':'Longitude','nhostkid':'Hostages','nperps':'Number of Perpetrators'})

rename_data.head()


In [None]:
# Values are " 0, 1, -9" which represents "No, Yes, and Unknown" Respectivly so we will change them

change_values = ["crit1","crit2","crit3","doubtterr","multiple","suicide","property","Kidnapped","INT_LOG","INT_IDEO","INT_MISC"]
for col in change_values:
    rename_data[col].replace(to_replace=[0,1,-9],value=["No","Yes","Unknown"],inplace=True)

rename_data.head()

In [None]:
# here to select only useful colums for our analysis purpose other colunms are elemenated

rename_data = rename_data[['Wounded','Killed','Attacktype','Eventid','Year','Month','Day','Country','Region','City','Attack_Type','Target_Type','Target','Extended','Nationality','Group_Name','Kidnapped','Source','Weapon_Type','Motive','Hostages']]
rename_data.head()


### What all manipulations have you done and insights you found?

Here, 

Firstly rename columns name which we requires.

Seconde replace 0, & 1 values with the no & yes for better understanding into the selected rows.

Third elemenat unnecessary columns and print only used columns from dataset.

## ***4. Data Vizualization, Storytelling & Experimenting with charts : Understand the relationships between variables***

#### Chart - 1

In [None]:
# Chart - 1 Terrorism Transition in Regions

pd.crosstab (rename_data.Year, rename_data.Region).plot(kind='area', figsize=(15,5))
plt.title('Terrorism Transition in Regions')
plt.ylabel('Number of Attacks')
plt.xlabel('Year')
plt.show()

##### 1. Why did you pick the specific chart?

Here i choose area formate graph because it clearly represent relationship between numerical values and non numerical values. In this graph i try to present relation between region and terror attacks.

##### 2. What is/are the insight(s) found from the chart?

Figure gives more information about how terrorism has changed over time in different areas. 

By examining this pattern, we can see that the graph has clearly risen and fallen in all areas.

We can see that South America has made little to no contribution to the present terrorism trend. Terrorism had an effect on South America from the early 1980s to the mid-1990s. Since that time, there has not been a lot of crime. Therefore, despite the fact that South America has a higher overall attack rate than other regions, this region does not significantly contribute to the present global terrorism scenario. The Middle East and North Africa area is an exception to this rule.

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

Figure highlights the fact that almost every area has experienced a rise and fall. Therefore, despite the fact that terrorism has been prevalent throughout history, no area has shown a consistent history of participation with terrorism.

#### Chart - 2

In [None]:
# Chart - 2 Number Of Terrorist Activities Each Year

plt.subplots(figsize=(13,6))
sns.countplot(x='Year',data=rename_data)
plt.xticks(rotation=90)
plt.title('Number Of Terrorist Activities Each Year')
plt.show()

##### 1. Why did you pick the specific chart?

Here i use countplot to gain insight into the evolution of terrorism and the place at which it affects the world each year by compiling a list of all terrorist attacks over time.

##### 2. What is/are the insight(s) found from the chart?

Figure depicts statistics for the total number of attacks that occurred each year from 1970 to 2017. 

The 1970s saw a relatively low amount of terrorist attacks. Then, after a moderate rise in the 1980s and early 1990s and a significant decline in the following decade, terrorism began to rise again in the early 2000s and peaked like never before in history.

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

The rise in attacks over the past few years has contributed to a more hostile atmosphere and raised tension worldwide. This finding can aid in the investigation of elements that negatively influenced the sudden increase in assault frequency.

#### Chart - 3

In [None]:
# Chart - 3 Number of Terrorist Activities in Each Region

region_attacks = rename_data.Region.value_counts().to_frame().reset_index()
region_attacks.columns = ['Region', 'Total Attacks']
plt.subplots(figsize=(15,5))
sns.barplot(x=region_attacks.Region, y=region_attacks['Total Attacks'])
plt.xticks(rotation=90)
plt.title('Number of Terrorist Activities in Each Region')
plt.show()

##### 1. Why did you pick the specific chart?

I use barplot because i have large data set and not able to count one by one for all region name. With help of this plot it counts all names from data set and disply it. It easy way to represent numerical datas.

##### 2. What is/are the insight(s) found from the chart?

As shown in Figure, nations have been divided into twelve areas based on their geographic locations in order to assess the prevalence of terrorism in each of them.

The greatest number of assaults occur in the Middle East and North Africa, followed by South Asia and South America. The spread of terrorism in this country is not uniform across all areas.

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

Different levels of focus are needed for each area depending on the quantity of attacks.

Top terror affected countries need to implement solid step to tackle the terror attackes such as Middle East & north africa, south asia, south america, sub saharan africa, southeast asia, and western europe.

#### Chart - 4

In [None]:
# Chart - 4 Preferable Attacking Methods by Terrorist

plt.subplots(figsize=(13,6))
sns.countplot(y='Attack_Type',data=rename_data)
plt.title('Preferable Attacking Methods by Terrorist')
plt.show()

##### 1. Why did you pick the specific chart?

Here, i use countplot because it clearly indicates attack type category and total number of counts for perticular attack.

##### 2. What is/are the insight(s) found from the chart?

Attackers have employed a variety of tools and tactics. The specified attack class has 8 categorical numbers. They are barrier incidents, defenceless assaults, infrastructure attacks, kidnappings, hijackings, bombings and explosions, armed assaults, and assassinations. These characteristics can clarify which types of attacks are deployed most frequently.

Figures show the attacker's possible objective or area of concentration. For instance, unarmed assault assaults typically target a single target or a small collection of targets.  A ransom is typically demanded as compensation for a hijacking.

The most frequent type of attack is an explosion, which is followed by armed attacks, murders, hostage situations, and so forth. Here, the overall number of deaths caused by explosive weapons is nearly twice as high as the death toll from the next-most incident, an armed assault.

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

This finding suggests that the majority of assaults targeted citizens in an effort to terrorise a large number of targets.

#### Chart - 5

In [None]:
# Chart - 5 Top Affected Countries

plt.subplots(figsize=(13,45))
sns.countplot(y='country_txt',data=Data)
plt.title('Top Affected Countries')
plt.ylabel('Country Name')
plt.show()

##### 1. Why did you pick the specific chart?

Here, i use countplot because it clearly indicates all countries and total number of counts for perticular countries.

##### 2. What is/are the insight(s) found from the chart?

Based on the overall number of attacks, the figure reveals that some of the most impacted nations are Iraq, Pakistan, Afghanistan, and India.

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

Given dataset explain how some countries are prone to
violent actions and difference in an ideology which can lead to extreme terrorism.

#### Chart - 6

In [None]:
# Chart - 6 Total Number Of Terrorist Activities in each country

# Replace zero values in month with Unknown

rename_data["Month"].replace(to_replace=[0],value=["Unknown"],inplace=True)

# Number Of Terrorist Activities in each country

plt.subplots(figsize=(20,5))
sns.countplot(x='Region', data=rename_data, edgecolor=sns.color_palette('dark',16),hue ='Month')
plt.xticks(rotation=30)
plt.title('Number Of Terrorist Activities in each country')
plt.show()

##### 1. Why did you pick the specific chart?

I use countplot used to understand month wise dataset of terrorist attacks in all regions.

##### 2. What is/are the insight(s) found from the chart?

Graph clearly shows that the middle east and north african countries are most affected by terror activities and it is in peak during june & july month.

While other countries also follows similar trend, attack peak at middle of the year and decline during last month.

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

Based on analysis, we found that all region need to focus on middle of the year to tackle the terror activities.

#### Chart - 7

In [None]:
# Chart - 7 Preferable Location For Attack

plt.subplots(figsize=(13,6))
sns.countplot(y='Target_Type',data=rename_data)
plt.title('Preferable Location For Attack')
plt.show()

##### 1. Why did you pick the specific chart?

Graph clearly shows relation between terror group and total number of attacks.

##### 2. What is/are the insight(s) found from the chart?

Targeting their victims is how the perpetrator always attempts to make a statement. Understanding the sort of target will help you comprehend their goal and probably their motivations. An philosophy that seeks to alter the status quo or enforce its own is what fuels terrorism. Analyzing the characteristics that are most frequently attacked will reveal the attacker's goals and the nature of terrorism in general.

There are over a hundred different target types. These target categories are grouped into 22 categorises. Figure demonstrates that the most frequent targets are people, the military, the government, and the cops. This graph illustrates how terrorist organisations or people despise national or state power.

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

Their primary goal is to either influence political change or impose their beliefs on government officials by retaliating against them.

#### Chart - 8

In [None]:
# Count of Kidnapping values form data set

Kidnapped_value = (rename_data['Kidnapped']) .value_counts()
print(Kidnapped_value)

In [None]:
# Chart - 8 Terrorist preferance Kidnapping or Not

plt.subplots(figsize=(13,6))
sns.countplot(y='Kidnapped',data=rename_data)
plt.title('Terrorist preferance for Kidnapping')
plt.show()


##### 1. Why did you pick the specific chart?

Here, i need to know terrorists prefer kidnapping or not. So, i choose this graph formate.

##### 2. What is/are the insight(s) found from the chart?

An act whose main goal is to seize control of hostages with the intention of accomplishing a political goal by interfering with regular operations.

Based on observation, i found that the terrorists are not much prefere kidnapping. In some case it prefere kidnapping, but it is small proportion to not prefere.

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

This insight will be helpfull to know mindset of terrorist and their behavior to
accomplishing a political goal through concessions or by interfering with regular operations.


1.   **Insights that lead to negative growth**

As per the above data insights, it appears that the number of hostages is still not particularly high, but it still suggests that anti-terrorism agencies need to better their leads and operations to neutralize hostage-taking/kidnapping acts in order to prevent upsetting regular operations and life losses.

#### Chart - 9

In [None]:
# Chart - 9 Preferable Weapon Type

plt.subplots(figsize=(13,6))
sns.countplot(y='Weapon_Type',data=rename_data)
plt.title('Preferable Weapon Type')
plt.show()

##### 1. Why did you pick the specific chart?

Easily present total count data with respect to its source.

##### 2. What is/are the insight(s) found from the chart?

Attackers have utilised a range of tools and techniques. For the specified attack style, there are 12 categorical numbers of weapon type that terrorist used for attacke. They are explosives, incendairy, firearms, chemical, melee, sabotage equipments, vehicle, fake wapones, radiological, biological and other weapone type also. These characteristics can clarify which forms of weapove type are most frequently employed.

The most frequent are explosions, followed by firearms, incendairy and so forth. Here, the overall number of deaths caused by explosive weapons is nearly twice as high as the death toll from the next-most weapone type.

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

This finding suggests that the majority of assaults targeted citizens in an effort to terrorise a large number of targets.

#### Chart - 10

In [None]:
# Chart - 10 Number of People Killed by Active Terrorist Groups

# Terrorist group names and killed each year

group_killed= rename_data[['Group_Name','Killed']].groupby(['Group_Name']).sum().sort_values('Killed', ascending=False).head(20)
group_killed

# People Killed by each group in terrorist activity

group_killed.plot(kind = "bar",figsize = (12,6))
plt.title('Number of People Killed by Active Terrorist Groups')
plt.xlabel('Active Terrorist Groups')
plt.ylabel('Number of people killed')
plt.xticks(rotation= 90)
plt.show()

##### 1. Why did you pick the specific chart?

Graph clearly shows various terror groups and its total number of attackes, that's the reasion i choose this graph.

##### 2. What is/are the insight(s) found from the chart?

Some of the most infamous terrorist organisations and the years they were active are shown in Figure. Organisations like the Shining Path and the Farabundo Marti National Liberation Front were very busy, but after that, there was no sign of them in the years that followed. 

The main causes of the recent increase in assaults, particularly in the Middle East and North Africa, are the Unkown group, Taliban and ISIL. There are more active terrorist organisations disseminating fear and bloodshed today than ever before. Almost all groups experience numerous peaks and valleys over the course of their life.

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

However, new organisations emerge with various motivations and work to disseminate their ideologies. As a result, it is necessary to use various tactics when dealing with various populations. Different factions cannot be served by the same tactics that have been employed in the past.

#### Chart - 11

In [None]:
# Chart - 11 Attacks to Kill Comparison

coun_terror = rename_data['Country'].value_counts()[:15].to_frame()
coun_kill = rename_data.groupby('Country')['Killed'].sum()
coun_terror.merge(coun_kill, left_index = True, right_index = True, how ='left'). plot.bar(width=0.9)
fig = plt.gcf()
fig.set_size_inches(15,5)
plt.title('Attacks to Kill Comparison')
plt.show()

##### 1. Why did you pick the specific chart?

Graph indiacates ratio between attacks and kill in perticular country and it easily understandable with help of this graph.

##### 2. What is/are the insight(s) found from the chart?

The kills to attack ratio for the worst-affected nations can be examined in Figure. That percentage is extremely high for Iraq. Over 3 kills per strike on average. Philippines, Peru, and the United Kingdom are three nations that have experienced roughly the same number of assaults but have experienced a distinct number of fatalities.



##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

As a consequence of this association, different counterterrorism tactics in various nations may produce a different number of fatalities for a given number of attacks. When developing new counterterrorism strategies, this international contrast can be taken into account.

#### Chart - 12

In [None]:
# Chart - 12 Top 10 terrorist attackes affected cities

Cities = rename_data['City'].value_counts()[1:11]

Cities.plot(kind = "bar",figsize = (17,5))
plt.title("Top 10 terrorist attackes affected cities",fontsize = 13)
plt.xlabel("Cities",fontsize=13)
plt.xticks(fontsize = 12)
plt.ylabel("Number of Attacks",fontsize = 13)
plt.show()

##### 1. Why did you pick the specific chart?

Bar chart easily represent cummulative number of counts with reference to its own source name (City).

##### 2. What is/are the insight(s) found from the chart?

Here, i try to present top ten affected countries by the terrorists. In which baghdad, karachi, and lima are on top three affexted countries.

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

Baghdad shows approximate 2.5 time more terror attacks compare to other contries and it indicates it weakness to tackle the terrorist attackes.

#### Chart - 13

In [None]:
# Chart - 13 Killed in each City

kill = rename_data[["City","Killed"]].groupby("City").sum().sort_values(by="Killed",ascending=False).drop("Unknown")

# Total Kill city wise

fig = plt.figure()
ax0=fig.add_subplot(1,2,1)
kill[:10].plot(kind="bar",color="cornflowerblue",figsize=(30,5),ax=ax0)
ax0.set_title("People Killed in each City")
ax0.set_xlabel("Cities")
ax0.set_ylabel("Number of People Killed")

##### 1. Why did you pick the specific chart?

I use barplot because i have large data set and not able to count one by one for all city name. With help of this plot it counts all names from data set and disply it. It easy way to represent numerical datas.

##### 2. What is/are the insight(s) found from the chart?

As per previous chart baghdad is a top affetced country by terrorist and it has also in top in total number of people kill followed by mosul, mogadishu, karachi, new york city, etc.

The insight we have drone form here is that these above 10 cities having higest killing by terrorist which indicates these cities require more proctection and security to halt the terrorist activty and require more resources and stratergies to build a secure ecosystem to protect day to day stack holders oprations make them safe for living population.

##### 3. Will the gained insights help creating a positive business impact? 
Are there any insights that lead to negative growth? Justify with specific reason.

Gained insights from the above chart is to build a strong network of security and creating a awareness among peoples against the terrorism. It insights us toward the finding points why the terrosit attackes and killing are highy in number in these cities is it because of wrong polices of government or because of unemployment or lack of education.

this insight will help to find the arears which will help to build and frame policies accordinly by the authorties/government

#### Chart - 14 - Correlation Heatmap

In [None]:
# Correlation Heatmap visualization code

rename = Data[['iyear','imonth','iday','nkill','attacktype1','targtype1','weaptype1','success','alternative','individual','natlty1','nkillus','nkillter']]

corr_matrix = rename.corr()
plt.figure(figsize=[15,6])
sns.heatmap(corr_matrix, annot=True)
plt.show()

##### 1. Why did you pick the specific chart?

This type of graph gives correlation between two or more number of variables.

##### 2. What is/are the insight(s) found from the chart?

One important trend about the character of terrorism can be found by looking for dependencies between the different factors in the dataset. We have chosen the 13 most important factors out of a total of 120 for this chart. The day, year, nation, kill, assault success rate, attack style, target type, amount of kills, etc. are a few of these parameters.

This finding demonstrates that most assaults are carried out by nationals of the target nation. Given that the proportion of foreign terrorism is considerably lower than that of domestic terrorism, this relationship offers an intriguing window into how to view the phenomenon. As the attack type is determined by the weaponry used in that event, the weapon used in the attack and attack have a close relationship as well.

A darker shade on the block denotes a negative relationship between the criteria of year and achievement. As a result, any attack's likelihood of success has decreased over time. This is a notable finding that counterterrorism forces can now more effectively limit the likelihood that an assault will succeed than they could previously.

#### Chart - 15 - Pair Plot 

In [None]:
# Pair Plot visualization code

data = Data[['region','attacktype1','targtype1','alternative']][0:101]

sns.pairplot(data, hue="region", diag_kind="hist", kind="hist", height = 2)

##### 1. Why did you pick the specific chart?

It help make a relationships between variables in a dataset, making them a popular choice for data analysis.

This plots allow us to simultaneously visualize the relationships between multiple variables in a dataset, making it easier to identify patterns and correlations between them.

##### 2. What is/are the insight(s) found from the chart?

Pair plot shows the positive correlation in all selected variables. Other crime type are mostly use bombing/ explosion to affect the business category as per result.

## **5. Solution to Business Objective**

#### What do you suggest the client to achieve Business Objective ? 
This study can be used to comprehend terrorism and its character by academic scholars, international groups holding global events, foreign investors, security-related policy makers, and curious civilians.

Following is a list of directions which can enhance the quality and quantity of this current project work,

1. Improve dataset quality
2. Prediction
3. Enhance work for accurate data collection
4. Connections with other datasets

# **Conclusion**


The objective of this initiative was to create an instrument that aids users in comprehending and interpreting the essence of terrorism. Users can sense the START dataset through graphic patterns. 

An dynamic tool to examine this dataset is provided by a visualisation that can be used to determine the total number of assaults, total death tallies, and position based on the chosen area and year.

Through the use of visual analysis and the accompanying description, users can comprehend various patterns, trends, and correlations in terrorism. 

Users can use this tool's START dataset and other terrorism-related sites to conduct extra study.

### ***Hurrah! You have successfully completed your EDA Capstone Project !!!***