![title](Capture.PNG)

<h1>Author: Achal Mate</h1>

    

<h1>Task: Exploratory Data Analysis - Terrorism</h1>

<h2>Objectives</h2><h3>
● Perform ‘Exploratory Data Analysis’ on dataset ‘Global Terrorism’<br>

● As a security/defense analyst, try to find out the hot zone of terrorism.<br>

● What all security issues and insights you can derive by EDA?</h3>

<h3>Import All Necessary Library</h3>

In [None]:
import plotly.express as px
import numpy as np
import pandas as pd
import seaborn as sns
import folium
from folium.plugins import MarkerCluster 
import matplotlib.pyplot as plt
%matplotlib inline
import warnings
warnings.filterwarnings('ignore')
!py -m pip install folium

<h3>Read the DataSet</h3>

In [None]:
df_globalterrorism = pd.read_csv("globalterrorism_data.csv",encoding='ISO-8859-1')

In [None]:
df_globalterrorism.head()

<h3>Dimensions of the Dataset</h3>

In [None]:
df_globalterrorism.shape

<h3>Columns of the dataset</h3>

In [None]:
df_globalterrorism.columns

<h3>Concise Summary</h3>

In [None]:
df_globalterrorism.info()

<h3>Cleaning or Pre-Processing of data<br>Selection of required columns</h3>

In [None]:
df_globalterrorism.rename(columns={'iyear':'Year','imonth':'Month','iday':'Day','extended':'Extended','country_txt':'Country',
                                   'provstate':'state','region_txt':'Region','attacktype1_txt':'AttackType','target1':'Target',
                                   'nkill':'Killed','city':'City','latitude':'Latitude','longitude':'Longitude',    
                                    'property':'Property','nwound':'Wounded','summary':'Summary','gname':'Group',
                                   'targtype1_txt':'Target_type','multiple':'Multiple','success':'Success',      
                                    'suicide':'Suicide','nperps':'Nperps' ,'claimed':'Claimed'  ,'nkillter':'Killer',
                                   'weaptype1_txt':'Weapon_type','motive':'Motive'},inplace = True)

In [None]:
df_globalterrorism=df_globalterrorism[['Year','Month','Day','Extended','Country','state',
                                       'Region','AttackType','Target','Killed','City','Latitude','Longitude',    
                                       'Property','Wounded','Summary','Group','Target_type','Multiple','Success',                  
                                        'Suicide','Nperps' ,'Claimed'  ,'Killer','Weapon_type','Motive']]

In [None]:
df_globalterrorism.head()# First 5 rows 

<h3> Statistical Summary of data</h3>

In [None]:
df_globalterrorism.describe()

<h3>Check the Missing Data</h3>

In [None]:
df_globalterrorism.isnull().sum()

<h3>Remove Unknown values from co-ordinates</h3>

In [None]:
df_globalterrorism = df_globalterrorism[pd.notnull(df_globalterrorism.Latitude)]
df_globalterrorism = df_globalterrorism[pd.notnull(df_globalterrorism.Longitude)]

<h3>Fill the null values</h3>

In [None]:
df_globalterrorism.fillna(0)

<h3>Destructive Feature of data</h3>


<h4>Year with the most attacks</h4>

In [None]:
df_globalterrorism['Year'].value_counts().idxmax()

<h4>Month with the most attacks</h4>

In [None]:
df_globalterrorism['Month'].value_counts().idxmax()

<h4>Group with the most attacks</h4>

In [None]:
df_globalterrorism['Group'].value_counts().index[1]

<h4>Most Attack Types</h4>

In [None]:
df_globalterrorism['AttackType'].value_counts().idxmax()

<h4>Country with the most attacks</h4>

In [None]:
df_globalterrorism['Country'].value_counts().idxmax()

<h4>Region with the most attacks</h4>

In [None]:
df_globalterrorism['Region'].value_counts().idxmax()

<h3>Normal Distribtion of data</h3>

In [None]:
df_globalterrorism.hist(figsize=(14,10))

<h3>plot the number of attackt from dataset and categories by successfull and unsuccessfull</h3>

In [None]:
plt.figure(figsize = (13,8))
sns.countplot(x='Year',hue='Success',data = df_globalterrorism)
plt.xticks(rotation=65)
plt.title("Number of attack categories by successfull and unsuccessfull")

<h3>Total Number of attacks were there in 1970 ad 2017  And by  what percentage the attacks have increased globally</h3>

In [None]:
year= df_globalterrorism.Year.value_counts().to_dict()
rate = ((year[2017]-year[1970])/year[2017])*100

<h4>Number of attack in year 2017</h4>

In [None]:
year[2017]

<h4>Number of attack in year 1970</h4>

In [None]:
year[2017]

<h4>Number of attackes from 1970  to 2017 has increted by ? '%</h4>

In [None]:
np.round(rate,0)

<h3>Count Distribution of Successfull and UnSuccessfull</h3>

In [None]:
plt.figure(figsize = (10,5))
sns.countplot(df_globalterrorism.Success)
plt.title("Count Distribution of Successfull and UnSuccessfull attack from 1970 to 2017")


0 for unsuccessfull attack<br>
1 for successfull attack

<h3>Corelation between data</h3>

In [None]:
df_globalterrorism.corr()

In [None]:
plt.figure(figsize = (15,10))
sns.heatmap(df_globalterrorism.corr(),annot=True)

<h3>Number of Terrorist Activities each Year</h3>

In [None]:
plt.figure(figsize = (10,10))
sns.countplot('Year',data=df_globalterrorism,palette = 'YlOrBr',edgecolor=sns.color_palette("RdYlGn_r", 10),label="Number of Attack per year")
plt.xticks(rotation=65)
plt.title('Number Of Terrorist Activities Each Year')
plt.legend(loc='upper left') 


<h3>Death and Injuries at all time.</h3>

In [None]:
df_globalterrorism.plot(kind = 'scatter', x = 'Killed', y = 'Wounded', alpha = 0.5, color = 'red', figsize = (6,6), fontsize=15)
plt.xlabel('Kill', fontsize=15)
plt.ylabel('Wound', fontsize=15)
plt.title('Kill - Wound Scatter Plot')


<h4>In the majority of acts of terrorism, the mortality rate and injuries were low, but a small number of actions led to too many deaths and injuries.</h4>

<h3> Number of people killed in particular Region</h3>

In [None]:
plt.figure(figsize = (10,10))
sns.barplot(x='Region',y='Killed',data=df_globalterrorism)
plt.title('Number of people killed in particular Region',fontsize=15)
plt.ylabel('Killed',fontsize=15)
plt.xlabel('Region',fontsize=15)
plt.xticks(rotation=65)

<h4>Most of the people are killed from Sub-Saharan Africa</h4>

<h3>Number of people Wounded in particular Region</h3>

In [None]:
plt.figure(figsize = (10,10))
sns.barplot(x='Region',y='Wounded',data=df_globalterrorism)
plt.title('Number of people Wounded in particular Region',fontsize=15)
plt.ylabel('Wounded',fontsize=15)
plt.xlabel('Region',fontsize=15)
plt.xticks(rotation=65)

<h4>most of the people of East Asia region is wounded</h4>

<h3>Terrorist Activities by Region in each Year through Area Plot</h3>

In [None]:
pd.crosstab(df_globalterrorism.Year, df_globalterrorism.Region).plot(kind='area',figsize=(15,6))
plt.title('Terrorist Activities by Region in each Year',fontsize=15)
plt.ylabel('Number of Attacks',fontsize=15)
plt.xlabel('Years',fontsize=15)

<h3>Frequency of Terrorist Actions in Customized Region</h3>


In [None]:
df_globalterrorism.Region.unique()

Let's analyze in Middle East and North Africa

In [None]:
middleEastData = df_globalterrorism[df_globalterrorism.Region == 'Middle East & North Africa']

middleEastData.Year.plot(kind = 'hist', bins = 30,  color = 'red', fontsize=15,figsize=(15,6))
plt.xlabel('Year', fontsize=15)
plt.ylabel('Frequency', fontsize=15)
plt.title('Frequency of Middle East & North Africa Terrorism Actions by Years',fontsize=15)


<h3>Top Countries affected by Terror Attacks</h3>

In [None]:
df_globalterrorism.Country.value_counts()[:15]

In [None]:
plt.subplots(figsize=(15,6))
sns.barplot(df_globalterrorism['Country'].value_counts()[:15].index,df_globalterrorism['Country'].value_counts()[:15].values,palette='rocket')
plt.title('Top Countries Affected',fontsize=15)
plt.xlabel('Countries',fontsize=15)
plt.ylabel('Count',fontsize=15)
plt.xticks(rotation= 65)


<h3>ANALYSIS ON CUSTOMIZED DATA</h3>

<h4>Terrorist Attacks of a Particular year and their Locations</h4>

Let's look at the terrorist acts in the world over a certain year.

In [None]:
filterYear = df_globalterrorism['Year'] == 1970

In [None]:
 # filter data
filterData = df_globalterrorism[filterYear]
filterData 

In [None]:
# filterData.info()
reqFilterData = filterData.loc[:,'City':'Longitude'] # getting the required fields
reqFilterDataList = reqFilterData.values.tolist()
reqFilterDataList 

In [None]:
map = folium.Map(location = [0, 30], tiles='CartoDB positron', zoom_start=2)
# clustered marker
markerCluster = folium.plugins.MarkerCluster().add_to(map)
for point in range(0, len(reqFilterDataList)):
    folium.Marker(location=[reqFilterDataList[point][1],reqFilterDataList[point][2]],
                  popup = reqFilterDataList[point][0]).add_to(markerCluster)
map

<h4>84% of the terrorist attacks in 1970 were carried out on the American continent. In 1970, the Middle East and North Africa, currently the center of wars and terrorist attacks, faced only one terrorist attack.</h4>

<h3>Check  which terrorist organizations have carried out their operations in each country. A value count would give us the terrorist organizations that have carried out the most attacks. we have indexed from 1 as to negate the value of 'Unknown'</h3>

In [None]:
df_globalterrorism.Group.value_counts()[1:15]

In [None]:
test = df_globalterrorism[df_globalterrorism.Group.isin(['Shining Path (SL)','Taliban','Islamic State of Iraq and the Levant (ISIL)'])]

In [None]:
test.Country.unique()

In [None]:
terror_df_group = df_globalterrorism.drop_duplicates(subset=['Country','Group'])

In [None]:
terrorist_groups = df_globalterrorism.Group.value_counts()[1:8].index.tolist()
terror_df_group = terror_df_group.loc[terror_df_group.Group.isin(terrorist_groups)]
print(terror_df_group.Group.unique())

In [None]:
map = folium.Map(location=[20, 0], tiles="CartoDB positron", zoom_start=2)
markerCluster = folium.plugins.MarkerCluster().add_to(map)
for i in range(0,len(terror_df_group)):
    folium.Marker([terror_df_group.iloc[i]['Latitude'],terror_df_group.iloc[i]['Longitude']], 
                  popup='Group:{}<br>Country:{}'.format(terror_df_group.iloc[i]['Group'], 
                  terror_df_group.iloc[i]['Country'])).add_to(map)
map

<h4>The Above map looks untidy even though it can be zoomed in to view the Country in question. Hence in the next chart, I have used Folium's Marker Cluster to cluster these icons. This makes it visually pleasing and highly interactive</h4>

In [None]:
m1 = folium.Map(location=[20, 0], tiles="CartoDB positron", zoom_start=2)
marker_cluster = MarkerCluster(
    name='clustered icons',
    overlay=True,
    control=False,
    icon_create_function=None
)
for i in range(0,len(terror_df_group)):
    marker=folium.Marker([terror_df_group.iloc[i]['Latitude'],terror_df_group.iloc[i]['Longitude']]) 
    popup='Group:{}<br>Country:{}'.format(terror_df_group.iloc[i]['Group'],
                                          terror_df_group.iloc[i]['Country'])
    folium.Popup(popup).add_to(marker)
    marker_cluster.add_child(marker)
marker_cluster.add_to(m1)
folium.TileLayer('openstreetmap').add_to(m1)
folium.TileLayer('cartodbdark_matter').add_to(m1)
folium.TileLayer('stamentoner').add_to(m1)
folium.LayerControl().add_to(m1)

m1

<h3> Total Number of people killed in terror attack</h3>

In [None]:
killData = df_globalterrorism.loc[:,'Killed']
print('Number of people killed by terror attack:', int(sum(killData.dropna())))# drop the NaN values

<h3>Types of terrorist attacks that cause deaths</h3>

In [None]:
attackData = df_globalterrorism.loc[:,'AttackType']
# attackData
typeKillData = pd.concat([attackData, killData], axis=1)


In [None]:
typeKillFormatData = typeKillData.pivot_table(columns='AttackType', values='Killed', aggfunc='sum')
typeKillFormatData


<h3>Number of Casualities  V/S Killed people in each country per year</h3>

In [None]:
px.scatter(df_globalterrorism,df_globalterrorism.Wounded,df_globalterrorism.Killed,hover_name='Country',animation_frame='Year',animation_group='Country',
          size='Success',size_max=40,range_color=[0,1],labels={'Killed':'Deaths','Wounded':'Casualities'},color="Suicide",
          title='Number of Casualities  V/S Killed people in each country per year')

<h4>Terrorist acts in the Middle East and northern Africa have been seen to have fatal consequences. The Middle East and North Africa are seen to be the places of serious terrorist attacks. If you look at the graphics, it appears that Iraq, Afghanistan and Pakistan are the most damaged countries. </h4>

<h1><center>Thank You</center></h1>