## Exploration of IndiaTerror data using python


In [1]:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

In [2]:
df = pd.read_csv('indiadata.csv')

Before exploring we should have clear knowledge what the data is, its size, type etc so that it it will be easier to get into the details.Let me first discover the shape of data and what columns does it have along with its data types.

In [3]:
df.shape

(4972, 14)

In [4]:
df.columns

Index([u'Year', u'City', u'Country', u'latitude', u'longitude', u'attack type',
       u'Target Type', u'Target Sub Type', u'Target', u'Weapon Type',
       u'Weapon sub type', u'Terrorist Organization', u'motive', u'summary'],
      dtype='object')

In [5]:
df.dtypes

Year                        int64
City                       object
Country                    object
latitude                  float64
longitude                 float64
attack type                object
Target Type                object
Target Sub Type            object
Target                     object
Weapon Type                object
Weapon sub type            object
Terrorist Organization     object
motive                     object
summary                    object
dtype: object

As we see we have 4972 rows and 14 columns. As we know before doing any anlysis we should first treat the null values as these
cause problems.

In [6]:
df.isnull().sum()

Year                      0
City                      0
Country                   0
latitude                  0
longitude                 0
attack type               0
Target Type               0
Target Sub Type           0
Target                    0
Weapon Type               0
Weapon sub type           0
Terrorist Organization    0
motive                    0
summary                   0
dtype: int64

Fortunately the dataset does not contain any null values.Lets explore it. I will first check the first 3 rows to get an idea of the dataset.

In [7]:
df.head(3)

Unnamed: 0,Year,City,Country,latitude,longitude,attack type,Target Type,Target Sub Type,Target,Weapon Type,Weapon sub type,Terrorist Organization,motive,summary
0,1975,Samastipur,India,25.863042,85.781004,Bombing/Explosion,Government (General),"Government Personnel (excluding police, military)",Lalit Narayan Mishra and a legislator,Explosives/Bombs/Dynamite,Unknown Explosive Type,Ananda Marga,Unknown,"1/2/1975: The Indian Railway Minister, Lalit N..."
1,1997,Unknown,India,33.778175,76.576171,Bombing/Explosion,Transportation,Bus Station/Stop,A bus station in Kashmir,Explosives/Bombs/Dynamite,Vehicle,Muslim Rebels,"Specific motive is unknown; however, the blast...",3/29/1997: Two explosions occurred at a bus st...
2,1997,Dhalai district,India,23.846698,91.909924,Bombing/Explosion,Military,Military Unit/Patrol/Convoy,Border Patrol Guards,Explosives/Bombs/Dynamite,Land Mine,National Liberation Front of Tripura (NLFT),"While the motive for this attack is unknown, t...",11/7/1997: A suspected anti-tank land mine exp...


The date of attack is mentioned in thee summary section. I will extract the date from summary and make another column.

In [8]:
df['date'] = df.summary.str.split(':').str.get(0)
df['summary'] = df.summary.str.split(':').str.get(1).str.strip()

In [9]:
df.head(2)

Unnamed: 0,Year,City,Country,latitude,longitude,attack type,Target Type,Target Sub Type,Target,Weapon Type,Weapon sub type,Terrorist Organization,motive,summary,date
0,1975,Samastipur,India,25.863042,85.781004,Bombing/Explosion,Government (General),"Government Personnel (excluding police, military)",Lalit Narayan Mishra and a legislator,Explosives/Bombs/Dynamite,Unknown Explosive Type,Ananda Marga,Unknown,"The Indian Railway Minister, Lalit Narayan Mis...",1/2/1975
1,1997,Unknown,India,33.778175,76.576171,Bombing/Explosion,Transportation,Bus Station/Stop,A bus station in Kashmir,Explosives/Bombs/Dynamite,Vehicle,Muslim Rebels,"Specific motive is unknown; however, the blast...",Two explosions occurred at a bus station in th...,3/29/1997


### What are the attacks types and the weapons used?
Lets first check what type of attcks have been conducted and what are the weapons that are used in these attacks.

In [10]:
df['attack type'].value_counts()

Bombing/Explosion                      2041
Armed Assault                          1516
Hostage Taking (Kidnapping)             593
Facility/Infrastructure Attack          439
Assassination                           201
Unknown                                 127
Unarmed Assault                          33
Hijacking                                13
Hostage Taking (Barricade Incident)       9
Name: attack type, dtype: int64

In [11]:
df['Weapon Type'].value_counts()

Explosives/Bombs/Dynamite                                                      2127
Firearms                                                                       1815
Incendiary                                                                      382
Unknown                                                                         374
Melee                                                                           243
Sabotage Equipment                                                               20
Chemical                                                                          7
Vehicle (not to include vehicle-borne explosives, i.e., car or truck bombs)       4
Name: Weapon Type, dtype: int64

As we see the attacks are done by using some common weapons.The most used weapon is Explosives/Bombs/Dynamite which is used 2127 times after which Firearms are most used weapons.Then kidnapping comes into picture.Though chemicals and vehicles were used but not many a times.Many attacks and weapons are reported unknown also.Now lets dig into it i.e lets explore the actuals weapons used in the attacks.

In [12]:
df.groupby('Weapon Type')['Weapon sub type'].value_counts()

Weapon Type                                                                  Weapon sub type                          
Chemical                                                                     .                                               4
                                                                             Poisoning                                       3
Explosives/Bombs/Dynamite                                                    Unknown Explosive Type                        674
                                                                             Grenade                                       536
                                                                             Other Explosive Type                          316
                                                                             Land Mine                                     266
                                                                             Remote Trigger                            

### Who the targets were?

In [13]:
df['Target Type'].value_counts()

Private Citizens & Property       1538
Government (General)               878
Police                             787
Business                           423
Transportation                     369
Military                           313
Educational Institution            130
Terrorists/Non-State Militia       103
Telecommunication                   89
Violent Political Party             86
Unknown                             83
Religious Figures/Institutions      66
Utilities                           47
Journalists & Media                 25
Tourists                            12
NGO                                  9
Other                                6
Airports & Aircraft                  3
Food or Water Supply                 3
Maritime                             1
Government (Diplomatic)              1
Name: Target Type, dtype: int64

In [None]:
We can see private properties are the most attacked targets.Then comes Government which is a commo news and ofcourse the police
military are the next targets.then come the telecommunication, education institution and religious institutions.There are also
some targets like NGO,airports etc but they dont have significant numbers. Now lets get into deeper to see who the actual
victims are.

In [14]:
df.groupby('Target Type')['Target Sub Type'].value_counts()

Target Type                     Target Sub Type                                     
Airports & Aircraft             Aircraft (not at an airport)                              2
                                Airport                                                   1
Business                        Construction                                            142
                                .                                                        73
                                Retail/Grocery/Bakery                                    51
                                Mining                                                   27
                                Gas/Oil                                                  23
                                Medical/Pharmaceutical                                   22
                                Industrial/Textiles/Factory                              19
                                Farm/Ranch                                             

In [None]:
The airports seem most secured place as there are hardly any attacks happened.We can see most of the attacks are to the 
government personnel,buildings,political party meetings etc which states they are at the top of the hater list.Though 
many attacks also done to the education institutes but all the attacks are intended to either to buldings or instructors,
but not to the students.In transportation trains are the common victims and so the constructions in Business.


### Who is conducting the attacks?

In [15]:
df['Terrorist Organization'].value_counts().head(50)

Communist Party of India - Maoist (CPI-Maoist)                  1547
Unknown                                                         1415
Maoists                                                          385
United Liberation Front of Assam (ULFA)                          238
National Democratic Front of Bodoland (NDFB)                     106
Lashkar-e-Taiba (LeT)                                            100
Other                                                             92
Hizbul Mujahideen (HM)                                            88
Garo National Liberation Army                                     82
People's War Group (PWG)                                          62
National Liberation Front of Tripura (NLFT)                       62
National Socialist Council of Nagaland-Isak-Muivah (NSCN-IM)      50
Indian Mujahideen                                                 34
Naxalites                                                         34
People's Liberation Army (India)  

As we see the most attacks had been conducted by the Maoists including CPI maoists after which ULFA, NDFB, LET and HM seem involved most actively in terrosism.Some small anti govt organisations like SIMI,NLFT,NSCM etc have also done significant attacks. 

### What are the most common cities where attacks are conducted?

In [16]:
df.City.value_counts().head(10)

Imphal                     230
Srinagar                   222
Unknown                    137
Guwahati                    61
Sopore                      40
New Delhi                   38
Latehar district            37
West Midnapore district     36
Malkangiri district         34
Anantnag                    32
Name: City, dtype: int64

We can see most of the attacks are happening in the nort zone where the terrorist organisations stay most actively.In the west there are also some areas like Malkanangiri in Odisha where Maoists are hide most actively.

### But what do they want ?

In [17]:
df.motive.value_counts().head(10)

The specific motive for the attack is unknown.                                                                                                                                                                             1641
Unknown                                                                                                                                                                                                                    1226
The attack was carried out because the victim was accused of being a police informer.                                                                                                                                        49
The specific motive for the attack is unknown                                                                                                                                                                                27
The attack was carried out because the victims were accused of being police informers.                  

As I explored the top 10 motives a lot of attacks are unknown and a common reason was "The attack was carried out because the victim was accused of being a police informer."and the others are related to politics.