# Projekt zaliczeniowy z Analizy Danych 

### Analysis of 'Global Terrorism Database', a dataset containing over 200 000 acts of terrorism since 1970 to 2017. 
### Main focus of the project is showing an ability to: 
#### 1) Read data
#### 2) Clean data
#### 3) Tranform data
#### 4) Presenting results and conclusions

##### Dataset url: https://www.kaggle.com/datasets/START-UMD/gtd/data

##### Codebook with description of the data in the dataset: http://start.umd.edu/gtd/downloads/Codebook.pdf

### 0) Loading and first data cleaning

In [97]:
import pandas as pd
pd.set_option('display.max_columns', None) # required to see all 135 columns in dataset

In [98]:
df = pd.read_csv('globalterrorismdb_0718dist.csv', 
                 encoding_errors='ignore',
                low_memory=False)

df.head()

#### 0.1) According to the Codebook, if it is impossible to assert the correct date to the case of attack, months are declared as '0'. 
#### Let's check these cases

In [113]:
df['imonth'].value_counts().sort_values(ascending=True, 
                                        ignore_index=True)

0        20
1     13496
2     13879
3     14180
4     14906
5     14936
6     15152
7     15257
8     15359
9     15563
10    15800
11    16268
12    16875
Name: count, dtype: int64

#### 0.2) As we see, there is 20 such cases. Lets check information about them

In [129]:
df.loc[df['imonth'] == 0]

Unnamed: 0,eventid,iyear,imonth,iday,approxdate,extended,resolution,country,country_txt,region,region_txt,provstate,city,latitude,longitude,specificity,vicinity,location,summary,crit1,crit2,crit3,doubtterr,alternative,alternative_txt,multiple,success,suicide,attacktype1,attacktype1_txt,attacktype2,attacktype2_txt,attacktype3,attacktype3_txt,targtype1,targtype1_txt,targsubtype1,targsubtype1_txt,corp1,target1,natlty1,natlty1_txt,targtype2,targtype2_txt,targsubtype2,targsubtype2_txt,corp2,target2,natlty2,natlty2_txt,targtype3,targtype3_txt,targsubtype3,targsubtype3_txt,corp3,target3,natlty3,natlty3_txt,gname,gsubname,gname2,gsubname2,gname3,gsubname3,motive,guncertain1,guncertain2,guncertain3,individual,nperps,nperpcap,claimed,claimmode,claimmode_txt,claim2,claimmode2,claimmode2_txt,claim3,claimmode3,claimmode3_txt,compclaim,weaptype1,weaptype1_txt,weapsubtype1,weapsubtype1_txt,weaptype2,weaptype2_txt,weapsubtype2,weapsubtype2_txt,weaptype3,weaptype3_txt,weapsubtype3,weapsubtype3_txt,weaptype4,weaptype4_txt,weapsubtype4,weapsubtype4_txt,weapdetail,nkill,nkillus,nkillter,nwound,nwoundus,nwoundte,property,propextent,propextent_txt,propvalue,propcomment,ishostkid,nhostkid,nhostkidus,nhours,ndays,divert,kidhijcountry,ransom,ransomamt,ransomamtus,ransompaid,ransompaidus,ransomnote,hostkidoutcome,hostkidoutcome_txt,nreleased,addnotes,scite1,scite2,scite3,dbsource,INT_LOG,INT_IDEO,INT_MISC,INT_ANY,related
1,197000000002,1970,0,0,,0,,130,Mexico,1,North America,Federal,Mexico city,19.371887,-99.086624,1.0,0,,,1,1,1,0.0,,,0.0,1,0,6,Hostage Taking (Kidnapping),,,,,7,Government (Diplomatic),45.0,"Diplomatic Personnel (outside of embassy, cons...",Belgian Ambassador Daughter,"Nadine Chaval, daughter",21.0,Belgium,,,,,,,,,,,,,,,,,23rd of September Communist League,,,,,,,0.0,,,0,7.0,,,,,,,,,,,,13,Unknown,,,,,,,,,,,,,,,,0.0,,,0.0,,,0,,,,,1.0,1.0,0.0,,,,Mexico,1.0,800000.0,,,,,,,,,,,,PGIS,0,1,1,1,
1123,197200000002,1972,0,0,,0,,160,Philippines,5,Southeast Asia,Capiz,Roxas,11.586558,122.753716,1.0,0,,,1,1,1,0.0,,,0.0,1,0,3,Bombing/Explosion,,,,,6,Airports & Aircraft,42.0,Aircraft (not at an airport),,air manila fokker F-27p,160.0,Philippines,,,,,,,,,,,,,,,,,Unknown,,,,,,,0.0,,,0,,,,,,,,,,,,,6,Explosives,7.0,Grenade,6.0,Explosives,16.0,Unknown Explosive Type,,,,,,,,,Explosive; Grenade,0.0,,,0.0,,,1,3.0,Minor (likely < $1 million),200000.0,,0.0,,,,,,,0.0,,,,,,,,,,,,,PGIS,-9,-9,0,-9,
1690,197300000001,1973,0,0,,1,12/1/1973,45,Colombia,3,South America,Unknown,unknown,,,5.0,0,,,1,1,1,0.0,,,0.0,1,0,6,Hostage Taking (Kidnapping),,,,,1,Business,9.0,Farm/Ranch,,"Alirio Serrano Sanchez, rancher",45.0,Colombia,,,,,,,,,,,,,,,,,National Liberation Army of Colombia (ELN),,,,,,,0.0,,,0,,,,,,,,,,,,,13,Unknown,,,,,,,,,,,,,,,,0.0,,,0.0,,,0,,,,,1.0,1.0,0.0,,,,Colombia,1.0,20000.0,,20000.0,,,2.0,Hostage(s) released by perpetrators,1.0,,,,,PGIS,0,0,0,0,
2164,197400000002,1974,0,0,,0,,69,France,8,Western Europe,Paris,Paris,48.856644,2.34233,1.0,0,,,1,1,1,-9.0,,,0.0,0,0,3,Bombing/Explosion,,,,,1,Business,3.0,Bank/Commerce,,Bank Lazard,69.0,France,,,,,,,,,,,,,,,,,Unknown,,,,,,,0.0,,,0,,,,,,,,,,,,,6,Explosives,28.0,Dynamite/TNT,,,,,,,,,,,,,Dynamite,0.0,,,0.0,,,1,,,,,0.0,,,,,,,0.0,,,,,,,,,,,,,PGIS,-9,-9,0,-9,
2165,197400000003,1974,0,0,,0,,98,Italy,8,Western Europe,Lazio,Rome,41.890961,12.490069,1.0,0,,,1,1,1,0.0,,,0.0,1,0,3,Bombing/Explosion,,,,,6,Airports & Aircraft,42.0,Aircraft (not at an airport),,TWA Boeing 707,217.0,United States,,,,,,,,,,,,,,,,,Unknown,,,,,,,0.0,,,0,,,,,,,,,,,,,6,Explosives,16.0,Unknown Explosive Type,,,,,,,,,,,,,Explosive,0.0,0.0,,0.0,0.0,,1,3.0,Minor (likely < $1 million),200000.0,,0.0,,,,,,,0.0,,,,,,,,,,,,,PGIS,-9,-9,1,1,
2744,197500000001,1975,0,0,,0,,153,Pakistan,6,South Asia,Punjab,Rawalpindi,33.594013,73.069077,1.0,0,,,1,1,1,0.0,,,0.0,1,0,3,Bombing/Explosion,,,,,6,Airports & Aircraft,42.0,Aircraft (not at an airport),,Pakistan Airlines Boeing 707,153.0,Pakistan,,,,,,,,,,,,,,,,,Unknown,,,,,,,0.0,,,0,,,,,,,,,,,,,6,Explosives,16.0,Unknown Explosive Type,,,,,,,,,,,,,Explosive,,,,,,,1,4.0,Unknown,,,0.0,,,,,,,0.0,,,,,,,,,,,,,PGIS,-9,-9,0,-9,
3484,197600000001,1976,0,0,,0,,209,Turkey,10,Middle East & North Africa,Istanbul,Istanbul,41.106178,28.689863,1.0,0,,,1,1,0,1.0,1.0,Insurgency/Guerilla Action,0.0,1,0,9,Unknown,,,,,4,Military,35.0,Military Transportation/Vehicle (excluding con...,,Turkish Army Vehicle,209.0,Turkey,,,,,,,,,,,,,,,,,Armenian Secret Army for the Liberation of Arm...,,,,,,,0.0,,,0,,,,,,,,,,,,,13,Unknown,,,,,,,,,,,,,,,,,,,,,,1,,,,,0.0,,,,,,,0.0,,,,,,,,,,,,,Hyland,0,1,0,1,
3485,197600000002,1976,0,0,,0,,209,Turkey,10,Middle East & North Africa,Ankara,Ankara,39.930771,32.76754,1.0,0,,,1,1,0,1.0,1.0,Insurgency/Guerilla Action,0.0,1,0,9,Unknown,,,,,4,Military,27.0,Military Barracks/Base/Headquarters/Checkpost,,military base,209.0,Turkey,,,,,,,,,,,,,,,,,Armenian Secret Army for the Liberation of Arm...,,,,,,,0.0,,,0,,,,,,,,,,,,,13,Unknown,,,,,,,,,,,,,,,,,,,,,,1,,,,,0.0,,,,,,,0.0,,,,,,,,,,,,,Hyland,0,1,0,1,
4407,197700000001,1977,0,0,,0,,101,Japan,4,East Asia,Tokyo,Tokyo,35.689125,139.747742,1.0,0,,,1,1,1,0.0,,,0.0,1,0,3,Bombing/Explosion,,,,,8,Educational Institution,49.0,School/University/Educational Building,,Tokyo University,101.0,Japan,,,,,,,,,,,,,,,,,Tribal Battlefront,,,,,,,0.0,,,0,,,,,,,,,,,,,6,Explosives,16.0,Unknown Explosive Type,,,,,,,,,,,,,Explosive,,,,,,,1,4.0,Unknown,,,0.0,,,,,,,0.0,,,,,,,,,,,,,PGIS,0,0,0,0,
4408,197700000002,1977,0,0,,0,,101,Japan,4,East Asia,Tokyo,Tokyo,35.689125,139.747742,1.0,0,,,1,1,1,0.0,,,0.0,1,0,3,Bombing/Explosion,,,,,1,Business,,,,Private Residence of President of a leading al...,101.0,Japan,,,,,,,,,,,,,,,,,Tribal Battlefront,,,,,,,0.0,,,0,,,,,,,,,,,,,6,Explosives,16.0,Unknown Explosive Type,,,,,,,,,,,,,Explosive,,,,,,,1,4.0,Unknown,,,0.0,,,,,,,0.0,,,,,,,,,,,,,PGIS,0,0,0,0,


#### 0.3) All of these cases lack a lot of information, and because of that we can safely discard them.