## Insurgency and EU law

#### Integration of datasets

In [1]:
import pandas as pd

Function to retrieve and filter desired variables and group's names

In [2]:
def filter_tables_groups(table, proposed_variables, group_name):
    """table: pandas df retrieved raw, proposed_variables: from excel sheet, group_name: varoable that correspond the group name as it is"""
    existing_columns = list(table.columns)
    columns_selected = list(set(proposed_variables) & set(existing_columns))
    table = table[columns_selected]
    groups = list(table[group_name].unique())
    return table, groups

---
#### Reputation of Terror Groups (RTG) Dataset
Description: The dataset contains all domestic terrorist groups, which defined in Enders et al. (2011) and based on Global Terrorism Database, with more than 5 terrorist attacks from 1980 to 2011. The data is in group name - year format. The data codes terrorist groups' actions which can build reputation among constituency and out-group. Researchers can found originally coded variables in regard to building positive and negative reputation among the audience as well as existing group level variables.

[Link to data](http://www.efetokdemir.com/data.html)

In [3]:
rtg_table = pd.read_stata('datasets/replicationdatajpr-oldstata.dta')

In [4]:
rtg_table.head()

Unnamed: 0,year,gname,ffund,childrec,frec,rebel,parterr,terpwing,teraff,govcaus,...,nat,civcausreal,civcauseffreal,outnegrep,cleavage,reputation,last,counter,endedtype,endedtype2
0,1989,1 May Group,0,0,0,0,0,0,0,1,...,0.0,0.25,1.0,1.0,1.0,0.0,3.0,1.0,0.0,0.0
1,1991,1 May Group,0,0,0,0,0,0,0,0,...,0.0,2.333333,0.0,0.0,1.0,0.0,3.0,2.0,0.0,0.0
2,1992,1 May Group,0,0,0,0,0,0,0,0,...,0.0,0.0,0.0,0.0,1.0,0.0,3.0,3.0,1.0,1.0
3,1989,16 January Organization for the Liberation of ...,0,0,0,0,0,0,0,1,...,,22.625,0.0,0.0,,0.0,1.0,1.0,1.0,1.0
4,1983,2 April Group,0,0,0,0,0,0,0,0,...,0.0,0.0,0.0,1.0,1.0,0.0,1.0,1.0,1.0,1.0


In [5]:
rtg_proposed_variables = ['gname',
'tcode',
'ffund',
'frec',
'parterr',
'terpwing',
'teraff',
'politics',
'media',
'pgood',
'intposrep',
'intnegrep',
'outnegrep',
'intnetrep',
'netrep',
'reputation',
'age',
'rebel',
'goal',
'broadgoal',
'type',
'terrstrong',
'statespons',
'international',
'nmbrtrr',
'left',
'rel',
'nat',
'peaksize',
'cleavage',
'endedtype',
'endedtype2',
'govcaus',
'educcaus',
'civcaus',
'nkill',
'nkillter',
'nwound',
'propertycount',
'govtargcount',
'civtargcount',
'eductargcount',
'eductargexist',
'civcausreal',
'civcausrealeff',
'ccodecow',
'logarea',
'loggdp',
'logpop',
'logmil']

In [6]:
df_rtg, groups_rtg = filter_tables_groups(rtg_table, rtg_proposed_variables, 'gname')

In [7]:
df_rtg.shape

(2641, 43)

In [8]:
len(groups_rtg)

443

In [11]:
df_rtg[df_rtg['gname'] == '1 May Group']

Unnamed: 0,pgood,nmbrtrr,nkill,teraff,rel,civtargcount,outnegrep,peaksize,media,reputation,...,ccodecow,international,logpop,tcode,civcaus,cleavage,terpwing,civcausreal,loggdp,eductargexist
0,0,4.0,2,0,0.0,1,1.0,,0,0.0,...,350.0,0,2.311965,1 May Group,1,1.0,0,0.25,9.766566,0
1,0,3.0,0,0,0.0,3,0.0,,0,0.0,...,350.0,0,2.326998,1 May Group,0,1.0,0,2.333333,9.780699,0
2,0,3.0,0,0,0.0,3,0.0,,0,0.0,...,350.0,0,2.337105,1 May Group,0,1.0,0,0.0,9.776662,0


---
#### The Foundations of Rebel Group Emergence (FORGE) Dataset

It provides information on the origins of violent non-state actors engaged in armed conflict against their government resulting in 25+ yearly battle deaths, active between 1946 and 2011. The unit of observation in this dataset is the rebel group organization. We also include information on the dyad and conflict in which these groups are participants for ease of integration with various Uppsala Conflict Data Program (UCDP) datasets. We draw upon the population of groups included in the Non-State Actor database described in greater detail here:
    
[Link to data](http://ksgleditsch.com/eacd.html)

In [9]:
forge_table = pd.read_excel('datasets/forge_v1.0_public.xlsx')#, header=[0], sheetname='Sheet')

In [10]:
forge_table.head()

Unnamed: 0,conflict_id,dyadid,NSAdyadid,actorid,gacronym,gname,ccode,cname,foundloc,foundyear,...,preorgfmr,preorgrel,preorgfor,preorgref,preorgeth,preorgoth,preorgname,merger,splinter,splinterUCDP
0,333.0,725.0,731.0,293.0,Harakat-i Inqilab-i Islami-yi Afghanistan,Movement of the Islamic Revolution/Uprising of...,700,Afghanistan,Pakistan,1978,...,0,0,0,0,0,0,Muslim Youth,0,,
1,333.0,731.0,737.0,298.0,Harakat-i Islami-yi Afghanistan,Islamic Movement,700,Afghanistan,"Qom, Iran",1978,...,0,0,0,0,0,0,,0,,
2,333.0,726.0,412.0,299.0,Hizb-i Islami-yi Afghanistan,Islamic Party of Afghanistan,700,Afghanistan,Pakistan,1976,...,0,0,0,0,0,0,Jam'iyyat-i Islami-yi Afghanistan,0,1.0,0.0
3,333.0,727.0,760.0,294.0,Hizb-i Islami-yi Afghanistan - Khalis faction,Islamic Party of Afghanistan - Khalis faction,700,Afghanistan,"Khugiani, Afghanistan",1979,...,0,0,0,0,0,0,Hizb-i Islami-yi Afghanistan,0,1.0,1.0
4,333.0,732.0,413.0,300.0,Hizb-i Wahdat,Unity Party,700,Afghanistan,"Tehran, Iran",1979,...,0,0,0,0,0,0,"Sazman-i-Nasr, Sepah-i Pasdaran/Pasdaran-i-Jih...",1,0.0,


In [11]:
forge_proposed_variables = [
    'conflict_id',
'dyadid',
'NSAdyadid',
'actorid',
'gacronym',
'gname',
'ccode',
'cname',
'foundloc',
'foundyear',
'foundmo',
'foundday',
'fightyear',
'fightmo',
'fightday',
'goalnominal',
'goalindep',
'goalauto',
'goalrights',
'goalrep',
'goalchange',
'goaldem',
'goalother',
'goalnote',
'ideology',
'ideolcom',
'ideolleft',
'ideolright',
'ideolnat',
'ideolanti',
'ideolrel',
'ideoloth',
'ideolnote',
'religious',
'religion',
'ethnic',
'ethnicity',
'preorg',
'preorgno',
'preorgreb',
'preorgpar',
'preorgmvt',
'preorgyou',
'preorglab',
'preorgrel',
'preorgmil',
'preorgfmr',
'preorggov',
'preorgfor',
'preorgref',
'preorgeth',
'preorgoth',
'preorgname',
'merger',
'splinter',
'splinterUCDP'
]

In [12]:
df_forge, groups_forge = filter_tables_groups(forge_table, forge_proposed_variables, 'gname')

In [13]:
df_forge.shape

(430, 56)

In [14]:
len(groups_forge)

415

---
#### Big Allied and Dangerous (BAAD) DatasetVersion 2.0

This dataset is an extract from the Big Allied and Dangerous (BAAD) Version 2.0 dataset used to create the results published in the article entitled "Crime, Conflict and the Legitimacy Tradeoff: Explaining Variation in Insurgents' Participation in Crime"
The Big Allied and Dangerous (BAAD) project focuses on creation and maintenance of a comprehensive database of terrorist and insurgent organizations – collectively referred to as “violent nonstate actors” (VNSAs) – and linking that data to prominent event, insurgency, and country characteristics datasets. Big Allied and Dangerous Version 2.0. BAAD Version 2.0 (BAAD2) contains data on nearly 600 and terrorist and insurgent organizations active 1998-2015 (with extension through 2017 planned). Organized into yearly time slices, BAAD2 records information on organizational characteristics (demographics, ideology, political activity, structure, leadership, exposure to counter-terrorism activity, social service provision, and engagement in violence) and organizational network relationships (both positive and negative).

[Link to data](https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/JT6GFR)

In [15]:
baad_table = pd.read_excel('datasets/BAAD2 Insurgency Crime Dataset.xlsx')#, header=[0], sheetname='Sheet')

In [16]:
baad_table.head()

Unnamed: 0,org,torg,year,torg_year,hbase,hbccode,hbiso,hb_iso_cc,left,reli,...,lead_hierarch,fddrugtk,fdextort,fdkidnap,fdrob,fdsmuggl,fdstate,ucdpbd,socsvcs,crim_degr_py
0,Abu Sayyaf Group (ASG),4,1998,4_1998,Philippines,840,PHL,608,0,1,...,1,1,1,1,1,1,0,55,0,
1,Abu Sayyaf Group (ASG),4,1999,4_1999,Philippines,840,PHL,608,0,1,...,1,1,1,1,1,1,0,0,0,1.0
2,Abu Sayyaf Group (ASG),4,2000,4_2000,Philippines,840,PHL,608,0,1,...,0,1,1,1,1,1,0,379,0,0.0
3,Abu Sayyaf Group (ASG),4,2001,4_2001,Philippines,840,PHL,608,0,1,...,0,1,1,1,1,1,0,333,0,0.0
4,Abu Sayyaf Group (ASG),4,2002,4_2002,Philippines,840,PHL,608,0,1,...,0,1,1,1,1,1,0,249,0,1.0


In [17]:
baad_proposed_variables = [
'ORG',
'TORG',
'YEAR',
'TORG_YEAR',
'HBASE',
'HBCCODE',
'HBISO',
'HB_ISO_CC',
'LEFT',
'RELI',
'ETHN',
'AGE',
'SIZE_REC',
'TERRCNTRL',
'LEAD_HIERARCH',
'FDDRUGTK',
'FDEXTORT',
'FDKIDNAP',
'FDROB',
'FDSMUGGL',
'FDSTATE',
'UCDPBD',
'SOCSVCS',
'CRIM_DEGR_PY'
]


In [18]:
baad_proposed_variables = [x.lower() for x in baad_proposed_variables]

In [19]:
df_baad, groups_baad = filter_tables_groups(baad_table, baad_proposed_variables, 'org')

In [20]:
df_baad.shape

(1386, 24)

In [21]:
len(groups_baad)

140

---
#### Do Good Borders Make Good Rebels? Territorial Control and Civilian Casualties?
Description: Replication data and code for "Do Good Borders Make Good Rebels?" (2016-06-25) 

[Link to data](https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/RTJ38N)

In [22]:
good_table = pd.read_stata('datasets/PKK osv synth.dta')

In [23]:
good_table.head()

Unnamed: 0,ccode,stateab,year,total,land,sea,location,total_border,acr,id,...,inslocdum324,inslocdum325,inslocdum326,s,dyloc,dylocnum,test,dup,firstapp,lastapp
0,100.0,COL,1989.0,10,5,5,Colombia,5079.5,COL,COL-1989,...,0,0,0,1.0,1920-1Colombia,1920-1Colombia,1.0,0.0,1966.0,2003.0
1,100.0,COL,1990.0,10,5,5,Colombia,5079.5,COL,COL-1990,...,0,0,0,1.0,1920-1Colombia,1920-1Colombia,1.0,0.0,1966.0,2003.0
2,100.0,COL,1991.0,10,5,5,Colombia,5079.5,COL,COL-1991,...,0,0,0,1.0,1920-1Colombia,1920-1Colombia,1.0,0.0,1966.0,2003.0
3,100.0,COL,1992.0,10,5,5,Colombia,5079.5,COL,COL-1992,...,0,0,0,1.0,1920-1Colombia,1920-1Colombia,1.0,0.0,1966.0,2003.0
4,100.0,COL,1993.0,10,5,5,Colombia,5079.5,COL,COL-1993,...,0,0,0,1.0,1920-1Colombia,1920-1Colombia,1.0,0.0,1966.0,2003.0


In [24]:
good_proposed_variables = ['ccode',
'stateab',
'year',
'total',
'land',
'sea',
'location',
'total_border',
'acr',
'id',
'obsid',
'confid',
'dyadid',
'dyadperiod',
'side_a',
'side_b',
'ex_sideb',
'state_b',
'startdate',
'enddate',
'terr',
'rebpolwing',
'rebestimate',
'rebestlow',
'rebesthigh',
'rebstrength',
'centcontrol',
'strengthcent',
'mobcap',
'armsproc',
'fightcap',
'terrcont',
'terrname',
'effterrcont',
'conflicttype',
'transconstsupp',
'rebextpart',
'rebpresosts',
'presname',
'rebsuport',
'rtypesop',
'rsupname',
'govsuport',
'gtypesup',
'gsupname',
'govextpart',
'ended',
'codedfromuctdp']

In [25]:
df_good, groups_good = filter_tables_groups(good_table, good_proposed_variables, 'side_b')

In [26]:
df_good.shape

(70, 46)

In [27]:
len(groups_good)

10

---
#### Rebel Governance: Military Boon or Military Bust?
Abstract: What is the effect of rebel governance on rebel military strength? Most existing research assumes that rebel governance enhances the military strength of the rebel group. I test this assumption with an original dataset of rebel governance services. The quantitative evidence present a more complicated picture that belies a straightforward link between the two: governance appears to have either no relationship with rebel strength, or a negative and statistically significant relationship with rebel military capacity. To explain this surprising result, I generate a set of empirically grounded mechanisms using case vignettes that incorporate primary and secondary qualitative data. As a whole, the paper calls for greater theorizing and testing about the consequences of rebel governance, as well as the strategic motivations for its implementation.
Original source https://www.meganastewart.org/research

[Link to data](https://191418d8-c73b-4d83-9b9d-697421330acd.filesusr.com/archives/1a24bf_cdc60d12908a4642bb84c1764e1a68d1.zip?dn=CMPS%20Information.zip)

In [28]:
mas_table = pd.read_stata('datasets/MAS_Main Data_CS.dta')

In [29]:
mas_table.head()

Unnamed: 0,side_b,location,sec2,seccol,secbroad2,secbroad3,secbroad4,secbroad5,communist,rebord,...,pg_ord,pg_zero,pg_one,dec40,dec50,dec60,dec70,dec80,dec90,dec00
0,ABSDF,Burma,0.0,0.0,0.0,0.0,0.0,,1.0,0.0,...,2.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
1,ABSU,India,0.0,0.0,0.0,0.0,0.0,,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0
2,ADF,Uganda,0.0,0.0,0.0,0.0,0.0,,0.0,1.0,...,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
3,AFDL,Congo/Zaire,0.0,0.0,0.0,0.0,0.0,,0.0,2.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
4,AFRC,Sierra Leone,0.0,0.0,0.0,0.0,0.0,,0.0,2.0,...,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0


In [30]:
mas_proposed_variables = ['side_b',
'location',
'communist',
'terrcontrol',
'ethnic',
'totalpopmin',
'yearmin',
'yearmax',
'maxedu',
'maxhealth',
'preconflict_edu',
'preconflict_health']

In [31]:
df_mas, groups_mas = filter_tables_groups(mas_table, mas_proposed_variables, 'side_b')

In [32]:
df_mas.shape

(313, 12)

In [33]:
len(groups_mas)

303

---
#### Mass Protests and the Resource Curse: The Politics of Demobilization in Rentier Autocracies?
Abstract: Why are some dictators more successful at demobilizing protest movements than others? Repression sometimes stamps out protest movements (Bahrain in 2011) but can also cause a backlash (Egypt and Tunisia in 2011), sometimes leading to the overthrow of the regime. This article argues that the effectiveness of repression in quelling protests varies depending upon the income sources of the authoritarian regime. Resource-rich autocracies are relatively shielded from domestic and international criticism. They therefore have a greater capacity to quell protests through force. Because resource-poor dictators lack such shielding, repression is more likely to trigger a backlash of increased protests. The argument is supported by analysis of newly available data on mass protests from the Nonviolent and Violent Campaigns and Outcomes (NAVCO 2.0) dataset, which covers all countries (1945-2006). The article implies that publics respond strategically to repression, and tend to demobilize when the government is capable of continually employing repression with impunity.
Original source https://www.meganastewart.org/research

[Link to data](https://www.dropbox.com/s/ekuta4zps0qxrsa/replication%20materials%202.zip?dl=1)

In [34]:
gsw_table = pd.read_stata('datasets/GSW Final Data, Including Civil Wars.dta')

In [35]:
gsw_table.head()

Unnamed: 0,campaign,location,year,cyear,lccode,wbcode,target,tccode,navco1designation,demob,...,negxpol,repression,progress,prim_method,dom_media,l1gdppc_log,l1gdppc_change,l1popdlog,effinal,regime_support
0,Active Forces,Madagascar,1991.0,0,580,MDG,Didier Radsiraka,580,1,1.0,...,3.0,3.0,3.0,1.0,2.0,6.788859,-0.047113,2.988396,0.861482,1.0
1,Active Forces,Madagascar,1992.0,1,580,MDG,Didier Radsiraka,580,1,0.0,...,,3.0,3.0,1.0,2.0,6.728581,-0.058564,3.018468,0.861482,1.0
2,Active Forces,Madagascar,1993.0,2,580,MDG,Didier Radsiraka,580,1,0.0,...,-7.0,3.0,4.0,1.0,2.0,6.706031,-0.022324,3.048735,0.861482,1.0
3,Afar insurgency,Djibouti,1991.0,0,522,DJI,Djibouti regime,522,0,0.0,...,4.0,3.0,1.0,0.0,2.0,7.646382,-0.102078,3.236704,0.6058,1.0
4,Afar insurgency,Djibouti,1992.0,1,522,DJI,Djibouti regime,522,0,0.0,...,4.0,3.0,2.0,0.0,2.0,7.491863,-0.143242,3.274404,0.6058,1.0


In [36]:
gsw_proposed_variables = ['campaign',
                          'location',
'year',
'cyear',
'lccode',
'wbcode',
'region',
'target',
'repression',
'camp_size',
'nonviolentany',
'totalcamp']

In [37]:
df_gsw, groups_gsw = filter_tables_groups(gsw_table, gsw_proposed_variables, 'campaign')

In [38]:
df_gsw.shape

(1726, 8)

In [39]:
len(groups_gsw)

250

---
#### Civil War as State Building: Strategic Governance in Civil War
Abstract: Why do rebel groups provide public goods? Some insurgencies divert critical financial and personnel resources to provide benefits to a population, that includes non-supporters (e.g. Karen National Union, Hezbollah, Eritrean People’s Liberation Front). Other groups offer no services or limit their service provision to only those people who actively support, or are likely to support, the insurgency. The existing literature examines why some insurgencies provide selective incentives for members to join and how insurgencies use social services to recruit members, yet no research addresses why insurgencies provide public goods. I argue that insurgent public goods provision is a strategic tool secessionist insurgents use to achieve their long-term strategic goal of independence. With new and original data, I use a large-n analysis to test this hypothesis. The results of the analysis support the theory, underscoring the importance insurgent non-violent behavior and addressing key issues such as sovereignty and governance.
Original source https://www.meganastewart.org/research

[Link to data](https://www.dropbox.com/s/vvuk09umjweszdo/Stewart_replication.zip?dl=0)

In [5]:
war_table = pd.read_stata('datasets/MAS_Original Data_Panel.dta')

In [6]:
war_table.head()

Unnamed: 0,acr,id,obsid,confid,dyadid,dyadperiod,side_a,side_b,location,year,...,communist,ethnonsec,rebord,dur,l1imr,l1gdppc_log,l1democracy,l1totalpop_log,lmtnest,terrcontrol
0,MYA,MYA-1990,EACD.2.2-112,1240.2,1240.2-1,1,Burma,ABSDF,Burma,1990.0,...,1.0,0.0,0.0,0.0,77.900002,,0.0,,3.600048,0.0
1,MYA,MYA-1991,EACD.2.2-112,1240.2,1240.2-1,1,Burma,ABSDF,Burma,1991.0,...,1.0,0.0,0.0,1.0,76.099998,,0.0,,3.600048,0.0
2,MYA,MYA-1992,EACD.2.2-112,1240.2,1240.2-1,1,Burma,ABSDF,Burma,1992.0,...,1.0,0.0,0.0,2.0,74.300003,,0.0,,3.600048,0.0
3,MYA,MYA-1993,EACD.2.2-112,1240.2,1240.2-1,1,Burma,ABSDF,Burma,1993.0,...,1.0,0.0,0.0,3.0,72.5,,0.0,,3.600048,0.0
4,MYA,MYA-1994,EACD.2.2-112,1240.2,1240.2-1,1,Burma,ABSDF,Burma,1994.0,...,1.0,0.0,0.0,4.0,70.599998,,0.0,,3.600048,0.0


In [7]:
war_proposed_variables = ['acr',
'id',
'obsid',
'confid',
'dyadid',
'dyadperiod',
'side_a',
'side_b',
'location',
'year',
'communist',
'ethnonsec',
'l1democracy',
'l1totalpop_log',
'totalpop_change_log',
'preconflict_edu',
'preconflict_health',
'terrcontrol',
'yearmin',
'yearpubeducount',
'yearcount',
'yearpubhealthcount',
'eduannual',
'healthannual',
'yearnoeducount',
'yearsexclusiveeducount',
'yearnohealthcount',
'yearexclusivehealthcount',
'exclude',
'id_use',
'yearpubeducountsec',
'yearpubeducountnonsec',
'countnonsec',
'countsec',
'yearpubhealthcountsec',
'yearpubhealthcountnonsec']

In [8]:
df_war, groups_war = filter_tables_groups(war_table, war_proposed_variables, 'side_b')

In [9]:
df_war.shape

(2426, 15)

In [10]:
len(groups_war)

327

---
#### Intersections

In [46]:
print('Rebel groups per dataset\nFORGE: {}\nRTG: {}\nBAAD: {}'\
      .format(len(groups_forge), len(groups_rtg), len(groups_baad)))

Rebel groups per dataset
FORGE: 415
RTG: 443
BAAD: 140


In [47]:
len(set(groups_forge) & set(groups_rtg))

14

In [48]:
len(set(groups_baad) & set(groups_rtg))

69

In [49]:
len(set(groups_baad) & set(groups_forge))

17

In [50]:
set(groups_baad) & set(groups_forge) & set(groups_rtg)

{'Al-Shabaab', 'Karen National Union', 'Oromo Liberation Front'}

#### Only 3 extact matches among the 3 datasets

In [51]:
pd.Series(groups_baad).to_csv('groups_baad.csv')
pd.Series(groups_forge).to_csv('groups_forge.csv')
pd.Series(groups_rtg).to_csv('groups_rtg.csv')
pd.Series(groups_good).to_csv('groups_good.csv')
pd.Series(groups_mas).to_csv('groups_mas.csv')
pd.Series(groups_gsw).to_csv('groups_gsw.csv')
pd.Series(groups_war).to_csv('groups_war.csv')