# EUR-LEX Sources
---

### The CEPS EurLex dataset: 142.036 EU laws from 1952-2019 with full text and 22 variables

In [1]:
#https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/0EGYWY

The dataset contains 142.036 EU laws - almost the entire corpus of the EU's digitally available legal acts passed between 1952 - 2019. It encompasses the three types of legally binding acts passed by the EU institutions: 102.304 regulations, 4.070 directives, 35.798 decisions in English language. The dataset was scraped from the official EU legal database (Eur-lex.eu) and transformed in machine-readable CSV format with the programming languages R and Python. 
The dataset was collected by the Centre for European Policy Studies (CEPS) for the TRIGGER project (https://trigger-project.eu/). We hope that it will facilitate future quantitative and computational research on the EU. 


In [2]:
import pandas as pd
import time
start = time.time()

In [3]:
data = pd.read_csv('EurLex_all.csv')

  interactivity=interactivity, compiler=compiler, result=result)


In [4]:
data.head()

Unnamed: 0,CELEX,Act_name,Act_type,Status,EUROVOC,Subject_matter,Treaty,Legal_basis_celex,Authors,Procedure_number,...,Act_cites,Cites_links,Act_ammends,Ammends_links,Eurlex_link,ELI_link,Proposal_link,Oeil_link,Additional_info,act_raw_text
0,32019D0276,Decision (EU) 2019/276 of the European Parliam...,Decision,In Force,aid to refugees; budget appropriation; EC gene...,cooperation policy; budget; EU finance; int...,TFEU,32013Q1220(01),European Parliament; European Council,,...,32013R1311,http://data.europa.eu/eli/reg/2013/1311/oj,,,https://eur-lex.europa.eu/legal-content/EN/ALL...,http://data.europa.eu/eli/dec/2019/276/oj,https://eur-lex.europa.eu/legal-content/EN/ALL...,,,22.2.2019 EN Official Journal of the European ...
1,32019D0277,Decision (EU) 2019/277 of the European Parliam...,Decision,In Force,aid to catastrophe victims; emergency aid; EC ...,cooperation policy; EU finance; budget; det...,TFEU,32002R2012; 32013Q1220(01),European Parliament; European Council,,...,32013R1311,http://data.europa.eu/eli/reg/2013/1311/oj,,,https://eur-lex.europa.eu/legal-content/EN/ALL...,http://data.europa.eu/eli/dec/2019/277/oj,https://eur-lex.europa.eu/legal-content/EN/ALL...,,,22.2.2019 EN Official Journal of the European ...
2,32019D0275,Decision (EU) 2019/275 of the European Parliam...,Decision,In Force,professional reintegration; Attica; EGF; EC ge...,employment; regions of EU Member States; EU ...,TFEU,32013Q1220(01); 32013R1309,European Parliament; European Council,,...,32013R1311,http://data.europa.eu/eli/reg/2013/1311/oj,,,https://eur-lex.europa.eu/legal-content/EN/ALL...,http://data.europa.eu/eli/dec/2019/275/oj,https://eur-lex.europa.eu/legal-content/EN/ALL...,,,22.2.2019 EN Official Journal of the European ...
3,32018D1859,Decision (EU) 2018/1859 of the European Parlia...,Decision,In Force,commitment appropriation; Latvia; payment appr...,budget; Europe; EU finance; cooperation pol...,TFEU,32002R2012; 32013Q1220(01),European Council; European Parliament,,...,32018D508; 32013R1311,http://data.europa.eu/eli/dec/2018/508/oj; htt...,,,https://eur-lex.europa.eu/legal-content/EN/ALL...,http://data.europa.eu/eli/dec/2018/1859/oj,https://eur-lex.europa.eu/legal-content/EN/ALL...,,,28.11.2018 EN Official Journal of the European...
4,32018D1720,Decision (EU) 2018/1720 of the European Parlia...,Decision,In Force,Northern Portugal; Portugal; employment aid; e...,regions of EU Member States; Europe; economi...,TFEU,32013Q1220(01); 32013R1309,European Council; European Parliament,,...,32013R1311,http://data.europa.eu/eli/reg/2013/1311/oj,,,https://eur-lex.europa.eu/legal-content/EN/ALL...,http://data.europa.eu/eli/dec/2018/1720/oj,https://eur-lex.europa.eu/legal-content/EN/ALL...,,,16.11.2018 EN Official Journal of the European...


In [5]:
data.columns

Index(['CELEX', 'Act_name', 'Act_type', 'Status', 'EUROVOC', 'Subject_matter',
       'Treaty', 'Legal_basis_celex', 'Authors', 'Procedure_number',
       'Date_document', 'Date_publication', 'First_entry_into_force',
       'Temporal_status', 'Act_cites', 'Cites_links', 'Act_ammends',
       'Ammends_links', 'Eurlex_link', 'ELI_link', 'Proposal_link',
       'Oeil_link', 'Additional_info', 'act_raw_text'],
      dtype='object')

In [6]:
len(data)

142036

In [7]:
data.groupby(['Status','Act_type']).agg({'CELEX': 'count'})

Unnamed: 0_level_0,Unnamed: 1_level_0,CELEX
Status,Act_type,Unnamed: 2_level_1
In Force,Decision,6515
In Force,Decision_DEL,26
In Force,Decision_ENTSCHEID,6309
In Force,Decision_FRAMW,22
In Force,Decision_IMPL,1853
In Force,Delegated Regulation,632
In Force,Directive,1294
In Force,Directive_DEL,58
In Force,Directive_IMPL,59
In Force,Implementing Regulation,4196


In [8]:
from flashtext import KeywordProcessor

In [9]:
def extract(vec, dictionary, info=False):
    matrix = []
    for line in vec:
        matrix.append(dictionary.extract_keywords(str(line).lower(), span_info=info))
    return matrix

In [10]:
# tems_list = [
# ' Liberation Tigers of Tamil Eelam',
# 'Revolutionary Armed Forces of Colombia',
# 'Communist Party of Nepal-Maoist',
# 'Free Aceh Movement',
# 'National Union for the Total Independence of Angola',
# ]

In [11]:
tems_list = [
'Al Qa’ida',
    'Islamic State of Iraq and al Sham','Islamic Jihad Group','Boko Haram','Al-Shabaab','Taliban','Al-Nusrah Front','Hezbollah'
]

In [12]:
terms_dict = KeywordProcessor()
terms_dict.add_keywords_from_list(tems_list)
terms_extracted = extract(data.act_raw_text, terms_dict)
rows = [list(set(i)) if len(i)>0 else '' for i in terms_extracted]
data['matches'] = [str(i).replace('[', '').replace(']', '') for i in rows]
data['count_matches'] = [len(i) for i in terms_extracted]

In [13]:
#data_laws.sort_values('count_matches', ascending=False).to_excel('datasets/laws_and_policies_AGRI.xlsx')
data.head()

Unnamed: 0,CELEX,Act_name,Act_type,Status,EUROVOC,Subject_matter,Treaty,Legal_basis_celex,Authors,Procedure_number,...,Act_ammends,Ammends_links,Eurlex_link,ELI_link,Proposal_link,Oeil_link,Additional_info,act_raw_text,matches,count_matches
0,32019D0276,Decision (EU) 2019/276 of the European Parliam...,Decision,In Force,aid to refugees; budget appropriation; EC gene...,cooperation policy; budget; EU finance; int...,TFEU,32013Q1220(01),European Parliament; European Council,,...,,,https://eur-lex.europa.eu/legal-content/EN/ALL...,http://data.europa.eu/eli/dec/2019/276/oj,https://eur-lex.europa.eu/legal-content/EN/ALL...,,,22.2.2019 EN Official Journal of the European ...,,0
1,32019D0277,Decision (EU) 2019/277 of the European Parliam...,Decision,In Force,aid to catastrophe victims; emergency aid; EC ...,cooperation policy; EU finance; budget; det...,TFEU,32002R2012; 32013Q1220(01),European Parliament; European Council,,...,,,https://eur-lex.europa.eu/legal-content/EN/ALL...,http://data.europa.eu/eli/dec/2019/277/oj,https://eur-lex.europa.eu/legal-content/EN/ALL...,,,22.2.2019 EN Official Journal of the European ...,,0
2,32019D0275,Decision (EU) 2019/275 of the European Parliam...,Decision,In Force,professional reintegration; Attica; EGF; EC ge...,employment; regions of EU Member States; EU ...,TFEU,32013Q1220(01); 32013R1309,European Parliament; European Council,,...,,,https://eur-lex.europa.eu/legal-content/EN/ALL...,http://data.europa.eu/eli/dec/2019/275/oj,https://eur-lex.europa.eu/legal-content/EN/ALL...,,,22.2.2019 EN Official Journal of the European ...,,0
3,32018D1859,Decision (EU) 2018/1859 of the European Parlia...,Decision,In Force,commitment appropriation; Latvia; payment appr...,budget; Europe; EU finance; cooperation pol...,TFEU,32002R2012; 32013Q1220(01),European Council; European Parliament,,...,,,https://eur-lex.europa.eu/legal-content/EN/ALL...,http://data.europa.eu/eli/dec/2018/1859/oj,https://eur-lex.europa.eu/legal-content/EN/ALL...,,,28.11.2018 EN Official Journal of the European...,,0
4,32018D1720,Decision (EU) 2018/1720 of the European Parlia...,Decision,In Force,Northern Portugal; Portugal; employment aid; e...,regions of EU Member States; Europe; economi...,TFEU,32013Q1220(01); 32013R1309,European Council; European Parliament,,...,,,https://eur-lex.europa.eu/legal-content/EN/ALL...,http://data.europa.eu/eli/dec/2018/1720/oj,https://eur-lex.europa.eu/legal-content/EN/ALL...,,,16.11.2018 EN Official Journal of the European...,,0


In [14]:
data[data['count_matches'] > 0 ]\
    [['CELEX','Eurlex_link','Act_type','Status','Treaty','Authors','Subject_matter','matches','count_matches']]\
    .sort_values('count_matches', ascending=False).to_excel('insurgency_groups_in_law2.xlsx')

In [15]:
end = time.time()
print('Elapsed time: {}'.format(time.strftime("%H:%M:%S", time.gmtime(end - start))))

Elapsed time: 00:04:34
