<p>The EU Open Data Portal is a source of data, produced by the institutions and other bodies of the European Union. These open data are are free and can be used in research, applications, commercial or non-commercial purposes.
The dataset contains all the projects funded by the European Union from 1994 to 2020. These projects got approved under the framework programme (FP) for research and technological development. For each project it is provided information, such as: reference, acronym, dates, funding, programmes, participant countries, subjects and objectives.
Datasets that are uploaded on Open Data Portal, are being produced on a monthly basis.</p>

In [1]:
import pandas as pd
import xlrd

In [2]:
"""
Load datasets of all projects funded by the European Union for research and technological development under the:
- FP4: fourth framework programme (1994-1998)
- FP5: fifth framework programme (1998–2002)
- FP6: sixth framework programme (2002–2006)
- FP7: seventh framework programme (2007–2013)
- H2020: Horizon 2020 framework programme (2014-2020)
"""

xlsxFP4 = pd.ExcelFile("dataset/cordisfp4projects.xlsx")
xlsxFP5 = pd.ExcelFile("dataset/cordis-fp5projects.xlsx")
xlsxFP6 = pd.ExcelFile("dataset/cordis-fp6projects.xlsx")
xlsxFP7 = pd.ExcelFile("dataset/cordis-fp7projects.xlsx")
xlsxH2020 = pd.ExcelFile("dataset/cordis-h2020projects.xlsx")

In [3]:
dataFP4 = xlsxFP4.parse()
dataFP5 = xlsxFP5.parse()
dataFP6 = xlsxFP6.parse()
dataFP7 = xlsxFP7.parse()
dataH2020 = xlsxH2020.parse()

<h4>Common attributes in all files:</h4>
<ul>
<li>rcn: reference code (type: int)</li>
<li>project title (type: string)</li>
<li>start date (type: datetime)</li>
<li>end date (type: datetime)</li>
<li>status (type: string):
    <ul>
    <li>completed/accepted (FP4)</li>
    <li>null (FP5, FP6)</li>
    <li>ong(oing)/can(celled) (FP7)</li>
    <li>signed (H2020)</li>
    </ul></li>
<li>acronym  (type: string)</li>
<li>programme/pga (type: string)</li>
<li>framework programme (type:> string)</li>
<li>total Cost (type: int)</li>
<li>objective (typ>e: string)</li>
<li>projectUrl</li>
<li>(project) call (type: st>ring)</li>
<li>subject (type: list)</li>
<li>coordinatorCountry (type: string)</li>
<li>participantCountries (type: list)</li>
</ul>

<h3>FP4: Fourth framework programme (1994-1998)</h3>
<br>
https://data.europa.eu/euodp/en/data/dataset/cordisfp4projects 

<p>Extra attributes: Contract Number, Keywords, Date of Signature, Total Funding, General Information, Achievements, Activity Area, Contract Type</p>


In [28]:
# dataFP4.head(10)
df4 = dataFP4[['rcn', 'title', 'objective', 'subjects', 'frameworkProgramme']]
df4.head(10)

Unnamed: 0,rcn,title,objective,subjects,frameworkProgramme
0,29005,Spot IV-VΘgΘtation,,Environmental Protection; Forecasting; Meteoro...,Fourth Framework Programme
1,30802,Formation and occurrence of nitrous acd in the...,%LTo understand the mechanisms leading to the ...,Environmental Protection; Forecasting; Measure...,Fourth Framework Programme
2,31031,Process for Production of Light Olefins by Deh...,,Industrial Manufacture; Materials Technology,Fourth Framework Programme
3,30803,High resolution diode laser carbon dioxide env...,%LTo develop a new instrument for measuring at...,Environmental Protection; Measurement Methods;...,Fourth Framework Programme
4,31004,Subsurface Radar as a Tool for Non-destructive...,,Industrial Manufacture; Materials Technology; ...,Fourth Framework Programme
5,30804,Pollution from aircraft emissions In the North...,To determine by measurements and analysis the ...,Environmental Protection; Forecasting; Measure...,Fourth Framework Programme
6,30805,Diversity Effects in Grassland Ecosystems of E...,DEGREE aims at investigating the modifications...,Environmental Protection; Meteorology,Fourth Framework Programme
7,31108,Improvement of moisture content measuring syst...,,Industrial Manufacture; Measurement Methods; R...,Fourth Framework Programme
8,31109,Robust process analytical methods for industri...,The problem of the agreement between analytica...,Industrial Manufacture; Measurement Methods; R...,Fourth Framework Programme
9,31110,Improvement of robot industrial standardisation,,"Electronics, Microelectronics; Industrial Manu...",Fourth Framework Programme


<h3>FP5: fifth framework programme (1998–2002)</h3>
<br>
https://data.europa.eu/euodp/en/data/dataset/cordisfp5projects

<p>Extra attributes: id, topics, ecMaxContribution, fundingScheme, coordinator, participants</p>


In [27]:
# dataFP5.head(10)
df5 = dataFP5[['rcn', 'title', 'objective', 'subjects', 'frameworkProgramme']]
df5.head(10)

Unnamed: 0,rcn,title,objective,subjects,frameworkProgramme
0,64570,Genetic diversity in agriculture: temporal flu...,The overall objective of this project is to de...,ECO;SEA;LIF;ENV;AGR,FP5-LIFE QUALITY
1,64192,Sensing and controlling single molecules by no...,This project concerns controlling and sensing ...,BIO;LIF;ENV;MED;WAS;ITT,FP5-LIFE QUALITY
2,61977,Transduction mechanisms for non-noxious and no...,,,FP5-HUMAN POTENTIAL
3,54932,Portable measurement systems for atmospheric p...,The primary objective of the proposed project ...,SEA;MET;ENV;FOR,FP5-EESD
4,56044,Benthic primary production - carbon cycling an...,,,FP5-HUMAN POTENTIAL
5,54627,Probiotics and gastrointestinal disorders - co...,The PROGID project proposes to perform 2 disti...,ECO;LIF;MED;FOO;AGR;SAF;IND,FP5-LIFE QUALITY
6,67317,Alternative fuels for industrial gas turbines ...,This project aims to contribute to the optimis...,ENV;RSE;ESV,FP5-EESD
7,53880,New polyolefin materials via metal catalysed c...,,,FP5-HUMAN POTENTIAL
8,51440,From gene regulation to gene function: regulat...,"As unicellular organisms, bacteria must match ...",BIO;LIF;SCI;MED,FP5-LIFE QUALITY
9,58975,Nutritional enhancement of probiotics and preb...,The purpose of the project is to address and t...,ECO;LIF;MED;FOO;AGR;SAF;IND,FP5-LIFE QUALITY


<h3>FP6: sixth framework programme (2002–2006)</h3>
<br>
https://data.europa.eu/euodp/en/data/dataset/cordisfp6projects

<p>Extra attributes: reference, topics, ecMaxContribution, fundingScheme, coordinator, participants</p>


In [26]:
# dataFP6.head(2)
df6 = dataFP6[['rcn', 'title', 'objective', 'subjects', 'frameworkProgramme']]
df6.head(10)

Unnamed: 0,rcn,title,objective,subjects,frameworkProgramme
0,71920,Amigo Ambient Intelligence for the networked h...,The networked home environment leads to many n...,IPS,
1,85502,Genetic component of the low dose risk of thyr...,Cancer of the non-medullary (follicular epithe...,BIO;RAD,
2,74968,European food information resource network,EuroFIR will form a world-leading collaboratio...,IPS;FOO,
3,74155,Global allergy and asthma european network,Allergic diseases and asthma pose an important...,SEA;LIF;MED;FOO;AGR,
4,74297,Advanced Protection Systems (APROSYS),The IP on Advanced Protective Systems (APROSYS...,,
5,81228,Flavonoids and related phenolics for healthy L...,There is growing evidence that bioactives in t...,MED;FOO;AGR;SAF,
6,82431,Multi-functional carbon nanotubes for biomedic...,We will exploit the potential of multi-functio...,,
7,79163,Enzyme Microarrays-An integrated technology fo...,The deciphering of the human genome laid the g...,,
8,75937,Healthy Lifestyle in Europe by Nutrition in Ad...,The key to health promotion and disease preven...,SOC;MED;FOO,
9,73971,"Crystalline Silicon PV: Low-cost, highly effic...",Crystal-Clear intends to develop innovative ma...,,


<h3>FP7: seventh framework programme (2007–2013</h3>
<br>
https://data.europa.eu/euodp/en/data/dataset/cordisfp7projects 

<p>Extra attributes: reference, topics, ecMaxContribution, fundingScheme, coordinator, participants</p>

In [25]:
# dataFP7.head(10)
df7 = dataFP7[['rcn', 'title', 'objective', 'subjects', 'frameworkProgramme']]
df7.head(10)

Unnamed: 0,rcn,title,objective,subjects,frameworkProgramme
0,110629,ALFRED - Personal Interactive Assistant for In...,***Personal Interactive Assistant for Independ...,INF,FP7
1,104117,Microbial Biomarker Records in Tibetan Peats: ...,It is crucial to understand terrestrial microb...,SCI,FP7
2,188177,Post-glacial recolonisation and Holocene anthr...,"At the end of last glaciation, ca. 15 000 cal....",,FP7
3,188066,Molecular Mechanisms Employed by the Newly Ass...,Posttranscriptional gene regulation is an esse...,,FP7
4,187919,Identifying the targets and mechanism of actio...,The Ubiquitin (UB) and SUMO modification pathw...,,FP7
5,107182,Archaeological Investigations of the Extra-Urb...,'The research project 'ARIEL' proposes the arc...,LIF,FP7
6,109557,Nano -structural and -dynamic events in the T-...,The organization of the T-cell in its resting ...,LIF,FP7
7,107890,Synthesis and Biological Evaluation of a Poten...,'This project is directed toward the total syn...,LIF,FP7
8,108334,Microbially catalyzed electricity driven biopr...,'The breakthroughs in extracellular electron t...,SCI,FP7
9,109580,Thin-film Hybrid Interfaces: a training initia...,Organic thin-films constitute a fast growing a...,LIF,FP7


<h3>H2020: Horizon 2020 framework programme (2014-2020)</h3>
<br>
https://data.europa.eu/euodp/en/data/dataset/cordisH2020projects

<p>Extra attributes: reference, topics, ecMaxContribution, fundingScheme, coordinator, participants</p>

In [24]:
# dataH2020.head(10)
df6 = dataFP6[['rcn', 'title', 'objective', 'subjects', 'frameworkProgramme']]
df6.head(10)

Unnamed: 0,rcn,title,objective,subjects,frameworkProgramme
0,71920,Amigo Ambient Intelligence for the networked h...,The networked home environment leads to many n...,IPS,
1,85502,Genetic component of the low dose risk of thyr...,Cancer of the non-medullary (follicular epithe...,BIO;RAD,
2,74968,European food information resource network,EuroFIR will form a world-leading collaboratio...,IPS;FOO,
3,74155,Global allergy and asthma european network,Allergic diseases and asthma pose an important...,SEA;LIF;MED;FOO;AGR,
4,74297,Advanced Protection Systems (APROSYS),The IP on Advanced Protective Systems (APROSYS...,,
5,81228,Flavonoids and related phenolics for healthy L...,There is growing evidence that bioactives in t...,MED;FOO;AGR;SAF,
6,82431,Multi-functional carbon nanotubes for biomedic...,We will exploit the potential of multi-functio...,,
7,79163,Enzyme Microarrays-An integrated technology fo...,The deciphering of the human genome laid the g...,,
8,75937,Healthy Lifestyle in Europe by Nutrition in Ad...,The key to health promotion and disease preven...,SOC;MED;FOO,
9,73971,"Crystalline Silicon PV: Low-cost, highly effic...",Crystal-Clear intends to develop innovative ma...,,
