### *Data Collection - European Parliament*
## Preparing Raw Data
---
**Sample Text 11**
Title: Digital Europe programme <br>
Date: April 29, 2021 - Brussels

In [28]:
# import necessary libraries
import requests
from requests_html import HTMLSession
import urllib.request
import time
from bs4 import BeautifulSoup
import urllib
from urllib import request
from __future__ import division
import nltk, re, pprint
from nltk import word_tokenize
from nltk import FreqDist
import os.path 
import pandas as pd

---
### Process: Trimming debate by inserting the original English or translated English files and tokenizing them.
*Note*: Due to time constraint, the process has been optimized.

- English parts of the debate will be added manually as a string and then tokenized. 

- A consistent method of translating and then adding will be applied to all EU Parliament debates:  Non-English parts are copied from the original web pages, inserted in the consistent choice of translation tool, Google Translate (https://translate.google.com/?hl=de&tab=TT), translated to English and pasted in as a string. 

- Afterwards, the same steps are applied as per usual (tokenizing, standardizing).

Because of the changed process, the URL and step of webscraping are technically no longer necessary, will however be included for the purpose of completeness. 

In [29]:
# url = "https://www.europarl.europa.eu/doceo/document/CRE-9-2021-04-29-ITM-014_EN.html"
# html = requests.get(url)
# raw = BeautifulSoup(html.content, 'html.parser').get_text()

In [30]:
raw1_1 = 'Valter Flego, Reporter. - Honorable President, Commissioner, the European Parliament will give the green light to the Digital Europe program today. As a citizen of the European Union and as a program rapporteur, I am very happy about that. First of all, I would like to thank all of you from the bottom of my heart for your great help and support. We succeeded on time and without delay. Digital Europe today is getting its formal end, that is, its big operational start, which we honestly hardly all expect together. From virtual meetings and remote voting, paperless offices through supercomputers and artificial intelligence to smart homes and driverless cars. The digital revolution is happening fast and irreversibly. At the same time, the negative economic consequences of covida-19 will be severe and severe, and it is here that the digital industry plays a key role as a major tool for increasing European competitiveness and resilience. As a powerful tool for a better standard of living and a more comfortable life and daily work, as a driver of significant new sources of income based on the latest business models and services. It is digital as well as green. The main goal of this mandate. The main goal of this decade in our modern Europe, and the widespread application of digital technologies and solutions, is key to overcoming the investment gap that affects both the private and the public. In 2021, when we nostalgically remember the distant 2019, the importance of digital is unquestionable even for the biggest skeptics. Yes, we’ve all, whether we like it or not, digitized over the past year. Our work, business, private and business communication, socializing and free time. In the past year, everything has been digitized at an enviable speed, but in order to achieve a turning point in economies, this digitalization must be supported by a true digital transformation. This means a deeper change that deals with internal processes, human skills and our culture, because the rear of Europe is reserved for those who underestimate the importance of digital transformation. That is why digital Europe wants to contribute to strengthening a Europe in which digital technologies also enable better health and all public services in general. Better education, more successful fight against climate change, digital transformation of public and private companies, transport, tourism, industry, medicine, the whole economy and the whole industry. Equally important, with Digital Europe we want to strengthen the trust our citizens have in digital technologies by absolutely protecting their privacy, security and consumer rights, while increasing the transparency and speed of services provided. With Digital Europe, we want to contribute to achieving European digital independence and strengthening competitiveness in the global digital economy, but at the same time we are responsibly approaching new generations, new processes and new jobs. The Digital Europe program will play a significant and powerful role in the coming years, but I want to remain absolutely realistic in my predictions. 7.5 billion euros is a lot of money. With this money, for example, one Croatia can significantly accelerate its transition to e-society. However, Digital Europe is only one and in itself limited aid, one program, and it is not a program to build digital capacity with the most European money. Therefore, it is important that there is cooperation and synergy between European programs, that there is coordination, that it is known who is working and in what way. Complementarity with other programs is important, for example with the Horizon Europe program, which with its almost 100 billion euros covers a much wider area, as well as with the multiannual financial framework 2021-2027. Digital and green, ladies and gentlemen, are therefore two indispensable items of every next European program and project, and the key priorities of the programming period ahead. If Europe wants to become a leader in the coming years and not remain a follower, green digital Europe really has no alternative. Take, for example, the field of health and vaccine development. The challenges we have been facing for the last year due to the coronavirus pandemic are absolutely unprecedented. In such difficult times, we need a ready and strong Europe more than ever before. A Europe that has the most advanced technologies, a Europe that has top experts, a Europe that can defy the virus on its own, a Europe that does not lag behind the global technological superpowers but catches up with them, a Europe that develops its vaccines and a Europe that invests huge amounts of money. We need that. Digital Europe is our biggest ally on this path and one of the most important programs whose activities and funding contribute to the further development of all European countries, both today and in the future, in all aspects of life, including health and medicine. Today, digital Europe has its long-awaited and needed operational start. From today, Member States such as Croatia are taking great responsibility. Digital Europe offers us big money and opens up huge opportunities, and it is up to us how much'
tokens1_1 = word_tokenize(raw1_1)
for word in tokens1_1:
    print(word, end=' ')

Valter Flego , Reporter . - Honorable President , Commissioner , the European Parliament will give the green light to the Digital Europe program today . As a citizen of the European Union and as a program rapporteur , I am very happy about that . First of all , I would like to thank all of you from the bottom of my heart for your great help and support . We succeeded on time and without delay . Digital Europe today is getting its formal end , that is , its big operational start , which we honestly hardly all expect together . From virtual meetings and remote voting , paperless offices through supercomputers and artificial intelligence to smart homes and driverless cars . The digital revolution is happening fast and irreversibly . At the same time , the negative economic consequences of covida-19 will be severe and severe , and it is here that the digital industry plays a key role as a major tool for increasing European competitiveness and resilience . As a powerful tool for a better st

In [31]:
raw1_2 = 'Virginijus Sinkevičius, Member of the Commission. – Mr President, it’s my pleasure to be here today. This debate could not come at a better time. A little more than a month ago, the Commission presented Europe’s Digital Decade – a vision, targets and avenues for a successful digital transformation of Europe by 2030 in a direction that benefits all our citizens and businesses. It sets the course towards a digitally—empowered Europe by 2030 and provides for a digital compass to guide us there. The ambition is to pursue digital policies that empower people and businesses to seize a human—centred, sustainable and more prosperous digital future. This required setting the right regulatory framework for the digital space, and we are working closely with the Parliament on this, but also accelerating investments in our digital capabilities. The Digital Europe Programme will be one of the core EU instruments to achieve the Digital Decade. Let me underline that it will be the first ever EU programme entirely dedicated to digital. Here I would like to warmly thank the rapporteur, Mr Flego, and the Committee on Industry, Research and Energy (ITRE), as well as the shadow rapporteurs and all the associated committees. Your support in the negotiations and swift political agreement on digital Europe shows that we are all working together on shaping and supporting the digital transformation of Europe’s society and economy. The COVID crisis has demonstrated how crucial digital technologies and skills are for all of us to work, to study and to do business across our single market. The pandemic has also exposed Europe’s vulnerabilities and where we need to get better. It has only accentuated the need for a programme such as digital Europe that will enable us today to deploy digital capacities and strengthen our resilience. It will fill the deployment gap in digital technologies to make sure that the fruits of research and innovation investment get successfully deployed throughout Europe and benefit all our businesses, notably SMEs. Digital Europe will provide funding for projects in five crucial areas: supercomputing, artificial intelligence, cybersecurity, advanced digital skills and ensuring the wide use of digital technologies across the economy and society. These investments will support the Union’s twin objectives of a green transition and digital transformation and strengthen the Union’s resilience. The Digital Europe Programme is thus one of the key tools to make sure that the digital transition will propel the recovery. With the new EU budget and with the Recovery and Resilience Facility and its 20% target for digital, we have mobilised unprecedented resources to invest in digital transition. This marks a key opportunity for Europe to strengthen its digital capacities and deliver the Digital Decade targets by 2030. We know we now have our digital compass and resources to get us there.'
tokens1_2 = word_tokenize(raw1_2)
for word in tokens1_2:
    print(word, end=' ')

Virginijus Sinkevičius , Member of the Commission . – Mr President , it ’ s my pleasure to be here today . This debate could not come at a better time . A little more than a month ago , the Commission presented Europe ’ s Digital Decade – a vision , targets and avenues for a successful digital transformation of Europe by 2030 in a direction that benefits all our citizens and businesses . It sets the course towards a digitally—empowered Europe by 2030 and provides for a digital compass to guide us there . The ambition is to pursue digital policies that empower people and businesses to seize a human—centred , sustainable and more prosperous digital future . This required setting the right regulatory framework for the digital space , and we are working closely with the Parliament on this , but also accelerating investments in our digital capabilities . The Digital Europe Programme will be one of the core EU instruments to achieve the Digital Decade . Let me underline that it will be the f

In [32]:
raw1_3 = 'Pilar del Castillo Vera, on behalf of the PPE Group. – Mr President, Commissioner, this debate marks the end of the legislative process by which the first European digital program is created. Today we are fully immersed in a data economy and the stakes are high. We need to improve our digital infrastructure a lot. It must be said that, in this field, we are lagging behind the United States and China. Steps have recently been taken to overcome these shortcomings, for example, the GAIA-X program to develop a digital infrastructure for the cloud. It is a project that is also paving the way towards a European cloud, which is precisely the objective of the recently created European Alliance on Industrial Data and Cloud. The Digital Europe programme, with an endowment of 7.5 billion euros, reinforces these digital capacities through the acquisition of state-of-the-art supercomputers that the European Union does not currently have, but it is also going to invest in strategic projects related to artificial intelligence , cybersecurity and advanced digital skills. In short, the first European digital program will benefit all sectors of our economy, but also research, universities and countless other sectors. I want to make a special mention of SMEs, because they will be able to count on the digital infrastructure, with the supercomputers that allow them to process data through which they can optimize technologies that are so important, such as artificial intelligence.'
tokens1_3 = word_tokenize(raw1_3)
for word in tokens1_3:
    print(word, end=' ')

Pilar del Castillo Vera , on behalf of the PPE Group . – Mr President , Commissioner , this debate marks the end of the legislative process by which the first European digital program is created . Today we are fully immersed in a data economy and the stakes are high . We need to improve our digital infrastructure a lot . It must be said that , in this field , we are lagging behind the United States and China . Steps have recently been taken to overcome these shortcomings , for example , the GAIA-X program to develop a digital infrastructure for the cloud . It is a project that is also paving the way towards a European cloud , which is precisely the objective of the recently created European Alliance on Industrial Data and Cloud . The Digital Europe programme , with an endowment of 7.5 billion euros , reinforces these digital capacities through the acquisition of state-of-the-art supercomputers that the European Union does not currently have , but it is also going to invest in strategic

In [33]:
raw1_4 = 'Carlos Zorrinho, on behalf of the S&D Group. – Mr President, Commissioner, the approval of the Digital Europe Program is important because of the volume of investment it contains, which amounts to EUR 7.6 billion, and because of its scientific and technological priorities, but also because it is imbued with the common values ​​of European Union and an ethical commitment of reference that will make it possible to affirm the geopolitical relevance of the Union in the second wave of digitalization, contributing to make it democratic, green and inclusive. Unfortunately, the programme, which responds to such important areas as high-performance computing, artificial intelligence, cybersecurity, the transformation of public administrations, interoperability, digital skills, has seen its financial envelope reduced compared to the proposals of the European Commission. and the European Parliament. However, the Parliament managed to reinforce the implementation model through decentralized and networked investments, with flexible governance, and this enhances its articulation with the national recovery and resilience programs, thus making the digital bet more robust at the various levels at which will have to be done, contributing to digital inclusion and territorial cohesion. The program thus fully assumes the commitments to the ecological pact and to biodiversity and defines a case-by-case decision matrix for international cooperation in a logic of reciprocity and alignment of scientific and technological activities and an ethical approach. As shadow rapporteur for the S&D Group, I salute the work done, the results achieved, and I call for the joint work to continue to harness the full potential of this important programme.'
tokens1_4 = word_tokenize(raw1_4)
for word in tokens1_4:
    print(word, end=' ')

Carlos Zorrinho , on behalf of the S & D Group . – Mr President , Commissioner , the approval of the Digital Europe Program is important because of the volume of investment it contains , which amounts to EUR 7.6 billion , and because of its scientific and technological priorities , but also because it is imbued with the common values ​​of European Union and an ethical commitment of reference that will make it possible to affirm the geopolitical relevance of the Union in the second wave of digitalization , contributing to make it democratic , green and inclusive . Unfortunately , the programme , which responds to such important areas as high-performance computing , artificial intelligence , cybersecurity , the transformation of public administrations , interoperability , digital skills , has seen its financial envelope reduced compared to the proposals of the European Commission . and the European Parliament . However , the Parliament managed to reinforce the implementation model throug

In [34]:
raw1_5 = 'Susana Solís Pérez, on behalf of the Renew Group. – Mr President, Commissioner, we have a great challenge ahead: the digital transformation of our society and our economy. Today we approve Digital Europe, a European program devoted entirely to digitization, but that is not enough and requires much more effort. Private capital must be attracted and recovery funds must be invested to accelerate the digital transition. Europe needs a lot of investment: in artificial intelligence, in supercomputers, in cybersecurity and in data centers, so as not to be left behind by the United States and China. We also need to invest in connectivity, so that 5G reaches rural areas and does not empty them out. And we need all citizens and SMEs, whether it is a neighborhood store, a car workshop or a rural hotel, to be able to benefit from digitization. And for that it is essential to invest in people: to train all workers, to train experts to cover the thousands of jobs that are not covered now and that all of us, young and old, are prepared for an increasingly important digital administration. Finally, we need a regulatory framework that allows us to trust technology, that respects our rights and fundamental values, and that gives companies security so that they can innovate, invest and grow. The European digital decade begins. Let us put all the resources so that no one is left behind.'
tokens1_5 = word_tokenize(raw1_5)
for word in tokens1_5:
    print(word, end=' ')

Susana Solís Pérez , on behalf of the Renew Group . – Mr President , Commissioner , we have a great challenge ahead : the digital transformation of our society and our economy . Today we approve Digital Europe , a European program devoted entirely to digitization , but that is not enough and requires much more effort . Private capital must be attracted and recovery funds must be invested to accelerate the digital transition . Europe needs a lot of investment : in artificial intelligence , in supercomputers , in cybersecurity and in data centers , so as not to be left behind by the United States and China . We also need to invest in connectivity , so that 5G reaches rural areas and does not empty them out . And we need all citizens and SMEs , whether it is a neighborhood store , a car workshop or a rural hotel , to be able to benefit from digitization . And for that it is essential to invest in people : to train all workers , to train experts to cover the thousands of jobs that are not 

In [35]:
raw1_6 = 'Isabella Tovaglieri, on behalf of the ID Group. - (IT) Mr President, Commissioner, ladies and gentlemen, digital technologies have never shown their full potential but also their numerous pitfalls as in this pandemic period. I am not referring only to cyberbullying, to gender-based violence on the net, with their dramatic consequences both on a psychological and social level. But I am also referring to online scams to threats and digital security of companies and public bodies for geopolitical or profit-making purposes, which seriously jeopardize our economy, services, our digital identities and our sensitive data. These phenomena during the lockdown have had an exponential increase all over the world. In Italy alone in 2020 online fraud, data theft and server seizure in companies increased by 246%, reaching peaks of 100,000 units, causing enormous damage to the economy already brought to its knees by the health emergency. The regions most affected are also the most dynamic, such as Lombardy which records an average of 7 cybercrimes per day. The use of smart working, online commerce, Internet services and apps have favored the criminal activities of hackers to the detriment of consumers and businesses, especially smaller ones, which are more vulnerable precisely due to the lack of know-how. and the resources needed to invest in cybersecurity. According to the Politecnico di Milano, 59% of companies fear being the victim of cyber attacks, while one in four, when they receive them, pays the ransom. It is estimated that a cyber attack on a SME costs an average of 120,000 euros. In the dramatic phase we are going through, a similar event could really represent the coup de grace for many activities. Small and medium-sized enterprises are the backbone of the Italian and European economy. They are fundamental and crucial for the recovery and it is our duty to protect them from these pitfalls. This is why we will strongly support this European regulation, which aims to create a new digital single market that defends, at a supranational level, the economic interests and above all the fundamental rights of citizens.'
tokens1_6 = word_tokenize(raw1_6)
for word in tokens1_6:
    print(word, end=' ')

Isabella Tovaglieri , on behalf of the ID Group . - ( IT ) Mr President , Commissioner , ladies and gentlemen , digital technologies have never shown their full potential but also their numerous pitfalls as in this pandemic period . I am not referring only to cyberbullying , to gender-based violence on the net , with their dramatic consequences both on a psychological and social level . But I am also referring to online scams to threats and digital security of companies and public bodies for geopolitical or profit-making purposes , which seriously jeopardize our economy , services , our digital identities and our sensitive data . These phenomena during the lockdown have had an exponential increase all over the world . In Italy alone in 2020 online fraud , data theft and server seizure in companies increased by 246 % , reaching peaks of 100,000 units , causing enormous damage to the economy already brought to its knees by the health emergency . The regions most affected are also the mos

In [36]:
raw1_7 = 'Damian Boeselager, on behalf of the Verts/ALE Group. – Mr President, the Digital Europe Programme is the best practice for how EU digital policy could be done. It is pan-European, which will benefit all citizens and companies in all Member States. It puts digital skills and support for digitisation of SMEs and start-ups at the core of EU policy, and it ensures parliamentary scrutiny over every euro spent under the programme. I therefore thank the rapporteur and I also thank the shadows and the Commission and the Council for the good work on this file – but it is clearly only a first step. Our SMEs and start-ups have now all embarked on their digital journey. There are different levels of sophistication in this process, and we can help them succeed if we do a couple of things. First, we need to ensure that they’re not forced into a technical infrastructure that they can’t escape from. Second, we need to ensure that the dominance of certain actors, for example in the data market, does not translate itself automatically onto the machine-learning and AI markets. And third, we need to ensure that the new standards that are being set for data sharing and such are not decided in a closed circle just between a couple of Member States and between a couple of big companies. In the end, our European digital policy will only be successful if we understand that innovation can come from anywhere, and especially the smallest actors.'
tokens1_7 = word_tokenize(raw1_7)
for word in tokens1_7:
    print(word, end=' ')

Damian Boeselager , on behalf of the Verts/ALE Group . – Mr President , the Digital Europe Programme is the best practice for how EU digital policy could be done . It is pan-European , which will benefit all citizens and companies in all Member States . It puts digital skills and support for digitisation of SMEs and start-ups at the core of EU policy , and it ensures parliamentary scrutiny over every euro spent under the programme . I therefore thank the rapporteur and I also thank the shadows and the Commission and the Council for the good work on this file – but it is clearly only a first step . Our SMEs and start-ups have now all embarked on their digital journey . There are different levels of sophistication in this process , and we can help them succeed if we do a couple of things . First , we need to ensure that they ’ re not forced into a technical infrastructure that they can ’ t escape from . Second , we need to ensure that the dominance of certain actors , for example in the 

In [37]:
raw1_8 = 'Jessica Stegrud, on behalf of the ECR Group. - Mr President! How did Microsoft, Apple and Google come about? Was it because the US government in the 70s presented a proposal to Congress to create a Digital America? Did they debate for hours how many billions of dollars would be spent on digital hubs in Alaska or Florida and how to train federal workers in Texas in the new digital technology through federal initiatives, or by regulating the gender distribution in boardrooms? No, creative industries grow best when politics stays away. I do not think that a number of digital hubs, innovation councils, detailed regulation, mandatory gender distribution of capital or even a few billion euros in grants will do the big job. What the EU should do to promote innovation, together with the Member States, is to invest in entrepreneurship and create clear frameworks, efficient administration, legal certainty, rapid judicial decisions and simplified tax rules. As it looks today, the Apple, Google or Microsoft of tomorrow will not come from Europe either.'
tokens1_8 = word_tokenize(raw1_8)
for word in tokens1_8:
    print(word, end=' ')

Jessica Stegrud , on behalf of the ECR Group . - Mr President ! How did Microsoft , Apple and Google come about ? Was it because the US government in the 70s presented a proposal to Congress to create a Digital America ? Did they debate for hours how many billions of dollars would be spent on digital hubs in Alaska or Florida and how to train federal workers in Texas in the new digital technology through federal initiatives , or by regulating the gender distribution in boardrooms ? No , creative industries grow best when politics stays away . I do not think that a number of digital hubs , innovation councils , detailed regulation , mandatory gender distribution of capital or even a few billion euros in grants will do the big job . What the EU should do to promote innovation , together with the Member States , is to invest in entrepreneurship and create clear frameworks , efficient administration , legal certainty , rapid judicial decisions and simplified tax rules . As it looks today ,

In [38]:
raw1_9 = 'Edina Tóth (NI). - Dear Mr. PRESIDENT! The Internet and digital technologies play a key role in all areas of our lives. Proper modernization and financial support of existing infrastructure is a prerequisite for an effective digital transformation. Innovative technologies will play a key role in post-epidemic economic recovery. I believe that a budget of EUR 7.6 billion can effectively promote green renewal and accelerate the digital transformation. The main goal is to reduce the digital divide between businesses and geographical areas, and to ensure the availability and use of infrastructure and innovative services. It is particularly welcome that the fight against digital exclusion in the program, which is of paramount importance to Hungary, is an important part of the framework budget, which provides long-term training opportunities.'
tokens1_9 = word_tokenize(raw1_9)
for word in tokens1_9:
    print(word, end=' ')

Edina Tóth ( NI ) . - Dear Mr. PRESIDENT ! The Internet and digital technologies play a key role in all areas of our lives . Proper modernization and financial support of existing infrastructure is a prerequisite for an effective digital transformation . Innovative technologies will play a key role in post-epidemic economic recovery . I believe that a budget of EUR 7.6 billion can effectively promote green renewal and accelerate the digital transformation . The main goal is to reduce the digital divide between businesses and geographical areas , and to ensure the availability and use of infrastructure and innovative services . It is particularly welcome that the fight against digital exclusion in the program , which is of paramount importance to Hungary , is an important part of the framework budget , which provides long-term training opportunities . 

In [39]:
raw1_10 = 'Cristian-Silviu Buşoi (PPE). - Mr President, Commissioner, ladies and gentlemen, a digital Europe is at the heart of the transformation that the European Union has set out to achieve. How to make Europe more digital and green at the same time are the challenges of our generation. The Digital Europe program will provide strategic funding to address these crucial challenges in areas such as supercomputing, artificial intelligence, cybersecurity, advanced digital skills and the widespread use of digital technologies throughout the economy and society. Geopolitically, if we succeed in transforming today, we will have an advantage in international markets tomorrow. Digitization and new technologies will change our lives and allow us to solve many of the challenges we face. For example, our ability to collect, store and process data through an artificial intelligence algorithm will increase efficiency and productivity and strengthen the overall competitiveness of the European Union. The Digital Europe Program is the first financial instrument of the EU focused on bringing digital technology to the citizens, industry and public administrations of the Member States, and this will contribute to the digital transformation of the Union. To this end, we in the European Parliament have made sure that digital Europe will be synergistic and complementary to the instruments in the new MFF, such as the Connecting Europe Facility, with the Europe Horizon Program, and that it will receive adequate funding. With a budget of 7.5 billion euros and synergies with other financial instruments, it will accelerate the economic recovery and make the digital transformation of European society and economy, bringing benefits to all.'
tokens1_10 = word_tokenize(raw1_10)
for word in tokens1_10:
    print(word, end=' ')

Cristian-Silviu Buşoi ( PPE ) . - Mr President , Commissioner , ladies and gentlemen , a digital Europe is at the heart of the transformation that the European Union has set out to achieve . How to make Europe more digital and green at the same time are the challenges of our generation . The Digital Europe program will provide strategic funding to address these crucial challenges in areas such as supercomputing , artificial intelligence , cybersecurity , advanced digital skills and the widespread use of digital technologies throughout the economy and society . Geopolitically , if we succeed in transforming today , we will have an advantage in international markets tomorrow . Digitization and new technologies will change our lives and allow us to solve many of the challenges we face . For example , our ability to collect , store and process data through an artificial intelligence algorithm will increase efficiency and productivity and strengthen the overall competitiveness of the Europe

In [40]:
raw1_11 = 'Lina Galvez Muñoz (S&D). – Mr President, our political group has been working on this agreement for years to strengthen the capacities of the European Union in areas that will be key in the digital world that we are building. I especially appreciate the mention of the digital divide in the Program, because technology has to help put an end to inequalities, reduce social inequalities. In this sense, I would like to focus on the pillar dedicated to advanced digital skills and stress the need to actively strengthen the participation of women and girls in the ICT sector, because women are underrepresented, especially in employment. , since we occupy only 18% of jobs in this sector. Likewise, I want to highlight the need to ensure the training of the younger generations in the digital field, also in cybersecurity, to combat disinformation, build critical and active digital citizenship and, ultimately, ensure a successful, inclusive and democratic digital transition. .'
tokens1_11 = word_tokenize(raw1_11)
for word in tokens1_11:
    print(word, end=' ')

Lina Galvez Muñoz ( S & D ) . – Mr President , our political group has been working on this agreement for years to strengthen the capacities of the European Union in areas that will be key in the digital world that we are building . I especially appreciate the mention of the digital divide in the Program , because technology has to help put an end to inequalities , reduce social inequalities . In this sense , I would like to focus on the pillar dedicated to advanced digital skills and stress the need to actively strengthen the participation of women and girls in the ICT sector , because women are underrepresented , especially in employment . , since we occupy only 18 % of jobs in this sector . Likewise , I want to highlight the need to ensure the training of the younger generations in the digital field , also in cybersecurity , to combat disinformation , build critical and active digital citizenship and , ultimately , ensure a successful , inclusive and democratic digital transition . 

In [41]:
raw1_12 = 'Bart Groothuis (Renew). – Mr President, today we vote on a pan—European digital programme. And one of the illustrations of how far we are already on track with this programme is, unfortunately, the doubling of cybercrime in 2019. Ransomware attacks have gone up 300% in the year after, in 2020. And I think we all know the metrics for 2021. It might get worse before it gets better. But it will get better, because we are currently drafting good new legislation. Yet here in Europe, we spend 41% less on cybersecurity than in the United States, and yet there is still no pan—European cybersecurity infrastructure. The good news of today is that we also vote for EUR 1.6 billion on cybersecurity, and I commend that. We are talking about trans—European networks in real networks: on energy networks, on water, and even connecting our digital networks through GAIA—X. How about laying down a cybersecurity network for Europe? Is it not time? I believe so. For coordinated vulnerability disclosure: yes. For incident reporting: yes. For cyber threat intelligence sharing: yes, that’s what we need. We need to prevent our citizens and entities from these things occurring before they happen. Now, the future is indeed digital, but we must also make sure that it is secure. Therefore, I thank the Commission and all the institutions for agreeing on this great package.'
tokens1_12 = word_tokenize(raw1_12)
for word in tokens1_12:
    print(word, end=' ')

Bart Groothuis ( Renew ) . – Mr President , today we vote on a pan—European digital programme . And one of the illustrations of how far we are already on track with this programme is , unfortunately , the doubling of cybercrime in 2019 . Ransomware attacks have gone up 300 % in the year after , in 2020 . And I think we all know the metrics for 2021 . It might get worse before it gets better . But it will get better , because we are currently drafting good new legislation . Yet here in Europe , we spend 41 % less on cybersecurity than in the United States , and yet there is still no pan—European cybersecurity infrastructure . The good news of today is that we also vote for EUR 1.6 billion on cybersecurity , and I commend that . We are talking about trans—European networks in real networks : on energy networks , on water , and even connecting our digital networks through GAIA—X . How about laying down a cybersecurity network for Europe ? Is it not time ? I believe so . For coordinated vu

In [42]:
raw1_13 = 'Kosma Złotowski (ECR). - Mr Chairman! In addition to huge data sets, an innovative economy also needs huge investments in research and development, digital infrastructure, cybersecurity and modern education. An important challenge faced by the Member States is to accelerate the digitization of public services, including public services such as healthcare, justice, consumer protection and administration. Without the support of European funds, many innovative projects in these areas could not be implemented. The experience of recent months shows that digitization in the area of ​​healthcare helps save lives. Innovations such as e-prescriptions or the online vaccination registration system were an important support in the fight against the epidemic in Poland. We need more of these modern solutions in all areas of our lives. The Digital Europe Program will allow them to be finalized.'
tokens1_13 = word_tokenize(raw1_13)
for word in tokens1_13:
    print(word, end=' ')

Kosma Złotowski ( ECR ) . - Mr Chairman ! In addition to huge data sets , an innovative economy also needs huge investments in research and development , digital infrastructure , cybersecurity and modern education . An important challenge faced by the Member States is to accelerate the digitization of public services , including public services such as healthcare , justice , consumer protection and administration . Without the support of European funds , many innovative projects in these areas could not be implemented . The experience of recent months shows that digitization in the area of ​​healthcare helps save lives . Innovations such as e-prescriptions or the online vaccination registration system were an important support in the fight against the epidemic in Poland . We need more of these modern solutions in all areas of our lives . The Digital Europe Program will allow them to be finalized . 

In [43]:
raw1_14 = 'Ivan Štefanec (EPP). - Mr President, the digital transformation is affecting all areas of our lives. It penetrates all sectors of the economy and provides most communication during a pandemic. Europe needs to increase its investment in digital technologies if we are to remain competitive and maintain leadership in digitalisation. A successful digital transformation is certainly a precondition for a successful economic recovery in Europe. We need to improve not only digital infrastructure, but also to invest more in digital skills and ensure a clear and stable legal framework. We use new digital technologies every day. Even if we do not realize it, they can sometimes be a threat, but especially an opportunity. The proposed balanced and systematic regulation can be an investment with a very fast return. New technologies bring great potential for economic growth, creating millions of new jobs, saving lives in healthcare, but also simplifying our way of life. Within the Digital Europe program, I appreciate the emphasis on the importance of educating the new generation and retraining the current workforce. Investing in cyber security and artificial intelligence research and innovation is a key area to work on. But I want to emphasize that when we talk about new technologies, we should always keep in mind the fact that the human being must stay at the center of new technologies and always make the final decision.'
tokens1_14 = word_tokenize(raw1_14)
for word in tokens1_14:
    print(word, end=' ')

Ivan Štefanec ( EPP ) . - Mr President , the digital transformation is affecting all areas of our lives . It penetrates all sectors of the economy and provides most communication during a pandemic . Europe needs to increase its investment in digital technologies if we are to remain competitive and maintain leadership in digitalisation . A successful digital transformation is certainly a precondition for a successful economic recovery in Europe . We need to improve not only digital infrastructure , but also to invest more in digital skills and ensure a clear and stable legal framework . We use new digital technologies every day . Even if we do not realize it , they can sometimes be a threat , but especially an opportunity . The proposed balanced and systematic regulation can be an investment with a very fast return . New technologies bring great potential for economic growth , creating millions of new jobs , saving lives in healthcare , but also simplifying our way of life . Within the 

In [44]:
raw1_15 = 'Tsvetelina Penkova (S&D). – Mr President, the Digital Europe Programme has an overall budget of EUR 7.5 billion and it will definitely lay down the foundation for Europe’s business, societal and industrial transformation. Digital technologies innovation and artificial intelligence will provide EU citizens with competitive jobs, better health and better public services. Technological progress would definitely help us achieve our climate—neutral, green, fair and social goals. The planned investment in supercomputing, AI, cybersecurity and advanced digital skills will expand Europe’s global competitiveness, and this will bring us closer towards the so—wanted digital independence. Europe has the capacity to be the global leader in that – a technological leadership where innovation advancement, fair and green economic growth, social inclusion and business competitiveness are inseparable.'
tokens1_15 = word_tokenize(raw1_15)
for word in tokens1_15:
    print(word, end=' ')

Tsvetelina Penkova ( S & D ) . – Mr President , the Digital Europe Programme has an overall budget of EUR 7.5 billion and it will definitely lay down the foundation for Europe ’ s business , societal and industrial transformation . Digital technologies innovation and artificial intelligence will provide EU citizens with competitive jobs , better health and better public services . Technological progress would definitely help us achieve our climate—neutral , green , fair and social goals . The planned investment in supercomputing , AI , cybersecurity and advanced digital skills will expand Europe ’ s global competitiveness , and this will bring us closer towards the so—wanted digital independence . Europe has the capacity to be the global leader in that – a technological leadership where innovation advancement , fair and green economic growth , social inclusion and business competitiveness are inseparable . 

In [45]:
raw1_16 = 'Nicola Danti (Renew). - (IT) Mr President, Commissioner, ladies and gentlemen, the Digital Europe program is an important tool for regaining European leadership in a sector in which we have not been able to keep pace with China and the United States for several years now. A sector on which the spotlight has turned, even more, during the COVID crisis we are experiencing. Diffusion and adoption of innovative technologies, artificial intelligence, supercomputing, cybersecurity are decisive challenges to make our production system and services globally competitive, with the essential contribution of the digital innovation poles, foreseen by the program to support the digitalization of companies. . Let me also emphasize that the program places particular emphasis on developing adequate skills for the workers and entrepreneurs of today and tomorrow. Human capital will be an increasingly determining factor in the face of a rapid digital transformation of the economy and society. This must be accompanied by the creation of a digital legislative framework consistent with the European identity, starting from the data we are already working on. This is the Europe that challenges the future, that we like and that we need.'
tokens1_16 = word_tokenize(raw1_16)
for word in tokens1_16:
    print(word, end=' ')

Nicola Danti ( Renew ) . - ( IT ) Mr President , Commissioner , ladies and gentlemen , the Digital Europe program is an important tool for regaining European leadership in a sector in which we have not been able to keep pace with China and the United States for several years now . A sector on which the spotlight has turned , even more , during the COVID crisis we are experiencing . Diffusion and adoption of innovative technologies , artificial intelligence , supercomputing , cybersecurity are decisive challenges to make our production system and services globally competitive , with the essential contribution of the digital innovation poles , foreseen by the program to support the digitalization of companies . . Let me also emphasize that the program places particular emphasis on developing adequate skills for the workers and entrepreneurs of today and tomorrow . Human capital will be an increasingly determining factor in the face of a rapid digital transformation of the economy and soc

In [46]:
raw1_17 = 'Adam Bielan (ECR). - Mr Chairman! We all know that, compared to the United States or China, the European Union is lagging behind in terms of investment in digital capacity and advanced technologies. Therefore, close cooperation between the Member States is necessary in order to build the strategic digital capabilities of the Union, including the development of cybersecurity, supercomputers, digital skills and artificial intelligence. Actions to support these areas simultaneously will help create a thriving data economy and contribute to social and economic development. I also believe that the introduction of the Digital Europe Program will provide more opportunities to develop breakthrough solutions to societal challenges such as healthcare and climate change. I also consider it extremely important to guarantee support for small and medium-sized enterprises in adapting to digital change. I am pleased that this is the first Union program that comprehensively and horizontally addresses issues related to digitization.'
tokens1_17 = word_tokenize(raw1_17)
for word in tokens1_17:
    print(word, end=' ')

Adam Bielan ( ECR ) . - Mr Chairman ! We all know that , compared to the United States or China , the European Union is lagging behind in terms of investment in digital capacity and advanced technologies . Therefore , close cooperation between the Member States is necessary in order to build the strategic digital capabilities of the Union , including the development of cybersecurity , supercomputers , digital skills and artificial intelligence . Actions to support these areas simultaneously will help create a thriving data economy and contribute to social and economic development . I also believe that the introduction of the Digital Europe Program will provide more opportunities to develop breakthrough solutions to societal challenges such as healthcare and climate change . I also consider it extremely important to guarantee support for small and medium-sized enterprises in adapting to digital change . I am pleased that this is the first Union program that comprehensively and horizonta

In [47]:
raw1_18 = 'Seán Kelly (PPE). – Mr President, whilst we do our best to fight the pandemic with the vaccine rollout – which is picking up steam, thankfully – we must also look beyond it to a greener, more digital Europe that has the ability to remain prosperous and innovative without detriment to the environment. The benefits of digitalisation have been clear for many years, but the pandemic has brought the digital world closer to home for many people that would otherwise have been adverse or uninterested. If the pandemic had hit ten years ago, with less developed teleworking and e—commerce technology, we would be in a far worse situation. This is why the digital Europe programme is so important. It also complements other programmes, such as Horizon Europe. It is the first financial instrument of the EU focused on bringing digital technology to businesses and citizens by investing in cutting-edge technology. State—of—the—art digital services are crucial for Europe in remaining competitive on the global stage. The establishment of the digital innovation hubs is particularly positive. Essentially, this is an innovation gateway providing a one-stop shop that helps companies to become more competitive by utilising digitalisation in their processes of production procedure. I was also pleased that the programme specifically allocates a sizeable portion to artificial intelligence. AI offers practical solutions to many long-standing technical problems. Digital Europe, here it comes.'
tokens1_18 = word_tokenize(raw1_18)
for word in tokens1_18:
    print(word, end=' ')

Seán Kelly ( PPE ) . – Mr President , whilst we do our best to fight the pandemic with the vaccine rollout – which is picking up steam , thankfully – we must also look beyond it to a greener , more digital Europe that has the ability to remain prosperous and innovative without detriment to the environment . The benefits of digitalisation have been clear for many years , but the pandemic has brought the digital world closer to home for many people that would otherwise have been adverse or uninterested . If the pandemic had hit ten years ago , with less developed teleworking and e—commerce technology , we would be in a far worse situation . This is why the digital Europe programme is so important . It also complements other programmes , such as Horizon Europe . It is the first financial instrument of the EU focused on bringing digital technology to businesses and citizens by investing in cutting-edge technology . State—of—the—art digital services are crucial for Europe in remaining compe

In [48]:
raw1_19 = 'Josianne Cutajar (S&D). - Mr President, the last year has shown how much our daily routine, our society and our economy depend on digital technology. COVID-19 has accelerated digitization and once the pandemic wave passes, we need to increase its rate. I am pleased that our citizens and SMEs can make full use of the Digital Europe program to shape the technological future of the European Union. This crucial initiative cannot be taken in isolation; increasing the use of technology, digital skills, sound cybersecurity practices, will only succeed if we do not forget anyone in this transition. From traditional small businesses, to chambers of commerce, from the public sector to research centers, the digital revolution starts from the bottom. We need to ensure that every person and every region, including our islands, is involved in this challenge but also a crucial opportunity.'
tokens1_19 = word_tokenize(raw1_19)
for word in tokens1_19:
    print(word, end=' ')

Josianne Cutajar ( S & D ) . - Mr President , the last year has shown how much our daily routine , our society and our economy depend on digital technology . COVID-19 has accelerated digitization and once the pandemic wave passes , we need to increase its rate . I am pleased that our citizens and SMEs can make full use of the Digital Europe program to shape the technological future of the European Union . This crucial initiative can not be taken in isolation ; increasing the use of technology , digital skills , sound cybersecurity practices , will only succeed if we do not forget anyone in this transition . From traditional small businesses , to chambers of commerce , from the public sector to research centers , the digital revolution starts from the bottom . We need to ensure that every person and every region , including our islands , is involved in this challenge but also a crucial opportunity . 

In [49]:
raw1_20 = 'Svenja Hahn (Renew). - Mister President! We have to strategically shape the digital transformation on a European basis, because progress and innovation do not just happen. Good ideas need investments and a political and social climate that encourages a new beginning. The Digital Europe program is the first pan-European digital program designed to shape this change. Public and private investments, education, research and the use of digital technologies are the drivers of progress. That is why I would have wished for more courage and financial clout. I find it more than regrettable that the Member States pushed through significant cuts in the negotiations. However, I am pleased that the focus is on research and the use of artificial intelligence. However, there is still room for improvement with the draft law by the Commission on the use of artificial intelligence. In particular, there must be no biometric mass surveillance of public space – without exception. So that good ideas and business models from Europe can actually become big, they need a common market. That is why the common digital single market must have absolute priority. The Digital Europe program is an important first step, because investments in digitization are investments in the future. Let us make Europe an innovation continent!'
tokens1_20 = word_tokenize(raw1_20)
for word in tokens1_20:
    print(word, end=' ')

Svenja Hahn ( Renew ) . - Mister President ! We have to strategically shape the digital transformation on a European basis , because progress and innovation do not just happen . Good ideas need investments and a political and social climate that encourages a new beginning . The Digital Europe program is the first pan-European digital program designed to shape this change . Public and private investments , education , research and the use of digital technologies are the drivers of progress . That is why I would have wished for more courage and financial clout . I find it more than regrettable that the Member States pushed through significant cuts in the negotiations . However , I am pleased that the focus is on research and the use of artificial intelligence . However , there is still room for improvement with the draft law by the Commission on the use of artificial intelligence . In particular , there must be no biometric mass surveillance of public space – without exception . So that 

In [50]:
raw1_21 = 'Jadwiga Wiśniewska (ECR). - Mr Chairman! The "Digital Europe" program is the first program of the European Union that comprehensively covers issues related to digitization and treats digitization as a horizontal phenomenon. It will certainly increase European capacity for high performance computing, artificial intelligence, cybersecurity and advanced digital skills, as well as their widespread use in the economy and society. However, we must ensure that all regions and all sectors of society are digitized. The COVID-19 pandemic has made this difference very clear. In some countries, as much as 32% of students did not have access to education. Therefore, it is important to adopt National Reconstruction Plans, which will contribute to the acceleration of digitization, because at least 20% of the expenditures from the National Reconstruction Plans must be allocated to strengthening the digital potential, which is why I appeal to the Polish total opposition from the PPE to support this program.'
tokens1_21 = word_tokenize(raw1_21)
for word in tokens1_21:
    print(word, end=' ')

Jadwiga Wiśniewska ( ECR ) . - Mr Chairman ! The `` Digital Europe '' program is the first program of the European Union that comprehensively covers issues related to digitization and treats digitization as a horizontal phenomenon . It will certainly increase European capacity for high performance computing , artificial intelligence , cybersecurity and advanced digital skills , as well as their widespread use in the economy and society . However , we must ensure that all regions and all sectors of society are digitized . The COVID-19 pandemic has made this difference very clear . In some countries , as much as 32 % of students did not have access to education . Therefore , it is important to adopt National Reconstruction Plans , which will contribute to the acceleration of digitization , because at least 20 % of the expenditures from the National Reconstruction Plans must be allocated to strengthening the digital potential , which is why I appeal to the Polish total opposition from the

In [51]:
raw1_22 = 'Miapetra Kumpula-Natri (S&D). – Mr President, the whole S&D Group and I are standing strongly behind enforcing European capacities on key technologies. We need this programme – Digital Europe – to help us to compete on super computing, artificial intelligence, cybersecurity and skills, and to fight against the digital divide and help SMEs to get on board in digitalisation. This programme closely links to the European data strategy and, with the governance act, the Committee on Industry, Research and Energy (ITRE) is currently working on the framework of how to enable European datashare, bringing data out of the silos for SMEs and the public sector so as to innovate and benefit from the data economy. The European data strategy builds on the data spaces that help and link ecosystem and environment to help intensify data sharing, help interoperability and ensure a level playing field. This programme is planned to finance data spaces as the Commission communication has promised. We need to see this finance happen to enable the data economy.'
tokens1_22 = word_tokenize(raw1_22)
for word in tokens1_22:
    print(word, end=' ')

Miapetra Kumpula-Natri ( S & D ) . – Mr President , the whole S & D Group and I are standing strongly behind enforcing European capacities on key technologies . We need this programme – Digital Europe – to help us to compete on super computing , artificial intelligence , cybersecurity and skills , and to fight against the digital divide and help SMEs to get on board in digitalisation . This programme closely links to the European data strategy and , with the governance act , the Committee on Industry , Research and Energy ( ITRE ) is currently working on the framework of how to enable European datashare , bringing data out of the silos for SMEs and the public sector so as to innovate and benefit from the data economy . The European data strategy builds on the data spaces that help and link ecosystem and environment to help intensify data sharing , help interoperability and ensure a level playing field . This programme is planned to finance data spaces as the Commission communication ha

In [52]:
raw1_23 = 'Virginijus Sinkevičius, Member of the Commission. – Mr President, I am truly grateful for the support for the digital Europe programme expressed by the European Parliament, not only during this debate but throughout the negotiations. I have taken very good note of your important comments today, from the importance of SMEs to the need to provide adequate training for the next generation, and from the urgency of addressing the challenges of cybersecurity to the potential offered by the twin transition towards a more digital and green Europe. Let me say once again that for the Commission, this programme is essential to deliver on the vision for the Digital Decade, on the twin green and digital transitions and on the digital sovereignty of the Union. As I mentioned in my introduction, we have mobilised unprecedented resources for the digital transition. The digital programme, underpinned by the national recovery and resilience plans, with 20% of funds reserved for digital, will play a catalytic role. This is a historic opportunity for all Member States and for all of us. We have to address gaps in the identified critical capacities and critical technologies and support an interconnected, interoperable and secure single market. Next week, the Commission will present an update of our industrial strategy with an assessment of our strategic dependences. The digital Europe programme will be a central point of gravity for developing key digital projects across the single market and an important source of funding. I believe we are fully equipped and ready to engage in this Digital Decade journey and to start this ambitious digital Europe programme.'
tokens1_23 = word_tokenize(raw1_23)
for word in tokens1_23:
    print(word, end=' ')

Virginijus Sinkevičius , Member of the Commission . – Mr President , I am truly grateful for the support for the digital Europe programme expressed by the European Parliament , not only during this debate but throughout the negotiations . I have taken very good note of your important comments today , from the importance of SMEs to the need to provide adequate training for the next generation , and from the urgency of addressing the challenges of cybersecurity to the potential offered by the twin transition towards a more digital and green Europe . Let me say once again that for the Commission , this programme is essential to deliver on the vision for the Digital Decade , on the twin green and digital transitions and on the digital sovereignty of the Union . As I mentioned in my introduction , we have mobilised unprecedented resources for the digital transition . The digital programme , underpinned by the national recovery and resilience plans , with 20 % of funds reserved for digital ,

---
### Combine all parts

In [53]:
tokens = tokens1_1 + tokens1_2 + tokens1_3 + tokens1_4 + tokens1_5 + tokens1_6 + tokens1_7 + tokens1_8 + tokens1_9 + tokens1_10 + tokens1_11 + tokens1_12 + tokens1_13 + tokens1_14 + tokens1_15 + tokens1_16 + tokens1_17 + tokens1_18 + tokens1_19 + tokens1_20 + tokens1_21 + tokens1_22 + tokens1_23 

---
### Normalize the words 

In [54]:
type(tokens)
eutext11 = [w.lower() for w in tokens]

---
**Save Output**

In [55]:
save_path = '/Users/charlottekaiser/Documents/uni/Hertie/master_thesis/00_data/20_intermediate_files'
file_name = "EU11_Digital Europe programme.txt"
completeName = os.path.join(save_path, file_name)
output = open(completeName, 'w')
print(eutext11, file=output)