### *Data Collection - European Parliament*
## Preparing Raw Data
---
**Sample Text 9**
Title: Preventing the dissemination of terrorist content online <br>
Date: April 28, 2021 - Brussels

In [50]:
# import necessary libraries
import requests
from requests_html import HTMLSession
import urllib.request
import time
from bs4 import BeautifulSoup
import urllib
from urllib import request
from __future__ import division
import nltk, re, pprint
from nltk import word_tokenize
from nltk import FreqDist
import os.path 
import pandas as pd

---
### Process: Trimming debate by inserting the original English or translated English files and tokenizing them.
*Note*: Due to time constraint, the process has been optimized.

- English parts of the debate will be added manually as a string and then tokenized. 

- A consistent method of translating and then adding will be applied to all EU Parliament debates:  Non-English parts are copied from the original web pages, inserted in the consistent choice of translation tool, Google Translate (https://translate.google.com/?hl=de&tab=TT), translated to English and pasted in as a string. 

- Afterwards, the same steps are applied as per usual (tokenizing, standardizing).

Because of the changed process, the URL and step of webscraping are technically no longer necessary, will however be included for the purpose of completeness. 

In [51]:
# url = "https://www.europarl.europa.eu/doceo/document/CRE-9-2021-04-28-ITM-019_EN.html"
# html = requests.get(url)
# raw = BeautifulSoup(html.content, 'html.parser').get_text()

In [52]:
raw1_1 = 'Patryk Jaki, rapporteur. – Madam President, we live in difficult times. The terrorist threat to our community is still present and it is our responsibility to keep citizens safe and to prevent terrorist attacks. However, in order to do so, it is necessary to understand the terrorists’ modus operandi and its evolution. As an example, I will use the long—standing statement of Umar Patek, the organiser of terrorist attacks on Bali in which more than 200 people were killed. He said, during his trial, for those who do not know how to commit Jihad, they should understand that there are several ways to know Jihad. This is not the Stone Age. This is the internet age. We have Facebook, we have Twitter. As he said, this is what happened. The internet has become the most important safe haven and tool for terrorists today. On the internet, terrorists recruit by spreading propaganda, including among children and young people. They provide attacks and violence from the internet, spreading propaganda to youngsters and children. There they show their attack, call for attack and spread a radical agenda. As a continuation, let me show some examples. In 2020, Facebook removed 43.4 million pieces of content of a terrorist nature or containing incitement to attack. In the previous year, it was 25.5 million pieces of content, so as you can see, it is a significant increase with the advent of pandemics and people being locked in their homes. In 2019, Twitter reported that it blocked more than 300 000 accounts linked to a terrorist organisation or used to spread terrorist propaganda both linked to radical Islam and to left- and right-wing extremes. In 2016, it was 100 000 accounts, so one can see an upward trend. Today, terrorist groups have their own media, magazines, editors. The internet and modern technology have allowed terrorist groups to post content that they fully control. A staff of specialists from such terrorist groups prepares professional information and propaganda campaigns, documentaries and magazines published online. They’ve directed executions. That is why our response was needed. The regulation will allow Member State to immediately remove terrorist content. It has introduced, among others, the principle of one hour, which means that the most dangerous content will be removed as soon as possible after publication. For example, Member States finally received a working tool to stop live broadcasting from a terrorist attack or mass shooting. This is a groundbreaking project and the first in the history of the EU that implements such a powerful tool in cross-border cooperation between Member States in this area. One country will have the right to send a removal order to another country. The order will apply to everyone in the EU, including those outside the EU. Failure to comply with the order will result in large financial penalties. The new regulation will help to counter the spread of extremist ideology online. The regulation will ensure that what is illegal offline is illegal online. Removal orders can be sent by any Member State to any online platform established within the EU. However, according to the compromise reached, the authorities of the Member States where the content host is located will have 72 hours to analyse and possibly challenge the withdrawal order if it sees, for example, a clear violation of freedom of expression. In addition, we are carrying a new framework for the cooperation between public institutions and the private sector. We will all now have a duty to act in solidarity to tackle terrorism. This regulation will ensure that the online platforms play a more active role in detecting terrorist content online and that such content is removed within a maximum of one hour. At the same time, potential threats to freedom have been removed. There will be no mandatory filtering of the internet. New safeguards for freedom of speech, freedom of press and media, as well as freedom for legal content, for instance, journalists’ research, artists, educational content, too. Moreover, it safeguards content which represents an expression of polemics or controversial views in the course of public debate. It was a very difficult project. Thanks to the involvement of Parliament’s excellent negotiation team, whom I would like to thank, it was possible to reach a compromise. I would also like to thank the Commissioner and the Council. I strongly believe that the Terrorist Content (TC) Regulation is a good and balanced text which balances security and the freedom of speech and expression on the internet, protecting legal content and access to information for every citizen in the EU. It is a new tool based on mutual cooperation and trust between States fighting together in the one cause against terrorists. The Europe Union today has gained a powerful new tool to fight terrorists, and this is very good information for our community.'
tokens1_1 = word_tokenize(raw1_1)
for word in tokens1_1:
    print(word, end=' ')

Patryk Jaki , rapporteur . – Madam President , we live in difficult times . The terrorist threat to our community is still present and it is our responsibility to keep citizens safe and to prevent terrorist attacks . However , in order to do so , it is necessary to understand the terrorists ’ modus operandi and its evolution . As an example , I will use the long—standing statement of Umar Patek , the organiser of terrorist attacks on Bali in which more than 200 people were killed . He said , during his trial , for those who do not know how to commit Jihad , they should understand that there are several ways to know Jihad . This is not the Stone Age . This is the internet age . We have Facebook , we have Twitter . As he said , this is what happened . The internet has become the most important safe haven and tool for terrorists today . On the internet , terrorists recruit by spreading propaganda , including among children and young people . They provide attacks and violence from the inte

In [53]:
raw1_2 = 'Ylva Johansson, Member of the Commission. – Madam President, this is an extremely important debate. This regulation will make it harder for terrorists to exploit the internet to spread fear, violence and terror. This regulation will make it more difficult for terrorists to abuse the internet, to recruit online, to incite attacks online, to glorify their atrocities online. With this regulation, we take European action against terrorist activities online: action that is quick, cross—border and compulsory. Quick: Member States will be able to order internet providers to take down terrorist content within the hour. Cross-border: Member States will be able to issue orders to take down terrorist content hosted by any service provider anywhere in the European Union. Compulsory: providers must carry out these demands and pay fines if they don’t. Proportionate fines: small companies must not fall victim to terrorists abusing their systems. I’m sure you all remember the Christchurch attacks: 60 minutes and 55 seconds of mass murder broadcast live to the world. Some platforms still have copies of that attack online. What’s worse, animated versions aimed at children keep appearing on videogaming platforms. That too can soon be taken down in the whole of Europe within the hour. As we speak, jihadi groups are calling for attacks online: calling on followers to exploit political unrest during the pandemic, to kill people and damage property. That too, soon, can be taken down in the whole of Europe within the hour. With this regulation, we will strike a blow against terrorists. Without online manuals to tell you how, it’s harder to make bombs. Without flashy propaganda videos, it’s harder to poison the minds of young people. Without streaming attacks online, it’s harder to inspire copycat attacks. It’s difficult to measure: people are not radicalised, bombs not made, attacks not carried out. We may never be able to count how many, but this regulation will save lives, and we can be proud of this result. This regulation strikes a major blow against terrorists, but it’s only one of many steps in the fight against terrorism, offline and online. In December 2020, I launched our counterterrorism agenda. Besides fighting terrorists online, we will, among other things, boost European police cooperation and information exchange, make it more difficult for foreign fighters to enter the European Union undetected, protect people in public spaces like churches, mosques, synagogues, stadiums and public transport. This regulation upholds our values and fundamental rights by fighting terrorists who attack our freedoms and our democracies, by putting safeguards in place, and by limiting removal of content to what is clearly illegal. We will take down terrorist content and uphold freedom of expression. I would like to warmly thank the rapporteur, Patryk Jaki, and also all the shadow rapporteurs for these results. A lot of work has gone into this. I would like to thank the Parliament for your support. Thank you for making this regulation a reality, and thank you for the very good cooperation. As a society, we are facing the enormous challenge of digital change, which fundamentally affects the way we live and the way we work. It affects our security and our privacy. We must face this challenge as law—makers and as a union. In December, the Commission launched the Digital Services Act to foster online markets and to protect fundamental rights online. We must also give law enforcement the right rules and tools to fight crime and terrorism in the digital age. It’s up to us now to find agreements on child sexual abuse online, our short—term and future long-term proposals on detecting and reporting child sexual exploitation online, and to find agreements on e—evidence – 85% of evidence is electronic. To fight crime and terrorism, law enforcement needs timely access to cross—border evidence. To find agreement on data retention, we need to look at possibilities in line with EU law so that digital evidence of crimes or leads to investigations do not accidentally disappear. To find agreement on encryption, police must be able to gain effective access to encrypted electronic evidence when they are granted the legal authorisation to do so. To make agreements on artificial intelligence, law enforcement should use the full potential of AI to find missing children and very dangerous criminals, to prevent terrorist attacks and to respond to the malicious use of technology by criminals. All of these policies are as sensitive as they are important. The successful negotiating result achieved on this file – the terrorist content online – gives me the confidence that we will also find agreement on the Commission’s other proposals on digital law enforcement. So let this result inspire us together to make Europe safe for everyone.'
tokens1_2 = word_tokenize(raw1_2)
for word in tokens1_2:
    print(word, end=' ')

Ylva Johansson , Member of the Commission . – Madam President , this is an extremely important debate . This regulation will make it harder for terrorists to exploit the internet to spread fear , violence and terror . This regulation will make it more difficult for terrorists to abuse the internet , to recruit online , to incite attacks online , to glorify their atrocities online . With this regulation , we take European action against terrorist activities online : action that is quick , cross—border and compulsory . Quick : Member States will be able to order internet providers to take down terrorist content within the hour . Cross-border : Member States will be able to issue orders to take down terrorist content hosted by any service provider anywhere in the European Union . Compulsory : providers must carry out these demands and pay fines if they don ’ t . Proportionate fines : small companies must not fall victim to terrorists abusing their systems . I ’ m sure you all remember the

In [54]:
raw1_3 = 'Javier Zarzalejos, on behalf of the PPE Group. – Madam President, Madam Commissioner, Samuel Paty, falsely accused of Islamophobia, was lynched online before having his throat slit at the door of his school. Last Friday a French police officer, a mother of two children, was murdered in Rambouillet by a terrorist who had consumed large amounts of jihadist propaganda on the web before her murder. As the commissioner recalled, two years ago, a white supremacist in Christchurch broadcast on Facebook the massacre of Muslim worshipers at the Al Noor mosque. These and many other cases lead us to the enormous problem of the dissemination of terrorist content on the Internet. Sometimes it is about exposing and spreading the crime itself while it is being committed; others, to feed the radicalization processes, from which new terrorists will emerge, or, in the case of Paty, the networks were used to spread the threat and incite murder. That is why the Regulation that we are going to adopt must make a clear difference in terms of improving efficiency, cooperation and the continued effort to fight terrorism on this front, the online world, which we know is essential in the strategy of terrorists. It has taken time and a lot of effort, and I want to highlight the role of our rapporteur in this regulation. But the use of the Internet for terrorist purposes, which is a phenomenon without borders, must be responded to with cooperation that is also without borders. We have struck a satisfactory balance between new cooperation instruments and procedures and the necessary safeguards of fundamental rights. Withdrawal orders – they have been explained here – constitute a crucial innovation in the framework of cross-border cooperation against terrorism. It is clear that nothing can replace the action on the ground of the military forces, the security forces, the information services, the judges... But it is also essential to prevent terrorism from spreading its shadow, contaminating minds and exalt criminals.'
tokens1_3 = word_tokenize(raw1_3)
for word in tokens1_3:
    print(word, end=' ')

Javier Zarzalejos , on behalf of the PPE Group . – Madam President , Madam Commissioner , Samuel Paty , falsely accused of Islamophobia , was lynched online before having his throat slit at the door of his school . Last Friday a French police officer , a mother of two children , was murdered in Rambouillet by a terrorist who had consumed large amounts of jihadist propaganda on the web before her murder . As the commissioner recalled , two years ago , a white supremacist in Christchurch broadcast on Facebook the massacre of Muslim worshipers at the Al Noor mosque . These and many other cases lead us to the enormous problem of the dissemination of terrorist content on the Internet . Sometimes it is about exposing and spreading the crime itself while it is being committed ; others , to feed the radicalization processes , from which new terrorists will emerge , or , in the case of Paty , the networks were used to spread the threat and incite murder . That is why the Regulation that we are 

In [55]:
raw1_4 = 'Marina Kaljurand, on behalf of the S&D Group. – Madam President, there is no place for terrorism, neither in the offline nor in the online world. We all agree to that, and therefore the negotiations between EU institutions focused on how this was to be achieved. I am happy that after a year and a half of negotiations, we were able to reach a political agreement that strengthens our society while safeguarding fundamental rights and freedoms. I would like to recognise the work of our rapporteur, but also all other parliamentary colleagues. Together, we have made many important changes so that today we can be clear. The regulation will increase the effectiveness of current measures to detect, identify and remove terrorist content online without encroaching on fundamental rights such as freedom of expression and information. We worked hard to improve cooperation between Member States and make the removal of terrorist content online legally watertight. I would like to highlight some aspects. Firstly, as to cross-border removal orders: competent authorities in the Member States will have the ability to remove terrorist content, but with a number of important safeguards. Member State authorities where the hosting provider is located will be involved in the removal process from the beginning and have the final word on removing content. Secondly, we have ensured that the agreement will not include any obligation to use automated filtering and technical measures cannot be imposed on service providers. Thirdly, educational, journalistic, artistic or research content will be protected. Fourthly, service providers will not be penalised if there are objective, technical and operational reasons for not being able to take down content in one hour. And finally, to prevent non—relevant content from being taken down, terrorist content is now strictly defined across the EU. Today I can say that I am pleased with the final outcome and can strongly support the final text adopted. I would also like to thank the Commissioner and all the presidencies who were involved in the negotiations.'
tokens1_4 = word_tokenize(raw1_4)
for word in tokens1_4:
    print(word, end=' ')

Marina Kaljurand , on behalf of the S & D Group . – Madam President , there is no place for terrorism , neither in the offline nor in the online world . We all agree to that , and therefore the negotiations between EU institutions focused on how this was to be achieved . I am happy that after a year and a half of negotiations , we were able to reach a political agreement that strengthens our society while safeguarding fundamental rights and freedoms . I would like to recognise the work of our rapporteur , but also all other parliamentary colleagues . Together , we have made many important changes so that today we can be clear . The regulation will increase the effectiveness of current measures to detect , identify and remove terrorist content online without encroaching on fundamental rights such as freedom of expression and information . We worked hard to improve cooperation between Member States and make the removal of terrorist content online legally watertight . I would like to high

In [56]:
raw1_5 = 'Maite Pagazaurtundúa, on behalf of the Renew Group. – Madam President, Commissioner, ladies and gentlemen, I would like to start my speech by remembering David and Roberto, the two Spanish journalists murdered in Burkina Faso, and also the French police who had their throats cut in Paris. Terrorists kill for propaganda, to show it off, to frighten, to capture and turn some into murderers, whether they are suicidal or not. In a world like today, in our relationships conditioned by the digital framework and the Internet, terrorists take advantage of it. In the last five years, we have legislated and managed to reduce lethality, but not fanaticism. Jihadi-inspired fanatics or otherwise adapt to continue exercising their macabre narcissism. This regulation will allow the rapid removal of terrorist content online, which will make it more difficult for the internet to be used to facilitate or carry out online crimes that are defined in the Counter-Terrorism Directive that the European Parliament approved four years ago. The debate has been intense. We have reached a broad consensus thanks to the proportionality of the measures and respect for fundamental rights. The matter, of course, deserves debate because, in addition to the clearest crimes, the line that separates freedom of expression from criminal conspiracy to destroy it must be clearly addressed. I have to tell you that I am very grateful to all the colleagues and, of course, to our rapporteur, the responsibility, because we look ahead and not in profile at the challenge. Because we have been responsible. Because they think that, if the terrorists who murder manage to hide behind the mask of freedom of expression, if they manage to make us believe that stopping them is a form of censorship, we will be digging our own grave. Solving the problems of citizens today is gaining security, not lowering our guard against violent fanatics, not giving in to fear. I assure you that I am satisfied, because the groups that were most belligerent in committee have not asked for the rejection of this agreement in plenary. The citizens would not forgive us if we acted irresponsibly. European citizens need us to minimize terrorist barbarity without losing their freedom. We have not let them down. We must not let them down in the future.'
tokens1_5 = word_tokenize(raw1_5)
for word in tokens1_5:
    print(word, end=' ')

Maite Pagazaurtundúa , on behalf of the Renew Group . – Madam President , Commissioner , ladies and gentlemen , I would like to start my speech by remembering David and Roberto , the two Spanish journalists murdered in Burkina Faso , and also the French police who had their throats cut in Paris . Terrorists kill for propaganda , to show it off , to frighten , to capture and turn some into murderers , whether they are suicidal or not . In a world like today , in our relationships conditioned by the digital framework and the Internet , terrorists take advantage of it . In the last five years , we have legislated and managed to reduce lethality , but not fanaticism . Jihadi-inspired fanatics or otherwise adapt to continue exercising their macabre narcissism . This regulation will allow the rapid removal of terrorist content online , which will make it more difficult for the internet to be used to facilitate or carry out online crimes that are defined in the Counter-Terrorism Directive tha

In [57]:
raw1_6 = 'Nicolaus Fest, on behalf of the ID Group. – Madam President, Commissioner! As a former journalist, I am always suspicious when states or the EU want to restrict the dissemination of information - even if there are very, very good reasons for doing so, as with terrorist content. However, we should always be aware that this is a very sharp sword. Anyone who is allowed to define what terrorist content is can enforce any censorship and permanently switch off the free exchange of opinions. This is already the case in China and in other totalitarian regimes as well. But you can live with the present proposal. Patryk Jaki and his shadows did a very good job. You have clearly defined what counts as terrorist content. They have retained the sovereignty of nation states. You prevented Facebook, Twitter and others from becoming censors. And they have received the freedom of scientific, political, artistic and academic debate. In this respect, the work should also serve as a benchmark for similar projects. This is especially true for the LIBE sub-committee INGE, which deals with the fight against hate speech and so-called disinformation. Unfortunately, almost nothing is defined there. You work with rubber terms, suspicions, mere assertions and dubious statistics. As good as the work on the template of today is, it is bad - even miserable - there. Today, anyone who wants to protect free speech but want to prevent terrorist content can agree to this bill. But the coming proposals to tackle hate speech and disinformation should make everyone look very, very carefully.'
tokens1_6 = word_tokenize(raw1_6)
for word in tokens1_6:
    print(word, end=' ')

Nicolaus Fest , on behalf of the ID Group . – Madam President , Commissioner ! As a former journalist , I am always suspicious when states or the EU want to restrict the dissemination of information - even if there are very , very good reasons for doing so , as with terrorist content . However , we should always be aware that this is a very sharp sword . Anyone who is allowed to define what terrorist content is can enforce any censorship and permanently switch off the free exchange of opinions . This is already the case in China and in other totalitarian regimes as well . But you can live with the present proposal . Patryk Jaki and his shadows did a very good job . You have clearly defined what counts as terrorist content . They have retained the sovereignty of nation states . You prevented Facebook , Twitter and others from becoming censors . And they have received the freedom of scientific , political , artistic and academic debate . In this respect , the work should also serve as a 

In [58]:
raw1_7 = 'Marcel Kolaja, on behalf of the Verts/ALE Group. – Madam President, online public space is a forum for the public debate that underpins our democracy. Of course, this space, unfortunately, can also be abused by malicious actors to recruit or make propaganda for terrorist attacks. However, no legislation foreseeing only the deletion of online content can possibly eradicate terrorism. Today, the Parliament approves legislation that will merely result in hiding the symptoms of the deeply-rooted problem of terrorism and will shrink this public space for legitimate public debate. Far too often, anti—terrorist legislation around the world is used against social protesters, minorities, environmental activists or refugees: a risk that has been pointed out to Members of the European Parliament by the UN Special Rapporteur, NGOs and journalists. This legislation, with its broad provision, risks not to prevent any of this. As a matter of fact, because of this new regulation, any authority without a court order or independent assessment will be able to mandate superfast removal of content in another Member State. Spanish, Hungarian or Polish authorities can decide what citizens in Czechia or Germany can see online, and that is why our Group cannot support the outcome of the negotiations.'
tokens1_7 = word_tokenize(raw1_7)
for word in tokens1_7:
    print(word, end=' ')

Marcel Kolaja , on behalf of the Verts/ALE Group . – Madam President , online public space is a forum for the public debate that underpins our democracy . Of course , this space , unfortunately , can also be abused by malicious actors to recruit or make propaganda for terrorist attacks . However , no legislation foreseeing only the deletion of online content can possibly eradicate terrorism . Today , the Parliament approves legislation that will merely result in hiding the symptoms of the deeply-rooted problem of terrorism and will shrink this public space for legitimate public debate . Far too often , anti—terrorist legislation around the world is used against social protesters , minorities , environmental activists or refugees : a risk that has been pointed out to Members of the European Parliament by the UN Special Rapporteur , NGOs and journalists . This legislation , with its broad provision , risks not to prevent any of this . As a matter of fact , because of this new regulation 

In [59]:
raw1_8 = 'Joachim Stanisław Brudziński, on behalf of the ECR Group. - Madam President! Commissioner! The words spoken by the rapporteur here: "everything that is forbidden in the real world must also be forbidden in the virtual world" are extremely important. I think that this wonderful space of freedom, which is the Internet, cannot be an instrument and a tool for degenerates, criminals, terrorists who, through the Internet, sometimes model the mind of young people, women and children, forcing them to commit acts of terrorism. Democratic countries, Europe cannot be defenseless here. This space of freedom cannot serve, as I said, to these barbarians to have our citizens murdered. I think that the representatives of all the countries represented here in this room have martyrs such as my compatriot Piotr Stańczyk murdered in Pakistan, in whose case the Internet was also used to inform the world about this crime. We remember the thirty Christian Coptic martyrs murdered by the Islamic jihadists, barbarians in Libya. Today, European states, secret services, and the police receive an excellent instrument in the form of this regulation. Finally, I would like to say: Patryk, great job!'
tokens1_8 = word_tokenize(raw1_8)
for word in tokens1_8:
    print(word, end=' ')

Joachim Stanisław Brudziński , on behalf of the ECR Group . - Madam President ! Commissioner ! The words spoken by the rapporteur here : `` everything that is forbidden in the real world must also be forbidden in the virtual world '' are extremely important . I think that this wonderful space of freedom , which is the Internet , can not be an instrument and a tool for degenerates , criminals , terrorists who , through the Internet , sometimes model the mind of young people , women and children , forcing them to commit acts of terrorism . Democratic countries , Europe can not be defenseless here . This space of freedom can not serve , as I said , to these barbarians to have our citizens murdered . I think that the representatives of all the countries represented here in this room have martyrs such as my compatriot Piotr Stańczyk murdered in Pakistan , in whose case the Internet was also used to inform the world about this crime . We remember the thirty Christian Coptic martyrs murdered 

In [60]:
raw1_9 = 'Balázs Hidvéghi (NI). - Madam President. The online presence of terrorists is growing, so it is crucial to remove terrorist propaganda spreading on the internet and prevent it from appearing at all in every way possible. This regulation is a milestone in this fight, and I would like to thank the rapporteur, my Polish colleague Patryk Jaki, for his persistent work in this long and difficult process. On the one hand, the compromise reached increases the responsibility of ISPs in identifying and removing terrorist content. On the other hand, it offers the authorities, in addition to the appropriate guarantees, a quick, cross-border solution that allows them to remove online terrorist content within 1 hour by seeking a service provider in another country. In addition, service providers must ensure that removed content cannot be re-uploaded elsewhere. And to the few Members who have even come to mind Viktor Orbán on this subject, I would like to say: do not fight the Hungarian Prime Minister, but the terrorists! Believe me, this would better serve the interests of the people of Europe.'
tokens1_9 = word_tokenize(raw1_9)
for word in tokens1_9:
    print(word, end=' ')

Balázs Hidvéghi ( NI ) . - Madam President . The online presence of terrorists is growing , so it is crucial to remove terrorist propaganda spreading on the internet and prevent it from appearing at all in every way possible . This regulation is a milestone in this fight , and I would like to thank the rapporteur , my Polish colleague Patryk Jaki , for his persistent work in this long and difficult process . On the one hand , the compromise reached increases the responsibility of ISPs in identifying and removing terrorist content . On the other hand , it offers the authorities , in addition to the appropriate guarantees , a quick , cross-border solution that allows them to remove online terrorist content within 1 hour by seeking a service provider in another country . In addition , service providers must ensure that removed content can not be re-uploaded elsewhere . And to the few Members who have even come to mind Viktor Orbán on this subject , I would like to say : do not fight the H

In [61]:
raw1_10 = 'Jeroen Lenaers (PPE). – Madam President, ‘get terrorist propaganda off the internet’: that is our simple but important message today. Because the internet, as we all know, is an extremely powerful communication and networking tool that is used as a force for good in so many ways, but we can never be naive about those who use it as a force for evil, using it to spread hatred, violence and terrorism. We know the digital environment offers easy ways to radicalise and we know it plays an important role in recruitment in terrorist training and in dissemination of terrorist ideas. The influence of illegal content is so great that it is posing a security risk to the whole of the EU, and we’ve seen its destructive power in terrorist attacks in Europe and beyond. The attack against French teacher Samuel Paty was a direct result of an online hate campaign. Madam Commissioner, you already mentioned Christchurch. The videos of that attack: 1.5 million times it was deleted from Facebook in 24 hours. As you rightly said, some of those videos are still online because these videos and these hate campaigns spread across online platforms like a virus, and they’re equally deadly, and as we are all very aware these days, the only way to stop a virus is with a rapid and a determined response. This is why this regulation is so important. I congratulate my colleagues for the good result. Any propaganda that prepares, incites or glorifies acts of terrorism will be removed within one hour. And that is good news because we need to get terrorist propaganda off our internet.'
tokens1_10 = word_tokenize(raw1_10)
for word in tokens1_10:
    print(word, end=' ')

Jeroen Lenaers ( PPE ) . – Madam President , ‘ get terrorist propaganda off the internet ’ : that is our simple but important message today . Because the internet , as we all know , is an extremely powerful communication and networking tool that is used as a force for good in so many ways , but we can never be naive about those who use it as a force for evil , using it to spread hatred , violence and terrorism . We know the digital environment offers easy ways to radicalise and we know it plays an important role in recruitment in terrorist training and in dissemination of terrorist ideas . The influence of illegal content is so great that it is posing a security risk to the whole of the EU , and we ’ ve seen its destructive power in terrorist attacks in Europe and beyond . The attack against French teacher Samuel Paty was a direct result of an online hate campaign . Madam Commissioner , you already mentioned Christchurch . The videos of that attack : 1.5 million times it was deleted fr

In [62]:
raw1_11 = 'Juan Fernando López Aguilar (S&D). – Madam President, Commissioner Johansson, when this Parliament debates the prevention of the dissemination of terrorist content on the Internet, I too pay tribute to the memory of Roberto Fraile and David Beriain, two Spanish journalists killed by jihadist terrorism in Burkina Faso. There was a time when, when terrorism hit European citizens, the Member States of which they were nationals felt alone. But that time has passed, because now there is an area of ​​freedom, justice and security that makes the fight against terrorism a European issue. And it makes this European Parliament a criminal legislator, a procedural legislator, a legislator of guarantees and a legislator of police and judicial cooperation against the transnational crime that offends us the most. Terrorism is a case and, therefore, we are doing the right thing and I am pleased that the negotiator and his team have managed, after hard work, to culminate it with this definitive parliamentary act of approval of the Regulation. Because it started in 2019, and, with elections in between, it has been very difficult to reach a meeting point that would clear up all the problems raised: withdrawal orders, which were the competent authorities, not only judicial, that can order the withdrawal in one hour of content that threatens to radicalize, spread, incite hatred, violence, or finance terrorist actions. And this, therefore, protects the security of Europeans and is an example that this Parliament is also guaranteeing that right to security for Europeans, which is a fundamental right, on a par with the right to liberty and enshrined in the same article —6— of the Charter of Fundamental Rights of the European Union.'
tokens1_11 = word_tokenize(raw1_11)
for word in tokens1_11:
    print(word, end=' ')

Juan Fernando López Aguilar ( S & D ) . – Madam President , Commissioner Johansson , when this Parliament debates the prevention of the dissemination of terrorist content on the Internet , I too pay tribute to the memory of Roberto Fraile and David Beriain , two Spanish journalists killed by jihadist terrorism in Burkina Faso . There was a time when , when terrorism hit European citizens , the Member States of which they were nationals felt alone . But that time has passed , because now there is an area of ​​freedom , justice and security that makes the fight against terrorism a European issue . And it makes this European Parliament a criminal legislator , a procedural legislator , a legislator of guarantees and a legislator of police and judicial cooperation against the transnational crime that offends us the most . Terrorism is a case and , therefore , we are doing the right thing and I am pleased that the negotiator and his team have managed , after hard work , to culminate it with 

In [63]:
raw1_12 = 'Fabienne Keller (Renew). – Madam President, Commissioner, the attack on the young policewoman in Rambouillet, France, and the murder of Professor Samuel Paty in Conflans-Sainte-Honorine have recently demonstrated this: social networks have a relay effect and considerable amplification for calls for violence and terrorist propaganda. Let’s face reality: the perpetrators of these crimes are radicalizing, coming together, organizing themselves through the internet and social networks. It is time to equip ourselves with effective tools to combat these modern threats. Thanks to this text, Member States will be able to demand the removal of terrorist content on the internet in less than an hour by directly addressing online platforms, anywhere in Europe. The European Parliament has obtained solid guarantees in terms of the protection of personal data and the right of appeal for any person who would be harmed by an unfortunate withdrawal. The platforms will also be supervised in their activities. I would like to salute here the determination of our Renew Europe group and particularly of my colleague Maite Pagazaurtundúa, who was one of the main architects of this agreement. I am delighted with this important step forward for the security of our fellow citizens, in Europe and in the world.'
tokens1_12 = word_tokenize(raw1_12)
for word in tokens1_12:
    print(word, end=' ')

Fabienne Keller ( Renew ) . – Madam President , Commissioner , the attack on the young policewoman in Rambouillet , France , and the murder of Professor Samuel Paty in Conflans-Sainte-Honorine have recently demonstrated this : social networks have a relay effect and considerable amplification for calls for violence and terrorist propaganda . Let ’ s face reality : the perpetrators of these crimes are radicalizing , coming together , organizing themselves through the internet and social networks . It is time to equip ourselves with effective tools to combat these modern threats . Thanks to this text , Member States will be able to demand the removal of terrorist content on the internet in less than an hour by directly addressing online platforms , anywhere in Europe . The European Parliament has obtained solid guarantees in terms of the protection of personal data and the right of appeal for any person who would be harmed by an unfortunate withdrawal . The platforms will also be supervi

In [64]:
raw1_13 = 'Gilles Lebreton (ID). – Madam President, Commissioner, ladies and gentlemen, Islamist terrorism is one of the greatest scourges of our time. France has again recently paid the price, in particular with the attack and beheading of a teacher, Samuel Paty, on October 16, 2020, on leaving his college, then with the assassination of a police officer on April 23 last. However, in both cases, the attack was perpetrated through the Internet. In the Paty case, the victim was identified and located by his killer because of a call for murder broadcast by a third party, and in the case of April 23, the killer had become radicalized because of propaganda sites Islamist. These tragedies therefore prove, if need be, the need to strengthen the prevention of the online dissemination of terrorist content. I therefore approve of the proposal for a regulation presented to us for this purpose today. Since the Internet is by nature cross-border in nature, the European Union is justified here in making its contribution to this work of public safety. The text has the merit of being firm because it requires digital platforms to remove their terrorist content within a maximum of one hour after receiving a removal order. Any platform that disobeys may be fined up to 4% of its global turnover. In addition, it is respectful of national sovereignty, as it entrusts each State with the task of designating the competent public authority to issue the withdrawal order. Finally, it preserves the right to a fair trial by giving the recipients or victims of the injunction the right to bring proceedings before the national courts concerned. For all these reasons, I support this text.'
tokens1_13 = word_tokenize(raw1_13)
for word in tokens1_13:
    print(word, end=' ')

Gilles Lebreton ( ID ) . – Madam President , Commissioner , ladies and gentlemen , Islamist terrorism is one of the greatest scourges of our time . France has again recently paid the price , in particular with the attack and beheading of a teacher , Samuel Paty , on October 16 , 2020 , on leaving his college , then with the assassination of a police officer on April 23 last . However , in both cases , the attack was perpetrated through the Internet . In the Paty case , the victim was identified and located by his killer because of a call for murder broadcast by a third party , and in the case of April 23 , the killer had become radicalized because of propaganda sites Islamist . These tragedies therefore prove , if need be , the need to strengthen the prevention of the online dissemination of terrorist content . I therefore approve of the proposal for a regulation presented to us for this purpose today . Since the Internet is by nature cross-border in nature , the European Union is just

In [65]:
raw1_14 = 'Gwendoline Delbos-Corfield (Greens/ALE). – Madam President, in some states of this European Union, one can be officially targeted as a traitor to the nation or an enemy of the state for simply having questioned a government decision. In some states of this European Union, the constitution can be interpreted as judging the fact of organizing a referendum as a terrorist act. In some states of this European Union, members of the government and senior officials have described as an act of terrorism minor material damage caused by environmental activists. We were not allowed to vote at second reading in this Parliament on the regulation preventing the dissemination of terrorist content online. However, this regulation will have decisive consequences on our collective freedoms. Tomorrow, the Ministry of the Interior of a country will be able to have content that it has declared terrorist in the neighboring country deleted in one hour by directly addressing the platform that hosts it, without any judicial authority of either of these two countries has ever had a say. Of course, it will then be possible to challenge this decision in court and, perhaps, to see justice done if it is legitimate, but only afterwards. In the meantime, the European Union will have created the opportunity for a form of prior censorship, which goes against the fundamental elements of freedom of expression.'
tokens1_14 = word_tokenize(raw1_14)
for word in tokens1_14:
    print(word, end=' ')

Gwendoline Delbos-Corfield ( Greens/ALE ) . – Madam President , in some states of this European Union , one can be officially targeted as a traitor to the nation or an enemy of the state for simply having questioned a government decision . In some states of this European Union , the constitution can be interpreted as judging the fact of organizing a referendum as a terrorist act . In some states of this European Union , members of the government and senior officials have described as an act of terrorism minor material damage caused by environmental activists . We were not allowed to vote at second reading in this Parliament on the regulation preventing the dissemination of terrorist content online . However , this regulation will have decisive consequences on our collective freedoms . Tomorrow , the Ministry of the Interior of a country will be able to have content that it has declared terrorist in the neighboring country deleted in one hour by directly addressing the platform that hos

In [66]:
raw1_15 = 'Mislav Kolakusic (NI). - Honorable Chair, the internet platform Facebook has spread like a smiling parasite, first to the United States, then to Europe and then to the rest of the world. That platform is, falsely presenting itself as apolitical, we would say a platform for all citizens, a platform on which everyone can speak freely. However, two years ago, they revealed themselves as a highly political platform, extremely dangerous because it is a political platform that does not go to the polls, but decides, and on its own, which is true and what is a lie. But worst of all, they got directly involved, so Facebook got directly involved in the 2019 European Parliament elections, and then got involved in the election, and in a completely wrong way, in the United States during the presidential election. We have to stop them. They are the greatest social threat to democracy.'
tokens1_15 = word_tokenize(raw1_15)
for word in tokens1_15:
    print(word, end=' ')

Mislav Kolakusic ( NI ) . - Honorable Chair , the internet platform Facebook has spread like a smiling parasite , first to the United States , then to Europe and then to the rest of the world . That platform is , falsely presenting itself as apolitical , we would say a platform for all citizens , a platform on which everyone can speak freely . However , two years ago , they revealed themselves as a highly political platform , extremely dangerous because it is a political platform that does not go to the polls , but decides , and on its own , which is true and what is a lie . But worst of all , they got directly involved , so Facebook got directly involved in the 2019 European Parliament elections , and then got involved in the election , and in a completely wrong way , in the United States during the presidential election . We have to stop them . They are the greatest social threat to democracy . 

In [67]:
raw1_16 = 'Geoffroy Didier (EPP). – Madam President, a European law requiring platforms to remove their terrorist content within one hour is a historic step forward. Europe is thus sending a signal of mistrust to the Islamists, to their accomplices, but also to the social networks which have so far masked their culpable passivity and their concern for profit behind the screen of a supposedly absolute freedom of expression. And I also want to say it here strongly and solemnly: I still do not understand that it was necessary to wait for the international emotion aroused by the murder of the French school teacher Samuel Paty, beheaded by a booze Islamist with hate content, so that some political groups – I am thinking of the Greens and the Socialists – are finally agreeing to join us in this fight. How many messages of hatred, how many revolting threats would it still have had to be endured for certain political leaders, their eyes clearly clouded by their ideology, to resolve to see reality in the face? This European regulation is there. The political fact is major, necessary, salutary, but the fight that had to be waged to achieve it says a lot about the state of our democracy. The challenge that we are going to have to take up together to protect the peoples is clearly just beginning.'
tokens1_16 = word_tokenize(raw1_16)
for word in tokens1_16:
    print(word, end=' ')

Geoffroy Didier ( EPP ) . – Madam President , a European law requiring platforms to remove their terrorist content within one hour is a historic step forward . Europe is thus sending a signal of mistrust to the Islamists , to their accomplices , but also to the social networks which have so far masked their culpable passivity and their concern for profit behind the screen of a supposedly absolute freedom of expression . And I also want to say it here strongly and solemnly : I still do not understand that it was necessary to wait for the international emotion aroused by the murder of the French school teacher Samuel Paty , beheaded by a booze Islamist with hate content , so that some political groups – I am thinking of the Greens and the Socialists – are finally agreeing to join us in this fight . How many messages of hatred , how many revolting threats would it still have had to be endured for certain political leaders , their eyes clearly clouded by their ideology , to resolve to see 

In [68]:
raw1_17 = 'Evin Incir (S&D). - Madam President! The regulation on the prevention of the spread of terrorist content online is EU policy at its most important. Terrorism must be cracked no matter where it shows its ugly face, regardless of ideological or religious affiliation. It is about defending human life and about defending our democracy and open society. We must fight it together, for common problems we only eliminate together. I would like to thank Commissioner Ylva Johansson, the European Parliament rapporteur and the shadow rapporteurs for your important work. Right-wing extremism and radical Islamist terrorism today seem to be vying for the worst, and they use the Internet as a tool to spread their heinous ideology: terrorism. The Wild West is over online. Online platforms must be given a clearer responsibility than today for keeping the platforms free from terrorist content. There must be an end to radicalization online. Everything that comes up should be taken down immediately. However, we must of course do so with respect for integrity and free conversation, art, satire and journalism, which the proposal is also clear about. We are now stepping up our work against terrorism and for democracy and human rights and freedoms. We are now stepping up our work in defense of our democracy.'
tokens1_17 = word_tokenize(raw1_17)
for word in tokens1_17:
    print(word, end=' ')

Evin Incir ( S & D ) . - Madam President ! The regulation on the prevention of the spread of terrorist content online is EU policy at its most important . Terrorism must be cracked no matter where it shows its ugly face , regardless of ideological or religious affiliation . It is about defending human life and about defending our democracy and open society . We must fight it together , for common problems we only eliminate together . I would like to thank Commissioner Ylva Johansson , the European Parliament rapporteur and the shadow rapporteurs for your important work . Right-wing extremism and radical Islamist terrorism today seem to be vying for the worst , and they use the Internet as a tool to spread their heinous ideology : terrorism . The Wild West is over online . Online platforms must be given a clearer responsibility than today for keeping the platforms free from terrorist content . There must be an end to radicalization online . Everything that comes up should be taken down 

In [69]:
raw1_18 = 'Moritz Körner (Renew). – Madam President, Commissioner, ladies and gentlemen! Of course, we must fight terrorism relentlessly in Europe. But what do terrorists want to sow apart from violence? They want to scare us. They want to attack free society. And if you want to defend free society against it, then you must not restrict freedom of expression disproportionately. I am very worried that this will happen with this instrument. I am not sure that automatic filters, which could be used, limit our freedom of expression. I am very concerned that the authorities that can issue cross-border erasure orders are actually narrowly defined enough for us to guarantee proportionality and the rule of law. There was a similar French law that failed in the highest courts. I regret that we cannot actually vote on this law again here in plenary at second reading. If I had had the opportunity, I would have voted against this regulation today. I appeal to the European Commission, when implementing and enforcing this instrument, to really ensure that the protective rights, for which the European Parliament has definitely fought, are actually respected. I am very worried that this will ultimately have no effect in practice.'
tokens1_18 = word_tokenize(raw1_18)
for word in tokens1_18:
    print(word, end=' ')

Moritz Körner ( Renew ) . – Madam President , Commissioner , ladies and gentlemen ! Of course , we must fight terrorism relentlessly in Europe . But what do terrorists want to sow apart from violence ? They want to scare us . They want to attack free society . And if you want to defend free society against it , then you must not restrict freedom of expression disproportionately . I am very worried that this will happen with this instrument . I am not sure that automatic filters , which could be used , limit our freedom of expression . I am very concerned that the authorities that can issue cross-border erasure orders are actually narrowly defined enough for us to guarantee proportionality and the rule of law . There was a similar French law that failed in the highest courts . I regret that we can not actually vote on this law again here in plenary at second reading . If I had had the opportunity , I would have voted against this regulation today . I appeal to the European Commission , 

In [70]:
raw1_19 = 'Maximilian Krah (ID). – Madam President, Commissioner, ladies and gentlemen! Of course we want to fight terrorism, and of course terrorism is also fueled via the Internet. But it is clear that there is always a risk when state institutions are given more powers to monitor the Internet and thus also to filter opinions. Here I am very grateful to Patryk Jaki and the shadow rapporteurs for finding a way to limit this risk and to present a law that, in an exemplary manner, succeeds in combating the real danger in a targeted manner, without us having to fear that unorthodox or dissident ones will opinions are limited. But one thing must be clear to everyone: we are fighting the symptoms of terrorism, not the causes. A police officer was recently slaughtered in France. This is not the result of a lack of internet surveillance, but of a lack of immigration control. We should therefore not only ensure that we monitor the Internet and give the police new powers, but also finally stop importing terrorism into Europe.'
tokens1_19 = word_tokenize(raw1_19)
for word in tokens1_19:
    print(word, end=' ')

Maximilian Krah ( ID ) . – Madam President , Commissioner , ladies and gentlemen ! Of course we want to fight terrorism , and of course terrorism is also fueled via the Internet . But it is clear that there is always a risk when state institutions are given more powers to monitor the Internet and thus also to filter opinions . Here I am very grateful to Patryk Jaki and the shadow rapporteurs for finding a way to limit this risk and to present a law that , in an exemplary manner , succeeds in combating the real danger in a targeted manner , without us having to fear that unorthodox or dissident ones will opinions are limited . But one thing must be clear to everyone : we are fighting the symptoms of terrorism , not the causes . A police officer was recently slaughtered in France . This is not the result of a lack of internet surveillance , but of a lack of immigration control . We should therefore not only ensure that we monitor the Internet and give the police new powers , but also fin

In [71]:
raw1_20 = 'Lukas Mandl (EPP). – Madam President, dear colleagues! I am Austrian. There was a devastating terrorist attack in our federal capital, Vienna, in the autumn. Terrorist attacks recently took place in France. All of Europe has already been affected by terrorist attacks, and we know that our European way of life does not suit everyone. We know that social networks, online networks, have many positive aspects on the one hand, but are also misused by criminal networks, especially terrorist networks, to recruit and to inform each other. Of course, we in Europe are responsible for ensuring that this abuse cannot take place. We protect life and limb with it. But what we also have to protect is what we defend against terrorism, as the European way of life: these are civil liberties, this is human dignity, this is the rule of law, this is liberal democracy, and this includes the values ​​of the Enlightenment, including freedom of expression which also include freedom of speech. And you can only do that - rule out one thing and defend the other - if in the end it is always a person who checks whether content is terrorist content, and if the legal process is available for those who publish it , is open. A value of education is also that publication comes with responsibility, something that we need to understand more and more in the Internet age.'
tokens1_20 = word_tokenize(raw1_20)
for word in tokens1_20:
    print(word, end=' ')

Lukas Mandl ( EPP ) . – Madam President , dear colleagues ! I am Austrian . There was a devastating terrorist attack in our federal capital , Vienna , in the autumn . Terrorist attacks recently took place in France . All of Europe has already been affected by terrorist attacks , and we know that our European way of life does not suit everyone . We know that social networks , online networks , have many positive aspects on the one hand , but are also misused by criminal networks , especially terrorist networks , to recruit and to inform each other . Of course , we in Europe are responsible for ensuring that this abuse can not take place . We protect life and limb with it . But what we also have to protect is what we defend against terrorism , as the European way of life : these are civil liberties , this is human dignity , this is the rule of law , this is liberal democracy , and this includes the values ​​of the Enlightenment , including freedom of expression which also include freedom

In [72]:
raw1_21 = 'Ylva Johansson, Member of the Commission. – Madam President, dear Members of Parliament, in 2002, al-Qaeda terrorists beheaded American journalist Daniel Pearl and posted the video online: the first video of a murder posted by terrorists, the first of many. In 2014, Daesh terrorists murdered journalist James Foley and posted the video online. They murdered journalists Steven Sotloff and Kenji Goto and shared the videos online – professionally-produced videos, the union of high technology and low barbarism. After these attacks, several news agencies stopped hiring freelance reporters in the region. Journalists are among the first targets and first victims of terrorists, and freedom of expression is their first casualty. Or take the murder of Samuel Paty, a teacher of history, geography and civics. His brutal murder was a direct attack on our values, our rights and our freedoms. The freedom of expression and information. The right to education. The freedom of thought, conscience and religion. The terrorists posted the picture of his dead body online. Terrorists exploit our fundamental rights against us. They abuse our open society and our freedoms, to spread fear and terror and ultimately to destroy those freedoms. Terrorists abuse the internet, which should be a tool for progress, and abuse it as an instrument of barbarity to spread death and destruction. This we can never ever accept. This is what we said loud and clear in today’s debate. And we say that, with the regulation we debated today – a firewall against terrorist content online, to take it down and to stop it spreading – together we strike a major blow against terrorists. Again, I would like to thank the Parliament, especially the rapporteur and the shadow rapporteurs for the good cooperation for this achievement.'
tokens1_21 = word_tokenize(raw1_21)
for word in tokens1_21:
    print(word, end=' ')

Ylva Johansson , Member of the Commission . – Madam President , dear Members of Parliament , in 2002 , al-Qaeda terrorists beheaded American journalist Daniel Pearl and posted the video online : the first video of a murder posted by terrorists , the first of many . In 2014 , Daesh terrorists murdered journalist James Foley and posted the video online . They murdered journalists Steven Sotloff and Kenji Goto and shared the videos online – professionally-produced videos , the union of high technology and low barbarism . After these attacks , several news agencies stopped hiring freelance reporters in the region . Journalists are among the first targets and first victims of terrorists , and freedom of expression is their first casualty . Or take the murder of Samuel Paty , a teacher of history , geography and civics . His brutal murder was a direct attack on our values , our rights and our freedoms . The freedom of expression and information . The right to education . The freedom of thoug

---
### Combine all parts

In [73]:
tokens = tokens1_1 + tokens1_2 + tokens1_3 + tokens1_4 + tokens1_5 + tokens1_6 + tokens1_7 + tokens1_8 + tokens1_9 + tokens1_10 + tokens1_11 + tokens1_12 + tokens1_13 + tokens1_14 + tokens1_15 + tokens1_16 + tokens1_17 + tokens1_18 + tokens1_19 + tokens1_20 + tokens1_21 

---
### Normalize the words 

In [74]:
type(tokens)
eutext09 = [w.lower() for w in tokens]

---
**Save Output**

In [75]:
save_path = '/Users/charlottekaiser/Documents/uni/Hertie/master_thesis/00_data/20_intermediate_files'
file_name = "EU09_Preventing the dissemination of terrorist content online.txt"
completeName = os.path.join(save_path, file_name)
output = open(completeName, 'w')
print(eutext09, file=output)