### *Data Collection - European Parliament*
## Preparing Raw Data
---
**Sample Text 13**
Title: Artificial intelligence in education, culture and the audiovisual sector <br>
Date: May 18, 2021 - Brussels

In [31]:
# import necessary libraries
import requests
from requests_html import HTMLSession
import urllib.request
import time
from bs4 import BeautifulSoup
import urllib
from urllib import request
from __future__ import division
import nltk, re, pprint
from nltk import word_tokenize
from nltk import FreqDist
import os.path 
import pandas as pd

---
### Process: Trimming debate by inserting the original English or translated English files and tokenizing them.
*Note*: Due to time constraint, the process has been optimized.

- English parts of the debate will be added manually as a string and then tokenized. 

- A consistent method of translating and then adding will be applied to all EU Parliament debates:  Non-English parts are copied from the original web pages, inserted in the consistent choice of translation tool, Google Translate (https://translate.google.com/?hl=de&tab=TT), translated to English and pasted in as a string. 

- Afterwards, the same steps are applied as per usual (tokenizing, standardizing).

Because of the changed process, the URL and step of webscraping are technically no longer necessary, will however be included for the purpose of completeness. 

In [32]:
# url = "https://www.europarl.europa.eu/doceo/document/CRE-9-2021-05-18-ITM-023_EN.html"
# html = requests.get(url)
# raw = BeautifulSoup(html.content, 'html.parser').get_text()

In [33]:
raw1_1 = 'Sabine Verheyen, rapporteur. – Mr President, Commissioner, dear colleagues! First of all, even at this late hour, I would like to thank my shadow rapporteurs, the secretariat, and the staff for the good working relationship, cooperation and support. Through this cooperation I think a good comprehensive report was possible. We wrote a report on artificial intelligence in the fields of education, culture and audiovisual media. In this report, we call for a clear ethical framework for the use of AI technology in media to ensure people have access to culturally and linguistically diverse content. Such a framework should also address the misuse of artificial intelligence to spread fake news and disinformation. We need clear rules for AI technologies so that they protect anti-discrimination, gender equality, pluralism and cultural and linguistic diversity. The use of data that reflects existing inequality or discrimination should be excluded when training artificial intelligence. Instead, an inclusive and ethical framework should be developed for the use of datasets that are applied during the deep learning process. There is no doubt that artificial intelligence technologies will encounter us in all areas of life in the future and will certainly also enrich them. However, we must put people at the center of the use of all technology, and especially in the field of education, artificial intelligence technology must be geared towards the human being, who makes the final decision and holds the responsibility in his hands. Nevertheless, I think it is wrong to generally classify the education sector as a high-risk sector in the discussion about artificial intelligence. In the education sector in particular, there are all conceivable ways in which AI can be used. Some carry higher risks, others far lower. Here, however, a clear risk assessment is required and not a generalization of an entire sector in order to be able to classify regulation and security measures depending on the actual current risk. At this point I would also like to remind you of the importance of strengthening digital skills at Union level as a prerequisite for the use of artificial intelligence in education and of the need, above all, to train teachers so that they can adapt to the realities of AI-based Education can adapt and risks from misuse of technology can be reduced. Teachers must always be able to correct decisions made by the AI, such as when grading students. Teachers must never be replaced by AI technologies, and especially not in early childhood education. For the cultural and creative sectors, we call for a coherent vision of the use of artificial intelligence at European level and in the Member States. Algorithm-based content recommendations, especially for video and music streaming services, should not be able to negatively impact cultural and linguistic diversity in the European Union. To this end, specific indicators will be developed to assess diversity and ensure that European works are promoted. In this way, we want to reduce filter bubbles and enable more transparency for consumers, who can then better understand algorithm decisions. We have fought for decades to establish our values ​​of inclusion, non-discrimination, multilingualism and cultural diversity, which our citizens see as an essential part of European identity. These values ​​must also be reflected in the online world, in which algorithms and artificial intelligence applications are increasingly being used. Maximum transparency and the development of high-quality and inclusive data systems for the deployment of deep learning are crucial, as is a clear ethical framework to ensure access to culturally and linguistically diverse content. Artificial intelligence is a technology that influences many areas of our lives and is a key driver of digital change. In my opinion, one thing is certain: Ultimately, technology must serve people. The human being is the focus and has the responsibility as well as the control and the final decision.'
tokens1_1 = word_tokenize(raw1_1)
for word in tokens1_1:
    print(word, end=' ')

Sabine Verheyen , rapporteur . – Mr President , Commissioner , dear colleagues ! First of all , even at this late hour , I would like to thank my shadow rapporteurs , the secretariat , and the staff for the good working relationship , cooperation and support . Through this cooperation I think a good comprehensive report was possible . We wrote a report on artificial intelligence in the fields of education , culture and audiovisual media . In this report , we call for a clear ethical framework for the use of AI technology in media to ensure people have access to culturally and linguistically diverse content . Such a framework should also address the misuse of artificial intelligence to spread fake news and disinformation . We need clear rules for AI technologies so that they protect anti-discrimination , gender equality , pluralism and cultural and linguistic diversity . The use of data that reflects existing inequality or discrimination should be excluded when training artificial intel

In [34]:
raw1_2 = 'Ondřej Kovařík, rapporteur for the opinion of the Committee on Civil Liberties, Justice and Home Affairs. – Mr President, on behalf of the Committee on Civil Liberties, Justice and Home Affairs (LIBE), let me present the main elements from our opinion. Primarily, we insist that any use of artificial intelligence in the education, culture and audiovisual sectors must fully respect fundamental rights and freedoms as set down in the Treaties. AI can enhance the education process if it’s used wisely, contributing to digital skills which will be so important in the future. About 40% of children today will work in jobs that do not exist yet. I therefore welcome the Commission’s intention to update the digital education action plan in this regard. We highlight the need for safeguards in the use of AI for children, who are more vulnerable and therefore deserve particular attention and protection. There must always be the possibility for human intervention to ensure that no bias is found in the systems and there are equal opportunities for all. Let me conclude my remarks by calling for an open regulatory framework for artificial intelligence where benefits in the education, audiovisual and cultural sectors are maximised and risks minimised.'
tokens1_2 = word_tokenize(raw1_2)
for word in tokens1_2:
    print(word, end=' ')

Ondřej Kovařík , rapporteur for the opinion of the Committee on Civil Liberties , Justice and Home Affairs . – Mr President , on behalf of the Committee on Civil Liberties , Justice and Home Affairs ( LIBE ) , let me present the main elements from our opinion . Primarily , we insist that any use of artificial intelligence in the education , culture and audiovisual sectors must fully respect fundamental rights and freedoms as set down in the Treaties . AI can enhance the education process if it ’ s used wisely , contributing to digital skills which will be so important in the future . About 40 % of children today will work in jobs that do not exist yet . I therefore welcome the Commission ’ s intention to update the digital education action plan in this regard . We highlight the need for safeguards in the use of AI for children , who are more vulnerable and therefore deserve particular attention and protection . There must always be the possibility for human intervention to ensure that 

In [35]:
raw1_3 = 'Mariya Gabriel, Member of the Commission. – Mr President, ladies and gentlemen Members of the European Parliament, allow me first of all to thank the rapporteur Sabine Verheyen, chair of the Committee on Culture and Education, and the whole of this committee for this report and for highlighting the importance of artificial intelligence in education, culture and the audiovisual sector. The European Parliament and the European Commission fully recognize the strategic importance of artificial intelligence and other emerging technologies for the Union. Artificial intelligence is already transforming our lives and it will continue to have an increasing impact on the way we learn, work and enjoy our free time. However, this change is not without risk. Artificial intelligence technologies are increasingly being used to spread fake news, distort reality, hijack data or even marginalize groups of people. Now is the time to seize the opportunities and meet the challenges. It is time for adaptation and this is precisely where education comes in. Education can help and add a bridge between technological advances and the training needed to use them safely, effectively and ethically. The action plan for digital education exploits the possibilities offered by technologies such as artificial intelligence, virtual reality or robotics. The goal is to scale up successful practices and better understand the ethical, privacy and security implications. It will offer targeted funding and support measures and act as a catalyst to promote a strategic approach to digital education in the Union. Through this plan, we are strengthening digital skills and developing guidelines that will help our Member States support educators, learners, researchers and future innovators, and thus contribute to the development of a successful digital education ecosystem. . For example, we will prepare ethical guidelines on artificial intelligence and the use of data in teaching and learning. I would like to tell you that at this very moment, the call for experts who will form this group has been launched since April 23. The idea is that they start their work next month, that is to say in June, and to have a first report by the end of this year. Similar guidelines on digital literacy and disinformation will also be developed, to tackle the spread of fake news and disinformation online. Raising awareness is equally important, which is why after the publication of the guidelines, the Commission will support awareness-raising activities aimed at educators and students. At the same time, we are updating the digital skills framework of the action plan, adding artificial intelligence and data skills. This framework defines the skills required for citizens who are digitally competent in the 21st century. Finally, we also support research and innovation in artificial intelligence in education. We have started to explore the use of artificial intelligence and learning analytics in education to support the education system to meet skills needs and trends. Two online artificial intelligence services are being developed right now. However, as highlighted in your report, the impact of digital technologies, such as artificial intelligence, goes far beyond the education and training sector. They also have an impact on the cultural and creative sectors, as well as on the heritage sector. An obvious example is the issue of the control of online streaming services by artificial intelligence, which is mentioned in your report and which generated a lot of interest during a recent online conference entitled "Diversity and competitiveness of the European sector music", which my services organized with experts from the Member States and representatives of the sector. Streaming will fundamentally change the way music reaches consumers. It can create opportunities to build diversity, but it can also favor some artists, languages ​​or genres over others. It is therefore essential to ensure greater transparency in the use of algorithms by streaming services, and fair financial compensation for creators and rights holders. At the same time, digital technologies, such as artificial intelligence, offer new possibilities to make cultural heritage more accessible to all audiences, and also to preserve cultural content. The crisis has also made clearer than ever the need to strengthen the link between digitization, heritage and education as digital possibilities make learning about heritage more accessible. A good example is the Europeana platform, which was able to reach nearly 50,000 schoolchildren with online courses and new platforms during the pandemic. We are also exploring the relationship between digital technologies and cultural heritage, through the Commissionary expert group on cultural heritage, which meets in April to inform Member State representatives of the possibilities offered by the application of artificial intelligence to cultural heritage. With regard to research and innovation, the new project of the program "Horizon Europe" includes a new fully dedicated culture cluster, which will at this time support culture and inclusive society and which will help the fields of culture and cultural heritage to benefit from projects dealing with the economic and social impact of technological applications and digital services linked to the evolution of the sector. Finally and lastly, technology also has potential to promote the concept of sustainable cultural tourism. The working group of European experts on sustainable cultural tourism, within the framework of the open method of coordination and in the context of previous work plans for culture, has published a report which includes the first definition of sustainable cultural tourism , as well as recommendations and guidelines for policy makers. I think it is first up to us to see how all these elements can truly serve the recommendations in the report, for which I thank you once again from the bottom of my heart.'
tokens1_3 = word_tokenize(raw1_3)
for word in tokens1_3:
    print(word, end=' ')

Mariya Gabriel , Member of the Commission . – Mr President , ladies and gentlemen Members of the European Parliament , allow me first of all to thank the rapporteur Sabine Verheyen , chair of the Committee on Culture and Education , and the whole of this committee for this report and for highlighting the importance of artificial intelligence in education , culture and the audiovisual sector . The European Parliament and the European Commission fully recognize the strategic importance of artificial intelligence and other emerging technologies for the Union . Artificial intelligence is already transforming our lives and it will continue to have an increasing impact on the way we learn , work and enjoy our free time . However , this change is not without risk . Artificial intelligence technologies are increasingly being used to spread fake news , distort reality , hijack data or even marginalize groups of people . Now is the time to seize the opportunities and meet the challenges . It is 

In [36]:
raw1_4 = 'Kim Van Sparrentak, rapporteur for the opinion of the Committee on the Internal Market and Consumer Protection. – Mr President, this week in the Netherlands secondary schools have started their final exams, which is an important moment in their lives and for their future. Last year, in the UK, not exams but artificial intelligence determined A-level exam results, which resulted in arbitrary and discriminatory results. Kids who went to an expensive private school were systematically marked up and kids from state schools were systematically given a lower grade than expected. We need to protect pupils and students against situations where a computer or algorithm decides their future without human oversight. But we also have to protect them from companies harvesting their personal data through educational software. In the Netherlands, 70% of all primary schools are now Google schools, which is increasingly raising privacy concerns. For crucial digital infrastructure, like our education systems, we must not be dependent on a small number of large companies. We need public investment in education technology, clear public procurement rules and always human oversight on all decisions about the future of pupils and students.'
tokens1_4 = word_tokenize(raw1_4)
for word in tokens1_4:
    print(word, end=' ')

Kim Van Sparrentak , rapporteur for the opinion of the Committee on the Internal Market and Consumer Protection . – Mr President , this week in the Netherlands secondary schools have started their final exams , which is an important moment in their lives and for their future . Last year , in the UK , not exams but artificial intelligence determined A-level exam results , which resulted in arbitrary and discriminatory results . Kids who went to an expensive private school were systematically marked up and kids from state schools were systematically given a lower grade than expected . We need to protect pupils and students against situations where a computer or algorithm decides their future without human oversight . But we also have to protect them from companies harvesting their personal data through educational software . In the Netherlands , 70 % of all primary schools are now Google schools , which is increasingly raising privacy concerns . For crucial digital infrastructure , like 

In [37]:
raw1_5 = 'Angel Djambazki, rapporteur for the opinion of the Committee on Legal Affairs. - Mr President, Commissioner Gabriel, artificial intelligence and its application are becoming more common in various areas of our lives, such as defense. We have witnessed the complete transition of activities from our daily lives in absentia, online. The same happened in the field of education, which showed a significant problem that needs to be addressed. 42% of the population of the European Union do not have basic digital skills. I believe that teachers cannot and should not be replaced by machines and programs, in the foreseeable future at least, but funds must be set aside for their qualification, for working with artificial intelligence, interactive teaching. Artificial intelligence also plays a role in spreading fake news. It is important to be able to counteract and use it to combat misinformation. I do not think it should be unattended and without people, because more than once the use of algorithms from major social networks and online platforms leads to downloads, blocking content that is pure censorship. And this is a serious part, for which I thank and congratulate the rapporteur. However, there is a very frivolous part, and I admit, I am impressed, even in this report, when it comes to artificial intelligence, you managed to include your favorite topics such as gender propaganda, LGBT and gender, gender, equality, etc. Even the poor robots in the future will have to comply with this propaganda, which makes their future quite sad, I think, but what to do is the option in which we live.'
tokens1_5 = word_tokenize(raw1_5)
for word in tokens1_5:
    print(word, end=' ')

Angel Djambazki , rapporteur for the opinion of the Committee on Legal Affairs . - Mr President , Commissioner Gabriel , artificial intelligence and its application are becoming more common in various areas of our lives , such as defense . We have witnessed the complete transition of activities from our daily lives in absentia , online . The same happened in the field of education , which showed a significant problem that needs to be addressed . 42 % of the population of the European Union do not have basic digital skills . I believe that teachers can not and should not be replaced by machines and programs , in the foreseeable future at least , but funds must be set aside for their qualification , for working with artificial intelligence , interactive teaching . Artificial intelligence also plays a role in spreading fake news . It is important to be able to counteract and use it to combat misinformation . I do not think it should be unattended and without people , because more than onc

In [38]:
raw1_6 = 'Maria da Graça Carvalho, rapporteur of the opinion of the Committee on Womens Rights and Gender Equality. – Mr President, Commissioner, Ladies and Gentlemen, I congratulate the rapporteur, Ms Sabine Verheyen, for her work on this report, hoping that we can effectively apply the full potential of artificial intelligence in education, culture and audiovisual media. As rapporteur for the opinion of the FEMM Committee, I consider it essential that the European Union, when regulating artificial intelligence, consider ethical aspects, namely from a gender perspective. It is also essential to develop strategies to increase the participation of women in the areas of digital and, in particular, in artificial intelligence, namely through the education system. The audiovisual and cultural sectors can also combat prejudices that associate new technologies with the masculine gender, for example by presenting good examples in the feminine, both in information and in fiction. We need everyone, men and women, to succeed in this transformation that lies ahead and we have to succeed.'
tokens1_6 = word_tokenize(raw1_6)
for word in tokens1_6:
    print(word, end=' ')

Maria da Graça Carvalho , rapporteur of the opinion of the Committee on Womens Rights and Gender Equality . – Mr President , Commissioner , Ladies and Gentlemen , I congratulate the rapporteur , Ms Sabine Verheyen , for her work on this report , hoping that we can effectively apply the full potential of artificial intelligence in education , culture and audiovisual media . As rapporteur for the opinion of the FEMM Committee , I consider it essential that the European Union , when regulating artificial intelligence , consider ethical aspects , namely from a gender perspective . It is also essential to develop strategies to increase the participation of women in the areas of digital and , in particular , in artificial intelligence , namely through the education system . The audiovisual and cultural sectors can also combat prejudices that associate new technologies with the masculine gender , for example by presenting good examples in the feminine , both in information and in fiction . We

In [39]:
raw1_7 = 'Miriam Lexmann, on behalf of the PPE Group. - Mr President, human dignity and good humanity must always be at the heart of the development and use of unique artificial intelligence technology. It already affects almost all areas of our lives today. It can help us in many situations, and provides completely new possibilities. In addition to allowing us to tailor learning, which is of great importance for the future of education systems, it can also influence what information comes to us through social media, thus changing our thinking and decisions. And here is the point where we need to pay particular attention to the development of critical thinking and digital and media literacy. Artificial intelligence must always be based on clear requirements for transparency. Users must always be aware that they are coming into contact with artificial intelligence and understand the basic features of its operation. Member States should therefore pay particular attention to the quality of education, not only in schools, but also in education in society of all ages. The key is the ethical dimension of artificial intelligence. We must ensure that it does not distort reality. dissemination of misinformation or manipulation in order to make a profit, but on the contrary, so that its goal is to serve man and his good.c'
tokens1_7 = word_tokenize(raw1_7)
for word in tokens1_7:
    print(word, end=' ')

Miriam Lexmann , on behalf of the PPE Group . - Mr President , human dignity and good humanity must always be at the heart of the development and use of unique artificial intelligence technology . It already affects almost all areas of our lives today . It can help us in many situations , and provides completely new possibilities . In addition to allowing us to tailor learning , which is of great importance for the future of education systems , it can also influence what information comes to us through social media , thus changing our thinking and decisions . And here is the point where we need to pay particular attention to the development of critical thinking and digital and media literacy . Artificial intelligence must always be based on clear requirements for transparency . Users must always be aware that they are coming into contact with artificial intelligence and understand the basic features of its operation . Member States should therefore pay particular attention to the quali

In [40]:
raw1_8 = 'Ibán García Del Blanco, on behalf of the S&D Group. – Mr. President, artificial intelligence, especially when it is related to freedoms or fundamental rights, has to reach the highest levels of compliance with ethical standards. Education is a clear case: it must be considered, within that division that the Commission itself has established, as a high-risk sector and, therefore, be controlled from its inception to the development of technologies that can be used to improve the field of education, to improve the experience, but never to replace the relationship that education must also have, the people being educated, with educators, with human beings: this possibility must never be replaced by the use of technology. In the same way, speaking of the culture and audiovisual sector, the implication that the use of these techniques has, the change in that ecosystem, may be so great that it is appropriate that we make a second more specific reading in this sector, and that This is what also mandates this report in some way, and it is an aspect that I want to highlight. And lastly, we have digital literacy: it is not just about having digital skills, it is about deeply understanding what this phenomenon implies, and, therefore, I also want to highlight the need we have to introduce that understanding of change in education. of which we are simply already observing its maximum dimension.'
tokens1_8 = word_tokenize(raw1_8)
for word in tokens1_8:
    print(word, end=' ')

Ibán García Del Blanco , on behalf of the S & D Group . – Mr. President , artificial intelligence , especially when it is related to freedoms or fundamental rights , has to reach the highest levels of compliance with ethical standards . Education is a clear case : it must be considered , within that division that the Commission itself has established , as a high-risk sector and , therefore , be controlled from its inception to the development of technologies that can be used to improve the field of education , to improve the experience , but never to replace the relationship that education must also have , the people being educated , with educators , with human beings : this possibility must never be replaced by the use of technology . In the same way , speaking of the culture and audiovisual sector , the implication that the use of these techniques has , the change in that ecosystem , may be so great that it is appropriate that we make a second more specific reading in this sector , a

In [41]:
raw1_9 = 'Laurence Farreng, on behalf of the Renew group. – Mr President, Commissioner, thank you first of all to the rapporteur, Mrs Verheyen, for this very important report. Artificial intelligence represents an immense opportunity for players in education, culture and the European audiovisual sector. Whether it involves more interactive learning devices, particularly linguistic ones, or enhancement of our heritage, artificial intelligence opens up many possibilities for our performance and our competitiveness. It is therefore essential, as we are asking in this group for this report, to allow everyone equitable and fair access to this digital technology. However, this also obliges us to take up several challenges: first, the strengthening of the digital skills of Europeans and the adequate training of teaching staff; then, the structuring of a competitive and attractive cutting-edge research sector for European researchers; finally, the development of an artificial intelligence that is ethical from the design stage and emblematic of our European values, such as cultural diversity, gender equality or freedom of expression and information. Europe therefore has its full role to play in meeting these challenges, through its “Horizon Europe”, “Digital Europe” and “Creative Europe” programmes. However, artificial intelligence also has its dark side: deep fakes, which make it possible to fake speeches, or the accelerated dissemination of false information. Against these practices, we will have to preserve high-risk areas, such as education, and also fight even more strongly against disinformation. In this respect, we expect a great deal from the framework project proposed by the Commission a few weeks ago.'
tokens1_9 = word_tokenize(raw1_9)
for word in tokens1_9:
    print(word, end=' ')

Laurence Farreng , on behalf of the Renew group . – Mr President , Commissioner , thank you first of all to the rapporteur , Mrs Verheyen , for this very important report . Artificial intelligence represents an immense opportunity for players in education , culture and the European audiovisual sector . Whether it involves more interactive learning devices , particularly linguistic ones , or enhancement of our heritage , artificial intelligence opens up many possibilities for our performance and our competitiveness . It is therefore essential , as we are asking in this group for this report , to allow everyone equitable and fair access to this digital technology . However , this also obliges us to take up several challenges : first , the strengthening of the digital skills of Europeans and the adequate training of teaching staff ; then , the structuring of a competitive and attractive cutting-edge research sector for European researchers ; finally , the development of an artificial inte

In [42]:
raw1_10 = 'Marcel Kolaja, on behalf of the Verts/ALE Group. – Mr President, artificial intelligence will bring changes to our societies. However, only if we eliminate the risks will these changes be positive. For instance, AI systems evaluating applications for universities must be considered high risk. Imagine that a technology faculty employs such a system and uses graduates’ profiles for machine learning. Women are under-represented in technology, so the system may reject applications from women, based on that pattern, and such discrimination is unacceptable. Systems with such an impact on people’s lives must be verifiable at every stage of their life cycle by authorities and by civil society. Let me remind you that free and open source solutions serve this purpose best. We also need to update curricula so that students are prepared to fully participate in the digital society and to understand how AI systems may influence their decisions, shopping patterns or what they see online. And finally, my Group has also been calling for a ban on facial recognition technologies in public space. Among other things, they can be easily abused to persecute investigative journalists. And please let me finish by thanking the rapporteur and other shadow rapporteurs for their work on this file.'
tokens1_10 = word_tokenize(raw1_10)
for word in tokens1_10:
    print(word, end=' ')

Marcel Kolaja , on behalf of the Verts/ALE Group . – Mr President , artificial intelligence will bring changes to our societies . However , only if we eliminate the risks will these changes be positive . For instance , AI systems evaluating applications for universities must be considered high risk . Imagine that a technology faculty employs such a system and uses graduates ’ profiles for machine learning . Women are under-represented in technology , so the system may reject applications from women , based on that pattern , and such discrimination is unacceptable . Systems with such an impact on people ’ s lives must be verifiable at every stage of their life cycle by authorities and by civil society . Let me remind you that free and open source solutions serve this purpose best . We also need to update curricula so that students are prepared to fully participate in the digital society and to understand how AI systems may influence their decisions , shopping patterns or what they see o

In [43]:
raw1_11 = 'Beata Mazurek, on behalf of the ECR Group. - Mr Chairman! The development of artificial intelligence is a chance for another civilization leap. For this to happen, we need to coordinate investment activities in this area and disseminate knowledge on this subject among the society, which still shows a lack of trust in this technology. In addition, if we want to efficiently implement solutions brought about by artificial intelligence, we have to fight digital exclusion, which unfortunately affects a large part of the inhabitants of Europe. Learning about artificial intelligence and digitization should take place at all stages of education. An educated society is the best capital that will have a significant impact on its development. The governments of the Member States should support the implementation of modern digital solutions through cooperation with innovative companies from Europe, as is the case in Poland through the government GovTech platform, which strengthens the involvement of external entities in improving digital skills in various areas of public life.'
tokens1_11 = word_tokenize(raw1_11)
for word in tokens1_11:
    print(word, end=' ')

Beata Mazurek , on behalf of the ECR Group . - Mr Chairman ! The development of artificial intelligence is a chance for another civilization leap . For this to happen , we need to coordinate investment activities in this area and disseminate knowledge on this subject among the society , which still shows a lack of trust in this technology . In addition , if we want to efficiently implement solutions brought about by artificial intelligence , we have to fight digital exclusion , which unfortunately affects a large part of the inhabitants of Europe . Learning about artificial intelligence and digitization should take place at all stages of education . An educated society is the best capital that will have a significant impact on its development . The governments of the Member States should support the implementation of modern digital solutions through cooperation with innovative companies from Europe , as is the case in Poland through the government GovTech platform , which strengthens t

In [44]:
raw1_12 = 'Martina Michels, on behalf of The Left Group. – Mr President, Commissioner! The culture committee naturally asks different questions than the industry, employment or consumer committees. But the EU AI strategy is not just about liability issues for self-driving cars, to put it bluntly. Think of Cambridge Analytica and its electoral influence, of machines that win literary competitions and have long since distributed more than self-written weather reports. Then it becomes clear: A socio-political approach to a European AI strategy is overdue. The use of AI in education, culture and media is full of the digital gender gap. According to studies, one in ten women in the EU has suffered some form of cyber violence since the age of fifteen. AI deployment may block, discriminate, or influence opinions by using data without consent. The report now calls for the Commission to consider recognizing education as a high-risk area in its AI strategy. And the bottom line is that he recommends rewriting the AI ​​white paper so that our culture, our democratic dialogue of the future, also takes place in it with an ethical, transparent and non-discriminatory AI. Our group will definitely agree with this approach.'
tokens1_12 = word_tokenize(raw1_12)
for word in tokens1_12:
    print(word, end=' ')

Martina Michels , on behalf of The Left Group . – Mr President , Commissioner ! The culture committee naturally asks different questions than the industry , employment or consumer committees . But the EU AI strategy is not just about liability issues for self-driving cars , to put it bluntly . Think of Cambridge Analytica and its electoral influence , of machines that win literary competitions and have long since distributed more than self-written weather reports . Then it becomes clear : A socio-political approach to a European AI strategy is overdue . The use of AI in education , culture and media is full of the digital gender gap . According to studies , one in ten women in the EU has suffered some form of cyber violence since the age of fifteen . AI deployment may block , discriminate , or influence opinions by using data without consent . The report now calls for the Commission to consider recognizing education as a high-risk area in its AI strategy . And the bottom line is that h

In [45]:
raw1_13 = 'Paul Tang (S&D). – Mr President, one year ago I read Weapons of Math Destruction by Cathy O’Neil. She noticed that companies started AI for good reasons but forgot those reasons in due process. A month ago, the Commission proposed legislation for ethical AI and today the report adds what is missing to the Commission proposal, especially in paragraphs 18 and 45. The first concern is the use of AI in advertising. The advertising business harms privacy, harms competition and above all, harms our democracy. That’s why this calls for strictly limiting personalised ads and tracking of uses. The second concern is biometric recognition. As the Reclaim Your Face movement makes clear, it endangers our society and our children are the first ones to suffer. This calls for a ban on facial recognition in educational use: no more online proctoring, no more cameras on school premises. Let’s use AI only for good reasons.'
tokens1_13 = word_tokenize(raw1_13)
for word in tokens1_13:
    print(word, end=' ')

Paul Tang ( S & D ) . – Mr President , one year ago I read Weapons of Math Destruction by Cathy O ’ Neil . She noticed that companies started AI for good reasons but forgot those reasons in due process . A month ago , the Commission proposed legislation for ethical AI and today the report adds what is missing to the Commission proposal , especially in paragraphs 18 and 45 . The first concern is the use of AI in advertising . The advertising business harms privacy , harms competition and above all , harms our democracy . That ’ s why this calls for strictly limiting personalised ads and tracking of uses . The second concern is biometric recognition . As the Reclaim Your Face movement makes clear , it endangers our society and our children are the first ones to suffer . This calls for a ban on facial recognition in educational use : no more online proctoring , no more cameras on school premises . Let ’ s use AI only for good reasons . 

In [46]:
raw1_14 = 'Svenja Hahn (Renew). – Mr. Chairman! We are currently doing pioneering work with a legal framework for artificial intelligence. Ms Verheyen, your report makes important points from an educational and cultural point of view. However, these must fundamentally apply beyond these areas if we want to use the opportunities of artificial intelligence. Ethical principles and civil rights must be our guideline for the development and use of artificial intelligence. There must be no misuse of digital technologies, the protection of our fundamental rights is non-negotiable. This distinguishes us and must always distinguish us from authoritarian and totalitarian regimes. Artificial intelligence must always conform to fundamental rights. Seizing opportunities therefore also means excluding applications. That is why there needs to be a clear ban on general surveillance obligations such as upload filters or biometric surveillance in public spaces. Let us create a smart, sustainable and technology-open framework for artificial intelligence based on ethical principles and European core values!'
tokens1_14 = word_tokenize(raw1_14)
for word in tokens1_14:
    print(word, end=' ')

Svenja Hahn ( Renew ) . – Mr. Chairman ! We are currently doing pioneering work with a legal framework for artificial intelligence . Ms Verheyen , your report makes important points from an educational and cultural point of view . However , these must fundamentally apply beyond these areas if we want to use the opportunities of artificial intelligence . Ethical principles and civil rights must be our guideline for the development and use of artificial intelligence . There must be no misuse of digital technologies , the protection of our fundamental rights is non-negotiable . This distinguishes us and must always distinguish us from authoritarian and totalitarian regimes . Artificial intelligence must always conform to fundamental rights . Seizing opportunities therefore also means excluding applications . That is why there needs to be a clear ban on general surveillance obligations such as upload filters or biometric surveillance in public spaces . Let us create a smart , sustainable a

In [47]:
raw1_15 = 'Pernando Barrena Arza (The Left). – Mr President, the report on artificial intelligence in education, culture and the audiovisual sector is a good one, and I’d like to congratulate all the colleagues involved. The rapid and exponential development of artificial intelligence in a growing number of areas poses many challenges. In this regard, the report is the first one to take a socio—political approach to an artificial-intelligence strategy, to democracy, inclusion, securing freedom of speech and data security by looking at education, culture and media in the development and obligations of AI. Nevertheless, I would like to raise a couple of issues: first, the importance of stressing non—discrimination, eliminating gender stereotypes, and developing inclusivity, also in data collection and its development; and, second, a gain for linguistic diversity and language learning. In this regard, we advocate for AI as a tool offering a specific potential for innovation, specifically for lesser—used languages in Europe.'
tokens1_15 = word_tokenize(raw1_15)
for word in tokens1_15:
    print(word, end=' ')

Pernando Barrena Arza ( The Left ) . – Mr President , the report on artificial intelligence in education , culture and the audiovisual sector is a good one , and I ’ d like to congratulate all the colleagues involved . The rapid and exponential development of artificial intelligence in a growing number of areas poses many challenges . In this regard , the report is the first one to take a socio—political approach to an artificial-intelligence strategy , to democracy , inclusion , securing freedom of speech and data security by looking at education , culture and media in the development and obligations of AI . Nevertheless , I would like to raise a couple of issues : first , the importance of stressing non—discrimination , eliminating gender stereotypes , and developing inclusivity , also in data collection and its development ; and , second , a gain for linguistic diversity and language learning . In this regard , we advocate for AI as a tool offering a specific potential for innovatio

In [48]:
raw1_16 = 'Victor Negrescu (S&D). - Mr President, Commissioner, ladies and gentlemen, new technologies have a substantial impact on education, culture and the audiovisual sector. The European Parliament calls for the assessment of the role played by artificial intelligence, the development of the opportunities offered by these technologies, but also the adequate information and protection of users. We are the authors of this digital transformation and we can set the rules and principles, including ethical ones, based on which we develop these technologies, making the process more humane and inclusive. Artificial intelligence cannot replace teacher empathy, but technology can help increase the quality of education. Artist creativity should not be replaced by an algorithm, but future content creators can use technology to generate higher revenue. Developing artificial intelligence education and skills is essential. Europe can be a pioneer in the field of artificial intelligence, but it can also be a leader in ethics.'
tokens1_16 = word_tokenize(raw1_16)
for word in tokens1_16:
    print(word, end=' ')

Victor Negrescu ( S & D ) . - Mr President , Commissioner , ladies and gentlemen , new technologies have a substantial impact on education , culture and the audiovisual sector . The European Parliament calls for the assessment of the role played by artificial intelligence , the development of the opportunities offered by these technologies , but also the adequate information and protection of users . We are the authors of this digital transformation and we can set the rules and principles , including ethical ones , based on which we develop these technologies , making the process more humane and inclusive . Artificial intelligence can not replace teacher empathy , but technology can help increase the quality of education . Artist creativity should not be replaced by an algorithm , but future content creators can use technology to generate higher revenue . Developing artificial intelligence education and skills is essential . Europe can be a pioneer in the field of artificial intelligen

In [49]:
raw1_17 = 'Mariya Gabriel, Member of the Commission. – Mr President, ladies and gentlemen, Madam rapporteur, thank you very much for this interesting discussion. As you can see, the use of artificial intelligence in the education, cultural and creative sectors presents many opportunities, but also challenges. We want to preserve the European cultural diversity in the rapidly changing environment of today. We therefore want to ensure that the great diversity of European cultural and creative players can benefit from these developments. To explore the possibilities, the Commission has launched a study which will identify inspiring case studies for all creative sectors; results are expected in December. As for the challenges, your report rightly highlights, among others, the importance of protecting privacy, combating discrimination, promoting gender equality, respecting human rights intellectual property, environmental protection, and consumer rights. All of these challenges are being examined through various ongoing initiatives and studies. Creative Europe also recognizes these challenges and encourages collaboration between the culture and media sectors to address these challenges, through its newly created cross-sector innovation lab. I would like to end on a subject that I find particularly important. We must ensure that these technological developments do not change our values ​​and our European way of life. Progress without inclusion and equality, dissemination of news or data without ethics are just empty words. We need to ensure that no one will be left behind, that no one will feel their privacy or rights have been violated, and that everyone will have an equal opportunity to thrive. After four councils in recent days and four debates this evening, I would like to thank you once again for the extraordinary energy I felt this evening, and also tell you that you can count on my support.'
tokens1_17 = word_tokenize(raw1_17)
for word in tokens1_17:
    print(word, end=' ')

Mariya Gabriel , Member of the Commission . – Mr President , ladies and gentlemen , Madam rapporteur , thank you very much for this interesting discussion . As you can see , the use of artificial intelligence in the education , cultural and creative sectors presents many opportunities , but also challenges . We want to preserve the European cultural diversity in the rapidly changing environment of today . We therefore want to ensure that the great diversity of European cultural and creative players can benefit from these developments . To explore the possibilities , the Commission has launched a study which will identify inspiring case studies for all creative sectors ; results are expected in December . As for the challenges , your report rightly highlights , among others , the importance of protecting privacy , combating discrimination , promoting gender equality , respecting human rights intellectual property , environmental protection , and consumer rights . All of these challenges

---
### Combine all parts

In [50]:
tokens = tokens1_1 + tokens1_2 + tokens1_3 + tokens1_4 + tokens1_5 + tokens1_6 + tokens1_7 + tokens1_8 + tokens1_9 + tokens1_10 + tokens1_11 + tokens1_12 + tokens1_13 + tokens1_14 + tokens1_15 + tokens1_16 + tokens1_17 

---
### Normalize the words 

In [51]:
type(tokens)
eutext13 = [w.lower() for w in tokens]

---
**Save Output**

In [52]:
save_path = '/Users/charlottekaiser/Documents/uni/Hertie/master_thesis/00_data/20_intermediate_files'
file_name = "EU13_Artificial intelligence in education, culture and the audiovisual sector.txt"
completeName = os.path.join(save_path, file_name)
output = open(completeName, 'w')
print(eutext13, file=output)