# Assessing actor diversity
   
In this Jupyter Notebook I use SpaCy and dictionary-based approaches in order to assess the overall actor diversity for each article in the dataset. The steps I perform are the following:

1) Load packages and data   
2) Create lists for dictionary-based NER   
3) Count frequency of actor groups per article (this is done through functions that draw on SpaCy's tokenisation, the lists of relevant political actors and SpaCy's NER feature for different types of entities.   
4) Inspect and save the data

## 1) Load packages and data

In [2]:
#import pandas
import pandas as pd
import numpy as np
from pandas import read_excel
#load SpaCy
import spacy
#import German language model
import de_core_news_md
#define nlp pipe
nlp = de_core_news_md.load()

In [3]:
#read in the data
df = read_excel("complete_data_cleaned.xlsx")
#inspect dataframe
df.head(3)

Unnamed: 0.1,Unnamed: 0,ID,Newspaper,Date,Length,Category,Author,Headline,Teaser,Article,Modality,url,clean text,words in clean text,reach_dummy,modality_dummy
0,6,100006,sueddeutschet politik (www),2020-05-28T15:34:08,367,,,SZ Espresso: Nachrichten kompakt - die Übersic...,<p>Was heute wichtig war - und was Sie auf SZ....,Das Wichtigste zum Coronavirus. Berufstätige M...,online,https://www.sueddeutsche.de/politik/nachrichte...,"das wichtig coronavirus . berufstat mutt vat ,...",224,1,0
1,8,100008,sueddeutschet politik (www),2020-05-28T17:01:43,200,,,Kommunalpolitik: Abgeblendet,<p>Bayreuths Stadtrat im Stream</p>,"Livestream aus dem Stadtrat, das klingt transp...",online,https://www.sueddeutsche.de/bayern/kommunalpol...,"livestream stadtrat , klingt transparent erstr...",104,1,0
2,24,100024,aachener zeitung (www),2020-05-28T03:01:52,512,Politik,,Länder planen Öffnung: Streit über Schulen und...,"<img src=""https://www.aachener-zeitung.de/imgs...",Der Streit über die Wiederöffnung von Schulen ...,online,https://www.aachener-zeitung.de/politik/deutsc...,der streit wiederoffn schul kindergart kris ve...,318,0,0


## 2) Create lists for NER

### Lists of political elite actors on the national level

In [4]:
cdu_national = ["CDU", 
                "CDU-Fraktion",
                "Michael von Abercron",
                "Stephan Albani",
                "Norbert Altenkamp",
                "Peter Altmaier",
                "Philipp Amthor",
                "Thomas Bareiß",
                "Norbert Barthle",
                "Maik Beermann",
                "Manfred Behrens",
                "Veronika Bellmann",
                "Sybille Benning",
                "André Berghegger",
                "Melanie Bernstein",
                "Christoph Bernstiel",
                "Peter Beyer",
                "Marc Biadacz",
                "Steffen Bilger",
                "Peter Bleser",
                "Norbert Brackmann",
                "Michael Brand",
                "Helge Braun",
                "Silvia Breher",
                "Heike Brehmer",
                "Ralph Brinkhaus",
                "Carsten Brodesser",
                "Gitta Connemann",
                "Astrid Damerow",
                "Michael Donth",
                "Marie-Luise Dött",
                "Hermann Färber",
                "Uwe Feiler",
                "Enak Ferlemann",
                "Axel Fischer",
                "Maria Flachsbarth",
                "Thorsten Frei",
                "Hans-Joachim Fuchtel",
                "Ingo Gädechens",
                "Thomas Gebhart",
                "Alois Gerig",
                "Eberhard Gienger",
                "Eckhard Gnodtke",
                "Ursula Groden-Kranich",
                "Hermann Gröhe",
                "Klaus-Dieter Gröhler",
                "Michael Grosse-Brömer",
                "Astrid Grotelüschen",
                "Markus Grübel",
                "Manfred Grund",
                "Oliver Grundmann",
                "Monika Grütters",
                "Fritz Güntzler",
                "Olav Gutting",
                "Christian Haase",
                "Jürgen Hardt",
                "Matthias Hauer",
                "Mark Hauptmann",
                "Matthias Heider",
                "Mechthild Heil",
                "Thomas Heilmann",
                "Frank Heinrich",
                "Mark Helfrich",
                "Rudolf Henke",
                "Michael Hennrich",
                "Marc Henrichmann",
                "Ansgar Heveling",
                "Christian Hirte",
                "Heribert Hirte",
                "Hendrik Hoppenstedt",
                "Hans-Jürgen Irmer",
                "Thomas Jarzombek",
                "Andreas Jung",
                "Ingmar Jung",
                "Anja Karliczek",
                "Torbjörn Kartes",
                "Volker Kauder",
                "Stefan Kaufmann",
                "Ronja Kemmer",
                "Roderich Kiesewetter",
                "Georg Kippels",
                "Volkmar Klein",
                "Axel Knoerig",
                "Jens Koeppen",
                "Markus Koob",
                "Carsten Körber",
                "Alexander Krauß",
                "Gunther Krichbaum",
                "Günter Krings",
                "Rüdiger Kruse",
                "Roy Kühne",
                "Karl A. Lamers",
                "Andreas Lämmel",
                "Katharina Landgraf",
                "Jens Lehmann",
                "Katja Leikert",
                "Antje Lezius",
                "Carsten Linnemann",
                "Patricia Lips",
                "Nikolas Löbel",
                "Jan-Marco Luczak", 
                "Saskia Ludwig", 
                "Karin Maag", 
                "Yvonne Magwas", 
                "Thomas de Maizière", 
                "Gisela Manderla", 
                "Astrid Mannes",
                "Matern von Marschall",
                "Hans-Georg von der Marwitz",
                "Andreas Mattfeldt",
                "Michael Meister",
                "Angela Merkel",
                "Jan Metzler",
                "Mathias Middelberg",
                "Dietrich Monstadt",
                "Karsten Möring",
                "Elisabeth Motschmann",
                "Axel Müller",
                "Carsten Müller",
                "Sepp Müller",
                "Andreas Nick",
                "Petra Nicolaisen",
                "Michaela Noll",
                " Wilfried Oellers",
                "Josef Oster",
                "Henning Otte",
                "Ingrid Pahlmann",
                "Sylvia Pantel",
                "Martin Patzelt",
                "Joachim Pfeiffer",
                "Christoph Ploß",
                "Eckhard Pols",
                "Thomas Rachel",
                "Kerstin Radomski",
                "Eckhardt Rehberg",
                "Lothar Riebsamen",
                "Josef Rief",
                "Johannes Röring",
                "Norbert Röttgen",
                "Stefan Rouenhoff",
                "Erwin Rüddel",
                "Stefan Sauer",
                "Anita Schäfer",
                "Wolfgang Schäuble",
                "Jana Schimke",
                "Tankred Schipanski",
                "Claudia Schmidtke",
                "Patrick Schnieder",
                "Nadine Schön",
                "Felix Schreiner",
                "Klaus-Peter Schulze",
                "Uwe Schummer",
                "Armin Schuster",
                "Torsten Schweiger",
                "Detlef Seif",
                "Johannes Selle",
                "Reinhold Sendker",
                "Patrick Sensburg",
                "Björn Simon",
                "Tino Sorge",
                "Jens Spahn",
                "Frank Steffel",
                "Albert Stegemann",
                "Andreas Steier",
                "Peter Stein",
                "Sebastian Steineke",
                "Johannes Steiniger",
                "Christian von Stetten",
                "Dieter Stier",
                "Gero Storjohann",
                "Karin Strenz",
                "Peter Tauber",
                "Hermann-Josef Tebroke",
                "Hans-Jürgen Thies",
                "Alexander Throm",
                "Dietlind Tiemann",
                "Antje Tillmann",
                "Markus Uhl",
                "Arnold Vaatz",
                "Oswin Veith",
                "Kerstin Vieregge",
                "Volkmar Vogel",
                "Christoph de Vries",
                "Kees de Vries",
                "Johann Wadephul",
                "Marco Wanderwitz",
                "Nina Warken",
                "Kai Wegner",
                "Albert Weiler",
                "Marcus Weinberg",
                "Peter Weiß",
                "Sabine Weiss",
                "Ingo Wellenreuther",
                "Marian Wendt",
                "Kai Whittaker",
                "Annette Widmann-Mauz",
                "Bettina Wiesmann",
                "Klaus-Peter Willsch",
                "Elisabeth Winkelmeier-Becker", 
                "Oliver Wittke", 
                "Paul Ziemiak", 
                "Matthias Zimmer"]

In [5]:
csu_national = ["CSU",
                "CSU-Fraktion",
                "Artur Auernhammer",
                "Peter Aumer",
                "Dorothee Bär",
                "Reinhard Brandl",
                "Sebastian Brehm",
                "Alexander Dobrindt",
                "Hansjörg Durz",
                "Thomas Erndl",
                "Astrid Freudenstein",
                "Hans-Peter Friedrich",
                "Michael Frieser",
                "Florian Hahn",
                "Alexander Hoffmann",
                "Karl Holmeier",
                "Erich Irlstorfer",
                "Alois Karl",
                "Michael Kießling",
                "Michael Kuffer",
                "Ulrich Lange",
                "Silke Launert",
                "Paul Lehrieder",
                "Andreas Lenz",
                "Andrea Lindholz",
                "Bernhard Loos",
                "Daniela Ludwig",
                "Stephan Mayer",
                "Hans Michelbach",
                "Gerd Müller",
                "Stefan Müller",
                "Georg Nüßlein",
                "Florian Oßner",
                "Stephan Pilsinger",
                "Alexander Radwan",
                "Alois Rainer",
                "Peter Ramsauer",
                "Albert Rupprecht",
                "Andreas Scheuer",
                "Christian Schmidt",
                "Thomas Silberhorn",
                "Katrin Staffler",
                "Wolfgang Stefinger",
                "Stephan Stracke",
                "Max Straubinger",
                "Volker Ullrich",
                "Anja Weisgerber",
                "Emmi Zeulner"]

In [6]:
national_elite_other = ["Bundesregierung",
                       "Bundesamt für Wirtschaft",
                       "Regierung",
                       "Kanzlerin",
                       "Minister",
                        "Bildungsministerin",
                        "Wisschenschaftsministerin",
                        "Verkehrsminister",
                        "Finanzminister",
                        "Außenminister",
                        "Umweltministerin",
                        "Justizministerin",
                        "Gesundheitsminister",
                        "Landwirtschaftsministerin",
                        "Verteidigungsministerin",
                        "Innenminister"
                       "Union"]

In [7]:
spd_national = ["SPD",
                "Sozialdemokraten"
                "SPD-Fraktion",
                "Niels Annen",
                "Ingrid Arndt-Brauer",
                "Bela Bach",
                "Heike Baehrens",
                "Ulrike Bahr",
                "Nezahat Baradari",
                "Doris Barnett",
                "Matthias Bartke",
                "Sören Bartol",
                "Bärbel Bas",
                "Lothar Binding",
                "Eberhard Brecht",
                "Leni Breymaier",
                "Karl-Heinz Brunner",
                "Katrin Budde",
                "Lars Castellucci",
                "Bernhard Daldrup",
                "Karamba Diaby",
                "Esther Dilcher",
                "Sabine Dittmar",
                "Wiebke Esdar",
                "Saskia Esken",
                "Yasmin Fahimi",
                "Johannes Fechner",
                "Fritz Felgentreu",
                "Edgar Franke",
                "Ulrich Freese",
                "Dagmar Freitag",
                "Michael Gerdes",
                "Martin Gerster",
                "Angelika Glöckner",
                "Timon Gremmels",
                "Kerstin Griese",
                "Michael Groß",
                "Uli Grötsch",
                "Bettina Hagedorn",
                "Rita Hagl-Kehl",
                "Metin Hakverdi",
                "Sebastian Hartmann",
                "Dirk Heidenblut",
                "Hubertus Heil",
                "Gabriela Heinrich",
                "Marcus Held",
                "Wolfgang Hellmich",
                "Barbara Hendricks",
                "Gustav Herzog",
                "Gabriele Hiller-Ohm",
                "Thomas Hitschler",
                "Eva Högl",
                "Frank Junge",
                "Josip Juratovic",
                "Thomas Jurk",
                "Oliver Kaczmarek",
                "Johannes Kahrs",
                "Elisabeth Kaiser",
                "Ralf Kapschack",
                "Gabriele Katzmarek",
                "Cansel Kiziltepe",
                "Arno Klare",
                "Lars Klingbeil",
                "Bärbel Kofler",
                "Daniela Kolbe",
                "Elvan Korkmaz",
                "Anette Kramme",
                "Christine Lambrecht",
                "Christian Lange",
                "Karl Lauterbach",
                "Sylvia Lehmann",
                "Helge Lindh",
                "Kirsten Lühmann",
                "Heiko Maas",
                "Isabel Mackensen",
                "Caren Marks",
                "Katja Mast",
                "Christoph Matschie",
                "Hilde Mattheis",
                "Matthias Miersch",
                "Klaus Mindrup",
                "Susanne Mittag",
                "Falko Mohrs",
                "Claudia Moll",
                "Siemtje Möller",
                "Bettina Müller",
                "Detlef Müller",
                "Michelle Müntefering",
                "Rolf Mützenich",
                "Dietmar Nietan",
                "Ulli Nissen",
                "Thomas Oppermann",
                "Josephine Ortleb",
                "Mahmut Özdemir",
                "Aydan Özoğuz",
                "Markus Paschke",
                "Christian Petry",
                "Detlev Pilger",
                "Sabine Poschmann",
                "Achim Post",
                "Florian Post",
                "Florian Pronold",
                "Sascha Raabe",
                "Martin Rabanus",
                "Daniela De Ridder",
                "Andreas Rimkus",
                "Sönke Rix",
                "Dennis Rohde",
                "Martin Rosemann",
                "René Röspel",
                "Ernst Dieter Rossmann",
                "Michael Roth",
                "Susann Rüthrich",
                "Bernd Rützel",
                "Sarah Ryglewski",
                "Johann Saathoff",
                "Axel Schäfer",
                "Nina Scheer",
                "Marianne Schieder",
                "Udo Schiefner",
                "Nils Schmid",
                "Dagmar Schmidt",
                "Ulla Schmidt",
                "Uwe Schmidt",
                "Carsten Schneider",
                "Johannes Schraps",
                "Michael Schrodi",
                "Ursula Schulte",
                "Martin Schulz",
                "Swen Schulz",
                "Frank Schwabe",
                "Stefan Schwartze",
                "Andreas Schwarz",
                "Rita Schwarzelühr-Sutter",
                "Rainer Spiering",
                "Svenja Stadler",
                "Martina Stamm-Fibich",
                "Sonja Steffen",
                "Mathias Stein",
                "Kerstin Tack",
                "Claudia Tausend",
                "Michael Thews",
                "Markus Töns",
                "Carsten Träger",
                "Ute Vogt",
                "Marja-Liisa Völlers",
                "Dirk Vöpel",
                "Gabi Weber",
                "Joe Weingarten",
                "Bernd Westphal",
                "Dirk Wiese",
                "Gülistan Yüksel",
                "Dagmar Ziegler",
                "Stefan Zierke",
                "Jens Zimmermann"]

### Lists of political opposition actors on the national level

In [8]:
linke_national = ["Die Linke",
                  "Doris Achelwilm",
                  "Gökay Akbulut",
                  "Simone Barrientos",
                  "Dietmar Bartsch",
                  "Lorenz Gösta Beutin",
                  "Matthias W. Birkwald",
                  "Heidrun Bluhm",
                  "Michel Brandt",
                  "Christine Buchholz",
                  "Birke Bull-Bischoff",
                  "Jörg Cezanne",
                  "Sevim Dağdelen",
                  "Diether Dehm",
                  "Anke Domscheit-Berg",
                  "Klaus Ernst",
                  "Susanne Ferschl",
                  "Brigitte Freihold",
                  "Sylvia Gabelmann",
                  "Nicole Gohlke",
                  "Gregor Gysi",
                  "André Hahn",
                  "Anja Hajduk",
                  "Heike Hänsel",
                  "Matthias Höhn",
                  "Andrej Hunko",
                  "Ulla Jelpke",
                  "Kerstin Kassner",
                  "Achim Kessler",
                  "Katja Kipping",
                  "Jan Korte",
                  "Jutta Krellmann",
                  "Caren Lay",
                  "Sabine Leidig",
                  "Ralph Lenkert",
                  "Michael Leutert",
                  "Stefan Liebich",
                  "Gesine Lötzsch",
                  "Thomas Lutze",
                  "Fabio De Masi",
                  "Pascal Meiser",
                  "Amira Mohamed Ali",
                  "Cornelia Möhring",
                  "Niema Movassat",
                  "Norbert Müller",
                  "Zaklin Nastic",
                  "Alexander Neu",
                  "Thomas Nord",
                  "Petra Pau",
                  "Sören Pellmann",
                  "Victor Perli",
                  "Tobias Pflüger",
                  "Ingrid Remmers",
                  "Martina Renner",
                  "Bernd Riexinger",
                  "Eva Schreiber",
                  "Petra Sitte",
                  "Evrim Sommer",
                  "Kersten Steinke",
                  "Friedrich Straetmanns",
                  "Kirsten Tackmann",
                  "Jessica Tatti",
                  "Alexander Ulrich",
                  "Kathrin Vogler",
                  "Sahra Wagenknecht",
                  "Andreas Wagner",
                  "Harald Weinberg",
                  "Katrin Werner",
                  "Hubertus Zdebel",
                  "Pia Zimmermann",
                  "Sabine Zimmermann"
                 ]

In [9]:
gruene_national = ["Grünen",
                   "Die Grünen",
                   "Bündnis 90 Die Grünen",
                   "Luise Amtsberg",
                   "Lisa Badum",
                   "Annalena Baerbock",
                   "Margarete Bause",
                   "Danyal Bayaz",
                   "Canan Bayram",
                   "Franziska Brantner",
                   "Agnieszka Brugger",
                   "Anna Christmann",
                   "Ekin Deligöz",
                   "Katja Dörner",
                   "Katharina Dröge",
                   "Harald Ebner",
                   "Matthias Gastel",
                   "Kai Gehring",
                   "Stefan Gelbhaar",
                   "Katrin Göring-Eckardt",
                   "Erhard Grundl",
                   "Britta Haßelmann",
                   "Bettina Hoffmann",
                   "Anton Hofreiter",
                   "Ottmar von Holtz",
                   "Dieter Janecek",
                   "Kirsten Kappert-Gonther",
                   "Uwe Kekeritz",
                   "Katja Keul",
                   "Sven-Christian Kindler",
                   "Maria Klein-Schmeink",
                   "Sylvia Kotting-Uhl",
                   "Oliver Krischer",
                   "Christian Kühn",
                   "Stephan Kühn",
                   "Renate Künast",
                   "Markus Kurth",
                   "Monika Lazar",
                   "Sven Lehmann",
                   "Steffi Lemke",
                   "Tobias Lindner",
                   "Irene Mihalic",
                   "Claudia Müller",
                   "Beate Müller-Gemmeke",
                   "Ingrid Nestle",
                   "Konstantin von Notz",
                   "Omid Nouripour",
                   "Friedrich Ostendorff",
                   "Cem Özdemir",
                   "Lisa Paus",
                   "Filiz Polat",
                   "Tabea Rößner",
                   "Claudia Roth",
                   "Manuela Rottmann",
                   "Corinna Rüffer",
                   "Manuel Sarrazin",
                   "Ulle Schauws",
                   "Frithjof Schmidt",
                   "Stefan Schmidt",
                   "Charlotte Schneidewind-Hartnagel"
                   "Kordula Schulz-Asche",
                   "Wolfgang Strengmann-Kuhn",
                   "Margit Stumpp",
                   "Markus Tressel",
                   "Jürgen Trittin",
                   "Julia Verlinden",
                   "Daniela Wagner",
                   "Beate Walter-Rosenheimer",
                   "Gerhard Zickenheiner"
                  ]

In [10]:
afd_national = ["AFD",
                "Alternative für Deutschland",
                "Bernd Baumann",
                "Marc Bernhard",
                "Andreas Bleck",
                "Peter Boehringer",
                "Stephan Brandner",
                "Jürgen Braun",
                "Marcus Bühl",
                "Matthias Büttner",
                "Petr Bystron",
                "Tino Chrupalla",
                "Joana Cotar",
                "Gottfried Curio",
                "Siegbert Droese",
                "Thomas Ehrhorn",
                "Berengar Elsner von Gronow",
                "Michael Espendiller",
                "Peter Felser",
                "Dietmar Friedhoff",
                "Anton Friesen",
                "Markus Frohnmaier",
                "Götz Frömming",
                "Alexander Gauland",
                "Axel Gehrke",
                "Albrecht Glaser",
                "Franziska Gminder",
                "Wilhelm von Gottberg",
                "Kay Gottschalk",
                "Armin-Paul Hampel",
                "Mariana Harder-Kühnel",
                "Roland Hartwig",
                "Jochen Haug",
                "Martin Hebner",
                "Udo Hemmelgarn",
                "Waldemar Herdt",
                "Martin Hess",
                "Heiko Heßenkemper",
                "Karsten Hilse",
                "Nicole Höchst",
                "Martin Hohmann",
                "Bruno Hollnagel",
                "Leif-Erik Holm",
                "Johannes Huber",
                "Fabian Jacobi",
                "Marc Jongen",
                "Jens Kestner",
                "Stefan Keuter",
                "Norbert Kleinwächter",
                "Enrico Komning",
                "Jörn König",
                "Steffen Kotré",
                "Rainer Kraft",
                "Rüdiger Lucassen",
                "Frank Magnitz",
                "Jens Maier",
                "Lothar Maier",
                "Birgit Malsack-Winkemann",
                "Corinna Miazga",
                "Andreas Mrosek",
                "Hansjörg Müller",
                "Volker Münz",
                "Sebastian Münzenmaier",
                "Christoph Neumann",
                "Jan Nolte",
                "Ulrich Oehme",
                "Gerold Otten",
                "Frank Pasemann",
                "Tobias Peterka",
                "Paul Podolay",
                "Jürgen Pohl",
                "Stephan Protschka",
                "Martin Reichardt",
                "Martin Renner",
                "Roman Reusch",
                "Ulrike Schielke-Ziesing",
                "Robby Schlund",
                "Jörg Schneider",
                "Uwe Schulz",
                "Thomas Seitz",
                "Martin Sichert",
                "Detlev Spangenberg",
                "Dirk Spaniel",
                "René Springer",
                "Beatrix von Storch",
                "Alice Weidel",
                "Harald Weyel",
                "Wolfgang Wiehle",
                "Heiko Wildberg",
                "Christian Wirth",
                "Uwe Witt",
               ]

In [11]:
independents_national = ["Verena Hartmann",
                         "Lars Hermann",
                         "Uwe Kamann",
                         "Mario Mieruch",
                         "Frauke Petry",
                         "Marco Bülow"
                        ]

In [12]:
fdp_national = ["FDP",
                "Freie Demokraten",
                "Grigorios Aggelidis",
                "Renata Alt",
                "Christine Aschenberg-Dugnus",
                "Nicole Bauer",
                "Jens Beeck",
                "Jens Brandenburg",
                "Mario Brandenburg",
                "Sandra Bubendorfer-Licht",
                "Marco Buschmann",
                "Karlheinz Busen",
                "Carlo Cronenberg",
                "Britta Katharina Dassler",
                "Bijan Djir-Sarai",
                "Christian Dürr",
                "Hartmut Ebbing",
                "Marcus Faber",
                "Otto Fricke",
                "Daniel Föst",
                "Alexander Graf Lambsdorff",
                "Thomas Hacker",
                "Reginald Hanke",
                "Peter Heidt",
                "Katrin Helling-Plahr",
                "Markus Herbrand",
                "Torsten Herbst",
                "Katja Hessel",
                "Gero Hocker",
                "Christoph Hoffmann",
                "Reinhard Houben",
                "Manuel Höferlin",
                "Ulla Ihnen",
                "Olaf in der Beek",
                "Gyde Jensen",
                "Christian Jung",
                "Karsten Klein",
                "Marcel Klinge",
                "Daniela Kluckert",
                "Pascal Kober",
                "Carina Konrad",
                "Wolfgang Kubicki",
                "Konstantin Kuhle",
                "Alexander Kulitz",
                "Lukas Köhler",
                "Ulrich Lechte",
                "Christian Lindner",
                "Michael Link",
                "Oliver Luksic",
                "Till Mansmann",
                "Jürgen Martens",
                "Christoph Meyer",
                "Alexander Müller",
                "Roman Müller-Böhm",
                "Frank Müller-Rosentritt",
                "Martin Neumann",
                "Hagen Reinhold",
                "Bernd Reuther",
                "Thomas Sattelberger",
                "Christian Sauter",
                "Wieland Schinnenburg",
                "Frank Schäffler",
                "Matthias Seestern-Pauly",
                "Frank Sitta",
                "Judith Skudelny",
                "Hermann Otto Solms",
                "Bettina Stark-Watzinger",
                "Marie-Agnes Strack-Zimmermann",
                "Benjamin Strasser",
                "Katja Suding",
                "Linda Teuteberg",
                "Michael Theurer",
                "Stephan Thomae",
                "Manfred Todtenhausen",
                "Florian Toncar",
                "Andrew Ullmann",
                "Gerald Ullrich",
                "Johannes Vogel",
                "Sandra Weeser",
                "Nicole Westig", 
                "Katharina Willkomm"
               ]

### Lists of political elite actors on the regional level

In [13]:
gruene_bw = ["Muhterem Aras",
           "Theresia Bauer",
           "Susanne Bay",
           "Hans-Peter Behrens",
           "Andrea Bogner-Unden",
           "Sandra Boser",
           "Martina Braun",
           "Nese Erikli",
           "Jürgen Filius",
           "Josha Frey",
           "Martin Grath",
           "Petra Häffner",
           "Martin Hahn",
           "Willi Halder",
           "Thomas Hentschel",
           "Winfried Hermann",
           "Hermann Katzenstein",
           "Manfred Kern",
           "Petra Krebs",
           "Winfried Kretschmann",
           "Daniel Lede Abal",
           "Ute Leidig",
           "Andrea Lindlohr",
           "Brigitte Lösch"
           "Manfred Lucha",
           "Alexander Maier",
           "Thomas Marwein",
           "Bärbl Mielich",
           "Bernd Murschel",
           "Jutta Niemann",
           "Reinhold Pix",
           "Thomas Poreski",
           "Daniel Renkonen",
           "Markus Rösler",
           "Barbara Saebel",
           "Alexander Salomon",
           "Alexander Schoch",
           "Andrea Schwarz",
           "Andreas Schwarz",
           "Uli Sckerl",
           "Stefanie Seemann",
           "Edith Sitzmann",
           "Franz Untersteller",
           "Thekla Walker",
           "Jürgen Walter",
           "Dorothea Wehinger",
           "Elke Zimmer"
          ]

In [14]:
cdu_bw = ["Norbert Beck",
          "Alexander Becker",
          "Thomas Blenke",
          "Klaus Burger",
          "Andreas Deuschle",
          "Thomas Dörflinger",
          "Konrad Epple",
          "Arnulf Freiherr von Eyb",
          "Marion Gentges",
          "Fabian Gramling",
          "Friedlinde Gurr-Hirsch",
          "Manuel Hagel",
          "Sabine Hartmann-Müller",
          "Raimund Haser",
          "Peter Hauk",
          "Ulli Hockenberger",
          "Nicole Hoffmeister-Kraut",
          "Isabell Huber",
          "Karl Klein",
          "Wilfried Klenk",
          "Joachim Kößler",
          "Sabine Kurtz",
          "Siegfried Lorek",
          "Winfried Mack",
          "Claudia Martin",
          "Paul Nemeth",
          "Christine Neumann",
          "Claus Paal",
          "Julia Philippi",
          "Patrick Rapp",
          "Nicole Razavi",
          "Wolfgang Reinhart",
          "Karl-Wilhelm Röhm",
          "Karl Rombach",
          "Volker Schebesta",
          "Stefan Scheffold",
          "August Schuler",
          "Albrecht Schütte",
          "Willi Stächele",
          "Stefan Teufel",
          "Tobias Wald",
          "Guido Wolf",
          "Karl Zimmermann"
         ]

In [15]:
cdu_nrw = ["Günther Bergmann",
           "Peter Biesenbach",
           "Jörg Blöming",
           "Marc Blondin",
           "Frank Boss",
           "Florian Braun",
           "Rainer Deppe",
           "Guido Déus",
           "Angela Erwin",
           "Björn Franken",
           "Heinrich Frieling",
           "Anke Fuchs-Dreisbach",
           "Katharina Gebauer",
           "Jörg Geerlings",
           "Matthias Goeken",
           "Gregor Golland",
           "Daniel Hagemeier",
           "Wilhelm Hausmann",
           "Bernhard Hoppe-Biermeyer",
           "Josef Hovenjürgen",
           "Klaus Kaiser",
           "Jens Kamieth",
           "Christos Georg Katzidis",
           "Oliver Kehrl",
           "Matthias Kerkhoff",
           "Jochen Klenner",
           "Kirstin Korte",
           "Wilhelm Korth",
           "Oliver Krauß",
           "Bernd Krückel",
           "André Kuper",
           "Armin Laschet",
           "Olaf Lehne",
           "Lutz Lienenkämper",
           "Bodo Löttgen",
           "Arne Moritz",
           "Stefan Nacke",
           "Jens-Peter Nettekoven",
           "Ralf Nolten",
           "Britta Oellers",
           "Marcus Optendrenk",
           "Dietmar Panske",
           "Patricia Peill",
           "Bernd Petelkau",
           "Romina Plonsker",
           "Peter Preuß",
           "Charlotte Quik",
           "Henning Rehbaum",
           "Jochen Ritter",
           "Frank Rock",
           "Thorsten Schick",
           "Claudia Schlottmann",
           "Hendrik Schmitz",
           "Marco Schmitz",
           "Thomas Schnelle",
           "Rüdiger Scholz",
           "Fabian Schrumpf",
           "Christina Schulze",
           "Föcking Daniel",
           "Sieveke Martin Sträßer",
           "Andrea Stullich",
           "Raphael Tigges",
           "Heike Troles",
           "Christian Untrieser",
           "Marco Voge",
           "Petra Vogt",
           "Margret Voßeler",
           "Klaus Voussem",
           "Simone Wendland",
           "Heike Wermer",
           "Bianca Winkelmann",
           "Hendrik Wüst"
          ]

In [16]:
fdp_nrw = ["Daniela Beihl",
           "Ralph Bombis",
           "Dietmar Brockes",
           "Alexander Brockmeier",
           "Lorenz Deutsch",
           "Markus Diekhoff",
           "Angela Freimuth",
           "Jörn Freynick",
           "Yvonne Gebauer",
           "Marcel Hafke",
           "Martina Hannen",
           "Stephan Haupt",
           "Henning Höne",
           "Stefan Lenzen",
           "Marc Lürbke",
           "Christian Mangen",
           "Rainer Matheisen",
           "Bodo Middeldorf",
           "Franziska Müller-Rech",
           "Thomas Nückel",
           "Stephen Paul",
           "Werner Pfeil",
           "Christof Rasche",
           "Ulrich Reuter",
           "Susanne Schneider",
           "Joachim Stamp",
           "Andreas Terhaag",
           "Ralf Witzel"
          ]

### Lists of political opoosition actors on the regional level

In [17]:
afd_nrw = ["Roger Beckamp",
           "Christian Blex",
           "Iris Dworeck-Danielowski",
           "Andreas Keith-Volkmer",
           "Christian Loose",
           "Thomas Röckemann",
           "Helmut Seifen",
           "Herbert Strotebeck",
           "Sven Tritschler",
           "Martin Vincentz",
           "Markus Wagner",
           "Gabriele Walger-Demolsky"
          ]

In [18]:
independents_nrw = ["Alexander Langguth",
                    "Frank Neppe",
                    "Marcus Pretzell",
                    "Nic Peter Vogel"
                   ]

In [19]:
gruene_nrw = ["Berivan Aymaz",
              "Horst Becker",
              "Sigrid Beer",
              "Matthi Bolte-Richter",
              "Wibke Brems",
              "Monika Düker",
              "Stefan Engstfeld",
              "Oliver Martin Keymis",
              "Arndt Klocke",
              "Mehrdad Mostofizadeh",
              "Josefine Paul",
              "Johannes Remmel",
              "Norwich Rüße",
              "Verena Schäffer"
             ]

In [20]:
spd_nrw = ["Britta Altenkamp",
           "Volkan Baran",
           "Andreas Becker",
           "Dietmar Bell",
           "Andreas Bialas",
           "Rainer Bischoff",
           "Inge Blask",
           "Sonja Bongers",
           "Frank Börner",
           "Martin Börschel",
           "Rainer Bovermann",
           "Nadja Büteführ",
           "Anja Butschkau",
           "Christian Dahm",
           "Gordan Dudas",
           "Georg Fortmeier",
           "Hartmut Ganzke",
           "Heike Gebhard",
           "Thomas Göddertz",
           "Carina Gödecke",
           "Gabriele Hammelrath",
           "Marc Herter",
           "Michael Hübner",
           "Ralf Jäger",
           "Armin Jahl",
           "Wolfgang Jörg",
           "Stefan Kämmerling",
           "Christina Kampmann",
           "Lisa-Kristin Kapteinat",
           "Regina Kopp-Herr",
           "Hans-Willi Körfges",
           "Andreas Kossiski",
           "Hannelore Kraft",
           "Hubertus Kramer",
           "Thomas Kutschaty",
           "Carsten Löcker",
           "Angela Lück",
           "Nadja Lüders",
           "Josefa Maria (Eva) Lux",
           "Dennis Maelzer",
           "Frank Müller",
           "Elisabeth Müller-Witt",
           "Josef Neumann",
           "Jochen Ott",
           "Sarah Philipp",
           "Ernst-Wilhelm Rahe",
           "Norbert Römer",
           "Karsten Rudolph",
           "Susana dos Santos Herrmann",
           "Rainer Schmeltzer",
           "René Schneider",
           "Karl Schultheis",
           "Ina Spanier-Oppermann",
           "André Stinka",
           "Ellen Stock",
           "Marlies Stotz",
           "Frank Sundermann",
           "Alexander Vogt",
           "Eva-Maria Voigt-Küppers",
           "Annette Watermann-Krass",
           "Sebastian Watermeier",
           "Rüdiger Weiß",
           "Christina Weng",
           "Markus Weske",
           "Sven Wolf",
           "Ibrahim Yetim",
           "Serdar Yüksel",
           "Stefan Zimkeit"
          ]

In [21]:
afd_bw = ["Rainer Balzer",
          "Anton Baron",
          "Christina Baum",
          "Klaus Dürr",
          "Bernd Gögel",
          "Bernd Grimmer",
          "Rüdiger Klos",
          "Heiner Merz",
          "Thomas Axel Palka",
          "Rainer Podeswa",
          "Stefan Räpple",
          "Daniel Rottmann",
          "Emil Sänze",
          "Doris Senger",
          "Hans Peter Stauch",
          "Udo Stein",
          "Klaus-Günther Voigtmann",
          "Carola Wolle"
         ]

In [22]:
fdp_bw = ["Stephen Brauer",
          "Rudi Fischer",
          "Ulrich Goll",
          "Jochen Haußmann",
          "Klaus Hoher",
          "Daniel Karrais",
          "Jürgen Keck",
          "Timm Kern",
          "Gabriele Reich-Gutjahr",
          "Hans-Ulrich Rülke",
          "Erik Schweickert",
          "Nico Weinmann"
         ]

In [23]:
independents_bw = ["Heinrich Fiechtner",
                   "Wolfgang Gedeon",
                   "Stefan Herre",
                   "Harald Pfeiffer"
                  ]

In [24]:
spd_bw = ["Sascha Binder",
          "Daniel Born",
          "Nicolas Fink",
          "Stefan Fulst-Blei",
          "Reinhold Gall",
          "Gernot Gruber",
          "Rainer Hinderer",
          "Peter Hofelich",
          "Andreas Kenner",
          "Gerhard Kleinböck",
          "Georg Nelius",
          "Martin Rivoir",
          "Gabi Rolland",
          "Ramazan Selcuk",
          "Rainer Stickelberger",
          "Andreas Stoch",
          "Jonas Weber",
          "Boris Weirauch",
          "Sabine Wölfle"
         ]

### List with all elite actors

In [25]:
elite_actors = cdu_national + csu_national + spd_national + national_elite_other + cdu_bw + gruene_bw + cdu_nrw + fdp_nrw

### List with all opposition actors

In [26]:
opposition_actors = linke_national + gruene_national + afd_national + independents_national + fdp_national + afd_nrw + independents_nrw + gruene_nrw + spd_nrw + afd_bw + fdp_nrw + independents_bw + spd_bw

## 3) Count actor groups

In [27]:
#functions to calculate number of different actors

#function for national elite actors
def count_national_elite_actors(text):
    doc = nlp(text)
    entities = [(ent.text, ent.label_) for ent in doc.ents]
    #turn the list into a df
    entities = pd.DataFrame(entities)
    #rename columns
    entities = entities.rename(columns = {0:"entity",1:"type"})
    #remove duplicates
    duplicates = entities[entities.duplicated(["entity"])]
    entities = entities[~entities["entity"].isin(duplicates["entity"])]
    #filter out irrelevant entities such as locations
    relevant_entities = ["PER", "ORG"]
    entities_rel = entities.loc[entities["type"].isin(relevant_entities)]
    #count the actors
    number= 0
    for e in entities_rel["entity"]:
        if e in cdu_national or e in csu_national or e in spd_national or e in national_elite_other:
            number = number +1
    #return the count
    return number

#function for national elite actors
def count_national_opposition_actors(text):
    doc = nlp(text)
    entities = [(ent.text, ent.label_) for ent in doc.ents]
    #turn the list into a df
    entities = pd.DataFrame(entities)
    #rename columns
    entities = entities.rename(columns = {0:"entity",1:"type"})
    #remove duplicates
    duplicates = entities[entities.duplicated(["entity"])]
    entities = entities[~entities["entity"].isin(duplicates["entity"])]
    #filter out irrelevant entities such as locations
    relevant_entities = ["PER", "ORG"]
    entities_rel = entities.loc[entities["type"].isin(relevant_entities)]
    #count the actors
    number= 0
    for e in entities_rel["entity"]:
        if e in afd_national or e in gruene_national or e in fdp_national or e in linke_national or e in independents_national:
            number = number +1
    #return the count
    return number

#function for national elite actors
def count_regional_elite_actors(text):
    doc = nlp(text)
    entities = [(ent.text, ent.label_) for ent in doc.ents]
    #turn the list into a df
    entities = pd.DataFrame(entities)
    #rename columns
    entities = entities.rename(columns = {0:"entity",1:"type"})
    #remove duplicates
    duplicates = entities[entities.duplicated(["entity"])]
    entities = entities[~entities["entity"].isin(duplicates["entity"])]
    #filter out irrelevant entities such as locations
    relevant_entities = ["PER", "ORG"]
    entities_rel = entities.loc[entities["type"].isin(relevant_entities)]
    #count the actors
    number= 0
    for e in entities_rel["entity"]:
        if e in cdu_bw or e in gruene_bw or e in cdu_nrw or e in fdp_nrw:
            number = number +1
    #return the count
    return number

#function for national elite actors
def count_regional_opposition_actors(text):
    doc = nlp(text)
    entities = [(ent.text, ent.label_) for ent in doc.ents]
    #turn the list into a df
    entities = pd.DataFrame(entities)
    #rename columns
    entities = entities.rename(columns = {0:"entity",1:"type"})
    #remove duplicates
    duplicates = entities[entities.duplicated(["entity"])]
    entities = entities[~entities["entity"].isin(duplicates["entity"])]
    #filter out irrelevant entities such as locations
    relevant_entities = ["PER", "ORG"]
    entities_rel = entities.loc[entities["type"].isin(relevant_entities)]
    #count the actors
    number= 0
    for e in entities_rel["entity"]:
        if e in afd_bw or e in independents_bw or e in fdp_bw or e in spd_bw or e in independents_nrw or e in spd_nrw or e in gruene_nrw or e in afd_nrw:
            number = number +1
    #return the count
    return number

#function to count all persons that are mentioned
def count_persons(text):
    doc = nlp(text)
    entities = [(ent.text, ent.label_) for ent in doc.ents]
    #turn the list into a df
    entities = pd.DataFrame(entities)
    #rename columns
    entities = entities.rename(columns = {0:"entity",1:"type"})
    #remove duplicates
    duplicates = entities[entities.duplicated(["entity"])]
    entities = entities[~entities["entity"].isin(duplicates["entity"])]
    #filter out irrelevant entities such as locations
    relevant_entities = ["PER", "ORG"]
    entities_rel = entities.loc[entities["type"].isin(relevant_entities)]
    entities_rel = entities_rel[(entities_rel.type == "PER")]
    #count the actors
    number= 0
    for e in entities_rel["entity"]:
        if e not in elite_actors or e not in opposition_actors:
            number = number +1
    #return the count
    return number

#function to count all organisations that are mentioned
def count_organisations(text):
    doc = nlp(text)
    entities = [(ent.text, ent.label_) for ent in doc.ents]
    #turn the list into a df
    entities = pd.DataFrame(entities)
    #rename columns
    entities = entities.rename(columns = {0:"entity",1:"type"})
    #remove duplicates
    duplicates = entities[entities.duplicated(["entity"])]
    entities = entities[~entities["entity"].isin(duplicates["entity"])]
    #filter out irrelevant entities such as locations
    relevant_entities = ["PER", "ORG"]
    entities_rel = entities.loc[entities["type"].isin(relevant_entities)]
    entities_rel = entities_rel[(entities_rel.type == "ORG")]
    #count the actors
    number= 0
    for e in entities_rel["entity"]:
        if e not in elite_actors or e not in opposition_actors:
            number = number +1
    #return the count
    return number

In [28]:
#apply all functions and save results for each article in a newsly created column
#overall person count
df["persons"] = [count_persons(text) for text in df["Article"]]

In [29]:
#overall organisation count
df["organisations"] = [count_organisations(text) for text in df["Article"]]

In [30]:
#number of elite actors on the national level
df["national_elite_actors"] = [count_national_elite_actors(text) for text in df["Article"]]

In [31]:
#number of opposition actors on the national level
df["national_opposition_actors"] = [count_national_opposition_actors(text) for text in df["Article"]]

In [43]:
#number of elite actors on the regional level from the states where the regional newspapers are located
df["regional_elite_actors"] = [count_regional_elite_actors(text) for text in df["Article"]]

In [44]:
#number of opposition actors on the regional level
df["regional_opposition_actors"] = [count_regional_opposition_actors(text) for text in df["Article"]]

## Compute diversity index

In [45]:
df["person dummy"] = np.where(df["persons"]>=1, 1, 0)
df["organisation dummy"] = np.where(df["organisations"]>=1, 1, 0)
df["nea dummy"] = np.where(df["national_elite_actors"]>=1, 1, 0)
df["noa dummy"] = np.where(df["national_opposition_actors"]>=1, 1, 0)
df["rea dummy"] = np.where(df["regional_elite_actors"]>=1, 1, 0)
df["roa dummy"] = np.where(df["regional_opposition_actors"]>=1, 1, 0)

In [46]:
df["ea dummy"] = np.where(df["national_elite_actors"]>=1, 1, np.where(df["regional_elite_actors"]>=1, 1, 0))
df["oa dummy"] = np.where(df["national_opposition_actors"]>=1, 1, np.where(df["regional_opposition_actors"]>=1, 1, 0))

In [47]:
df["diversity index"] = df["person dummy"] + df["organisation dummy"] + df["ea dummy"] + df["oa dummy"] 
df["diversity index all actors"] = df["person dummy"] + df["organisation dummy"] + df["nea dummy"] +df["noa dummy"] + df["rea dummy"] +df["roa dummy"]

## 4) Inspect and save the data

In [48]:
df.head(3)

Unnamed: 0.1,Unnamed: 0,ID,Newspaper,Date,Length,Category,Author,Headline,Teaser,Article,...,person dummy,organisation dummy,nea dummy,noa dummy,rea dummy,roa dummy,ea dummy,oa dummy,diversity index,diversity index all actors
0,6,100006,sueddeutschet politik (www),2020-05-28T15:34:08,367,,,SZ Espresso: Nachrichten kompakt - die Übersic...,<p>Was heute wichtig war - und was Sie auf SZ....,Das Wichtigste zum Coronavirus. Berufstätige M...,...,1,1,0,0,0,0,0,0,2,2
1,8,100008,sueddeutschet politik (www),2020-05-28T17:01:43,200,,,Kommunalpolitik: Abgeblendet,<p>Bayreuths Stadtrat im Stream</p>,"Livestream aus dem Stadtrat, das klingt transp...",...,1,0,0,0,0,0,0,0,1,1
2,24,100024,aachener zeitung (www),2020-05-28T03:01:52,512,Politik,,Länder planen Öffnung: Streit über Schulen und...,"<img src=""https://www.aachener-zeitung.de/imgs...",Der Streit über die Wiederöffnung von Schulen ...,...,1,1,1,1,0,0,1,1,4,4


In [49]:
df.groupby("Newspaper").describe()

Unnamed: 0_level_0,Unnamed: 0,Unnamed: 0,Unnamed: 0,Unnamed: 0,Unnamed: 0,Unnamed: 0,Unnamed: 0,Unnamed: 0,ID,ID,...,diversity index,diversity index,diversity index all actors,diversity index all actors,diversity index all actors,diversity index all actors,diversity index all actors,diversity index all actors,diversity index all actors,diversity index all actors
Unnamed: 0_level_1,count,mean,std,min,25%,50%,75%,max,count,mean,...,75%,max,count,mean,std,min,25%,50%,75%,max
Newspaper,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2
Aachener Zeitung,970.0,492.615464,286.104359,0.0,245.25,491.5,739.75,990.0,970.0,493.615464,...,3.0,4.0,970.0,2.61134,0.976755,0.0,2.0,3.0,3.0,6.0
Der Tagesspiegel,1286.0,10927.010886,375.730454,10278.0,10601.25,10924.5,11252.75,11575.0,1286.0,10928.010886,...,3.0,4.0,1286.0,2.691291,0.765218,0.0,2.0,3.0,3.0,5.0
Die Welt,831.0,1407.464501,240.577279,991.0,1199.5,1408.0,1615.5,1823.0,831.0,1408.464501,...,3.0,4.0,831.0,2.67148,0.852575,0.0,2.0,3.0,3.0,5.0
Rheinische Post,2375.0,3486.029474,945.658182,1824.0,2680.5,3490.0,4302.5,5107.0,2375.0,3487.029474,...,3.0,4.0,2375.0,2.598316,0.918497,0.0,2.0,3.0,3.0,6.0
Stuttgarter Zeitung,1237.0,5749.415521,373.245187,5109.0,5430.0,5744.0,6071.0,6402.0,1237.0,5750.415521,...,3.0,4.0,1237.0,2.732417,0.978916,0.0,2.0,3.0,3.0,6.0
Süddeutsche Zeitung (inkl. Regionalausgaben),3720.0,8341.211828,1120.275724,6403.0,7376.75,8343.5,9309.25,10277.0,3720.0,8342.211828,...,3.0,4.0,3720.0,2.505645,0.799281,0.0,2.0,2.0,3.0,5.0
aachener zeitung (www),168.0,4215.535714,2451.087736,24.0,1971.75,4245.0,6544.75,7928.0,168.0,104215.535714,...,3.0,4.0,168.0,2.470238,0.947266,0.0,2.0,2.0,3.0,5.0
der tagesspiegel (www),264.0,4147.094697,2213.800853,77.0,2322.25,4323.0,5922.25,7910.0,264.0,104147.094697,...,3.0,4.0,264.0,2.375,0.764903,1.0,2.0,2.0,3.0,4.0
die welt (www),177.0,3980.39548,2289.970877,111.0,1863.0,4132.0,5963.0,7914.0,177.0,103980.39548,...,3.0,4.0,177.0,2.581921,0.901584,0.0,2.0,2.0,3.0,5.0
rheinische post (www),173.0,4260.479769,2246.251336,106.0,2294.0,4329.0,6090.0,7978.0,173.0,104260.479769,...,3.0,4.0,173.0,2.260116,0.846658,0.0,2.0,2.0,3.0,5.0


In [50]:
df.groupby("Newspaper").mean()

Unnamed: 0_level_0,Unnamed: 0,ID,Length,words in clean text,reach_dummy,modality_dummy,persons,organisations,national_elite_actors,national_opposition_actors,...,person dummy,organisation dummy,nea dummy,noa dummy,rea dummy,roa dummy,ea dummy,oa dummy,diversity index,diversity index all actors
Newspaper,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Aachener Zeitung,492.615464,493.615464,484.008247,294.691753,0.0,1.0,5.470103,3.743299,0.71134,0.198969,...,0.94433,0.918557,0.442268,0.159794,0.11134,0.035052,0.485567,0.180412,2.528866,2.61134
Der Tagesspiegel,10927.010886,10928.010886,574.946345,345.008554,1.0,1.0,7.534992,5.557543,0.920684,0.264386,...,0.993002,0.975117,0.506998,0.173406,0.042768,0.0,0.514774,0.173406,2.656299,2.691291
Die Welt,1407.464501,1408.464501,774.438026,458.566787,1.0,1.0,7.873646,6.399519,0.90012,0.334537,...,0.974729,0.975933,0.476534,0.186522,0.056558,0.001203,0.488568,0.186522,2.625752,2.67148
Rheinische Post,3486.029474,3487.029474,377.456,227.909895,0.0,1.0,5.380211,3.716211,0.586947,0.234526,...,0.942737,0.933895,0.409263,0.188632,0.097684,0.026105,0.456421,0.208,2.541053,2.598316
Stuttgarter Zeitung,5749.415521,5750.415521,394.241714,236.194826,0.0,1.0,5.268391,3.881164,0.777688,0.203719,...,0.952304,0.94422,0.469685,0.138238,0.187551,0.04042,0.538399,0.163298,2.598222,2.732417
Süddeutsche Zeitung (inkl. Regionalausgaben),8341.211828,8342.211828,529.366129,312.950806,1.0,1.0,6.907527,4.667473,0.654301,0.233065,...,0.964785,0.938441,0.402957,0.169086,0.030376,0.0,0.409946,0.169086,2.482258,2.505645
aachener zeitung (www),4215.535714,104215.535714,401.488095,247.35119,0.0,0.0,5.553571,4.678571,0.857143,0.214286,...,0.940476,0.928571,0.416667,0.142857,0.041667,0.0,0.416667,0.142857,2.428571,2.470238
der tagesspiegel (www),4147.094697,104147.094697,574.189394,340.988636,1.0,0.0,6.935606,4.590909,0.57197,0.132576,...,0.988636,0.924242,0.337121,0.083333,0.041667,0.0,0.352273,0.083333,2.348485,2.375
die welt (www),3980.39548,103980.39548,578.751412,346.711864,1.0,0.0,7.429379,4.858757,0.988701,0.333333,...,0.966102,0.954802,0.40113,0.20904,0.050847,0.0,0.40113,0.20904,2.531073,2.581921
rheinische post (www),4260.479769,104260.479769,338.699422,206.323699,0.0,0.0,4.786127,3.473988,0.549133,0.150289,...,0.936416,0.895954,0.312139,0.086705,0.023121,0.00578,0.323699,0.092486,2.248555,2.260116


In [51]:
df.to_excel("complete_data_cleaned_with_actor_diversity.xlsx")