# Definition Phrasing

## 0. Requirements

In [20]:
# import required libraries
import pandas as pd
from ast import literal_eval
from collections import Counter
import numpy as np
import random

# import definition strings script
import definition_strings as ds

## 1. Load Knowledge Base
We load the knowledge base that was created in the `textmining.ipynb` notebook and drop the unnecessary columns.

In [21]:
# load knowledge base
knowledge_base = pd.read_csv("../output/knowledge_base.csv", sep=",")

# drop unnecessary columns
to_drop = ["noun_forms", "related_words", "hypernyms", "roots", "en_hypernyms", "path", "wup", "stem_cistem", "stem_porter",
       "stem_lancaster", "stem_snowball", "share_cistem", "share_porter", "share_lancaster", "share_snowball", "dist_stemmer"]
knowledge_base = knowledge_base.drop(to_drop, axis=1)

# 2. Preprocessing of the information
Loading the csv file via `pandas` oftentimes requires the re-evaluation of the literals contained in the data frame. Accordingly, we run the following code to make sure all cell types are evaluated correctly. 

In [22]:
# evaluate literals
knowledge_base["compound_forms"] = knowledge_base.compound_forms.apply(lambda x: literal_eval(str(x)))
knowledge_base["definition"] = knowledge_base.definition.apply(lambda x: literal_eval(str(x)))
knowledge_base["PERS_pro"] = knowledge_base.PERS_pro.apply(lambda x: literal_eval(str(x)) if(str(x) != "nan") else x)
knowledge_base["PERS_con"] = knowledge_base.PERS_con.apply(lambda x: literal_eval(str(x)) if(str(x) != "nan") else x)
knowledge_base["ORG_pro"] = knowledge_base.ORG_pro.apply(lambda x: literal_eval(str(x)) if(str(x) != "nan") else x)
knowledge_base["ORG_con"] = knowledge_base.ORG_con.apply(lambda x: literal_eval(str(x)) if(str(x) != "nan") else x)
knowledge_base["similar_words"] = knowledge_base.similar_words.apply(lambda x: literal_eval(str(x)) if(str(x) != "nan") else x)
knowledge_base["pro_mods"] = knowledge_base.pro_mods.apply(lambda x: literal_eval(str(x)) if(str(x) != "nan") else x)
knowledge_base["con_mods"] = knowledge_base.con_mods.apply(lambda x: literal_eval(str(x)) if(str(x) != "nan") else x)
knowledge_base["pro_attr"] = knowledge_base.pro_attr.apply(lambda x: "".join(literal_eval(str(x))) if(str(x) != "nan") else x)
knowledge_base["con_attr"] = knowledge_base.con_attr.apply(lambda x: "".join(literal_eval(str(x))) if(str(x) != "nan") else x)
knowledge_base["pro_colls"] = knowledge_base.pro_colls.apply(lambda x: literal_eval(str(x)) if(str(x) != "nan") else list())
knowledge_base["con_colls"] = knowledge_base.con_colls.apply(lambda x: literal_eval(str(x)) if(str(x) != "nan") else list())

Next we apply some preprocessing steps to translate and reduce some of the pieces of information that are contained in the knowledge base:
- replace empty entity values with nan
- get counts of entities
- translate polarity labels into German counterparts
- translate attribution tags into German counterparts
- replace unavailable sarcasm values with 0
- count modifiers
- remove potential duplicates from collocation lists
- replace unavailable TF-IDF scores with 0
- retrieve definite articles from genus column

In [27]:
# replace empty values with nan
knowledge_base["PERS_pro"] = knowledge_base.PERS_pro.apply(lambda x: x if x else np.nan)
knowledge_base["PERS_con"] = knowledge_base.PERS_con.apply(lambda x: x if x else np.nan)
knowledge_base["ORG_pro"] = knowledge_base.ORG_pro.apply(lambda x: x if x else np.nan)
knowledge_base["ORG_con"] = knowledge_base.ORG_con.apply(lambda x: x if x else np.nan)

# apply counter to count occurrences of entities
knowledge_base["PERS_pro"] = knowledge_base.PERS_pro.apply(lambda x: Counter(x) if(str(x) != "nan") else x)
knowledge_base["PERS_con"] = knowledge_base.PERS_con.apply(lambda x: Counter(x) if(str(x) != "nan") else x)
knowledge_base["ORG_pro"] = knowledge_base.ORG_pro.apply(lambda x: Counter(x) if(str(x) != "nan") else x)
knowledge_base["ORG_con"] = knowledge_base.ORG_con.apply(lambda x: Counter(x) if(str(x) != "nan") else x)

# replace sentiment with the German counterparts
knowledge_base["manual_sentiment"] = knowledge_base.manual_sentiment.replace("negative", "negativ")
knowledge_base["manual_sentiment"] = knowledge_base.manual_sentiment.replace("positive", "positiv")

# strip white spaces at right end of string (attribution columns)
knowledge_base["pro_attr"] = knowledge_base["pro_attr"].str.rstrip()
knowledge_base["con_attr"] = knowledge_base["con_attr"].str.rstrip()

# replace attribution tag with German counterpart / nan label
knowledge_base["pro_attr"] = knowledge_base.pro_attr.replace("Self", "Selbstzuschreibung")
knowledge_base["pro_attr"] = knowledge_base.pro_attr.replace("External", "Fremdzuschreibung")
knowledge_base["pro_attr"] = knowledge_base.pro_attr.replace("None", "nan")
knowledge_base["con_attr"] = knowledge_base.con_attr.replace("Self", "Selbstzuschreibung")
knowledge_base["con_attr"] = knowledge_base.con_attr.replace("External", "Fremdzuschreibung")
knowledge_base["con_attr"] = knowledge_base.con_attr.replace("None", "nan")

# replace unavailable sarcasm values with 0
knowledge_base["pro_sarcasm"] = knowledge_base.pro_sarcasm.apply(lambda x: 0.0 if(str(x) == 'nan') else x)
knowledge_base["con_sarcasm"] = knowledge_base.con_sarcasm.apply(lambda x: 0.0 if(str(x) == 'nan') else x)

# count modifiers
knowledge_base["pro_mods"] = knowledge_base.pro_mods.apply(lambda x: Counter(x) if(str(x) != 'nan') else x)
knowledge_base["con_mods"] = knowledge_base.con_mods.apply(lambda x: Counter(x) if(str(x) != 'nan') else x)

# remove duplicates from list of collocations
knowledge_base["pro_colls"] = knowledge_base.pro_colls.apply(set)
knowledge_base["pro_colls"] = knowledge_base.pro_colls.apply(list)
knowledge_base["con_colls"] = knowledge_base.con_colls.apply(set)
knowledge_base["con_colls"] = knowledge_base.con_colls.apply(list)

# replace nan values in TF columnd with 0
knowledge_base["pro_updated"] = knowledge_base.pro_updated.replace(np.nan, 0)
knowledge_base["con_updated"] = knowledge_base.con_updated.replace(np.nan, 0)
knowledge_base.pro_updated = knowledge_base.pro_updated.astype('Int64') # convert into integers
knowledge_base.con_updated = knowledge_base.con_updated.astype('Int64')

# replace nan values in TF-IDF columnd with 0
knowledge_base["tfidf_pro"] = knowledge_base.tfidf_pro.replace(np.nan, 0)
knowledge_base["tfidf_con"] = knowledge_base.tfidf_con.replace(np.nan, 0)

# change genus to article information
# create a list of our conditions
conditions = [(knowledge_base["genus"] == "f"),(knowledge_base["genus"] == "m"), (knowledge_base["genus"] == "n")]

# create a list of the values we want to assign for each condition
values = ["die", "der", "das"]

# create a new column and use np.select to assign values to it using our lists as arguments
knowledge_base["article"] = np.select(conditions, values)

## 2. Generate Definitions
Next, for each compound word, we will generate a final combination of strings (according to the unique information pieces that we have for this compound) and fill the place holders (denoted in swift brackets) with these information pieces. 

In [29]:
# create new text file containing the definition texts to be able to manually check the output 
f = open("../evaluation/definition_texts.txt", "w")

In [34]:
# for each compound word 
for word in knowledge_base.original:
    
    # set index to this compound word 
    idx = knowledge_base.index[knowledge_base["original"] == word][0]
        
    ### BASE INFORMATION ####
    
    # initiate base info string (i.e. compound + genus)
    text = ds.str_base_info
        
    # retrieve base information from knowledge base 
    compound = word.capitalize() # capitalized version of compound
    article = knowledge_base["article"].iloc[idx] # definite article
    con_freq = knowledge_base["con_updated"].iloc[idx] # term frequency C2022
    pro_freq = knowledge_base["pro_updated"].iloc[idx] # term frequency P2022
    con_tfidf = round(knowledge_base["tfidf_con"].iloc[idx],3) # TF-IDF C2022
    pro_tfidf = round(knowledge_base["tfidf_pro"].iloc[idx],3) # TF-IDF P2022
    pro_sarc = knowledge_base["pro_sarcasm"].iloc[idx] # sarcasm P2022
    con_sarc = knowledge_base["con_sarcasm"].iloc[idx] # sarcasm C2022
    pro_attr = knowledge_base["pro_attr"].iloc[idx] # attribution P2022
    con_attr = knowledge_base["con_attr"].iloc[idx] # attribution C2022
    
    # if category of compound is "person"
    if knowledge_base["concept"].iloc[idx] == "person":
        text += ds.str_base_pers # add "person" string to base string
            
    # if category of compound is "location"
    elif knowledge_base["concept"].iloc[idx] == "location":
        text += ds.str_base_loc # add "location" string to base string 
    
    # if category of compound is "group"
    elif knowledge_base["concept"].iloc[idx] == "group":
        text += ds.str_base_group # add "group" string to base string 
        
    # if category of compound is "abstraction"        
    elif knowledge_base["concept"].iloc[idx] == "abstraction":
        text += ds.str_base_abstract # add "abstraction" string to base string 
    
    # if category of compound is "action"
    elif knowledge_base["concept"].iloc[idx] == "action":
        text += ds.str_base_action # add "action" string to base string 
    
    ### ATTRIBUTION ###
    
    # if we have attribution info for both corpora
    if str(knowledge_base["con_attr"].iloc[idx]) != "nan" and str(knowledge_base["pro_attr"].iloc[idx]) != "nan":

        text += ds.str_attr # add attribution string
        
        # compose attribution filler
        attr = con_attr + " von Seiten der Skeptiker und als " + pro_attr + " im Vertreter Korpus."
    
    # if we only have attribution info for C2022
    elif str(knowledge_base["con_attr"].iloc[idx]) != "nan" and str(knowledge_base["pro_attr"].iloc[idx]) == "nan":

        text += ds.str_attr # add attribution string
        
        # compose attribution filler
        attr = con_attr + " von Seiten der Skeptiker." 

    # if we only have attribution info for P2022
    elif str(knowledge_base["con_attr"].iloc[idx]) == "nan" and str(knowledge_base["pro_attr"].iloc[idx]) != "nan":
        
        text += ds.str_attr # add attribution string    
            
        # compose attribution filler
        attr = pro_attr + " von Seiten der Vertreter." 
            
    else:        
        attr = "" # attribution not available 
        
        
    ### SARCASM ###
    
    # if we have sarcasm values for P2022 and C2022
    if pro_sarc > 0 and con_sarc> 0:
                
        # add sarcasm string
        text += ds.str_sarcasm
        
        # compose text for sarcasm string
        sarcasm = "in " + str(int(pro_sarc*100)) + "% der Fälle für den Vertreter Diskurs und in " + str(int(con_sarc*100)) + "% der Fälle für den Skeptiker Diskurs"

    # or if we only have sarcasm info for P2022
    elif pro_sarc > 0: 
        
        # add sarcasm string
        text += ds.str_sarcasm

        # compose sarcasm filler
        sarcasm = "in " + str(int(pro_sarc*100)) + "% der Fälle im Vertreter Diskurs"
        
    # or if we only have sarcasm info for C2022
    elif con_sarc > 0:
        
        # add sarcasm string
        text += ds.str_sarcasm

        # compose sarcasm filler
        sarcasm = "in " +str(int(con_sarc*100)) + "% der Fälle im Skeptiker Diskurs"
        
    else:
        # add sarcasm string
        text += ds.str_sarcasm
        sarcasm = "nicht" # sarcasm not available
            
    
    ### CONNOTATION ### 
        
    # add sentiment string
    text += ds.str_sent 
    
    sentiment = knowledge_base["manual_sentiment"].iloc[idx] # retrieve connotation label 
        
    ### MODIFIERS ###    
    
    # if we have at least one P2022 modifier for compound
    if knowledge_base["pro_mods"].isna().iloc[idx] == False:
        
        try:
            # try to retrieve two most common modifiers and connect with conjunction
            pro_mods =  " und ".join(["'"+el[0]+"'" for el in knowledge_base["pro_mods"].iloc[idx].most_common(2) if el[1] > 1])
            
            # if modifier string is not empty 
            if pro_mods != "":
                text += ds.str_mods_pro # add "pro modifier" string to base string 
        
        except:
            # if only one modifier available retrieve this one
            pro_mods = "".join(["'"+el[0]+"'" for el in knowledge_base["pro_mods"].iloc[idx].most_common(1) if el[1] > 1])
            
            # if modifier string is not empty
            if pro_mods != "":
                text += ds.str_mods_pro # add "pro modifier" string to base string 
    else:
        pro_mods = "" # else, no P2022 modifier available 
            
    # if we have at least one C2022 modifier for compound
    if knowledge_base["con_mods"].isna().iloc[idx] == False:
        
        try:
            # try to retrieve two most common modifiers and connect with conjunction
            con_mods =  " und ".join(["'"+el[0]+"'" for el in knowledge_base["con_mods"].iloc[idx].most_common(2) if el[1] > 1])
            
            # if modifier string is not empty
            if con_mods != "":
                text += ds.str_mods_con # add "con modifier" string to base string 
        
        except:
            # if only one modifier available retrieve this one
            con_mods = "".join(["'"+el[0]+"'" for el in knowledge_base["con_mods"].iloc[idx].most_common(1) if el[1] > 1])
            
            # if modifier string is not emtpy 
            if con_mods != "": 
                text += ds.str_mods_con # add "con modifier" string to base string 
    else:
        con_mods = "" # else, no C2022 modifier available 
            
    
    ### PERSON ENTITIES ###
    
    # if we have at least one person in C2022 AND P2022
    if knowledge_base["PERS_con"].isna().iloc[idx] == False and knowledge_base["PERS_pro"].isna().iloc[idx] == False:
        pers_both = True # set value to TRUE
    
    else:
        pers_both = False # set value to FALSE 
            
    # if we have at least one person in C2022
    if knowledge_base["PERS_con"].isna().iloc[idx] == False:
        
        try:
            # try to retrieve two most common person and connect with "und"
            con_pers = " und ".join([el[0] for el in knowledge_base["PERS_con"].iloc[idx].most_common(2)])
            text += ds.str_pers # add "person" string to base string 
            text += ds.str_pers_con # add "con person" string to base string 
        
        except:
            # if only one person available retrieve this one
            con_pers = knowledge_base["PERS_con"].iloc[idx].most_common(1)[0]
            text += ds.str_pers # add "person" string to base string 
            text += ds.str_pers_con # add "con person" string to base string  
           
        # if we only have C2022 "person"
        if pers_both == False:
            text += "." # add full stop to string
    else:
        con_pers = "" # no C2022 person available 
            
    # if we have at least one person in P2022
    if knowledge_base["PERS_pro"].isna().iloc[idx] == False:
        
        # if we have "person" entities for C2022 AND P2022
        if pers_both == True:
            
            try:
                # try to retrieve two most common person and connect with "und"
                pro_pers =  " und ".join([el[0] for el in knowledge_base["PERS_pro"].iloc[idx].most_common(2)])
                text += " und" # add "und" to string 
                text += ds.str_pers_pro + "." # add "pro person" string and full stop to base string 

            except:
                # if only one person available retrieve this one
                pro_pers = knowledge_base["PERS_pro"].iloc[idx].most_common(1)[0]
                text += " und" # add "und" to string 
                text += ds.str_pers_pro + "."# add "pro person" string and full stop to base string
                
        else:
            try:
                # try to retrieve two most common person and connect with comma
                pro_pers =  ", ".join([el[0] for el in knowledge_base["PERS_pro"].iloc[idx].most_common(2)])
                text += ds.str_pers # add "person" string to base string 
                text += ds.str_pers_pro + "." # add "pro person" string and full stop to base string 

            except:
                # if only one person available retrieve this one
                pro_pers = knowledge_base["PERS_pro"].iloc[idx].most_common(1)[0]
                text += ds.str_pers # add "person" string to base string 
                text += ds.str_pers_pro + "." # add "pro person" string and full stop to base string 

    else:
        pro_pers = "" # no P2022 person available 

    ### ORGANISATION ENTITIES ### 
    
    # if we have at least one organisation in C2022 AND P2022
    if knowledge_base["ORG_con"].isna().iloc[idx] == False and knowledge_base["ORG_pro"].isna().iloc[idx] == False:
        org_both = True # set value to TRUE
    
    else:
        org_both = False # set value to FALSE 
            
    # if we have at least one organisation in C2022
    if knowledge_base["ORG_con"].isna().iloc[idx] == False:
            
        try:
            # try to retrieve two most common organisations and connect with comma
            con_org =  ", ".join([el[0] for el in knowledge_base["ORG_con"].iloc[idx].most_common(2)])
            text += ds.str_org # add "organisation" string to base string 
            text += ds.str_org_con # add "con organisation" string to base string 
        
        except:
            # if only one organisation available retrieve this one
            con_org = knowledge_base["ORG_con"].iloc[idx].most_common(1)[0]
            text += ds.str_org # add "organisation" string to base string 
            text += ds.str_org_con # add "con organisation" string to base string 
        
        # if we only have C2022 "organisation"
        if org_both == False:
            text += "." # add full stop to string 
                
    else:
        con_org = "" # no C2022 "organisation" available 
            
    # if we have at least one organisation in P2022
    if knowledge_base["ORG_pro"].isna().iloc[idx] == False:
        
        # if we have organisations for C2022 AND P2022
        if org_both == True:
            try:
                # try to retrieve two most common organisations and connect with comma
                pro_org =  ", ".join([el[0] for el in knowledge_base["ORG_pro"].iloc[idx].most_common(2)])
                text += " und" # add "und" string to base string 
                text += ds.str_org_pro + "." # add "pro organisation" string to base string 

            except:
                # if only one organisation available retrieve this one
                pro_org = knowledge_base["ORG_pro"].iloc[idx].most_common(1)[0]
                text += " und" # add "und" string to base string 
                text += ds.str_org_pro + "." # add "pro organisation" string to base string
                
        else:
            try:
                # try to retrieve two most common organisations and connect with comma
                pro_org =  ", ".join([el[0] for el in knowledge_base["ORG_pro"].iloc[idx].most_common(2)])
                text += ds.str_org # add "organisation" string to base string 
                text += ds.str_org_pro + "."# add "pro organisation" string to base string 

            except:
                # if only one organisation available retrieve this one
                pro_org = knowledge_base["ORG_pro"].iloc[idx].most_common(1)[0]
                text += ds.str_org # add "organisation" string to base string 
                text += ds.str_org_pro + "." # add "pro organisation" string to base string
            
    else:
        pro_org = "" # no P2022 "organisation" available 
        
    ### COLLOCATIONS ###
        
    # if we have collocations for C2022 and P2022
    if len(knowledge_base["con_colls"].iloc[idx]) != 0 and len(knowledge_base["pro_colls"].iloc[idx]) != 0:
        try:
            # try to retrieve two random collocations from C2022 and connect with comma and compose string
            con_colls = ", ".join(["'"+x+"'" for x in random.sample(knowledge_base["con_colls"].iloc[idx], 2)]) + " (Skeptiker Korpus)"
            
            
        except:
            # if only one collocation from C2022 available retrieve this one and compose string
            con_colls = ", ".join(["'"+x+"'" for x in random.sample(knowledge_base["con_colls"].iloc[idx], 1)]) + " (Skeptiker Korpus)"

        try:
            # try to retrieve two random collocations from P2022 and connect with comma and compose string 
            pro_colls = " und " + ", ".join(["'"+x+"'" for x in random.sample(knowledge_base["pro_colls"].iloc[idx], 2)]) + " (Vertreter Korpus)"


        except:
            # if only one collocation from P2022 available retrieve this one
            pro_colls = " und " + ", ".join(["'"+x+"'" for x in random.sample(knowledge_base["pro_colls"].iloc[idx], 1)]) + " (Vertreter Korpus)"



        text += ds.str_colls # add "collocations" string to base string 
        
    # if we only have collocations for C2022
    elif len(knowledge_base["con_colls"].iloc[idx]) != 0:
        try:
            # try to retrieve two random collocations from C2022 and connect with comma and compose string
            con_colls = ", ".join(["'"+x+"'" for x in random.sample(knowledge_base["con_colls"].iloc[idx], 2)]) + " (Skeptiker Korpus)"

        except:
            # if only one collocation from C2022 available retrieve this one and compose string
            con_colls = ", ".join(["'"+x+"'" for x in random.sample(knowledge_base["con_colls"].iloc[idx], 1)]) + " (Skeptiker Korpus)"

        pro_colls = "" # no collocations for P2022 available 
        text += ds.str_colls # add "collocations" string to base string
        
    # if we only have collocations for P2022
    elif len(knowledge_base["pro_colls"].iloc[idx]) != 0:
        try:
            # try to retrieve two random collocations from P2022 and connect with comma and compose string 
            pro_colls = ", ".join(["'"+x+"'" for x in random.sample(knowledge_base["pro_colls"].iloc[idx], 2)]) + " (Vertreter Korpus)"


        except:
            # if only one collocation from P2022 available retrieve this one
            pro_colls = ", ".join(["'"+x+"'" for x in random.sample(knowledge_base["pro_colls"].iloc[idx], 1)]) + " (Vertreter Korpus)"

        con_colls = "" # no collocations for C2022 available
        text += ds.str_colls # add "collocations" string to base string 
        
    else:
        pro_colls = "" # no P2022 collocation available
        con_colls = "" # no C2022 collocation available 
        
        
    ### SIMILAR WORDS ###
    
    # if we have at least one similar word 
    if len(knowledge_base["similar_words"].iloc[idx]) != 0:
        
        # retrieve the words and re-append the prefix "Klima" to the words
        similar_words = set(["Klima"+x for x in knowledge_base["similar_words"].iloc[idx] if "Klima"+x != compound])
        
        # connect the words with a comma 
        similar_words = ", ".join(similar_words)
        
        text += ds.str_simwords # add "similar words" string to base string
            
            
    else:
        # if no similar words available, keep empty
        similar_words = ""

    ### FILL PLACE HOLDERS ###
            
    # assign the fillers to the according place holders in the final definition string    
    full_definition = text.format(COMPOUND= compound, ARTICLE = article, CON_FREQ= con_freq, PRO_FREQ = pro_freq, 
                                  CON_TFIDF = con_tfidf, PRO_TFIDF = pro_tfidf, SENTIMENT= sentiment, CON_PERS = con_pers, 
                                  PRO_PERS = pro_pers, CON_ORG = con_org, PRO_ORG = pro_org, SIMILAR_WORDS= similar_words, 
                                  PRO_MODS = pro_mods, CON_MODS = con_mods, PRO_COLLS = pro_colls, CON_COLLS = con_colls, 
                                  ATTRIBUTION = attr, SARCASM = sarcasm)
    

    #print(full_definition)
    #print("\n")
    #print("_"*50)
    
    # write full definition to text file (for evaluation purposes)
    f.write("\n")
    f.write(full_definition)
    f.write("\n")
    f.write("_"*50)
    f.write("\n")
    
    # save to column "full_definition" in knowledge base 
    knowledge_base.at[idx, "full_definition"] = full_definition

Let's have a look at an example definition text for the compound **Klimalüge**:

In [38]:
print(knowledge_base[knowledge_base.original == "klimalüge"].full_definition.values[0])

Klimalüge, die
Der Begriff 'Klimalüge' wird in unserem Korpus 18 Mal von den Klimaforschungsskeptikern und 3 Mal von den Klimaforschungsvertretern verwendet. Auf den gesamten Korpus gesehen entspricht das einer relativen Häufigkeit (TF-IDF) von 0.191 für die Skeptiker und 0.016 für die Vertreter. Die Verwendung wird in 33% der Fälle für den Vertreter Diskurs und in 11% der Fälle für den Skeptiker Diskurs als sarkastisch eingestuft. In unserem Korpus Sample ist der Begriff tendenziell negativ konnotiert. Im Zusammenhang mit dem Begriff erwähnt der Skeptiker Korpus die Person(en) Christian Möser und Hartmut Bachmann und der Vertreter Korpus die Person(en) Hartmut Bachmann und Donald Trump. Im Kontext von 'Klimalüge' erfolgt die Nennung folgender Organisation(en): FFF, EIKE (Skeptiker Korpus) und EU, IPCC (Vertreter Korpus).

Kollokationen: 'aufzuschwätzen', 'Welt' (Skeptiker Korpus)

Siehe auch: Klimaleugner, Klimaerzählung, Klimalügner


***
To retrieve a definition for a random glossary term please run the following code:

In [40]:
print(knowledge_base.sample().full_definition.values[0])

Klimahype, der
Der Begriff 'Klimahype' wird in unserem Korpus 15 Mal von den Klimaforschungsskeptikern und 0 Mal von den Klimaforschungsvertretern verwendet. Auf den gesamten Korpus gesehen entspricht das einer relativen Häufigkeit (TF-IDF) von 0.178 für die Skeptiker und 0.0 für die Vertreter. Die Verwendung wird nicht als sarkastisch eingestuft. In unserem Korpus Sample ist der Begriff tendenziell neutral konnotiert. Im Zusammenhang mit dem Begriff erwähnt der Skeptiker Korpus die Person(en) Stefan Rahmstorf und Angela Merkel. Im Kontext von 'Klimahype' erfolgt die Nennung folgender Organisation(en): EIKE, AWI (Skeptiker Korpus).

Kollokationen: 'Wirkmächtigkeit', 'kommen' (Skeptiker Korpus)

Siehe auch: Klimahysterie


### Save Definition Texts to Knowledge Base
In the end, we will save the final definition texts to an updated knowledge base file `knowledge_base_updated.csv`. We won't overwrite thew old knowledge base (output of `textmining.ipynb`) to be able to keep track of the progress and the full documentation of the work within the complete project. 

In [41]:
# save to file 
knowledge_base.to_csv("../output/knowledge_base_updated.csv", index = False)