## Processing

We want to extract the following:
 1. Label with explanation: This is the argument that contains the search term
 2. Argument: The parent of 1)
 3. Claim: The parent of 2)
 4. Type of argument: Is 2) a pro or con of 3)?
 5. Type of Fallacy
 6. Pros: A list containing all the pros of 1)
 7. Cons: A list containing all the cons of 1)


In [1]:
from src.discussiontree import DiscussionTree
dt = DiscussionTree("example.csv")

indices = dt.find("fallacy")
for index in indices:
    
    # 1) Label with explanation
    label_node = dt.get_entry(index)
    label = label_node["content"]
    print(f"1) label: {label}")
    
    # 2) Argument
    argument_node = dt.get_parent(index)
    argument = argument_node["content"]
    print(f"2) argument: {argument}")
    
    # 3) Claim
    claim_node = dt.get_parent(argument_node["index"])
    claim = claim_node["content"]
    print(f"3) claim: {claim}")
    
    # 4) Type of argument
    stance = argument_node["stance"]
    print(f"4) stance: {stance}")
    
    # 5) Type of fallacy
    print("5) type of fallacy: \"fallacy\"")
    
    # 6) Pros
    children = dt.get_children(index)
    print("6) Pros")
    for child in children:
        if child["stance"] == "Pro":
            print(child["content"])
    
    # 7) Cons
    print("7) Cons")
    for child in children:
        if child["stance"] == "Con":
            print(child["content"])
    

1) label: [Appeal to ignorance](https://en.wikipedia.org/wiki/Argument_from_ignorance) is a logical fallacy, it should not be asserted that a proposition is true because it has not yet been proven false.
2) argument: Absence of evidence is not evidence of absence.
3) claim: There is no historical evidence of Exodus happening.
4) stance: Con
5) type of fallacy: "fallacy"
6) Pros
7) Cons
This does not preclude the possibility that there is insufficient evidence to determine if a claim is true or false. Regardless of that fact though, the burden of proof remains that those making the claim must produce evidence to provide sufficient warrant for their position, or to abandon the claim as true citing there is insufficient evidence to consider it as true, which is not an admission that it is false, merely that it is unknown.


### Now for a larger dataset

Replace with directory containing Kialo debate exports:

In [2]:
DEBATES_PATH = "./debates"

Load all the debates into a dictionary of discussion trees:

In [3]:
from src.discussiontree import DiscussionTree
import os

trees = {}

for filename in os.listdir(DEBATES_PATH):
    trees[filename] = DiscussionTree(os.path.join(DEBATES_PATH, filename))

Define a method that extracts the info we want into a table row:

In [4]:
def parse_row(dt, index, term, header):
    row = {}
    
    # 1) Label with explanation
    label_node = dt.get_entry(index)
    row[header[0]] = label_node["content"]
    
    # 2) Argument
    argument_node = dt.get_parent(index)
    row[header[1]] = argument_node["content"]

    try:
        # 3) Claim
        claim_node = dt.get_parent(argument_node["index"])
        row[header[2]] = claim_node["content"]
        # 4) Type of argument
        row[header[3]] = argument_node["stance"]
    except KeyError:
        row[header[2]] = row[header[3]] = ""

    # 5) Type of fallacy
    row[header[4]] = term

    # 6) Pros
    children = dt.get_children(index)
    pros = ""   
    for child in children:
        try:
            if child["stance"] == "Pro":
                pros += " " + child["content"]
        except KeyError:
            pass
    row[header[5]] = pros

    # 7) Cons
    cons = ""
    for child in children:
        try:
            if child["stance"] == "Con":
                cons += " " + child["content"]
        except KeyError:
            pass
    row[header[6]] = cons
    
    return row


Specify the search terms that should be used for extraction:

In [5]:
with open("fallacy_types.txt") as file:
    terms = file.read().splitlines()

Now we can run the extraction and populate csv files with the results:

In [16]:
import csv

header = ["Label", "Argument", "Claim", "Type of Argument", "Type of fallacy", "Pros", "Cons"]

n_trees = 0
n_errors = 0
n_results = 0
for term in terms:
    rows = []
    n_term_results = 0
    for dt in trees.values():
        n_trees += 1
        indices = dt.find(term)
        for index in indices:
            try:
                row = parse_row(dt, index, term, header)
                rows.append(row)
                n_term_results += 1
            except KeyError:
                n_errors += 1
    print(f"Extraced {n_term_results} for \"{term}\"")
    n_results += n_term_results
    import os
    filename = f"results/{term}.csv"
    os.makedirs(os.path.dirname(filename), exist_ok=True)
    with open(filename, "w") as csvfile:
        writer = csv.DictWriter(csvfile, fieldnames = header)
        writer.writeheader()
        writer.writerows(rows)
print(f"\n{n_trees} discussions processed")
print(f"{n_results} results collected")
print(f"{n_errors} Key Errors detected")

Extraced 495 for "fallacy"
Extraced 0 for "false generalization"
Extraced 39 for "ad hominem"
Extraced 37 for "ad populum"
Extraced 39 for "strawman"
Extraced 23 for "straw man"
Extraced 1 for "false causality"
Extraced 0 for "circular rasoning"
Extraced 35 for "appeal to authority"
Extraced 9 for "appeal to emotion"
Extraced 31 for "from ignorance"
Extraced 0 for "fallacy of relevance"
Extraced 0 for "deductive fallacy"
Extraced 1 for "intentional fallacy"
Extraced 0 for "fallacy of extension"
Extraced 36 for "false dichotomy"
Extraced 0 for "fallacy of credibility"
Extraced 15 for "equivocation"
Extraced 14 for "naturalistic fallacy"

26771 discussions processed
775 results collected
180 Key Errors detected
