# Evaluation of topic modelling
This notebook aims to give an evaluation of the accuracy with which the topic of bills are identified.
We do this by comparing our results to the topics that are given to bills in the dataset, i.e. bills that have the same topic according to the dataset, should also have the same topic in our results.

In the dataset, each bill has a set of topics and one main topic. 

We will use two metrics for the evaluation, a strict and a loose metric. The strict metric requires the main topic of bills as specified in the dataset to match exactly. The loose metric requires at least $k$ topics to match in the sets of topics to consider two bills as being in the same category.

### Test set for the evaluation (temporary)

In [43]:
import csv
import random 
with open('test_evaluation_data.csv', 'w') as csvfile:
    filewriter = csv.writer(csvfile, delimiter='|')
    for x in range(500):
        topic = random.randint(1, 5)
        true_topic = random.choices([topic, random.randint(1, 5)], weights = [8,2])[0]
        filewriter.writerow([topic, true_topic, 'none'])

### Read results

In [44]:
import csv
from collections import defaultdict, Counter

In [45]:
# Allow for changes in final format of the CSV file by making the relevant column ids variables
dataset_main_subject_column = 1
dataset_subjects_column = 2
topic_column = 0

results = 'test_evaluation_data.csv'

In [55]:
# Create a dictionary of topics found in results with their respective bills
topics = defaultdict(list)

with open(results, newline='') as csvfile:
    rows = csv.reader(csvfile, delimiter='|')
    for row in rows:
        bill = dict()
        bill['topic'] = row[topic_column]
        bill['dataset_subjects'] = row[dataset_subjects_column]
        bill['dataset_main_subject'] = row[dataset_main_subject_column]
        
        topics[bill['topic']].append(bill)

We now go over each identified topic and evaluate wether its bills are in fact have the same subject (according to the dataset)

### Strict metric

Find out how many 'actual topics' each identified topic contains

In [61]:
true_topics_per_topic = dict()
for topic, bills in topics.items():
    true_topics = Counter()
    for bill in bills:
        true_topic = bill['dataset_main_subject']
        true_topics[true_topic] += 1
    true_topics_per_topic[topic] = true_topics

In [68]:
for topic, true_topics in true_topics_per_topic.items():
    n_bills = sum(true_topics.values())
    print("Topic:", topic)
    print("# bills:", n_bills)
    print("# real topics:", len(true_topics))
    print("Most common real topic frequency:", true_topics.most_common(1)[0][1] / n_bills)
    print("---------------------------------------")

Topic: 1
# bills: 95
# real topics: 5
Most common real topic frequency: 0.8105263157894737
---------------------------------------
Topic: 5
# bills: 107
# real topics: 5
Most common real topic frequency: 0.8785046728971962
---------------------------------------
Topic: 3
# bills: 102
# real topics: 5
Most common real topic frequency: 0.7843137254901961
---------------------------------------
Topic: 4
# bills: 91
# real topics: 5
Most common real topic frequency: 0.8241758241758241
---------------------------------------
Topic: 2
# bills: 105
# real topics: 5
Most common real topic frequency: 0.8380952380952381
---------------------------------------
