# Reddit False Information Detection

## Background and current situation

### Goals
In this project, we want to find out if if an algorithm trained on existing fake news datasets is suitable for detecting misinformation in the quite specific type of content posted on Reddit. To assess the predictions made by our Machine Learning model, we check the online status of the streamed content exactly one week after it was posted. Our four areas of interest and main questions / goals are:

- Content moderation: determine if our method can be effective in filtering out misinformation compared to human content moderation
- Resource use: is our method efficient and scalable for real-time Big Data analysis?
- Reddit insights: analyze posts / comments classified as misinformation
- Model applicability: find out if an algorithm trained on existing misinformation datasets is suitable for Reddit

### Notebook structure
- reddit.ipynb: This notebook contains everything related to getting data from the Reddit API (streaming real-time content, retrieving data 1 week later), applying a trained model for classification, calculating performance metrics and some statistics as well as analyzing and discussing the results.
- ml_model_training[...].ipynb: The code and information related to the training of the ML models can be found in two separate notebooks (one for the model trained on the Truthseeker dataset, one for that trained on the Fakeddit dataset).

### Reddit API changes
Reddit has had a free API for 7 years now, however in April 2023 they announced that changes will come to their API. The result of the changes would be that third party apps that access Reddit’s API will need to pay on a request basis going forward. Therefore, by the end of June third party apps will either be shut down or need to pay $0.24 per 1.000 API calls. However, Reddit says that apps using less than 100 requests per minute through the OAuth client ID will continue to be able to use the API free of charge. One major remaining question is what Reddit classifies as "one request" - for example, with the existing information, it is not clear to us if continuous streaming (like in our project) would be counted as one request or if each streamed submission or comment appearing in the stream would be counted as one request. Since it was critically important for us to finish all of our tasks that access the API before the changes apply and the exact date was not clear (19th of June or 1st of July were the two possible dates), we did all of these parts before June 19th, just to be on the safe side. 
The official reason given by Reddit for the API changes was that users visiting the site through third-party apps may not see ads that Reddit serves on its website and first-party app, so it was not sustainable any longer for the company. In an “Ask me anything” by Reddit CEO Steve Huffman, it was announced that moderation tools that need API access will remain free of charge.

### Protests and blackout
Almost 8.000 subreddits participated in an event called “Reddark”, where the moderators put the communities into private mode, so users could no longer access the specific subreddits. The blackout started on June 12th and the protest was aimed to last at least 48 hours, including some of the biggest subreddits on the platform. In total, subreddits participating in Reddark had a total of two and a half billion members (cumulative, not unique users).
Over 6.000 subreddits stayed offline even after the 48 hours:

- Affected subreddits with 40+ million users: r/funny
- Affected subreddits with 30+ million users: r/aww, r/gaming, r/Music, r/Pics, r/science, r/todayilearned
- Affected subreddits with 20+ million users: r/art, r/askscience, r/books, r/DIY, r/EarthPorn, r/explainlikeimfive, r/food, r/gadgets, r/gifs, r/LifeProTips, r/memes, r/mildlyinteresting, r/NotTheOnion, r/Showerthoughts, r/space, r/sports, r/videos

An internal company memo was leaked which told employees that the blackouts did not really affect revenue and that the company's aim is to sit them out instead of responding to them. 

### Subreddit rules and moderation
In our selected subreddits (r/news, r/politics, r/science, r/worldnews), the number of submissions is much lower than the number of comments (roughly by a factor of 100). One possible reason for this extreme difference could be the subbredit-specific level of moderation and the sometimes very strict rules for submissions. For example, in r/politics there is a set of domains for submissions which are approved, if a submission is not from the approved domains list it gets removed. Also, new accounts that have not reached a certain time on the platform yet have restricted access to posting. Even more, the submission title on Reddit really must have the exact same wording as the news article headline that is linked. Also, additional text below the title is often prohibited, so adding opinions to the submission leads to a removal because only original text out of the article is allowed in the submission. In addition, the news article that gets submitted must not be older than 1 week of the time posting.


### Sources:

https://www.businessinsider.com/biggest-subreddits-affected-by-48-hour-blackout-list-private-2023-6 

https://www.digitaltrends.com/computing/reddit-api-changes-explained/ 

https://www.forbes.com/sites/antoniopequenoiv/2023/06/13/reddit-stands-by-controversial-api-changes-as-subreddit-protest-continues/ 

https://www.reddit.com/r/news/ 

https://www.reddit.com/r/politics/ 

https://www.reddit.com/r/science/  

https://www.reddit.com/r/worldnews/ 

## Streaming and classification

### Importing necessary packages

In [4]:
import praw
from time import time
from datetime import datetime
import spacy_sentence_bert
import pandas as pd
from IPython.display import display, HTML
from sklearn.metrics import accuracy_score, f1_score, recall_score, precision_score

### Connecting to the Reddit API

Reddit's API is used via the Python package "PRAW". The client ID and secret were created after registering a Reddit account and registering an app at https://www.reddit.com/prefs/apps.

In [None]:
reddit = praw.Reddit(client_id='...',
                     client_secret='...',
                     user_agent='...',
                     username='...',
                     password='...')

### Streaming all new posts on Reddit

In [None]:
stream_start = time()
for submission in reddit.subreddit('all').stream.submissions():
    if submission.created_utc > stream_start:
        print('id: ' + str(submission.id))
        print('unix_timestamp: ' + str(submission.created_utc))
        print('subreddit: ' + str(submission.subreddit.display_name))
        print('author_name: ' + str(submission.author))
        print('author_id: ' + str(submission.author.id))
        print('upvotes: ' + str(submission.score))
        print('title: ' + str(submission.title))
        print('url: ' + str(submission.url))
        print('content: ' + str(submission.selftext))
        print('________________________________________________ \n')

### Streaming all new comments on Reddit

In [None]:
stream_start = time()
for comment in reddit.subreddit('all').stream.comments():
    if comment.created_utc > stream_start:
        print('id: ' + str(comment.id))
        print('unix_timestamp: ' + str(comment.created_utc))
        print('subreddit: ' + str(comment.subreddit.display_name))
        print('submission_id: ' + str(comment.submission.id))
        print('author_name: ' + str(comment.author))
        print('author_id: ' + str(comment.author.id))
        print('upvotes: ' + str(comment.score))
        print('content: ' + str(comment.body))
        print('________________________________________________ \n')

### Classification (applying the ML model)

We are defining a function that loads the trained ML model, applies it to Reddit data to classify it as misinformation or truthful content and saves the streamed data and the classification label to a CSV file.

Note: the "classify_headline" function was initially applying the ML model on a wrong column ("selftext" instead of "title"), so the labels for the submissions streamed in real time were wrong. We noticed and corrected this mistake later, repeated the classficiations and replaced the affected CSV files.

In [89]:
def classify_headline(data, model = "Twitter"):
    # Load library
    import skops.io as sio

    # Load persisted model
    if model == "Twitter":
        trained_model = sio.load(file = "models/statement_rf_model.skops", trusted = True)
    elif model == "Reddit":
        nb_model = sio.load(file = "models/fakeddit_nb_model.skops", trusted = True)
        nn_model = sio.load(file = "models/fakeddit_nn_model.skops", trusted = True)

    if model == "Twitter":
        # Vectorize content
        nlp = spacy_sentence_bert.load_model('en_stsb_distilbert_base')
        title = nlp(data["title"][0]).vector.reshape(1, -1)
    elif model == "Reddit":
        title = [data["title"]]

    # Classify data
    if model == "Twitter":
        label = trained_model.predict(title)
    elif model == "Reddit":
        reddit_nb_label = nb_model.predict(title)
        reddit_nn_label = nn_model.predict(title)

    # Show message
    # if(label == 1):
    #     print("Nothing detected by the model. \n")
    # else:
    #     print("Misinformation detected! 👎🏼 \n")

    # Add label to dataframe
    if model == "Twitter":
        data["label"] = label
    elif model == "Reddit":
        data["reddit_nb_label"] = reddit_nb_label[0]
        data["reddit_nn_label"] = reddit_nn_label[0]

    # Return dataframe
    return data

In [93]:
# Code used to to insert additional labels based on the model trained on Fakeddit data
# We also used a variation of this code to fix the wrong labels in "submissions_classified_[...].csv" and "submissions_detailed_[...].csv files
# Export file names are only temporary to not overwrite input files

# data = pd.read_csv("results/submissions_detailed_r_science.csv")

# Reddit code

# data["reddit_nb_label"] = pd.Series(dtype = "int64")
# data["reddit_nn_label"] = pd.Series(dtype = "int64")

# for index, row in data.iterrows():
#     data.loc[index] = classify_headline(row, model = "Reddit")

# data["reddit_nb_label"] = data["reddit_nb_label"].astype(int)
# data["reddit_nn_label"] = data["reddit_nn_label"].astype(int)

# data.to_csv("results/submissions_detailed_FULL_r_science.csv", index = None)

# Code to fix erroneous labels

# for index, row in data.iterrows():
#     data.loc[index] = classify_headline(row, model = "Reddit")

# data.to_csv("results/submissions_detailed_CORRECTED_r_news.csv", index = None)

In [None]:
def classify_text(data):
    # Load library
    import skops.io as sio

    # Load persisted model
    trained_model = sio.load(file = "models/tweet_nn_model.skops", trusted = True)

    # Vectorize content
    nlp = spacy_sentence_bert.load_model('en_stsb_distilbert_base')
    content = nlp(data["content"][0]).vector.reshape(1, -1)

    # Classify data
    label = trained_model.predict(content)

    # Show message
    # if(label == 1):
    #     print("Nothing detected by the model. \n")
    # else:
    #     print("Misinformation detected! 👎🏼 \n")

    # Add label to dataframe
    data["label"] = label

    # Return dataframe
    return data

### Streaming and real-time classification

In [None]:
def stream_and_classify_submissions(subreddit):
    # Load library
    import os

    # Set current time as start time of the stream
    stream_start_unix_timestamp = time()
    stream_start_timestamp = datetime.now().strftime("_%Y_%m_%d_%H_%M_%S")
    
    # Loop through every new submission in the subreddit
    for submission in reddit.subreddit(subreddit).stream.submissions():
        if submission.created_utc > stream_start_unix_timestamp:
            try:
                # Get details
                id = str(submission.id)
                unix_timestamp = submission.created_utc
                subreddit_name = str(submission.subreddit.display_name)
                author_name = str(submission.author)
                author_id = str(submission.author.id)
                upvotes = submission.score
                title = str(submission.title)
                url = str(submission.url)
                content = str(submission.selftext.splitlines())

                # Print details
                # print('id: ' + id)
                # print('unix_timestamp: ' + str(unix_timestamp))
                # print('subreddit_name: ' + subreddit_name)
                # print('author_name: ' + author_name)
                # print('author_id: ' + author_id)
                # print('upvotes: ' + str(upvotes))
                # print('title: ' + title)
                # print('url: ' + url)
                # print('content: ' + content)
                # print('________________________________________________ \n')

                # Create dataframe
                data = pd.DataFrame([[id, unix_timestamp, subreddit_name, author_name, author_id, upvotes, title, url, content]],
                                    columns=["id", "unix_timestamp", "subreddit_name", "author_name", "author_id", "upvotes", "title", "url", "content"])

                # Classify data
                classified_data = classify_headline(data)

                # Save data: if a CSV exists already, append data without header, otherwise create new CSV with header
                file_name = "results/submissions_classified_" + "r_" + subreddit + stream_start_timestamp + ".csv"
                classified_data.to_csv(file_name, mode="a", index=False, header=not os.path.isfile(file_name))
            except:
                continue

In [None]:
def stream_and_classify_comments(subreddit):
    # Load library
    import os
    
    # Set current time as start time of the stream
    stream_start_unix_timestamp = time()
    stream_start_timestamp = datetime.now().strftime("_%Y_%m_%d_%H_%M_%S")

    # Loop through every new comment in the subreddit
    for comment in reddit.subreddit(subreddit).stream.comments():
        if comment.created_utc > stream_start_unix_timestamp:
            try:
                # Get details
                id = str(comment.id)
                unix_timestamp = comment.created_utc
                subreddit_name = str(comment.subreddit.display_name)
                submission_id = str(comment.submission.id)
                author_name = str(comment.author)
                author_id = str(comment.author.id)
                upvotes = comment.score
                content = str(comment.body.splitlines())

                # Print details
                # print('id: ' + id)
                # print('unix_timestamp: ' + str(unix_timestamp))
                # print('subreddit_name: ' + subreddit_name)
                # print('submission_id: ' + submission_id)
                # print('author_name: ' + author_name)
                # print('author_id: ' + author_id)
                # print('upvotes: ' + str(upvotes))
                # print('content: ' + content)
                # print('________________________________________________ \n')

                # Create dataframe
                data = pd.DataFrame([[id, unix_timestamp, subreddit_name, submission_id, author_name, author_id, upvotes, content]],
                                    columns=["id", "unix_timestamp", "subreddit_name", "submission_id", "author_name", "author_id", "upvotes", "content"])

                # Classify data
                classified_data = classify_text(data)

                # Save data: if a CSV exists already, append data without header, otherwise create new CSV with header
                file_name = "results/comments_classified_" + "r_" + subreddit + stream_start_timestamp + ".csv"
                classified_data.to_csv(file_name, mode="a", index=False, header=not os.path.isfile(file_name))
            except:
                continue

To reduce negative effects of any interruptions, each subreddit is streamed in a different notebook and the data is saved to unique CSVs with timestamps!

## Analysis
### Getting details about streamed content by ID 
To check if the streamed content had been deleted or not, we had to retrieve all available details for each submission or comment. This was done exactly 1 week after the content had been streamed and classified. Using PRAW's info() function is ideal, since it requests the data in batches of 100 and not one by one. The function requires a list of IDs with the correct prefix (for submission or comment) added.

In [None]:
# Pick ONE import and export only from the options below - either submissions or comments from one subreddit!
"""
### Submissions only ###

# Get submission IDs and labels
# streamed_data = pd.read_csv("results/submissions_classified_r_news.csv")
# streamed_data = pd.read_csv("results/submissions_classified_r_politics.csv")
# streamed_data = pd.read_csv("results/submissions_classified_r_science.csv")
streamed_data = pd.read_csv("results/submissions_classified_r_worldnews.csv")
streamed_data = streamed_data[["id", "label"]]

# First submission ID
first_full_id = [i if i.startswith('t3_') else f't3_{i}' for i in streamed_data.loc[:0,"id"]]

# Rest of the submission IDs
rest_full_ids = [i if i.startswith('t3_') else f't3_{i}' for i in streamed_data.loc[1:,"id"]]
"""

### Comments only ###

# Get comment IDs and labels
# streamed_data = pd.read_csv("results/comments_classified_r_news.csv")
# streamed_data = pd.read_csv("results/comments_classified_r_politics.csv")
# streamed_data = pd.read_csv("results/comments_classified_r_science.csv")
streamed_data = pd.read_csv("results/comments_classified_r_worldnews.csv")
streamed_data = streamed_data[["id", "label"]]

# First comment ID
first_full_id = [i if i.startswith('t1_') else f't1_{i}' for i in streamed_data.loc[:0,"id"]]

# Rest of the comment IDs
rest_full_ids = [i if i.startswith('t1_') else f't1_{i}' for i in streamed_data.loc[1:,"id"]]

### Submissions or comments ###

# Get data about the first ID
for submission_or_comment in reddit.info(fullnames = first_full_id):
	data = vars(submission_or_comment)
	df = pd.json_normalize(data)

# Get data about the rest of the IDs
for submission_or_comment in reddit.info(fullnames = rest_full_ids):
	data = vars(submission_or_comment)
	df_row = pd.json_normalize(data)
	df = pd.concat([df, df_row], join = "inner", ignore_index = True)

# Remove linebreaks from text
if "selftext" in df.columns:
	df["selftext"] = df["selftext"].apply(lambda row: str(row.splitlines()).strip("[]") if row != None else row)
	df["selftext_html"] = df["selftext_html"].apply(lambda row: str(row.splitlines()).strip("[]") if row != None else row)
elif "body" in df.columns:
	df["body"] = df["body"].apply(lambda row: str(row.splitlines()).strip("[]") if row != None else row)
	df["body_html"] = df["body_html"].apply(lambda row: str(row.splitlines()).strip("[]") if row != None else row)

# Merge with streamed data to add classification label
df = pd.merge(df, streamed_data, on = "id")

# Move ID column to the first position
df.insert(0, 'id', df.pop('id'))

# Export as CSV
# df.to_csv("results/submissions_detailed_r_news.csv", index = False)
# df.to_csv("results/submissions_detailed_r_politics.csv", index = False)
# df.to_csv("results/submissions_detailed_r_science.csv", index = False)
# df.to_csv("results/submissions_detailed_r_worldnews.csv", index = False)
# df.to_csv("results/comments_detailed_r_news.csv", index = False)
# df.to_csv("results/comments_detailed_r_politics.csv", index = False)
# df.to_csv("results/comments_detailed_r_science.csv", index = False)
# df.to_csv("results/comments_detailed_r_worldnews.csv", index = False)

# Display the entire content of the dataframe
# from IPython.display import display, HTML
# display(HTML(df.to_html(col_space = 300)))

# Print
display(df)

### Removal indicators and general notes

<b>General:</b>

"[removed]" indicates that the submission/comment was removed by a moderator of the subreddit (because it broke the rules). <br>
"[deleted]" indicates that the submission/comment was deleted by its author. <br>

<b>Comments:</b>

If the author is undefined, but the comment is still online, the user account was removed or deleted!

Fields that signal removal / deletion of the comment:
* "body": "[removed]" / "[deleted]"

Fields that signal controversiality of the comment:
* "score": negative integer

<b>Submissions:</b>

If the author is undefined, the user account was deleted or removed!

Fields that signal removal / deletion of the submission:
* "selftext": "[removed]" / "[deleted]"
* "removed_by_category": not "nan"
* "is_robot_indexable": False

Fields that signal controversiality of the submission:
* "upvote_ratio": low value of float

Check if the number of upvotes is preserved even when content is taken down!

<b>Notes:</b>

The absolute number of downvotes is not provided anymore even though the field "downs" still exists. (Confirmation: https://github.com/praw-dev/praw/issues/881). For submissions, the number of upvotes is given in the "score" attribute and the share of upvotes among all votes is given in the "upvote_ratio" column. For comments, the difference between the number of upvotes and the number of downvotes is given in the "score" column.

<b>Important:</b> Some existing columns in Reddit API data are unused, others are used differently (contain different data) for submissions and comments!

It's not possible to retrieve deleted or removed comments via the submission they responded to as only the IDs of comments that are still visible are included in the submission metadata. (Tested on this submission: https://www.reddit.com/r/redditdev/comments/geybhw/praw_is_there_any_way_to_determine_if_a_link_post/)

Unfortunately, we cannot include posts being marked NSFW / "over_18" by moderators as misinformation indicators. While it is possible that moderators mark content as NSFW after it was postetd, users can also do this for their own content at the time of posting. As we did not save the "over_18" tag during the data streaming, it's now impossible for us to tell which of the two cases is applicable.

To retrieve content via its ID using the info() function, submissions IDs must be prefixed with "t3_" and comment IDs with "t1_". (Source: https://www.reddit.com/dev/api/)

<b>Reddit Blackout Notes:</b>

On 2023-06-17, when fetching the detailed data for the streamed content one week after streaming:
* r/news, r/politics, r/worldnews were available and active.
* r/science was visible again after the protests starting on 2023-06-12, but was still restricted (= no new content allowed).

Below are functions to categorize and analyze the submissions and comments.
Note: the value "truth" for the prediction target below is a simplification, "not_clearly_misinformation" would be more accurate but also more inconvenient.

In [26]:
def categorize_submissions(df):
    # Submission was removed / deleted
    if (df["is_robot_indexable"] == False) or (df["removed_by_category"] != "nan"):
        # Submission was removed by a moderator
        if (df["removed_by_category"] == "moderator") or (df["removed_by_category"] == "reddit") or (df["removed_by_category"] == "deleted" and df["selftext"] == "'[removed]'"):
            return "removed_by_moderator"
        # Submission was deleted by the user
        elif (df["removed_by_category"] == "deleted" and df["selftext"] == "'[deleted]'"):
            return "deleted_by_user"
    # Submission was not removed / deleted
    else:
        return "online"
    

def categorize_comments(df):
    # Comment was removed by a moderator
    if df["body"] == "'[removed]'":
        return "removed_by_moderator"
    # Coment was deleted by the user
    elif df["body"] == "'[deleted]'":
        return "deleted_by_user"
    # Comment was not removed / deleted
    else:
        return "online"


def analyze_submissions(path, prediction_target, model = "Twitter"):
    # Import CSV file
    df = pd.read_csv(path, low_memory = False)

    # Convert column types
    df["author"] = df["author"].apply(str)
    df["removed_by_category"] = df["removed_by_category"].apply(str)

    # Add column on removal status
    df['removal_status'] = df.apply(categorize_submissions, axis = 1)

    # Add column on online status
    df.loc[df["removal_status"] == "removed_by_moderator", "online"] = 0
    df.loc[df["removal_status"] == "deleted_by_user", "online"] = 0
    df.loc[df["removal_status"] == "online", "online"] = 1
    df["online"] = df["online"].astype(int)

    if model == "Twitter":
        y_prediction = df["label"]
    elif model == "Reddit NB":
        y_prediction = df["reddit_nb_label"]
    elif model == "Reddit NN":
        y_prediction = df["reddit_nn_label"]
        
    y_actual = df["online"]

    if prediction_target == "misinformation":
        target_label = 0
    elif prediction_target == "truth":
        target_label = 1

    # Calculate and print scores
    print(f'Accuracy: {round(accuracy_score(y_actual, y_prediction) * 100, 2)}%')
    print(f'Precision: {round(precision_score(y_actual, y_prediction, pos_label = target_label) * 100, 2)}%')
    print(f'Recall: {round(recall_score(y_actual, y_prediction, pos_label = target_label) * 100, 2)}%')
    print(f'F1 Score: {round(f1_score(y_actual, y_prediction, pos_label = target_label) * 100, 2)}%')

    confusion_matrix_lr = pd.crosstab(y_prediction, y_actual)
    display(confusion_matrix_lr)

    # Show the unique values of every column
    # for col in df:
    #     print(col + ": " + str(df[col].unique()))

    # return df[["id", "author", "selftext", "score", "upvote_ratio", "removal_status", "online", "label"]]
     

def analyze_comments(path, prediction_target):
    # Import CSV file
    df = pd.read_csv(path, low_memory = False)

    # Convert column type
    df["author"] = df["author"].apply(str)

    # Add column on removal status
    df['removal_status'] = df.apply(categorize_comments, axis = 1)

    # Add column on online status
    df.loc[df["removal_status"] == "removed_by_moderator", "online"] = 0
    df.loc[df["removal_status"] == "deleted_by_user", "online"] = 0
    df.loc[df["removal_status"] == "online", "online"] = 1
    df["online"] = df["online"].astype(int)

    y_prediction = df["label"]
    y_actual = df["online"]

    if prediction_target == "misinformation":
        target_label = 0
    elif prediction_target == "truth":
        target_label = 1

    # Calculate and print scores
    print(f'Accuracy: {round(accuracy_score(y_actual, y_prediction) * 100, 2)}%')
    print(f'Precision: {round(precision_score(y_actual, y_prediction, pos_label = target_label) * 100, 2)}%')
    print(f'Recall: {round(recall_score(y_actual, y_prediction, pos_label = target_label) * 100, 2)}%')
    print(f'F1 Score: {round(f1_score(y_actual, y_prediction, pos_label = target_label) * 100, 2)}%')

    confusion_matrix_lr = pd.crosstab(y_prediction, y_actual)
    display(confusion_matrix_lr)

    # Show the unique values of every column
    # for col in df:
    #     print(col + ": " + str(df[col].unique()))

    # return df[["id", "author", "body", "score", "removal_status", "online", "label"]]

### Results for submissions
Metrics and the confusion matrix are shown for the submission results of each of the 4 subreddits separately, with three different results for each:
* Results for the Random Forest model trained on Truthseeker that was applied in real time
* Results for the Naive Bayes model trained on Fakeddit that was used afterwards
* Results for the Neural Network model trained on Fakeddit that was used afterwards

#### r/news:

In [108]:
analyze_submissions("results/submissions_detailed_r_news.csv", "misinformation")
analyze_submissions("results/submissions_detailed_r_news.csv", "misinformation", model = "Reddit NB")
analyze_submissions("results/submissions_detailed_r_news.csv", "misinformation", model = "Reddit NN")

Accuracy: 57.14%
Precision: 54.55%
Recall: 31.58%
F1 Score: 40.0%


online,0,1
label,Unnamed: 1_level_1,Unnamed: 2_level_1
0,6,5
1,13,18


Accuracy: 57.14%
Precision: 100.0%
Recall: 5.26%
F1 Score: 10.0%


online,0,1
reddit_nb_label,Unnamed: 1_level_1,Unnamed: 2_level_1
0,1,0
1,18,23


Accuracy: 52.38%
Precision: 33.33%
Recall: 5.26%
F1 Score: 9.09%


online,0,1
reddit_nn_label,Unnamed: 1_level_1,Unnamed: 2_level_1
0,1,2
1,18,21


#### r/politics:

In [112]:
analyze_submissions("results/submissions_detailed_r_politics.csv", "misinformation")
analyze_submissions("results/submissions_detailed_r_politics.csv", "misinformation", model = "Reddit NB")
analyze_submissions("results/submissions_detailed_r_politics.csv", "misinformation", model = "Reddit NN")

Accuracy: 49.02%
Precision: 65.17%
Recall: 36.94%
F1 Score: 47.15%


online,0,1
label,Unnamed: 1_level_1,Unnamed: 2_level_1
0,58,31
1,99,67


Accuracy: 45.1%
Precision: 68.89%
Recall: 19.75%
F1 Score: 30.69%


online,0,1
reddit_nb_label,Unnamed: 1_level_1,Unnamed: 2_level_1
0,31,14
1,126,84


Accuracy: 40.39%
Precision: 64.71%
Recall: 7.01%
F1 Score: 12.64%


online,0,1
reddit_nn_label,Unnamed: 1_level_1,Unnamed: 2_level_1
0,11,6
1,146,92


#### r/science:

In [114]:
analyze_submissions("results/submissions_detailed_r_science.csv", "misinformation")
analyze_submissions("results/submissions_detailed_r_science.csv", "misinformation", model = "Reddit NB")
analyze_submissions("results/submissions_detailed_r_science.csv", "misinformation", model = "Reddit NN")

Accuracy: 62.07%
Precision: 47.06%
Recall: 80.0%
F1 Score: 59.26%


online,0,1
label,Unnamed: 1_level_1,Unnamed: 2_level_1
0,8,9
1,2,10


Accuracy: 55.17%
Precision: 28.57%
Recall: 20.0%
F1 Score: 23.53%


online,0,1
reddit_nb_label,Unnamed: 1_level_1,Unnamed: 2_level_1
0,2,5
1,8,14


Accuracy: 58.62%
Precision: 0.0%
Recall: 0.0%
F1 Score: 0.0%


online,0,1
reddit_nn_label,Unnamed: 1_level_1,Unnamed: 2_level_1
0,0,2
1,10,17


#### r/worldnews:

In [116]:
analyze_submissions("results/submissions_detailed_r_worldnews.csv", "misinformation")
analyze_submissions("results/submissions_detailed_r_worldnews.csv", "misinformation", model = "Reddit NB")
analyze_submissions("results/submissions_detailed_r_worldnews.csv", "misinformation", model = "Reddit NN")

Accuracy: 48.19%
Precision: 42.47%
Recall: 34.83%
F1 Score: 38.27%


online,0,1
label,Unnamed: 1_level_1,Unnamed: 2_level_1
0,31,42
1,58,62


Accuracy: 54.92%
Precision: 66.67%
Recall: 4.49%
F1 Score: 8.42%


online,0,1
reddit_nb_label,Unnamed: 1_level_1,Unnamed: 2_level_1
0,4,2
1,85,102


Accuracy: 55.96%
Precision: 83.33%
Recall: 5.62%
F1 Score: 10.53%


online,0,1
reddit_nn_label,Unnamed: 1_level_1,Unnamed: 2_level_1
0,5,1
1,84,103


### Results for comments
Metrics and the confusion matrix are shown for the comments results of each of the 4 subreddits. The results are based on the Neural Network model trained on Truthseeker that was applied in real time.

In [122]:
analyze_comments("results/comments_detailed_r_news.csv", "misinformation")
analyze_comments("results/comments_detailed_r_politics.csv", "misinformation")
analyze_comments("results/comments_detailed_r_science.csv", "misinformation")
analyze_comments("results/comments_detailed_r_worldnews.csv", "misinformation")

Accuracy: 56.4%
Precision: 5.9%
Recall: 41.6%
F1 Score: 10.34%


online,0,1
label,Unnamed: 1_level_1,Unnamed: 2_level_1
0,275,4385
1,386,5897


Accuracy: 55.52%
Precision: 5.46%
Recall: 48.47%
F1 Score: 9.82%


online,0,1
label,Unnamed: 1_level_1,Unnamed: 2_level_1
0,379,6558
1,403,8310


Accuracy: 57.08%
Precision: 31.79%
Recall: 45.14%
F1 Score: 37.3%


online,0,1
label,Unnamed: 1_level_1,Unnamed: 2_level_1
0,274,588
1,333,951


Accuracy: 56.6%
Precision: 6.72%
Recall: 45.49%
F1 Score: 11.71%


online,0,1
label,Unnamed: 1_level_1,Unnamed: 2_level_1
0,429,5956
1,514,8008


## Discussion

The cases predicted by the "label" attribute and the facts captured by the "online" attribute are not fully the same - these are not predictions and actual values for one and the same (clear) target variable.

The <b>ML model</b> predicts whether or not an input content can be classfied as misinformation (and <b>not</b> if such content, when posted on Reddit, is likely to be removed).

The <b>real data</b> indicates whether or not a specific Reddit submission or comment was removed from the platform (for <b>any</b> reason, not just for being misinformation).

The Truthseeker training dataset was very balanced quantitatively (approximately 50:50) and the predictions made by the trained model when it was applied to the stream were also quite balanced (but not 50:50) - this means that there is an enormous amount of false positives (content falsely classfied as misinformation).

Across submissions and comments, accuracy (how much data was classified correctly overall) is between 50% and 60%, with recall (how much misinformation / deleted content was predicted correctly) typically going below 50%. Considerations that would otherwise possibly be made, e.g. discussing tradeoffs between recall and precision values and the related issue of finding as many target cases as possible vs. "overshooting the target", don't really make much sense here given the overall very low level of the results.

To do a sanity check, we trained an additional model after looking at the initial results, based on "Fakeddit" (see notebook "ml_model_training_fakeddit.ipynb"). The Fakeddit training dataset was more unbalanced (after removing all image-related content) and thus more closely resembling our streamed Reddit data. However, while it showed superior performance (compared to the Truthseeker model) at detecting misinformation in its own testing dataset, the outcomes for our streamed Reddit submissions are much worse. This is a bit surprising, given that models trained on the Truthseeker data should in theory be more suitable for making predictions for our exact "real data" (see above), i.e. whether or not Reddit content gets removed.

There might (must!) be aspects in our pipeline that can be improved, to create a better model. Even despite the difference in nature of the training data and our real data, the Truthseeker results were disappointing - at least at first, but then most of the Fakeddit results made them look good in comparison.

<b>Main learnings:</b>

1. Training an ML model to detect fake news / misinformation is complex. Better training data, better language preprocessing and better model fine-tuning would be needed.
2. Tricking yourself or misleading the audience by not double-checking (or intentionally changing) the target of the evaluation metrics is really easy. In an unbalanced dataset with much more "true" than "fake" content - which is the case for real data like our Reddit stream data - setting the target to "true" (the default value for the sklearn.metrics functions is 1!) can easily yield 90% for some metrics even if hardly any misinformation is detected at all.
3. Published datasets for ML training are not always what they appear to be at first sight. When reading through the associated papers (where available) and working through the dataset with pandas, we quickly realized that there are many assumptions and simplifications involved, that, while understandable, are certainly not helping the outcomes. There is no such thing as a "gold standard", general purpose, do-it-all misinformation dataset, no matter how many authors claim exactly that. The Fakeddit dataset we used mixes data about satire, fake news and manipulated image content, but we were unable to achieve good results using the satire and fake news parts of the dataset only. One of the reasons might be that there are no real "fake" or "removed by moderator" cases at all in the data, since the assumptions are basically: every Reddit post that is online after a year is "true", except for satire-related subreddits, where every post is "fake".
4. Matching the training data more closely to the real data / use case, while necessary, might not be sufficient, as we suspect that the quantity of data would also need to be increased massively (see LLMs).

Somewhat in line with the general topic of fake news and misinformation, here's an example for point 2 above, showing the actual results and a possible summary / headline someone could create out of them:

In [125]:
analyze_comments("results/comments_detailed_r_politics.csv", "truth")

Accuracy: 55.52%
Precision: 95.37%
Recall: 55.89%
F1 Score: 70.48%


online,0,1
label,Unnamed: 1_level_1,Unnamed: 2_level_1
0,379,6558
1,403,8310


<b>"We trained a Fake News detection Machine Learning model that has a precision of 95%"</b>

While technically not a complete lie, the precision refers to the detection of "true" data (or rather, content that did not get deleted after 1 week) and not misinformation or "fake" data (content that got deleted). At the same time, much more "true" data (14.868 instances) was present than "fake" data (782 instances) and out of the latter, only 379 instances were correctly detected. The high precision is purely pased on the low number of false positives (403) and is a given: if there are much less of the non-targeted cases, then there can't be many false positives either!

$$ Precision = {True Positives \over True Positives + False Positives} $$


## Statistics
Finally, we want to briefly explore some statistics to see if there are any differences in certain characteristics between removed and non-removed content. Below is the code we used to calculate means and medians for the scores and upvote ratios of content that was still online after a week and content that had been deleted by the user or removed by a moderator.

In [187]:
# Statistics for submissions

# Import and combine datasets
df1 = pd.read_csv("results/submissions_detailed_r_news.csv", low_memory = False)
df2 = pd.read_csv("results/submissions_detailed_r_politics.csv", low_memory = False)
df3 = pd.read_csv("results/submissions_detailed_r_science.csv", low_memory = False)
df4 = pd.read_csv("results/submissions_detailed_r_worldnews.csv", low_memory = False)
df = pd.concat([df1, df2, df3, df4]).reset_index(drop = True)

# Convert column types
df["author"] = df["author"].apply(str)
df["removed_by_category"] = df["removed_by_category"].apply(str)

# Add column on removal status
df['removal_status'] = df.apply(categorize_submissions, axis = 1)

# Keep only relevant columns
submissions_all = df[["id", "author", "selftext", "score", "upvote_ratio", "removal_status", "label"]]

# Calculate means and medians
# submissions_all.loc[submissions_all["removal_status"] == "online", "score"].mean()
# submissions_all.loc[submissions_all["removal_status"] == "online", "score"].median()
# submissions_all.loc[submissions_all["removal_status"] == "online", "upvote_ratio"].mean()
# submissions_all.loc[submissions_all["removal_status"] == "online", "upvote_ratio"].median()
# submissions_all.loc[submissions_all["removal_status"] == "deleted_by_user", "score"].mean()
# submissions_all.loc[submissions_all["removal_status"] == "deleted_by_user", "score"].median()
# submissions_all.loc[submissions_all["removal_status"] == "deleted_by_user", "upvote_ratio"].mean()
# submissions_all.loc[submissions_all["removal_status"] == "deleted_by_user", "upvote_ratio"].median()
# submissions_all.loc[submissions_all["removal_status"] == "removed_by_moderator", "score"].mean()
# submissions_all.loc[submissions_all["removal_status"] == "removed_by_moderator", "score"].median()
# submissions_all.loc[submissions_all["removal_status"] == "removed_by_moderator", "upvote_ratio"].mean()
# submissions_all.loc[submissions_all["removal_status"] == "removed_by_moderator", "upvote_ratio"].median()

In [None]:
# Statistics for comments

# Import and combine datasets
df1 = pd.read_csv("results/comments_detailed_r_news.csv", low_memory = False)
df2 = pd.read_csv("results/comments_detailed_r_politics.csv", low_memory = False)
df3 = pd.read_csv("results/comments_detailed_r_science.csv", low_memory = False)
df4 = pd.read_csv("results/comments_detailed_r_worldnews.csv", low_memory = False)
df = pd.concat([df1, df2, df3, df4]).reset_index(drop = True)

# Convert column type
df["author"] = df["author"].apply(str)

# Add column on removal status
df['removal_status'] = df.apply(categorize_comments, axis = 1)

# Keep only relevant columns
comments_all = df[["id", "author", "body", "score", "removal_status","label"]]

# Calculate means and medians
# comments_all.loc[comments_all["removal_status"] == "online", "score"].mean()
# comments_all.loc[comments_all["removal_status"] == "online", "score"].median()
# comments_all.loc[comments_all["removal_status"] == "deleted_by_user", "score"].mean()
# comments_all.loc[comments_all["removal_status"] == "deleted_by_user", "score"].median()
# comments_all.loc[comments_all["removal_status"] == "removed_by_moderator", "score"].mean()
# comments_all.loc[comments_all["removal_status"] == "removed_by_moderator", "score"].median()

### Submissions

#### Online

Average score: 2520

Median score: 337

Average upvote ratio: 0.87

Median upvote ratio: 0.93

#### Deleted by user

Average score: 517

Median score: 6

Average upvote ratio: 0.85

Median upvote ratio: 0.90

#### Removed by moderator

Average score: 494

Median score: 5

Average upvote ratio: 0.88

Median upvote ratio: 0.97

### Comments
Note: there is no upvote ratio for comments, the score value is already the difference between upvotes and downvotes.

#### Online

Average (net) score: 22

Median (net) score: 2

#### Deleted by user

Average (net) score: 12

Median (net) score: 1

#### Removed by moderator

Average (net) score: 16

Median (net) score: 1

### Insights

The difference in absolute score between deleted/removed submissions and those that remained online is very likely due to the additional time that the latter had to accumulate votes. The upvote ratio is almost identical between all three types. For comments, the scores are also surprisingly similar - content that ends up being deleted / removed can thus not be easily predicted by looking at low scores (obviously, doing this would also not be possible in real-time at the time of posting, since votes accumulate over time).

These statistics indicate that score and upvote ratio, at least in most cases, are likely not suitable to make predictions. However, this is one of the assumptions made by the authors and creators of the "Fakeddit" dataset! In their view, content goes through multiple stages of filtering on Reddit, community voting being one of them.

For submissions (but not so much for comments), the difference between average and median score is much higher for removed/deleted content than for online content, indicating that some very large submissions must have been taken down with quite some delay, allowing enough time to accumulate many votes.

### Assessment
Finally, it's time to revisit our initial project goals / research aspects:

#### Content moderation:
Based on the results, it is very clear that this specific model and method is not effective in filtering out misinformation and not a suitable replacement for human content moderation.

#### Resource use:
We saw no performance issues at all when streaming and classifying submissions and comments of 4 very large subreddits in real-time in 8 parallel Jupyter notebooks in Visual Studio Code locally on a Macbook Pro with a M1 Pro chip. With more resources, our approach should be feasible on a larger scale without issues, at least for Reddit, which has a lower volume of content than for example Twitter (which was evident in a Reddit/Twitter streaming project in the bachelor's studies).

#### Reddit insights:
Comparing content that remained online to content that was removed did not show any major statistical differences regarding score and upvote ratio. Manual inspection of both types of content showed that the model struggled to classify irony / sarcasm and other fine nuances.

#### Model applicability:
While the results based on a model trained on Twitter data were mediocre at best, the quick sanity check and comparison based on a model trained on Reddit data did not yield better results either (though we only applied it on submissions, which were very limited in number).