## Final Project: Politifact Analysis

**GROUP: Forhad Akbar, Adam Douglas, and Soumya Ghosh**

In [1]:
import nltk
import numpy as np
import pandas as pd

### Introduction
The website www.politifact.com is a non-partisan fact-checking website that dedicates itself to countering false and misleading political information $^1$. Politifact uses human fact-checkers to research statements made by politicians or viral media that states a political viewpoint. The fact-checker (also known as a "curtor") assigns a score for the item: True, Mostly True, Half True, Mostly False, False, and the dreaded "Pants On Fire" for the biggest falsehoods. The specifics of their research are also included to provide detail and sources related to their scoring.

### The Data
The data was sourced from the Kaggle web site $^2$ and contains 19,421 records. The items date from  to .

### Goal
The goal of this analysis is to use NLP techniques on the curator's review to categorize the topics of the record (e.g. budget, coronavirus, election, etc.). Once topics have been assigned to each record, a network analysis of sources will be done to analyze patterns of false information by topic (e.g. politician X frequently puts out more misleading/false information on topic Y than any other).

### Analysis
#### Data Loading and Cleansing

First we will load the data from the CSV file and then prepare the data for analysis.

In [21]:
pol = pd.read_csv("politifact.csv", index_col = 0, usecols = range(0,11))
pol.head()

Unnamed: 0,sources,sources_dates,sources_post_location,sources_quote,curator_name,curated_date,fact,sources_url,curators_article_title,curator_complete_article
0,Viral image,2021-03-20 00:00:00,a Facebook post:\n,\nSays Disneyland is enforcing a “no scream” p...,Ciara O'Rourke,2021-03-22 00:00:00,false,https://www.politifact.com/factchecks/2021/mar...,"\nNo, Disneyland isn’t enforcing a ‘no scream’...",\nBack in July a Japanese amusement park drew ...
1,Viral image,2021-01-24 00:00:00,a Facebook post:\n,\nVideo “proves 100% that President Joe Biden ...,Ciara O'Rourke,2021-03-22 00:00:00,pants-fire,https://www.politifact.com/factchecks/2021/mar...,"\nNo, this video doesn’t prove Biden isn’t hum...",\nA recent Facebook post uses a video clip of ...
2,Terry McAuliffe,2021-03-10 00:00:00,a speech:\n,\n“If you look at the average teacher pay comp...,Warren Fiske,2021-03-22 00:00:00,true,https://www.politifact.com/factchecks/2021/mar...,\nVa. teachers pay ranks last in U.S. compared...,"\n""If you look at the average teacher pay comp..."
3,TikTok posts,2021-03-17 00:00:00,a video caption:\n,\n“You will need a WHO Yellow Vaccination Pass...,Daniel Funke,2021-03-22 00:00:00,barely-true,https://www.politifact.com/factchecks/2021/mar...,"\nNo, you don’t need a WHO vaccination certifi...",\nA popular TikTok video said that if you want...
4,Mike Bost,2021-03-15 00:00:00,a radio,\nThe American Rescue Plan Act “does not” incl...,Kiannah Sepeda-Miller,2021-03-21 00:00:00,half-true,https://www.politifact.com/factchecks/2021/mar...,\nDoes the American Rescue Plan limit how stat...,\nAfter failing to stop the $1.9 trillion COVI...


In [33]:
# Remove newlines
pol["sources_post_location"] = pol["sources_post_location"].str.replace("\n"," ", regex = False)
pol["sources_quote"] = pol["sources_quote"].str.replace("\n"," ", regex = False)
pol["curators_article_title"] = pol["curators_article_title"].str.replace("\n"," ", regex = False)
pol["curator_complete_article"] = pol["curator_complete_article"].str.replace("\n"," ", regex = False)

pol.head()

Unnamed: 0,sources,sources_dates,sources_post_location,sources_quote,curator_name,curated_date,fact,sources_url,curators_article_title,curator_complete_article
0,Viral image,2021-03-20 00:00:00,a Facebook post:,Says Disneyland is enforcing a “no scream” po...,Ciara O'Rourke,2021-03-22 00:00:00,false,https://www.politifact.com/factchecks/2021/mar...,"No, Disneyland isn’t enforcing a ‘no scream’ ...",Back in July a Japanese amusement park drew a...
1,Viral image,2021-01-24 00:00:00,a Facebook post:,Video “proves 100% that President Joe Biden i...,Ciara O'Rourke,2021-03-22 00:00:00,pants-fire,https://www.politifact.com/factchecks/2021/mar...,"No, this video doesn’t prove Biden isn’t human",A recent Facebook post uses a video clip of a...
2,Terry McAuliffe,2021-03-10 00:00:00,a speech:,“If you look at the average teacher pay compa...,Warren Fiske,2021-03-22 00:00:00,true,https://www.politifact.com/factchecks/2021/mar...,Va. teachers pay ranks last in U.S. compared ...,"""If you look at the average teacher pay compa..."
3,TikTok posts,2021-03-17 00:00:00,a video caption:,“You will need a WHO Yellow Vaccination Passp...,Daniel Funke,2021-03-22 00:00:00,barely-true,https://www.politifact.com/factchecks/2021/mar...,"No, you don’t need a WHO vaccination certific...",A popular TikTok video said that if you want ...
4,Mike Bost,2021-03-15 00:00:00,a radio,The American Rescue Plan Act “does not” inclu...,Kiannah Sepeda-Miller,2021-03-21 00:00:00,half-true,https://www.politifact.com/factchecks/2021/mar...,Does the American Rescue Plan limit how state...,After failing to stop the $1.9 trillion COVID...


### Works Cited
1 - https://www.politifact.com/who-pays-for-politifact/

2 - https://www.kaggle.com/shivkumarganesh/politifact-factcheck-data