# Creating Baseline ML Model

### Imports

In [None]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import accuracy_score, classification_report

## Loading datasets

In [2]:
truth = pd.read_csv('../raw_data/True.csv')
fake = pd.read_csv('../raw_data/Fake.csv')

## Cleaning Data

In [3]:
# Dropping unnecessary columns

truth = truth.drop(columns=['title', 'subject', 'date'])
fake = fake.drop(columns=['title', 'subject', 'date'])

In [4]:
# Creating 'isfake' column (0 = truth, 1 = fake)

truth['isfake'] = 0
fake['isfake'] = 1

In [5]:
# Concatenating both df's

news = pd.concat([truth, fake], axis=0, ignore_index=True)
news

Unnamed: 0,text,isfake
0,WASHINGTON (Reuters) - The head of a conservat...,0
1,WASHINGTON (Reuters) - Transgender people will...,0
2,WASHINGTON (Reuters) - The special counsel inv...,0
3,WASHINGTON (Reuters) - Trump campaign adviser ...,0
4,SEATTLE/WASHINGTON (Reuters) - President Donal...,0
...,...,...
44893,21st Century Wire says As 21WIRE reported earl...,1
44894,21st Century Wire says It s a familiar theme. ...,1
44895,Patrick Henningsen 21st Century WireRemember ...,1
44896,21st Century Wire says Al Jazeera America will...,1


## Preprocessing Data

In [6]:
# ID features and target

X = news.text
y = news.isfake

In [8]:
# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

vectorizer = TfidfVectorizer(max_features=5000)  
X_train_vectorized = vectorizer.fit_transform(X_train)
X_test_vectorized = vectorizer.transform(X_test)

## Train MultinomialNB Model

In [9]:
# Train the Multinomial Naive Bayes model
clf = MultinomialNB()
clf.fit(X_train_vectorized, y_train)

# Make predictions on the test set
y_pred = clf.predict(X_test_vectorized)

# Evaluate the model
accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy: {accuracy:.4f}')

# Print classification report
print(classification_report(y_test, y_pred))

Accuracy: 0.9347
              precision    recall  f1-score   support

           0       0.93      0.93      0.93      4330
           1       0.93      0.94      0.94      4650

    accuracy                           0.93      8980
   macro avg       0.93      0.93      0.93      8980
weighted avg       0.93      0.93      0.93      8980



## New Article

In [12]:
# New text to classify
new_text = '''Former President Donald J. Trump, who has been indicted by federal prosecutors for conspiracy to defraud the United States in connection with a plot to overturn the 2020 election, repeatedly claimed to supporters in Iowa on Saturday that it was President Biden who posed a severe threat to American democracy.

While Mr. Trump shattered democratic norms throughout his presidency and has faced voter concerns that he would do so again in a second term, the former president in his speech repeatedly accused Mr. Biden of corrupting politics and waging a repressive “all-out war” on America.

”Joe Biden is not the defender of American democracy,” he said. “Joe Biden is the destroyer of American democracy.”

Mr. Trump has made similar attacks on Mr. Biden a staple of his speeches in Iowa and elsewhere. He frequently accuses the president broadly of corruption and of weaponizing the Justice Department to influence the 2024 election.

But in his second of two Iowa speeches on Saturday, held at a community college gym in Cedar Rapids, Mr. Trump sharpened that line of attack, suggesting a more concerted effort by his campaign to defend against accusations that Mr. Trump has an anti-democratic bent — by going on offense.

Polls have shown that significant percentages of voters in both parties are concerned about threats to democracy. During the midterm elections, candidates who embraced Mr. Trump’s lie that the 2020 election was stolen from him were defeated, even in races in which voters did not rank “democracy” as a top concern.

Mr. Biden’s re-election campaign has frequently attacked Mr. Trump along those lines. In recent weeks, Biden aides and allies have called attention to news reports about plans being made by Mr. Trump and his allies that would undermine central elements of American democracy, governing and the rule of law.

Mr. Trump and his campaign have sought to dismiss such concerns as a concoction to scare voters. But on Saturday, they tried to turn the Biden campaign’s arguments back against the president.

At the Cedar Rapids event, aides and volunteers left placards with bold black-and-white lettering reading “Biden attacks democracy” on the seats and bleachers. At the start of Mr. Trump's speech, that message was broadcast on a screen above the stage.

Mr. Trump has a history of accusing his opponents of behavior that he himself is guilty of, the political equivalent of a “No, you are” playground retort. In a 2016 debate, when Hillary Clinton accused Mr. Trump of being a Russian puppet, Mr. Trump fired back with “You’re the puppet,” a comment he never explained.

Mr. Trump’s accusations against Mr. Biden, which he referenced repeatedly throughout his speech, veered toward the conspiratorial. He claimed the president and his allies were seeking to control Americans’ speech, their behavior on social media and their purchases of cars and dishwashers.

Without evidence, he accused Mr. Biden of being behind a nationwide effort to get Mr. Trump removed from the ballot in several states. And, as he has before, he claimed, again without evidence, that Mr. Biden was the mastermind behind the four criminal cases against him.

Here, too, Mr. Trump conjured a nefarious-sounding presidential conspiracy, one with dark ramifications for ordinary Americans, not just for the former president being prosecuted. Mr. Biden and his allies “think they can do whatever they want,” Mr. Trump said — “break any law, tell any lie, ruin any life, trash any norm, and get away with anything they want. Anything they want.”

Democrats suggested that the former president was projecting again.

“Donald Trump’s America in 2025 is one where the government is his personal weapon to lock up his political enemies,” Ammar Moussa, a spokesman for Mr. Biden’s re-election campaign, said in a statement. “You don’t have to take our word for it — Trump has admitted it himself.”

Even as he was insisting that Mr. Biden threatens democracy, Mr. Trump underscored his most antidemocratic campaign themes.

Having said that he would use the Justice Department to “go after” the Biden family,  on Saturday, he swore that he would “investigate every Marxist prosecutor in America for their illegal, racist-in-reverse enforcement of the law.”

Mr. Trump has frequently decried the cases brought him against by Black prosecutors in New York and Atlanta as racist. (He does not apply that charge to the white special counsel in his two federal criminal cases, who he instead calls “deranged.”)

Yet Mr. Trump himself has a history of racist statements.

At an earlier event on Saturday, where he sought to undermine confidence in election integrity well before the 2024 election, he urged supporters in Ankeny, a predominantly white suburb of Des Moines, to take a closer look at election results next year in Detroit, Philadelphia and Atlanta, three cities with large Black populations in swing states that he lost in 2020.

“You should go into some of these places, and we’ve got to watch those votes when they come in,” Mr. Trump said. “When they’re being, you know, shoved around in wheelbarrows and dumped on the floor and everyone’s saying, ‘What’s going on?’

“We’re like a third-world nation,” he added.

Mr. Trump’s speeches on Saturday reflected how sharply he is focused on the general election rather than the Republican primary contest, in which he holds a commanding lead.

With just over six weeks until the Iowa caucus, Mr. Trump dismissed his Republican rivals, mocking them for polling well behind him and denouncing Gov. Ron DeSantis of Florida as disloyal for deciding to run against him.

He also attacked Iowa’s Republican governor, Kim Reynolds, for endorsing Mr. DeSantis and suggested her popularity had tumbled after she had spurned Mr. Trump.

“You know, with your governor we had an issue,” Mr. Trump said, prompting a chorus of boos.'''

# Vectorize the new text using the same vectorizer
new_text_vectorized = vectorizer.transform([new_text])

# Make predictions using the trained Multinomial Naive Bayes model
prediction = clf.predict(new_text_vectorized)

# Interpret the prediction
if prediction[0] == 1:
    print("The text is classified as fake.")
else:
    print("The text is classified as real.")

The text is classified as fake.
