## Text Mining Wikipedia Article on Donald Trump's Tariffs ##
- For this programming assignment, I want to extract and analyze text from the Wikipedia article on Donald Trump's tariffs, and then apply sentiment analysis to the extracted text
- Theoretically, Wikipedia articles are to be written with a neutral and objective tone. Is this true when the article is about controversial political issues? Let's see what the sentiment analysis shows us.
- https://github.com/DNGMtSac/cisd43files

In [2]:
## Importing Beautifulsoup to use for text mining of wikipedia article

import requests
from bs4 import BeautifulSoup

In [3]:
# getting the wikipedia page
url = "https://en.wikipedia.org/wiki/Tariffs_in_the_second_Trump_administration"
response = requests.get(url) #it sends a GET request to the specified URL and stores the response.

In [4]:
soup = BeautifulSoup(response.text, 'html.parser')
print (soup)

<!DOCTYPE html>

<html class="client-nojs vector-feature-language-in-header-enabled vector-feature-language-in-main-page-header-disabled vector-feature-page-tools-pinned-disabled vector-feature-toc-pinned-clientpref-1 vector-feature-main-menu-pinned-disabled vector-feature-limited-width-clientpref-1 vector-feature-limited-width-content-enabled vector-feature-custom-font-size-clientpref-1 vector-feature-appearance-pinned-clientpref-1 vector-feature-night-mode-enabled skin-theme-clientpref-day vector-sticky-header-enabled vector-toc-available" dir="ltr" lang="en">
<head>
<meta charset="utf-8"/>
<title>Tariffs in the second Trump administration - Wikipedia</title>
<script>(function(){var className="client-js vector-feature-language-in-header-enabled vector-feature-language-in-main-page-header-disabled vector-feature-page-tools-pinned-disabled vector-feature-toc-pinned-clientpref-1 vector-feature-main-menu-pinned-disabled vector-feature-limited-width-clientpref-1 vector-feature-limited-wid

## Below I will extract some of the key sections of the page: ## 

In [6]:
# Extract the page title
title = soup.find("h1").text
print("Page Title:", title)

Page Title: Tariffs in the second Trump administration


In [7]:
# Extract the key sections of the page
sections = soup.find_all(['h2','h3','h4',])
for section in sections:
    section_title = section.text
    section_content_tag = section.find_next("p")
    if section_content_tag:
        section_content = section_content_tag.text
        print("Section Title:", section_title)
        print("Section Content:", section_content)
        print()

Section Title: Contents
Section Content: 


Section Title: Background
Section Content: Since the 1980s, Trump has advocated for import tariffs as a tool to regulate trade and retaliate against foreign nations that he believes have been disadvantageous to Americans.[16] In his campaigns for the US presidency, Trump promised to use tariffs to achieve a wide range of goals, including preventing war, reducing trade deficits, improving border security, and subsidizing childcare.[17] Although Trump has said foreign countries pay his tariffs, US tariffs are fees paid by US consumers and businesses either directly or in the form of increased prices.[18][17][19][20] He has also claimed that tariff revenues could eventually replace income taxes, however tariffs would only raise $2.4 trillion over the course of a decade, whereas the IRS collects $2 trillion per year;[21] furthermore, tariff revenues would be expected to decrease over time, given a presumed reduction of imports.[22] Shortly after 

In [10]:
# Extract the external links
external_links = soup.find_all("a", {"class": "external text"})
print("External Links:")
for link in external_links:
    print(link.get("href"))

External Links:
https://uscode.house.gov/view.xhtml?req=granuleid:USC-2000-title50-section1702&num=0&edition=2000
https://www.whitehouse.gov/fact-sheets/2025/02/fact-sheet-president-donald-j-trump-imposes-tariffs-on-imports-from-canada-mexico-and-china/
https://www.whitehouse.gov/presidential-actions/2025/02/imposing-duties-to-address-the-flow-of-illicit-drugs-across-our-national-border/
https://ustr.gov/issue-areas/reciprocal-tariff-calculations
https://web.archive.org/web/20250411034055/https://www.irishtimes.com/resizer/v2/YBMVQKNWCY6ACX3OJ4M3IZO2EY.jpg?auth=8e384f9ec11fd2c5030220604dd90ac2692aef78e53ae43914b27ba8812affe0
https://www.irishtimes.com/resizer/v2/YBMVQKNWCY6ACX3OJ4M3IZO2EY.jpg?auth=8e384f9ec11fd2c5030220604dd90ac2692aef78e53ae43914b27ba8812affe0
https://web.archive.org/web/20250402223423/https://www.irishtimes.com/photography/2025/04/02/in-pictures-trump-signs-executive-order-on-reciprocal-tariffs/
https://www.irishtimes.com/photography/2025/04/02/in-pictures-trump-sign

## Perform Sentiment Analysis on Extracted Text ##

In [13]:
## Extract all text from article
paragraphs = soup.find_all('p')

In [32]:
## clean up the article text for sentiment analysis

article_text = " ".join([para.get_text() for para in paragraphs])
article_text



In [34]:
## import TextBlob
from textblob import TextBlob

In [36]:
## defining the text blob for analysis

blob = TextBlob(article_text)

In [38]:
## Perform polarity and subjectivity analysis

print("Sentiment Polarity:", blob.sentiment.polarity)
print("Sentiment Subjectivity:", blob.sentiment.subjectivity)

Sentiment Polarity: 0.03987134507380988
Sentiment Subjectivity: 0.34148046528328185


## The results of the sensitivity analysis shows that the language in the article is in line with what Wikipedia aims to be: objective and neutral reporting. Although the tariffs are a controversial topic, there is low polarity indicating a neutral tone. Interestingly, the subjectivity score is 0.34, not a very low number, which indicates that not all of the text may be based in objective fact. 