## SENTIMENTAL ANALYSIS
- I will perform sentiment analysis on data obtained from the Reddit API, specifically focusing on discussions related to homelessness in Ireland. The goal is to gain insights into the housing needs in the country.
- For this analysis, I have chosen to utilize two libraries: TextBlob and NLTK. The inclusion of both libraries allows for a comparison of the results obtained through sentimental analysis. While TextBlob provides a simplified and user-friendly interface, NLTK offers greater flexibility and customization options.
- The NLTK library, provides a comprehensive toolkit for various natural language processing (NLP) tasks.
- NLTK provides finer control over tokenization and stemming, while TextBlob offers a straightforward and accessible solution primarily focused on sentiment analysis.

# Install praw if you don't have in your environment
#!pip install praw

In [1]:
# Install nltk if you don't have in your environment
# !pip install nltk

In [2]:
import praw

reddit = praw.Reddit(
    client_id="Ra3jN6DnoznWvUpKbdu33g",
    client_secret="XdAPqOLQ2lA5YQxpMoS6ImbyehASYg",
    user_agent="my_unique_script by u/cctdublin"
)

# I want to search for posts related to homelessness in Ireland in the r/europe subreddit
hot_posts = reddit.subreddit('europe').search('homelessness Ireland', limit=100)
for post in hot_posts:
    print(post.title)


Version 7.7.0 of praw is outdated. Version 7.7.1 was released Tuesday July 11, 2023.


UN refugee agency says it is 'completely unpalatable' that asylum seekers are homeless in Ireland
A Searing, Claustrophobic Portrait of Family Homelessness in Ireland—’Rosie’, dir. Paddy Breathnach, 2018
Ireland is no country if you’re young, creative or homeless
FactCheck: Does Ireland really have a low rate of homelessness by international standards?
Thousands (12,000-20,000 people) march in Dublin over housing & homeless crisis in Ireland
Europe’s hotel occupancy rates drop by -61.6% to 26.3% during March 2020
2020 Irish General Election [Megathread]
Social Protection Expenditure in the EU Countries (% GDP, 2018)
The State of the World’s Toilets
Household expenditure on housing (housing, water, electricity, gas and other fuels) in the EU: In 2019, the share of household expenditure devoted to housing was largest in Finland (28.8%), Slovakia (28.4%) and Denmark (27.9%), lowest in Malta (12.3%), Lithuania (14.9%) and Cyprus (15.6%).
Irish economy grew by 2.5% in second quarter of 2018

In [3]:
# Install textblob if you don't have in your environment
# !pip install textblob

In [4]:
from textblob import TextBlob

# List of the sentences
sentences = [
    "UN refugee agency says it is 'completely unpalatable' that asylum seekers are homeless in Ireland",
    "A Searing, Claustrophobic Portrait of Family Homelessness in Ireland—’Rosie’, dir. Paddy Breathnach, 2018",
    "Ireland is no country if you’re young, creative or homeless",
    "FactCheck: Does Ireland really have a low rate of homelessness by international standards?",
    "Thousands (12,000-20,000 people) march in Dublin over housing & homeless crisis in Ireland",
    "Europe’s hotel occupancy rates drop by -61.6% to 26.3% during March 2020",
    "2020 Irish General Election [Megathread]",
    "Social Protection Expenditure in the EU Countries (% GDP, 2018)",
    "The State of the World’s Toilets",
    "Household expenditure on housing (housing, water, electricity, gas and other fuels) in the EU: In 2019, the share of household expenditure devoted to housing was largest in Finland (28.8%), Slovakia (28.4%) and Denmark (27.9%), lowest in Malta (12.3%), Lithuania (14.9%) and Cyprus (15.6%).",
    "Irish economy grew by 2.5% in second quarter of 2018",
    "Hungary's homeless ban: Campaigners slam 'policy of total evil' with temperatures set to fall",
    "The hidden statistics of Eurostat"
]


# Initialize variables
total_sentiment = 0.0
num_sentences = len(sentences)

# Calculate sentiment for each sentence
for sentence in sentences:
    sentiment = TextBlob(sentence).sentiment.polarity
    total_sentiment += sentiment

# Calculate the overall sentiment score
overall_sentiment = total_sentiment / num_sentences

print(f"Overall sentiment score: {overall_sentiment}")

Overall sentiment score: -0.07820512820512822


- According to the overall sentiment score in TextBlob is approximately -0.0782. The score suggests a slightly negative sentiment overall.

In [6]:
# Install nltk if you don't have in your environment
# !pip install nltk
# nltk.download('punkt')
# nltk.download('vader_lexicon')

In [7]:
import nltk
from nltk.sentiment import SentimentIntensityAnalyzer
from nltk.tokenize import word_tokenize
from nltk.stem import SnowballStemmer

# List of the sentences
sentences = [
    "UN refugee agency says it is 'completely unpalatable' that asylum seekers are homeless in Ireland",
    "A Searing, Claustrophobic Portrait of Family Homelessness in Ireland—’Rosie’, dir. Paddy Breathnach, 2018",
    "Ireland is no country if you’re young, creative or homeless",
    "FactCheck: Does Ireland really have a low rate of homelessness by international standards?",
    "Thousands (12,000-20,000 people) march in Dublin over housing & homeless crisis in Ireland",
    "Europe’s hotel occupancy rates drop by -61.6% to 26.3% during March 2020",
    "2020 Irish General Election [Megathread]",
    "Social Protection Expenditure in the EU Countries (% GDP, 2018)",
    "The State of the World’s Toilets",
    "Household expenditure on housing (housing, water, electricity, gas and other fuels) in the EU: In 2019, the share of household expenditure devoted to housing was largest in Finland (28.8%), Slovakia (28.4%) and Denmark (27.9%), lowest in Malta (12.3%), Lithuania (14.9%) and Cyprus (15.6%).",
    "Irish economy grew by 2.5% in second quarter of 2018",
    "Hungary's homeless ban: Campaigners slam 'policy of total evil' with temperatures set to fall",
    "The hidden statistics of Eurostat"
]

# Initialize variables
total_sentiment = 0.0
num_sentences = len(sentences)

# Initialize NLTK components
stemmer = SnowballStemmer("english")
sid = SentimentIntensityAnalyzer()

# Calculate sentiment for each sentence
for sentence in sentences:
    # Tokenize the sentence
    tokens = word_tokenize(sentence)

    # Apply stemming to the tokens
    stemmed_tokens = [stemmer.stem(token) for token in tokens]

    # Join the stemmed tokens back into a sentence
    stemmed_sentence = " ".join(stemmed_tokens)

    # Calculate sentiment polarity using NLTK's SentimentIntensityAnalyzer
    sentiment = sid.polarity_scores(stemmed_sentence)["compound"]
    total_sentiment += sentiment

# Calculate the overall sentiment score
overall_sentiment = total_sentiment / num_sentences

print(f"Overall sentiment score: {overall_sentiment}")


Overall sentiment score: -0.11186923076923076


- The NLTK analysis indicates an overall sentiment score of around -0.1118, which implies a slightly negative sentiment.
- The slight difference in the overall sentiment score can be attributed to the limited amount of text available for analysis.