<a href="https://www.kaggle.com/code/amirmotefaker/ukraine-russia-war-twitter-sentiment-analysis?scriptVersionId=144635206" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

# Introduction

- More than 400 days have passed since the war between Russia and Ukraine. Many countries support Ukraine by imposing economic sanctions against Russia. There are a lot of tweets about the Ukraine-Russia war where people tend to update the facts on the ground, how they feel about it, and who they support.

# Russia-Ukraine war at a glance: what we know on day 429 of the invasion

- Russia on Friday launched a wave of missile attacks across many of Ukraine’s biggest cities, killing a mother and young child in the port city of Dnipro, and three people at a high-rise apartment building in the central city of Uman. Air raid alarms were active across the country in the early hours of Friday morning, while explosions were heard in Kyiv, and southern Mykolaiv was targeted again.

- At least seven civilians were killed and 33 injured between Wednesday and Thursday, Ukraine’s presidential office said, including one person killed and 23 wounded when four Kalibr cruise missiles hit the southern city of Mykolaiv.

- The parliamentary assembly of the Council of Europe has voted that the forced detention and deportation of children from Russian occupied territories of Ukraine is genocide.

- [The Guardian](https://www.theguardian.com/world/2023/apr/28/russia-ukraine-war-at-a-glance-what-we-know-on-day-429-of-the-invasion)


# Russia-Ukraine war LIVE

- [Live News Aljazeera](https://www.aljazeera.com/tag/ukraine-russia-crisis/)

# Import Libraries

In [None]:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from nltk.sentiment.vader import SentimentIntensityAnalyzer
from wordcloud import WordCloud, STOPWORDS, ImageColorGenerator
import nltk
import re
from nltk.corpus import stopwords
import string

# Read Data

In [None]:
data = pd.read_csv("/kaggle/input/russia-vs-ukraine-tweets-datasetdaily-updated/filename.csv")

In [None]:
print(data.head())

- Let’s have a quick look at all the column names of the dataset:

In [None]:
print(data.columns)

- We only need three columns for this task (username, tweet, and language); I will only select these columns and move forward:

In [None]:
data = data[["username", "tweet", "language"]]

- Let’s have a look at whether any of these columns contains any null values or not:


In [None]:
data.isnull().sum()

- So none of the columns has null values, let’s have a quick look at how many tweets are posted in which language:

In [None]:
data["language"].value_counts()

# Sentiment Analysis

- So most of the tweets are in English. Let’s prepare this data for the task of sentiment analysis. Here I will remove all the links, punctuation, symbols and other language errors from the tweets:

In [None]:
nltk.download('stopwords')
stemmer = nltk.SnowballStemmer("english")
stopword=set(stopwords.words('english'))

def clean(text):
    text = str(text).lower()
    text = re.sub('\[.*?\]', '', text)
    text = re.sub('https?://\S+|www\.\S+', '', text)
    text = re.sub('<.*?>+', '', text)
    text = re.sub('[%s]' % re.escape(string.punctuation), '', text)
    text = re.sub('\n', '', text)
    text = re.sub('\w*\d\w*', '', text)
    text = [word for word in text.split(' ') if word not in stopword]
    text=" ".join(text)
    text = [stemmer.stem(word) for word in text.split(' ')]
    text=" ".join(text)
    return text
data["tweet"] = data["tweet"].apply(clean)

# Most Frequently Used Words

- Let’s have a look at the wordcloud of the tweets, which will show the most frequently used words in the tweets by people sharing their feelings and updates about the Ukraine and Russia war:

In [None]:
text = " ".join(i for i in data.tweet)
stopwords = set(STOPWORDS)
wordcloud = WordCloud(stopwords=stopwords, background_color="white").generate(text)
plt.figure( figsize=(15,10))
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis("off")
plt.show()

# Positive, Negative, and Neutral

- Now I will add three more columns in this dataset as Positive, Negative, and Neutral by calculating the sentiment scores of the tweets:

In [None]:
nltk.download('vader_lexicon')
sentiments = SentimentIntensityAnalyzer()
data["Positive"] = [sentiments.polarity_scores(i)["pos"] for i in data["tweet"]]
data["Negative"] = [sentiments.polarity_scores(i)["neg"] for i in data["tweet"]]
data["Neutral"] = [sentiments.polarity_scores(i)["neu"] for i in data["tweet"]]
data = data[["tweet", "Positive", "Negative", "Neutral"]]
print(data.head())

# Positive Sentiments

- Let’s have a look at the most frequent words used by people with positive sentiments:

In [None]:
positive =' '.join([i for i in data['tweet'][data['Positive'] > data["Negative"]]])
stopwords = set(STOPWORDS)
wordcloud = WordCloud(stopwords=stopwords, background_color="white").generate(positive)
plt.figure( figsize=(15,10))
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis("off")
plt.show()

# Negative Sentiments

- Let’s have a look at the most frequent words used by people with negative sentiments

In [None]:
negative =' '.join([i for i in data['tweet'][data['Negative'] > data["Positive"]]])
stopwords = set(STOPWORDS)
wordcloud = WordCloud(stopwords=stopwords, background_color="white").generate(negative)
plt.figure( figsize=(15,10))
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis("off")
plt.show()

## I hope this war gets over soon and things get back to normal.

# Summary

- There are a lot of tweets about the Ukraine and Russia war where people tend to update about the ground truths, what they feel about it, and who they are supporting. I used those tweets for the task of Twitter sentiment analysis on the Ukraine and Russia wars.