# Hate Narratives and the Nagorno-Karabakh Conflict, 2020-2024

This summer, I spent six weeks in the South Caucasus conducting research on civil society-led peacebuilding initiatives in Armenia, in the particular context of the decades-long Nagorno-Karabakh conflict between Armenia and Azerbaijan. I was based in Yerevan, Armenia's capital, but also spent several days in Baku, the capital of Azerbaijan. Throughout my time there, I was perpetually struck by the hateful rhetoric used by both political leaders and regular people on both sides of the conflict. At times it was downright appalling - I recall watching a recent video of an elderly Azerbaijani woman, in the aftermath of an Armenian strike, crying, "We thirst for every last drop of Armenian blood. We thirst for their children's blood."

Such hateful speech has become an integral dimension of the conflict, especially as actual contact between Armenians and Azerbaijanis, which has in many cases facilitated mutual understanding, increasingly vanishes. I was struck when an Armenian colleague at the Yerevan think tank I worked with over the summer told me that she'd only met one Azerbaijani in her life, when she lived in Russia in her youth. She added that if she had an opportunity in the future to speak with someone from Azerbaijan, she wouldn't take it. This was coming from somebody I deeply admired for her intellectual curiosity and openmindedness, and someone who had been involved to varying degrees in peacebuilding efforts before. If someone with her position and pedigree was completely uninterested in building bridges, what hope was there to promote reconciliation - especially when travel by Armenians to Azerbaijan, and even their presence in the other country, and vice versa has effectively been banned? My colleague never used bombastic, hateful speech, but there was always a sense of disdain, and that Armenians had been deeply, indelibly wronged by the Azerbaijanis.

One can, though, often hear hateful, dehumanizing rhetoric emanating from the top offices in both countries. In my experience, Ilham Aliyev, the president of Azerbaijan since 2003, is the bigger culprit, but Armenia's Nikol Pashinyan, prime minister since 2018, is guilty of weaponizing such speech as well. I'm interested in investigating how Aliyev and Pashinyan use hate narratives, when and to what end they use them, and their historical roots, among other things. Borrowing from Iwona Jakubowska-Branicka (2016), hate narratives are patterns of speech or communication that promote hostility, discrimination, or violence against a particular group, often based on their ethnicity, religion, nationality, gender, or other traits. Hate narratives often involve stereotyping, dehumanization, or the justification of harm toward the targeted group. **My primary research question is as follows: How do the speeches of Nikol Pashinyan and Ilham Aliyev between 2020 and 2024 reflect and propagate hate narratives, and what differences or similarities exist in their rhetorical strategies?**

While the conflict began in 1988 as the Soviet Union began to crumble, my focus is on the past four years, and especially on four particularly pivotal events: the Second Nagorno-Karabakh War, which lasted from late September to early November 2020; September 2022 clashes between Armenia and Azerbaijan which included the largest attacks by Azerbaijan on Armenia proper in the history of the conflict; the Azerbaijani blockade of the Lachin corridor, the only road connecting Armenia and Nagorno-Karabakh, which began in December 2022; and the one-day lightning offensive launched by Azerbaijan in September 2023, giving it complete control of Nagorno-Karabakh for the first time and leading to the exodus of nearly all Armenians from the territory and into mainland Armenia.

My analysis includes speeches by Pashinyan and Aliyev before, during, and after these events. In total, it includes 48 speeches by Pashinyan and 36 by Aliyev, scraped from their official websites. All the speeches are available in English on the sites. More precisely, it includes 16 Pashinyan speeches and 11 Aliyev speeches around the Second Nagorno-Karabakh War; six Pashinyan and eight Aliyev speeches around the September 2022 clashes; 12 Pashinyan and seven Aliyev speeches around the Lachin corridor blockade; and 14 Pashinyan and 10 Aliyev speeches around the 2023 lightning offensive.

First, it's always helpful to get a mental picture of where something is taking place, so let's generate a map.

***Generating a map of Armenia, Azerbaijan, and Nagorno-Karabakh***

In [15]:
"""I wanted to create a map of Armenia and Azerbaijan and also show Nagorno-Karabakh, the rough borders of which aren't
usually shown on most maps. If you click on the pins on the map, it'll show you which entity is which."""

import folium #For creating interactive maps
import json #To handle GeoJSON data, a file format for geographic data, in this case to add the Nagorno-Karabakh border to the map
from IPython.display import IFrame #To display the map directly in the notebook

#Specifying the path to my local GeoJSON file, which I found for Nagorno-Karabakh on GitHub.
#It's got the blue border on the map below.
geojson_file_path = r"C:\Users\wolyn\Downloads\nagorno-karabakh.geojson"

#Opening and loading the GeoJSON data from the local file
with open(geojson_file_path, 'r') as f:
    geojson_data = json.load(f)

#Initializing the map centered around a latitude and longitude, got these with help from ChatGPT
m = folium.Map(location=[40.2, 45.3], zoom_start=7) #zoom_start determines the starting zoom level

#Displaying the geographic data on the map
folium.GeoJson(geojson_data).add_to(m)

#Adding markers with labels for Armenia, Azerbaijan, and Nagorno-Karabakh
#Coordinates for approximate locations of these regions
folium.Marker([40.2, 44.5], popup="Armenia").add_to(m)
folium.Marker([40.2, 47.5], popup="Azerbaijan").add_to(m)
folium.Marker([39.9, 46.8], popup="Nagorno-Karabakh").add_to(m)

#Saving the map to an HTML file
m.save("nagorno_karabakh_map_with_labels.html")

#Displaying the map in the notebook
IFrame("nagorno_karabakh_map_with_labels.html", width=800, height=600)

***Collecting and cleaning the data***

Now that we can easily visualize this part of the world, I need to actually collect the speeches for my analysis. I went to the respective websites for Aliyev and Pashinyan and gathered the URLs for each relevant speech around the four events in question. I'm creating two CSV files, one for Aliyev's speeches and one for Pashinyan's, to keep it simple in my mind/analysis.

In [17]:
import requests #To fetch webpage content
import csv #To write extracted data into a CSV file
from bs4 import BeautifulSoup #To parse and extract data from HTML

#List of web links to Pashinyan's speeches to be scraped
urls = [
    "https://www.primeminister.am/en/statements-and-messages/item/2020/08/21/Nikol-Pashinyan-Security-Council-meeting/",
"https://www.primeminister.am/en/statements-and-messages/item/2020/08/28/Nikol-Pashinyan--message/",
"https://www.primeminister.am/en/statements-and-messages/item/2020/09/27/Nikol-Pashinyan-message/",
"https://www.primeminister.am/en/statements-and-messages/item/2020/09/27/Nikol-Pashinyan-Speech/",
"https://www.primeminister.am/en/statements-and-messages/item/2020/09/27/Cabinet-meeting-Speech-27-09/",
"https://www.primeminister.am/en/statements-and-messages/item/2020/10/03/Nikol-Pashinyan-message/",
"https://www.primeminister.am/en/statements-and-messages/item/2020/10/14/Nikol-Pashinyan-message-to-the-nation/",
"https://www.primeminister.am/en/statements-and-messages/item/2020/10/21/Nikol-Pashinyan/",
"https://www.primeminister.am/en/statements-and-messages/item/2020/10/27/Nikol-Pashinyan-Speech/",
"https://www.primeminister.am/en/statements-and-messages/item/2020/11/12/Nikol-Pashinyan-Speech/",
"https://www.primeminister.am/en/statements-and-messages/item/2020/11/26/Nikol-Pashinyan-Speech/",
"https://www.primeminister.am/en/statements-and-messages/item/2020/11/27/Cabinet-meeting-Speech/",
"https://www.primeminister.am/en/statements-and-messages/item/2020/12/02/Nikol-Pashinyan-Speech-CSTO-session/",
"https://www.primeminister.am/en/statements-and-messages/item/2020/12/05/Nikol-Pashinyan-message/",
"https://www.primeminister.am/en/statements-and-messages/item/2020/12/14/Nikol-Pashinyan-Speech/",
"https://www.primeminister.am/en/statements-and-messages/item/2020/12/19/Nikol-Pashinyan-message/",
"https://www.primeminister.am/en/statements-and-messages/item/2022/08/04/Cabinet-meeting-Speech/",
"https://www.primeminister.am/en/statements-and-messages/item/2022/09/02/Nikol-Pashinyan-messages/",
"https://www.primeminister.am/en/statements-and-messages/item/2022/09/13/Nikol-Pashinyan-Speech/",
"https://www.primeminister.am/en/statements-and-messages/item/2022/09/19/Nikol-Pashinyan-Speech/",
"https://www.primeminister.am/en/statements-and-messages/item/2022/09/22/Nikol-Pashinyan-Speech/",
"https://www.primeminister.am/en/statements-and-messages/item/2022/10/14/Nikol-Pashinyan-Speech/#photos[pp_gal_1]/0/",
"https://www.primeminister.am/en/statements-and-messages/item/2022/11/10/Cabinet-meeting-Speech/",
"https://www.primeminister.am/en/statements-and-messages/item/2022/11/23/Nikol-Pashinyan-CSTO-meeting/",
"https://www.primeminister.am/en/statements-and-messages/item/2022/12/15/Cabinet-meeting-Speech/",
"https://www.primeminister.am/en/statements-and-messages/item/2022/12/22/Cabinet-meeting-Speech/",
"https://www.primeminister.am/en/statements-and-messages/item/2022/12/29/Cabinet-meeting-Speech/",
"https://www.primeminister.am/en/statements-and-messages/item/2022/12/31/Nikol-Pashinyan-New-Year-Message/",
"https://www.primeminister.am/en/statements-and-messages/item/2023/02/16/Cabinet-meeting-Speech/",
"https://www.primeminister.am/en/statements-and-messages/item/2023/02/20/Nikol-Pashinyan-Congratulations/",
"https://www.primeminister.am/en/statements-and-messages/item/2023/02/23/Cabinet-meeting-Speech/",
"https://www.primeminister.am/en/statements-and-messages/item/2023/03/09/Cabinet-meeting-Speech/",
"https://www.primeminister.am/en/statements-and-messages/item/2023/03/16/Cabinet-meeting-Speech/",
"https://www.primeminister.am/en/statements-and-messages/item/2023/03/23/Cabinet-meeting-Speech/",
"https://www.primeminister.am/en/statements-and-messages/item/2023/08/17/Cabinet-meeting-Speech/",
"https://www.primeminister.am/en/statements-and-messages/item/2023/08/24/Cabinet-meeting-Speech/",
"https://www.primeminister.am/en/statements-and-messages/item/2023/09/02/Nikol-Pashinyan-message/",
"https://www.primeminister.am/en/statements-and-messages/item/2023/09/07/Cabinet-meeting-Speech/",
"https://www.primeminister.am/en/statements-and-messages/item/2023/09/14/Cabinet-meeting-Speech/",
"https://www.primeminister.am/en/statements-and-messages/item/2023/09/19/Nikol-Pashinyan-Speech/",
"https://www.primeminister.am/en/statements-and-messages/item/2023/09/20/Nikol-Pashinyan-Speech/",
"https://www.primeminister.am/en/statements-and-messages/item/2023/09/21/Nikol-Pashinyan-21-09-Speech/",
"https://www.primeminister.am/en/statements-and-messages/item/2023/09/22/Cabinet-meeting-Speech/",
"https://www.primeminister.am/en/statements-and-messages/item/2023/09/24/Nikol-Pashinyan-messages/",
"https://www.primeminister.am/en/statements-and-messages/item/2023/09/28/Cabinet-meeting-Speech/",
"https://www.primeminister.am/en/statements-and-messages/item/2023/10/17/Nikol-Pashinyan-Speech/",
"https://www.primeminister.am/en/statements-and-messages/item/2023/10/26/Nikol-Pashinyan-Speech/#photos[pp_gal_1]/0/",
"https://www.primeminister.am/en/statements-and-messages/item/2023/11/18/Nikol-Pashinyan-Speech/"
]

#Name of the file to store the scraped speeches
csv_file = 'pashinyan_speeches48.csv'

#Looping through the URLs to process each speech webpage
for url in urls:
    try:
        response = requests.get(url) #Retrieving the page
        response.raise_for_status() #Ensuring the request is successful
        soup = BeautifulSoup(response.text, 'html.parser') #Parses the HTML structure of the page

        #Finding the speech text (in all <p> tags)
        paragraphs = soup.find_all('p')
        speech_text = "\n".join([p.get_text() for p in paragraphs]) #Combining the text into a single string

        #Extracting the date
        date = soup.find('div', class_='search__date fs12')  #Locates the date using a specific class, had to manually inspect the page to get the right class
        date_text = date.get_text(strip=True) if date else "Date not found"

        #Categorizing the leader for this CSV, not super necessary since this one is only for Pashinyan but helpful in my mind
        leader = "Pashinyan"

        #Opening the CSV file and writing speech data
        with open(csv_file, mode='a', newline='', encoding='utf-8') as file:
            writer = csv.writer(file)

            #Making sure to write the header only once
            if file.tell() == 0:  #Check if the file is empty to write a header
                writer.writerow(["Leader", "Date", "Speech Text", "URL"])

            # Write the speech data
            writer.writerow([leader, date_text, speech_text, url]) #Writes leader name, date, speech, and URL as a row

        print(f"Successfully scraped: {url}")

    except Exception as e:
        print(f"Failed to scrape {url}: {e}")

Successfully scraped: https://www.primeminister.am/en/statements-and-messages/item/2020/08/21/Nikol-Pashinyan-Security-Council-meeting/
Successfully scraped: https://www.primeminister.am/en/statements-and-messages/item/2020/08/28/Nikol-Pashinyan--message/
Successfully scraped: https://www.primeminister.am/en/statements-and-messages/item/2020/09/27/Nikol-Pashinyan-message/
Successfully scraped: https://www.primeminister.am/en/statements-and-messages/item/2020/09/27/Nikol-Pashinyan-Speech/
Successfully scraped: https://www.primeminister.am/en/statements-and-messages/item/2020/09/27/Cabinet-meeting-Speech-27-09/
Successfully scraped: https://www.primeminister.am/en/statements-and-messages/item/2020/10/03/Nikol-Pashinyan-message/
Successfully scraped: https://www.primeminister.am/en/statements-and-messages/item/2020/10/14/Nikol-Pashinyan-message-to-the-nation/
Successfully scraped: https://www.primeminister.am/en/statements-and-messages/item/2020/10/21/Nikol-Pashinyan/
Successfully scraped

Now my Pashinyan CSV file has been created, and I want to go in and add a column for "Event" so each speech clearly corresponds to one of the four events I'm looking at. Then I want to see the data I'm working with, using Pandas.

In [41]:
!pip install numpy==1.23.5

Collecting numpy==1.23.5
  Using cached numpy-1.23.5.tar.gz (10.7 MB)
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'error'


  error: subprocess-exited-with-error
  
  Getting requirements to build wheel did not run successfully.
  exit code: 1
  
  [33 lines of output]
  Traceback (most recent call last):
    File "C:\Users\wolyn\anaconda3\Lib\site-packages\pip\_vendor\pyproject_hooks\_in_process\_in_process.py", line 353, in <module>
      main()
    File "C:\Users\wolyn\anaconda3\Lib\site-packages\pip\_vendor\pyproject_hooks\_in_process\_in_process.py", line 335, in main
      json_out['return_val'] = hook(**hook_input['kwargs'])
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "C:\Users\wolyn\anaconda3\Lib\site-packages\pip\_vendor\pyproject_hooks\_in_process\_in_process.py", line 112, in get_requires_for_build_wheel
      backend = _build_backend()
                ^^^^^^^^^^^^^^^^
    File "C:\Users\wolyn\anaconda3\Lib\site-packages\pip\_vendor\pyproject_hooks\_in_process\_in_process.py", line 77, in _build_backend
      obj = import_module(mod_path)
            ^^^^^^^^^^^^^^^^^^^^^

In [29]:
import pandas as pd

df_p = pd.read_csv(r"C:\Users\wolyn\OneDrive\Documents\Yale - Fall 2024\Python for Global Affairs\pashinyan_speeches48.csv")
print(df_p.head())  #Viewing the first few rows


A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.0.2 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.

Traceback (most recent call last):  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\wolyn\anaconda3\Lib\site-packages\ipykernel_launcher.py", line 17, in <module>
    app.launch_new_instance()
  File "C:\Users\wolyn\anaconda3\Lib\site-packages\traitlets\config\application.py", line 1075, in launch_instance
    app.start()
  File "C:\Users\wolyn\anaconda3\Lib\site-packages\ipykernel\kernelapp.py", line 701, in start
    self.io_loop.start()
  File "C:\Users\wolyn\anaconda3\Lib\site-pack

ImportError: 
A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.0.2 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.



ImportError: numpy.core.multiarray failed to import

Looks like there are some unnecessary characters like the \n, so I want to take care of those. I also want to make sure the date column is in datetime format for easier analysis.

In [31]:
df_p['Date'] = pd.to_datetime(df_p['Date'], format='%d.%m.%Y')
print(df_p.dtypes) #Date should now show datetime64[ns]

NameError: name 'pd' is not defined

In [33]:
#Applying the text cleaning
df_p['Speech Text'] = df_p['Speech Text'].str.replace(r'\n', ' ', regex=True)  #Removing newline characters here
df_p['Speech Text'] = df_p['Speech Text'].str.replace(r'\s+', ' ', regex=True)  #Replacing multiple spaces with a single space
df_p['Speech Text'] = df_p['Speech Text'].str.strip()  #Removing leading and trailing spaces

#Let's check out what we have now
print(df_p.head())

NameError: name 'df_p' is not defined

Now I want to normalize the speech text and do tokenization at the word level.

In [None]:
import re

def normalize_and_tokenize(text):
    """This is a function to normalize and tokenize"""
    #Converting to lowercase
    text = text.lower()
    #Removing punctuation
    text = re.sub(r'[^\w\s]', '', text)
    #Tokenizing into words
    tokens = text.split()
    return tokens

#Applying the function to my DataFrame
df_p['Tokens'] = df_p['Speech Text'].apply(normalize_and_tokenize)

#Displaying updated DataFrame to check
print(df_p[['Speech Text', 'Tokens']].head())

Lastly, I just want to see a full speech to make sure everything looks acceptable.

In [None]:
print(df_p['Speech Text'].iloc[5]) #Picking a random row/speech

I'll take it. Now I've got a workable CSV with all 48 Pashinyan speeches. I'll move on to Aliyev's speeches. I tried the same approach with his, but I ran into a bunch of errors I think stemming from a slightly different website setup, so bit of a different approach here.

In [None]:
#List of web links to Aliyev's speeches to be scraped
urls = [
    "https://president.az/en/articles/view/39951",
"https://president.az/en/articles/view/40267",
"https://president.az/en/articles/view/42798",
"https://president.az/en/articles/view/42794",
"https://president.az/en/articles/view/42459",
"https://president.az/en/articles/view/42108",
"https://president.az/en/articles/view/44371",
"https://president.az/en/articles/view/48793",
"https://president.az/en/articles/view/49937",
"https://president.az/en/articles/view/49953",
"https://president.az/en/articles/view/50226",
"https://president.az/en/articles/view/56550",
"https://president.az/en/articles/view/56987",
"https://president.az/en/articles/view/57277",
"https://president.az/en/articles/view/57276",
"https://president.az/en/articles/view/58559",
"https://president.az/en/articles/view/58557",
"https://president.az/en/articles/view/57564",
"https://president.az/en/articles/view/57748",
"https://president.az/en/articles/view/57857",
"https://president.az/en/articles/view/57856",
"https://president.az/en/articles/view/58470",
"https://president.az/en/articles/view/59164",
"https://president.az/en/articles/view/59195",
"https://president.az/en/articles/view/63447",
"https://president.az/en/articles/view/60430",
"https://president.az/en/articles/view/60990",
"https://president.az/en/articles/view/61113",
"https://president.az/en/articles/view/61532",
"https://president.az/en/articles/view/61664",
"https://president.az/en/articles/view/62244",
"https://president.az/en/articles/view/62209",
"https://president.az/en/articles/view/62336",
"https://president.az/en/articles/view/62864",
"https://president.az/en/articles/view/64527",
"https://president.az/en/articles/view/64830"
]

def scrape_speech(url):
    """This is a function to extract content from the webpages"""
    response = requests.get(url) #Scraping a URL, retrieving the page content
    soup = BeautifulSoup(response.content, 'html.parser') #This converts the webpage into a navigable BeautifulSoup object
    
    #Extracting the date
    date = soup.find('span', class_='news_date') #Again, had to inspect the webpage to find the right class
    date = date.text.strip() if date else 'No date found'
    
    #Extracting the speech text; combining text from <p> tags within the proper news_paragraph-block class
    speech_text = ' '.join([p.text.strip() for p in soup.find_all('p') if p.find_parent(class_='news_paragraph-block')]) #Also had to inspect here to get the right class for the speech text
    if not speech_text:
        speech_text = 'No speech text found'

    return {
        'Date': date,
        'Speech Text': speech_text,
        'URL': url
    } #Returns a dictionary with the above elements

#Scraping the speeches
speech_data = [] #Initializing an empty list to store the scraped data
for url in urls: #Looping through the URLs and appending results to the list
    speech_data.append(scrape_speech(url))

#Creating a DataFrame from the scraped data
df = pd.DataFrame(speech_data)

#Saving the data to a CSV file
df.to_csv('aliyev_speeches36.csv', index=False)

As with the Pashinyan speeches, I also want to see what I have here. And again, I want to manually add an "Event" column, including (as for Pashinyan) "2nd NK War", "Border clashes", "Lachin blockade", and "Lightning offensive".

In [None]:
df_a = pd.read_csv(r"C:\Users\wolyn\OneDrive\Documents\Yale - Fall 2024\Python for Global Affairs\aliyev_speeches36.csv")
print(df_a.head())  #Viewing the first few rows

In [None]:
#Convert to datetime and get rid of the time part here, which Pashinyan's didn't have
df_a['Date'] = pd.to_datetime(df_a['Date'], format='%d %B %Y, %H:%M').dt.date

In [None]:
print(df_a.head()) #Looks good

Normalizing and tokenizing for Aliyev's speeches now.

In [None]:
import re

def normalize_and_tokenize(text):
    """This is a function to normalize and tokenize"""
    #Converting to lowercase
    text = text.lower()
    #Removing punctuation
    text = re.sub(r'[^\w\s]', '', text)
    #Tokenizing into words
    tokens = text.split()
    return tokens

#Applying the function to my DataFrame
df_a['Tokens'] = df_a['Speech Text'].apply(normalize_and_tokenize)

#Displaying updated DataFrame to check
print(df_a[['Speech Text', 'Tokens']].head())

In [None]:
#Let's see a full speech just to make sure it looks OK
print(df_a['Speech Text'].iloc[17]) #Picking a random row/speech

Everything looks alright to me. Now I have two DataFrames to work with and analyze, one each for Pashinyan and Aliyev.

***Sentiment Analysis***

I'm gonna start with a sentiment analysis to try to quantify and compare the tone of the speeches. I'll use TextBlob.

In [None]:
!pip install textblob nltk

In [None]:
from textblob import TextBlob

def calculate_sentiment(text):
    """This is a function to calculate sentiment"""
    blob = TextBlob(text)
    return blob.sentiment.polarity  #Gives me a polarity score: -1 (negative) to +1 (positive)

#Applying sentiment analysis to both DataFrames
df_p['Sentiment'] = df_p['Speech Text'].apply(calculate_sentiment)
df_a['Sentiment'] = df_a['Speech Text'].apply(calculate_sentiment)

#Displaying sample results
print(df_p[['Speech Text', 'Sentiment']].head())
print(df_a[['Speech Text', 'Sentiment']].head())

Now I want to visualize changes in sentiment over time. I also want to easily see when each important event occurred, so I'm adding labels for each (incl. start and end date for the Second Nagorno-Karabakh War).

In [None]:
import matplotlib.pyplot as plt

#Sorting by date
df_p.sort_values('Date', inplace=True)
df_a.sort_values('Date', inplace=True)

#Defining my key events and their dates
events = {
    "2nd NK War Start": "2020-09-27",
    "2nd NK War End": "2020-11-10",
    "Sept 2022 Attacks": "2022-09-13",
    "Dec 2022 Blockade": "2022-12-12",
    "Sept 2023 Offensive": "2023-09-19"
}

#Converting event dates to datetime
event_dates = {event: pd.to_datetime(date) for event, date in events.items()}

#Plotting sentiment for both leaders
plt.figure(figsize=(12, 6))
plt.plot(df_p['Date'], df_p['Sentiment'], label='Pashinyan', marker='o', color='blue')
plt.plot(df_a['Date'], df_a['Sentiment'], label='Aliyev', marker='x', color='red')

#Adding labels for events
for event, date in event_dates.items():
    plt.axvline(x=date, color='gray', linestyle='--', alpha=0.7)  #Vertical lines for my key events
    plt.text(date, 0.1, event, rotation=90, verticalalignment='bottom', fontsize=9, color='black')  #Text labels

#Adding plot details here
plt.xlabel('Date')
plt.ylabel('Sentiment')
plt.title('Sentiment Over Time: Pashinyan vs. Aliyev with Events')
plt.legend()
plt.tight_layout()

plt.show()

I also want to get some summary stats for sentiment, including mean sentiment for each leader.

In [None]:
#Calculating mean sentiment for Pashinyan
pashinyan_mean_sentiment = df_p['Sentiment'].mean()

#Calculating mean sentiment for Aliyev
aliyev_mean_sentiment = df_a['Sentiment'].mean()

#Summary statistics for both leaders
pashinyan_stats = df_p['Sentiment'].describe()
aliyev_stats = df_a['Sentiment'].describe()

#Printing results
print("Pashinyan Mean Sentiment:", pashinyan_mean_sentiment)
print("Aliyev Mean Sentiment:", aliyev_mean_sentiment)
print("\nPashinyan Sentiment Statistics:")
print(pashinyan_stats)
print("\nAliyev Sentiment Statistics:")
print(aliyev_stats)

Firstly, it looks like the average sentiment for Aliyev's speeches over this time period is slightly higher than that of Pashinyan's speeches, which makes sense. Azerbaijan was the clear victor across these stages of the conflict, winning on the battlefield over nearly all the events and eventually taking over Nagorno-Karabakh completely. It's interesting that Pashinyan's mean sentiment is in the positive range, considering Armenia, obviously, was the clear loser and lost control of what it sees as some of its historical homeland. Much of this can be chalked up to Pashinyan's opting to strike a tone of resilience and optimism in many of his speeches. For example, this excerpt from his Nov. 12, 2020 speech, just after the end of the 2nd NK War, illustrates this point: "Many may whether we can talk about a good future after such a disastrous war. Yes, because today there are countries that have suffered the most severe capitulations in the 20th century, but today are among the most powerful nations in the world. They did so after a brutal defeat, with an emphasis on the development of education, science, industry and democracy, and this should be our next step. And I urge all of us to focus on what we can do to strengthen our country. This will be our best service to the memory of our martyrs, our wounded and disabled servicemen, their relatives, families, mothers, fathers, wives, and children." Clearly, Pashinyan here is trying to maintain a sense of hope and looking forward to a brighter future.

A couple of things jump out to me from the line graph. Pashinyan's most positive speech came right after the start of the 2nd NK War. Here's a bit of it: "Victory and only victory is the ending that we see at the end of this struggle. Today, a few hours ago, the Artsakh Defense Army launched a powerful counteroffensive and recorded significant advances and destroyed several special units of Azerbaijan. No matter how many mercenaries have been deployed over there: the Azerbaijani side cannot compete with the Armenian will to win and live. With joint efforts we will break the adversary's backbone, so that it could never raise its murderous hands on us, so that its bloody gaze could never fall upon our people...This is a new Sardarapat, and we all should be prepared to devote ourselves to a single mission that we call final victory. Each of us must be ready to be on the forefront of that victory. We will win! We are sure to win! Rest assured that victory will be on our side!" At this point in the war, there were indeed a few positive developments for the Armenian side, but more than anything this speech serves as a rallying cry, a way for Pashinyan to inject a sense of purpose and victory into his people. It was looking like it might become a protracted conflict at this point, so Pashinyan was trying to motivate and inspire the Armenian forces and people, and ready them for a potentially long war. After this point, things went clearly south for Armenia on the battlefield, and you can see that reflected in the sentiment of Pashinyan's ensuing speeches.

Not surprisingly, Aliyev's speeches were clearly more positive than Pashinyan's after the conclusion of the 2nd NK War. Azerbaijan won and claimed quite a bit of territory. We then see Aliyev at his most positive after the September 2022 border clashes, which tracks - the Azerbaijani attacks struck across a wide swath of Armenia proper, the largest such offensive on mainland Armenia in the entire history of the conflict. Armenia put up little resistance, and the international community hardly batted an eye, so Aliyev had to be feeling pretty good: "Our Victory march started in this direction as we broke through the enemy's first line of defense. For 44 days, the Azerbaijan Army moved forward every single day. Every day, without stopping, without pausing, tirelessly, we went forward, shedding blood and giving martyrs, but advancing towards Victory. We didn't stop even for a day. We didn't step back even for a day but went forward, chasing the enemy on the battlefield and winning the historic Victory."

Lastly, we can see that Pashinyan was at his most negative in the lead up to Azerbaijan's 2023 lightning offensive and takeover of NK. You can clearly feel a sense of despair in this Sep. 7, 2023 speech as Azerbaijan built up its forces at the front lines: "In the past week, the military-political situation in our region significantly worsened. The reason is that Azerbaijan has been accumulating troops for days along the Nagorno Karabakh contact line and the Armenia-Azerbaijan border. The rhetoric of anti-Armenian hatred has intensified in the Azerbaijani press and propaganda platforms. The policy of encroachment on the sovereign territory of Armenia continues." Interestingly, Pashinyan's rhetoric grows slightly more positive after the Azerbaijani offensive, again most probably to strike that tone of resilience and optimism, looking forward to (hopefully) less bleak times.

***Rhetorical Strategy Analysis***

Now I want to take a deeper dive into Pashinyan's and Aliyev's rhetorical strategies. I'll tokenize by sentence here, which will be more conducive to such an analysis.

In [None]:
import nltk
nltk.download('punkt')

from nltk.tokenize import sent_tokenize

#Tokenizing the speeches in df_p and df_a by sentence
df_p['Sentences'] = df_p['Speech Text'].apply(sent_tokenize)
df_a['Sentences'] = df_a['Speech Text'].apply(sent_tokenize)

#Viewing the first few sentences of speeches
print(df_p['Sentences'].iloc[0])
print(df_a['Sentences'].iloc[0])

In [None]:
!pip install blis

In [None]:
import pandas as pd
import spacy
from collections import Counter
import matplotlib.pyplot as plt
from wordcloud import WordCloud

# Load spaCy model
nlp = spacy.load("en_core_web_sm")

# Example: Processing Pashinyan's speeches
df_p['processed_speech'] = df_p['Speech Text'].apply(lambda x: ' '.join([token.text.lower() for token in nlp(x) if token.is_alpha]))

# List of common hate speech words (expand as needed)
hate_words = [
    'terrorist', 'enemy', 'invader', 'occupier', 'separatist', 'vandal', 
    'criminal', 'barbarian', 'savage', 'threat', 'aggressor', 'murderer', 
    'traitor', 'enemy-of-the-state', 'radical', 'fanatic', 'extremist', 
    'bloodthirsty', 'devil', 'beast', 'oppressor', 'rebel', 'usurper', 
    'invading', 'violator', 'destroyer', 'butcher', 'enemy-combatant', 
    'scum', 'filth', 'despicable', 'terrorizing', 'horrible', 'bloodshed', 
    'executioner', 'hater', 'perpetrator', 'war-criminal', 'unforgivable', 
    'fiend', 'rapist', 'scoundrel', 'monster', 'hate-filled', 'unjust', 
    'infidel', 'disgraceful', 'subhuman', 'abomination', 'cruel', 'evil', 
    'inhuman', 'destruction', 'heinous', 'barbaric', 'traitorous', 'fanatical', 
    'hatemonger', 'radicalized', 'unholy', 'killers', 'vicious', 'unworthy', 
    'vile', 'devastator', 'cold-blooded', 'hostile', 'murderous', 'insurgent', 
    'madman', 'deceiver', 'corrupt', 'fascist', 'enemy-of-humanity', 'scary', 
    'unpatriotic', 'subversive', 'disloyal', 'hateful', 'vengeful', 'destructive', 
    'war-monger', 'intolerant', 'untrustworthy', 'tyrant', 'reprehensible', 
    'beastly', 'monstrous', 'evil-doer', 'blasphemous', 'inhumane', 'backstabber', 
    'vile', 'intolerable', 'warrior-of-evil', 'racist', 'xenophobic', 'unforgiven', 
    'unrepentant', 'sacrilegious', 'unspeakable', 'demonic', 'apocalyptic', 'chauvinist', 
    'miserable', 'hostility', 'unholy', 'unjustifiable', 'unredeemable', 'defiler'
]


# Function to detect hate speech
def detect_hate_speech(text, hate_words):
    return sum([1 for word in text.split() if word in hate_words])

# Apply function to detect hate speech
df_p['hate_speech_count'] = df_p['processed_speech'].apply(lambda x: detect_hate_speech(x, hate_words))

# Visualize frequency of hate speech words
word_freq = Counter(" ".join(df_p['processed_speech']).split())
top_hate_words = {word: freq for word, freq in word_freq.items() if word in hate_words}

# Plot WordCloud
wordcloud = WordCloud(width=800, height=400).generate_from_frequencies(top_hate_words)
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis('off')
plt.show()

# Compare between leaders
df_a['processed_speech'] = df_a['Speech Text'].apply(lambda x: ' '.join([token.text.lower() for token in nlp(x) if token.is_alpha]))
df_a['hate_speech_count'] = df_a['processed_speech'].apply(lambda x: detect_hate_speech(x, hate_words))

# Plot comparison
plt.plot(df_p['Date'], df_p['hate_speech_count'], label='Pashinyan', marker='o')
plt.plot(df_a['Date'], df_a['hate_speech_count'], label='Aliyev', marker='x')
plt.xlabel('Date')
plt.ylabel('Hate Speech Count')
plt.title('Comparison of Hate Speech Over Time')
plt.legend()
plt.show()