### Producing Sentiment variations using LLMs

Author: Raphael Merx
Input: a baseline sentence; Output: variations of this sentence where a target word is more or less intense

In [None]:
#pip install simplemind python-dotenv

Collecting simplemind
  Downloading simplemind-0.2.4-py3-none-any.whl.metadata (13 kB)
Collecting python-dotenv
  Using cached python_dotenv-1.0.1-py3-none-any.whl.metadata (23 kB)
Collecting instructor (from simplemind)
  Downloading instructor-1.6.4-py3-none-any.whl.metadata (17 kB)
Collecting logfire (from simplemind)
  Downloading logfire-2.4.1-py3-none-any.whl.metadata (8.0 kB)
Collecting pydantic-settings (from simplemind)
  Using cached pydantic_settings-2.6.1-py3-none-any.whl.metadata (3.5 kB)
Collecting aiohttp<4.0.0,>=3.9.1 (from instructor->simplemind)
  Downloading aiohttp-3.11.7-cp311-cp311-win_amd64.whl.metadata (8.0 kB)
Collecting docstring-parser<0.17,>=0.16 (from instructor->simplemind)
  Using cached docstring_parser-0.16-py3-none-any.whl.metadata (3.0 kB)
Collecting jinja2<4.0.0,>=3.1.4 (from instructor->simplemind)
  Using cached jinja2-3.1.4-py3-none-any.whl.metadata (2.6 kB)
Collecting jiter<0.7,>=0.6.1 (from instructor->simplemind)
  Downloading jiter-0.6.1-cp311

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
spacy-transformers 1.2.5 requires transformers<4.31.0,>=3.4.0, but you have transformers 4.40.0 which is incompatible.


In [5]:
from dotenv import load_dotenv
load_dotenv()

from dataclasses import dataclass
import simplemind as sm

TARGET_WORD = 'anxiety'
#TARGET_WORDS = ['abuse', 'anxiety', 'depression', 'mental_health', 'mental_illness', 'trauma']

# can be changed to gemini, see https://pypi.org/project/simplemind/
PROVIDER = "openai"
MODEL = "gpt-4o"

Get neutral sentences from corpus for LLM input for each target

**Aim**: This script processes text files containing sentences for specified target terms, calculates sentence-level mean arousal scores using the NRC-VAD lexicon, identifies 
sentences with neutral arousal scores (dynamically globally determined from NRC-VAD dataset), and saves these sentences along with their metadata to output files in the output folder.

In [None]:
%run step0_get_neutral_baselines_intensity.py

Dynamic neutral arousal range: (0.47, 0.49)
Processing file: c:\Users\naomi\OneDrive\COMP80004_PhDResearch\RESEARCH\PROJECTS\3_evaluation+validation - ACL 2025\sentiment-breadth-intensity\0.0_corpus_preprocessing\output\natural_lines_targets\abuse.lines.psych
Saved neutral arousal sentences for abuse to c:\Users\naomi\OneDrive\COMP80004_PhDResearch\RESEARCH\PROJECTS\3_evaluation+validation - ACL 2025\sentiment-breadth-intensity\3_intensity\synthetic\input\baselines\abuse_neutral_baselines_intensity.csv.
Total neutral sentences for abuse: 4262
Processing file: c:\Users\naomi\OneDrive\COMP80004_PhDResearch\RESEARCH\PROJECTS\3_evaluation+validation - ACL 2025\sentiment-breadth-intensity\0.0_corpus_preprocessing\output\natural_lines_targets\anxiety.lines.psych
Saved neutral arousal sentences for anxiety to c:\Users\naomi\OneDrive\COMP80004_PhDResearch\RESEARCH\PROJECTS\3_evaluation+validation - ACL 2025\sentiment-breadth-intensity\3_intensity\synthetic\input\baselines\anxiety_neutral_basel

Setup examples to inject in the prompt

In [None]:
# This code sets up examples of baseline sentences and their intensity-modified variations (more and less intense) to provide context and guidance for the LLM. 
# These examples are formatted into a structured prompt to help the model understand how to generate intensity-modified variations for new sentences.

@dataclass
class Example:
    baseline: str
    more_intense: str
    less_intense: str

    def format_for_prompt(self):
        return f"""<baseline>
{self.baseline}
</baseline>
<increased {TARGET_WORD}>
{self.more_intense}
</increased {TARGET_WORD}>
<decreased {TARGET_WORD}>
{self.less_intense}
</decreased {TARGET_WORD}>
"""

EXAMPLES = [
    Example(
        baseline="In a 24-yr-old female patient with a 12-yr history of kleptomania, it appeared that the behavior was maintained because it reduced anxiety in relevant situations.",
        more_intense="In a 24-yr-old female patient with a 12-yr history of kleptomania, it appeared that the behavior was maintained because it alleviated intense and overwhelming anxiety that pervaded her daily life.",
        less_intense="In a 24-yr-old female patient with a 12-yr history of kleptomania, it appeared that the behavior was maintained because it relieved mild, occasional anxiety in certain situations.",
    ),
    Example(
        baseline="Anxiety, insecurity, and lack of skill in establishing appropriate relationships prevented workers from obtaining adequate bases for evaluation and for effectuation of treatment.",
        more_intense="Intense anxiety, profound insecurity, and a lack of skill in establishing appropriate relationships prevented workers from obtaining adequate bases for evaluation and for effectuation of treatment.",
        less_intense="Mild anxiety, occasional insecurity, and a lack of skill in establishing appropriate relationships prevented workers from obtaining adequate bases for evaluation and for effectuation of treatment.",
    ),
    Example(
        baseline="The relationship between self-reported fear and anxiety was examined in a large sample of normal Australian children and adolescents.",
        more_intense="The relationship between deeply rooted fear and severe anxiety was examined in a large sample of Australian children and adolescents, revealing strong emotional undercurrents.",
        less_intense="The relationship between mild fear and occasional anxiety was examined in a large sample of Australian children and adolescents.",
    )
]

PROMPT_INTRO = f"""In psychology lexicographic research, we define "intensity" as the "extent to which a word refers to more emotionally or referentially intense phenomena". Here we study the intensity of the word "{TARGET_WORD}"
You will be given a sentence with the word "{TARGET_WORD}" in it. You will then be asked to write two new sentences: one where the word "{TARGET_WORD}" is more intense, and one where it is less intense.

"""


In [None]:
# This code chunk constructs a structured prompt by combining predefined examples with a new target sentence, queries an LLM to generate intensity-modified variations (more and less intense) 
# of the target sentence, and extracts the results for further analysis.

@dataclass
class SentenceToModify:
    text: str

    def get_prompt(self):
        prompt = PROMPT_INTRO
        for example in EXAMPLES:
            prompt += example.format_for_prompt()
            prompt += "\n\n"
        
        prompt += f"""<baseline>
{self.text}
</baseline>
"""
        return prompt
    
    def parse_response(self, response: str):
        # get the sentences inside <more {TARGET_WORD}> and <less {TARGET_WORD}>
        increased_variation = response.split(f"<increased {TARGET_WORD}>")[1].split(f"</increased {TARGET_WORD}>")[0].strip()
        decreased_variation = response.split(f"<decreased {TARGET_WORD}>")[1].split(f"</decreased {TARGET_WORD}>")[0].strip()
        return increased_variation, decreased_variation

    def get_variations(self) -> list[str]:
        """ Returns a list of two strings: one where the TARGET_WORD is more intense, and one where it is less intense """
        assert TARGET_WORD in self.text, f"TARGET_WORD {TARGET_WORD} not found in text"
        prompt = self.get_prompt()
        res = sm.generate_text(prompt=prompt, llm_provider=PROVIDER, llm_model=MODEL)
        return self.parse_response(res)


In [None]:
# This code initializes a SentenceToModify object with a given sentence and uses the get_variations() method to generate two intensity-modified 
# versions (more intense and less intense) of the target word within that sentence.

sentence = SentenceToModify(text="The anxiety caused by impending surgery left her feeling overwhelmed and vulnerable, affecting her daily life.")
sentence.get_variations()


('The intense anxiety caused by the impending surgery consumed her, leaving her utterly overwhelmed and profoundly vulnerable, disrupting every aspect of her daily life.',
 'The mild anxiety caused by the impending surgery left her feeling a bit uneasy and slightly vulnerable, affecting her daily routine to a small degree.')

In [None]:
# End of script