# Sentiment analysis experiment
This notebook contains sample EDSL code for exploring what language models know about "sentiment" and impacts to their responses to prompts to perform sentiment analysis when different amounts of information about sentiment are presented to them and when they are prompted to consider what they already "know" about it.

Inspired by Michael Burnham's paper: [What is Sentiment Meant to Mean to
Language Models?](https://arxiv.org/pdf/2405.02454)

[EDSL](https://docs.expectedparrot.com/en/latest/) is an open-source library for simulating surveys and experiments with AI agents and language models. Please see our [documentation](https://docs.expectedparrot.com/en/latest/) for instructions on installing the package and [join our Discord](https://discord.com/invite/mxAYkjfy9m) to ask questions and chat with other researchers!

In [1]:
# pip install edsl

In [2]:
from edsl.questions import QuestionMultipleChoice, QuestionFreeText
from edsl import Scenario, Survey, Agent, Model

In [3]:
# Import a dataset and labels

texts = [ # POTUS recent tweets
    "Tune in as I deliver the keynote address at the U.S. Holocaust Memorial Museum’s Annual Days of Remembrance ceremony in Washington, D.C.",
    "We’re a nation of immigrants. A nation of dreamers. And as Cinco de Mayo represents, a nation of freedom.",
    "Medicare is stronger and Social Security remains strong. My economic plan has helped extend Medicare solvency by a decade. And I am committed to extending Social Security solvency by making the rich pay their fair share.",
    "Today, the Army Black Knights are taking home West Point’s 10th Commander-in-Chief Trophy. They should be proud. I’m proud of them too – not for the wins, but because after every game they hang up their uniforms and put on another: one representing the United States.",
    "This Holocaust Remembrance Day, we mourn the six million Jews who were killed by the Nazis during one of the darkest chapters in human history. And we recommit to heeding the lessons of the Shoah and realizing the responsibility of 'Never Again.'",
    "The recipients of the Presidential Medal of Freedom haven't just kept faith in freedom. They kept all of America's faith in a better tomorrow.",
    "Like Jill says, 'Teaching isn’t just a job. It’s a calling.' She knows that in her bones, and I know every educator who joined us at the White House for the first-ever Teacher State Dinner lives out that truth every day.",
    "Jill and I send warm wishes to Orthodox Christian communities around the world as they celebrate Easter. May the Lord bless and keep you this Easter Sunday and in the year ahead.",
    "Dreamers are our loved ones, nurses, teachers, and small business owners – they deserve the promise of health care just like all of us. Today, my Administration is making that real by expanding affordable health coverage through the Affordable Care Act to DACA recipients.",
    "With today’s report of 175,000 new jobs, the American comeback continues. Congressional Republicans are fighting to cut taxes for billionaires and let special interests rip folks off, I'm focused on job creation and building an economy that works for the families I grew up with."
]

labels = ["Positive", "Negative", "Neutral"]

In [4]:
# Create questions prompting a model to consider the task of sentiment analysis and to label some data
# See examples of all question types: https://docs.expectedparrot.com/en/latest/questions.html

from edsl.questions import QuestionFreeText, QuestionMultipleChoice

q_consideration = QuestionFreeText(
    question_name = "consideration",
    question_text = "What you know about sentiment analysis?" # Try variations to see if the model does a better job in the next question without instructinos
)

q_classification = QuestionMultipleChoice(
    question_name = "classification",
    question_text = "Identify the {{ term }} of the following text: {{ text }}",
    question_options = labels
)

In [5]:
# Combine questions in a survey to administer them together

from edsl import Survey

survey = Survey(questions = [q_consideration, q_classification])

In [6]:
# Optionally add survey logic/rules
# Questions are administered asynchronously (independently) by default
# We can add a "memory" of the first question when the next question is presented
# When we generate results we can check the prompts that were used to see this additional info

survey = survey.add_targeted_memory(q_classification, q_consideration) # compare to survey results when a memory is not added

In [7]:
# Create parameters for the question to run it for each text with different information about the task

terms = [
    "Sentiment analysis is.....", # add any instructions to compare to responses where no detailed instructions are given
    "Emotional valence means.....",
    "Sentiment", # no additional info
    "Emotional valence",
    "Stance (opinion)"
]
scenarios = [Scenario({"term":term, "text":text}) for term in terms for text in texts]

# We can check the combinations of parameters that will be used

# scenarios

In [8]:
# Create AI agents with general instructions and any desired traits 
# We can also create unique agent traits to compare their responses

agents = [Agent(traits = {
    "persona": "You are classifying texts....", # add any general instructions for the task
    # add additional general/agent-specific traits
})]

# We can inspect the individual agents' traits that were created

agents

[Agent(traits = {'persona': 'You are classifying texts....'})]

In [9]:
# Checking available models to use to generate responses

Model.available()

[['01-ai/Yi-34B-Chat', 'deep_infra', 0],
 ['Austism/chronos-hermes-13b-v2', 'deep_infra', 1],
 ['Gryphe/MythoMax-L2-13b', 'deep_infra', 2],
 ['Gryphe/MythoMax-L2-13b-turbo', 'deep_infra', 3],
 ['HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1', 'deep_infra', 4],
 ['Phind/Phind-CodeLlama-34B-v2', 'deep_infra', 5],
 ['bigcode/starcoder2-15b', 'deep_infra', 6],
 ['bigcode/starcoder2-15b-instruct-v0.1', 'deep_infra', 7],
 ['claude-3-haiku-20240307', 'anthropic', 8],
 ['claude-3-opus-20240229', 'anthropic', 9],
 ['claude-3-sonnet-20240229', 'anthropic', 10],
 ['codellama/CodeLlama-34b-Instruct-hf', 'deep_infra', 11],
 ['codellama/CodeLlama-70b-Instruct-hf', 'deep_infra', 12],
 ['cognitivecomputations/dolphin-2.6-mixtral-8x7b', 'deep_infra', 13],
 ['databricks/dbrx-instruct', 'deep_infra', 14],
 ['deepinfra/airoboros-70b', 'deep_infra', 15],
 ['gemini-pro', 'google', 16],
 ['google/codegemma-7b-it', 'deep_infra', 17],
 ['google/gemma-1.1-7b-it', 'deep_infra', 18],
 ['gpt-3.5-turbo', 'openai', 19],


In [10]:
# Selecting models to add to the survey

models = [Model(m) for m in ['gpt-4-1106-preview']] # add any other models to list

# We can inspect and modify model parameters

models

[Model(model_name = 'gpt-4-1106-preview', temperature = 0.5, max_tokens = 1000, top_p = 1, frequency_penalty = 0, presence_penalty = 0, logprobs = False, top_logprobs = 3)]

In [11]:
# Add components to the survey and running it

results = survey.by(scenarios).by(agents).by(models).run()

In [12]:
# Results include information about all the components that we can analyze

results.columns

['agent.agent_name',
 'agent.persona',
 'answer.classification',
 'answer.consideration',
 'comment.classification_comment',
 'iteration.iteration',
 'model.frequency_penalty',
 'model.logprobs',
 'model.max_tokens',
 'model.model',
 'model.presence_penalty',
 'model.temperature',
 'model.top_logprobs',
 'model.top_p',
 'prompt.classification_system_prompt',
 'prompt.classification_user_prompt',
 'prompt.consideration_system_prompt',
 'prompt.consideration_user_prompt',
 'question_options.classification_question_options',
 'question_options.consideration_question_options',
 'question_text.classification_question_text',
 'question_text.consideration_question_text',
 'question_type.classification_question_type',
 'question_type.consideration_question_type',
 'raw_model_response.classification_raw_model_response',
 'raw_model_response.consideration_raw_model_response',
 'scenario.term',
 'scenario.text']

In [13]:
# Inspecting the responses

(results
 .sort_by("term")
 .sort_by("text")
 .select("text", "term", "consideration", "classification")
 .print(format="rich")
)