##Purpose of this Colab##

Large language models are here to stay and many people have done interesting experiments exploring them and how they seem to capture aspects of human cognition.  In Psych 209-W23 the readings and lecture on Wed Feb 15 will cover some examples of this.  The purpose of this Colab is to give you tools to explore these models.  We provide it because we see several ways these tools could be extended to perform interesting experiments, including possible class projects!

One tool allows you to present a prompt to one of the variants of GPT3, and then to assess its prediction for the next word following the prompt.  Inspired be a recent talk by Richard Futrell at UC Irvine (formerly a Stanford undergrad and master's student!) and based on some code he provided, we allow you to present a prompt, and get back the language model's top 5 choices for the next word, as well as the log of the probability it estimates for each choice.

We use examples from Richard's talk for this: Given 'The children went outside to' as the prompt.  Here the probability of 'play' is very high, but with a second prompt 'The children went inside to' the probability of 'play' goes way down, and the probabilities of some other alternatives go up.  There are a huge number of questions you can explore, including some of the ones covered in the Dasgupta, Lampinen *et al* paper listed as a reading for Feb 13, starting from this tool and adapting it in various ways!

The second tool allows you to present a prompt and see how the model continues from there.  In the second case, the model *samples* words according to it's estimates of their probabilities.  For example with the first prompt above, 'play' is likely to be chosen by this process.  The model then feeds its choice in as the first word of the completion, and then repeats the process.  You can tell it the maximum number of words to sample and also you can tell it to stop if it ever hits one of a set of specified stop sequences, such as '.' or '!', which are common end-of-sentence markers.  So following the first prompt mentioned before, 'play' would likely be the next word, but the output may vary over the 10 runs from there.  

NOTE: We are using the Openai *Completions* endpoint.  There are parameters that you can control that affect the sampling process, and you can read more about them in the [Documentation](https://platform.openai.com/docs/api-reference/completions/create) from the Open AI web site.

To get started, Make your own copy of this Colab, and proceed from there!

## Preparation

##Make your own copy of this Colab

Before doing anything else, including running this Colab, please make your own copy of it (*File->Save a Copy in Drive*)

###Uncomment a Crucial Line Below

To prevent accidental running of this Colab, a crucial line in the next codeblock defining the OPENAI_API_KEY has been commented out.  The Colab will not run until you uncomment that line.  After you have uncommented that line, connect the Colab to a server and then run the next code block.

You may see some error messages about dependencies, but they should not prevent you from using the colab.

In [19]:
!pip install openai==0.28
!pip install pandas
import openai
from getpass import getpass

import sys
import csv
import json
import requests
import numpy as np
import pandas as pd
import glob
import os


# Read the API key from a JSON file
with open('openai_api_key.json') as f:
    data = json.load(f)

openai_api_key = data['api_key']
OPENAI_API_KEY = openai_api_key

Collecting pandas
  Using cached pandas-2.2.3-cp310-cp310-macosx_11_0_arm64.whl.metadata (89 kB)
Collecting pytz>=2020.1 (from pandas)
  Downloading pytz-2025.2-py2.py3-none-any.whl.metadata (22 kB)
Collecting tzdata>=2022.7 (from pandas)
  Downloading tzdata-2025.2-py2.py3-none-any.whl.metadata (1.4 kB)
Using cached pandas-2.2.3-cp310-cp310-macosx_11_0_arm64.whl (11.3 MB)
Downloading pytz-2025.2-py2.py3-none-any.whl (509 kB)
Downloading tzdata-2025.2-py2.py3-none-any.whl (347 kB)
Installing collected packages: pytz, tzdata, pandas
Successfully installed pandas-2.2.3 pytz-2025.2 tzdata-2025.2


## Part 1. Analysis of word at the end of the sequence (Unidirectional model only)

---



### Basic methods for this section.
Method `get_completions_with_logprobs` prints out the 5 tokens with the highest probability to be the word immediately following the prompt. For each token, the code prints its log probability and its probability.  After that, it prints the sum of the probabilities of the top 5 tokens.


In [51]:
def get_emotionality_rating(sentence, max_attempts=5,brief_input=True):
    prompt = f"Rate the following sentence on its emotionality from very negative (-1) to very positive (+1):\n\n\"{sentence}\"\n\nAnswer with a number:"
    if brief_input == False:
        print("analysing: ",sentence)
    openai.api_key = OPENAI_API_KEY  # Make sure this is defined

    ratings = []
    attempts = 0

    while len(ratings) < 5 and attempts < max_attempts:
        response = openai.ChatCompletion.create(
            model="gpt-4-turbo",
            messages=[{"role": "user", "content": prompt}],
            max_tokens=5,
            temperature=0.7,
            n=5  # Ask for 5 completions
        )

        for i, choice in enumerate(response.choices):
            content = choice.message.content.strip()
            if not brief_input:
                print(f"Attempt {attempts+1} - Response {i+1}:", content)
            try:
                rating = float(content)
                ratings.append(rating)
            except ValueError:
                if not brief_input:
                    print(f"Non-numeric response: '{content}', skipping.")

            if len(ratings) >= 5:
                break

        attempts += 1

    if len(ratings) < 5 & brief_input == False:
        print("Warning: Could not collect 5 valid numeric ratings after maximum attempts.")

    return ratings[:5]  # Return exactly 5 ratings


# Example usage
print(get_emotionality_rating(":)",5,False))
print(get_emotionality_rating("Aur Naur"))


analysing:  :)
Attempt 1 - Response 1: +1
Attempt 1 - Response 2: +1
Attempt 1 - Response 3: 0.9
Attempt 1 - Response 4: +1
Attempt 1 - Response 5: +1
[1.0, 1.0, 0.9, 1.0, 1.0]
[0.0, 0.0, 0.0, 0.0, 0.0]


In [60]:
tempdf = pd.read_csv(os.path.join('./sentiment_analysis.csv'))

# Prepare a list to collect mean ratings
mean_ratings = []

for idx, row in tempdf.iterrows():
    text = row.get('sentence', '')
    if pd.isna(text) or not text.strip():
        mean_rating = np.nan
    else:
        ratings = get_emotionality_rating(text)
        mean_rating = np.nanmean(ratings)  # mean ignoring NaN if any
        mean_ratings.append(mean_rating)
        print(f"Row {idx}: Mean GPT rating = {mean_rating}")

# Add the new column
tempdf['gpt_rating'] = mean_ratings

# Save back to the same file
tempdf.to_csv('./sentiment_analysis.csv', index=False)

Row 0: Mean GPT rating = -0.78
Row 1: Mean GPT rating = 0.8
Row 2: Mean GPT rating = 0.7699999999999999
Row 3: Mean GPT rating = 0.8400000000000001
Row 4: Mean GPT rating = 0.8
Row 5: Mean GPT rating = -0.52
Row 6: Mean GPT rating = -0.82
Row 7: Mean GPT rating = -0.8
Row 8: Mean GPT rating = -0.56
Row 9: Mean GPT rating = -0.8
Row 10: Mean GPT rating = -0.6
Row 11: Mean GPT rating = 0.0
Row 12: Mean GPT rating = 0.5
Row 13: Mean GPT rating = 0.79
Row 14: Mean GPT rating = 0.0
Row 15: Mean GPT rating = 0.0
Row 16: Mean GPT rating = 0.0
Row 17: Mean GPT rating = 0.5
Row 18: Mean GPT rating = 0.0
Row 19: Mean GPT rating = 0.0
Row 20: Mean GPT rating = -0.8
Row 21: Mean GPT rating = 0.0
Row 22: Mean GPT rating = 0.0
Row 23: Mean GPT rating = 0.0
Row 24: Mean GPT rating = -1.0
Row 25: Mean GPT rating = -0.64
Row 26: Mean GPT rating = -0.6
Row 27: Mean GPT rating = 0.0
Row 28: Mean GPT rating = -0.7
Row 29: Mean GPT rating = 0.25
Row 30: Mean GPT rating = 0.0
Row 31: Mean GPT rating = 0.1
R

In [58]:
mean_ratings


[np.float64(-0.76)]

In [38]:
tempdf

Unnamed: 0,file_name,start_time,end_time,sentence,syuzhet_sentiment_score,word_scores,compound,pos,neu,neg,but_count
0,user_106_testimonial_105_stimuli_363_.tsv,5.36,17.52,"So I grew up in Del Paso Heights, which is, it...",-2.65,"{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",-0.542,0.000,0.863,0.137,0
1,user_106_testimonial_105_stimuli_363_.tsv,18.60,30.20,"Well, one of the good things is I had a neighb...",3.20,"{1.1, 0, 0, 0, 1.9, 0, 0, 0, 0, 0, 0, 0, 0, 0,...",0.886,0.246,0.754,0.000,0
2,user_106_testimonial_105_stimuli_363_.tsv,30.38,37.36,"We went camping, doing nature stuff and just g...",1.20,"{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2.3...",0.511,0.163,0.837,0.000,0
3,user_106_testimonial_105_stimuli_363_.tsv,37.36,40.48,It was a great experience for me.,0.50,"{0, 0, 0, 3.1, 0, 0, 0}",0.625,0.406,0.594,0.000,0
4,user_106_testimonial_105_stimuli_363_.tsv,42.70,54.72,"However, one night we went to Scouts and I rem...",0.50,"{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",0.572,0.121,0.879,0.000,0
...,...,...,...,...,...,...,...,...,...,...,...
302,user_84_testimonial_83_stimuli_285_.tsv,106.46,118.40,And I think they resented that for a long time...,-1.00,"{0, 0, 0, 0, -1.6, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",-0.382,0.000,0.890,0.110,0
303,user_84_testimonial_83_stimuli_285_.tsv,118.40,119.96,He was the all star.,0.60,"{0, 0, 0, 0, 0}",0.000,0.000,1.000,0.000,0
304,user_84_testimonial_83_stimuli_285_.tsv,120.12,123.26,He was the special one.,0.80,"{0, 0, 0, 1.7, 0}",0.402,0.403,0.597,0.000,0
305,user_84_testimonial_83_stimuli_285_.tsv,123.26,134.66,And I was the one who got the chores and did a...,-0.75,"{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -1.9, ...",-0.440,0.000,0.873,0.127,0
