#How does temperature affect output?
#How is a text prompt divided into tokens?

Revised 2025_04_15

Tokens are pieces of text the LLM recognizes in the prompt, and then assembles to respond.  They're similar to word roots + prefixes + suffixes, but often vary from what we expect based on the frequency of each piece of text in the LLM's training data.

This project uses OpenAI's [gpt-4o-mini](https://platform.openai.com/docs/models/gpt-4o-mini), a relatively cost-effective model that has a context window of 128,000 tokens, which includes all input and output tokens, as well as control tokens used for directions about how to respond to the prompt.  Its maximum output is 16,384 tokens.

#Installs
Not all of these installs may be required when running code in Google Colab.  Software versions are frequently updated, so if you are running locally, you may need to do some research regarding versions and compatibility.

In [None]:
# install OpenAI
# !pip --quiet install openai==1.10.0

# install LangChain, a framework for working with LLMs
# !pip --quiet install langchain==0.1.4
# !pip --quiet install langchain
# !pip --quiet install langchain_community
!pip --quiet install langchain-openai

# install Transformers, a pre-trained tokenizer
#!pip --quiet install transformers

print("\ndone installs\n")

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/61.3 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m61.3/61.3 kB[0m [31m2.2 MB/s[0m eta [36m0:00:00[0m
[?25h[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/1.2 MB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━━━━━[0m[91m╸[0m[90m━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.5/1.2 MB[0m [31m14.8 MB/s[0m eta [36m0:00:01[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m1.2/1.2 MB[0m [31m22.0 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.2/1.2 MB[0m [31m15.3 MB/s[0m eta [36m0:00:00[0m
[?25h
done installs



#OpenAI API Key

To run this project you'll need to create an OpenAI account, and use it to obtain an OpenAI API key.  Keep your API key secret.

While creating these lessons, I loaded my account with a \$10.00 credit and turned OFF Auto Recharge.  This means that when the balance drops to \$0, requests using this API key will stop working.  While creating these lessons during the summer of 2024, I ended up using only $0.43 in total.

*   [Create/manage your OpenAI API keys](https://platform.openai.com/settings/organization/api-keys)
*   [Track your OpenAI API key usage/costs](https://platform.openai.com/settings/organization/usage)

This project assumes the API key is saved as a Google Colab Secret.  In the left navbar, click the key icon to access secrets.


In [None]:
# to use Google Colab Secrets
from google.colab import userdata

# if instead using environmental variable
# import os

print("\ndone import\n")


done import



In [None]:
# load API key from Google Colab Secret
api_key = userdata.get('OPENAI_API_KEY')

# if instead using an environmental variable to store your API key
# paste without quotation marks
# %env OPENAI_API_KEY=sk-proj-ec_this_is_a_fake_API_key

#Temperature – OpenAI Prompting

In [None]:
# import OpenAI LLM
import openai

print("\ndone import\n")


done import



In [None]:
# instantiate OpenAI
# will not run without first providing your OpenAI API key above
# client = openai.OpenAI(api_key=api_key)
client = openai.OpenAI(api_key=api_key)

In [None]:
# set up function to receive prompt
def llm(prompt):
    response = client.chat.completions.create(
        model='gpt-4o-mini',
        messages=[
          {"role": "user", "content": prompt}
        ],
        temperature=0.9 # temperature optional, range 0-2, default=1
    )
    return response.choices[0].message.content.strip()

In [None]:
# prompt
text = "Tell me a joke."
print(llm(text))


# TINKER:

# 1) Run several times and observe the results

# 2) Change the prompt text

# 3) Change the temperature

Why did the scarecrow win an award?

Because he was outstanding in his field!


#Temperature – LangChain Prompting

In [None]:
# import LangChain framework
# from langchain_openai import OpenAI
from langchain_openai import ChatOpenAI
from langchain.callbacks import get_openai_callback

print("\ndone imports\n")


done imports



In [None]:
# instantiate OpenAI
# will not run without first providing your OpenAI API key above
# llm2 = OpenAI(openai_api_key=api_key, model='gpt-4o-mini')
llm2 = ChatOpenAI(openai_api_key=api_key, model='gpt-4o-mini')

In [None]:
# simple prompt using LangChain
with get_openai_callback() as cb:
    result = llm2.invoke("Tell me a joke", temperature=0.9) # temperature optional, range 0-2, default=1
    print(result)

    # use LangChain callback to track token usage
    print()
    print(cb)


    # result = llm2.invoke("Tell me a joke", config={"temperature": 0.9})
    # print(result)


# TINKER:

# Run several times to observe different results, tokens used, and costs.
#      Note that a single word ("cat") might be mistaken for computer code,
#      or even a different language

content='Why did the scarecrow win an award?\n\nBecause he was outstanding in his field!' additional_kwargs={'refusal': None} response_metadata={'token_usage': {'completion_tokens': 18, 'prompt_tokens': 11, 'total_tokens': 29, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_b376dfbbd5', 'id': 'chatcmpl-BK9F98QItV1e8zeeIA7KqmoWfixlG', 'finish_reason': 'stop', 'logprobs': None} id='run-4e6e80d7-e3b1-4cf3-a8a5-c26ccc590f8f-0' usage_metadata={'input_tokens': 11, 'output_tokens': 18, 'total_tokens': 29, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}}

Tokens Used: 29
	Prompt Tokens: 11
		Prompt Tokens Cached: 0
	Completion Tokens: 18
		Reasoning Tokens: 0
Successful Requests: 1
Total Cost (USD): $1.2449999999999

In [None]:
# prompt of only 1 short word using LangChain
with get_openai_callback() as cb:
    result = llm2("uber", temperature=0.9)
    print(result)

    # use LangChain callback to track token usage
    print()
    print(cb)


# TINKER:

# Run with longer single-word prompts ("catalog", "caterwauled", "cataclysmic")
#      and observe the different numer of prompt tokens used

ValueError: Got unsupported message type: u

#Tokenization



In [None]:
# import the Transformers pre-trained tokenizer
from transformers import AutoTokenizer

print("\ndone import\n")


done import



In [None]:
# initialize the tokenizer
# tokenizer = AutoTokenizer.from_pretrained('gpt2')
tokenizer = AutoTokenizer.from_pretrained('gpt2')

OSError: gpt-4o-mini is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo either by logging in with `huggingface-cli login` or by passing `token=<your_token>`

In [None]:
# output and examine all pretrained tokens
print(tokenizer.vocab)


# TINKER:

# Copy/paste this output to CSV file, then import to Microsoft Excel.
# TRANSPOSE the single row into a single column, then examine the contents.
# Note that output will be ~900 KB and more than 16,000 rows

# All tokens have been processed and are viewable in this Google Sheet:
# https://docs.google.com/spreadsheets/d/1p3yhSFyWlno1TMa-vjMTsh5kECymsCWU2AFTdoFmtBs/edit?usp=sharing
# Notice glitches at rows 921, 945, 2617, etc.



In [None]:
# examine prompt tokens and token IDs for a short prompt
sample_text = "caterwauled"
token_ids = tokenizer.encode(sample_text)

print("Tokens:", tokenizer.convert_ids_to_tokens(token_ids))
print("Token IDs:", token_ids)


# TINKER:

# 1) Change sample_text to a different string and predict how it will be tokenized

# 2) What might you consider a general rule of thumb regarding
#     total # words –> total # tokens?

# 3) What's the lowest token ID you can discover?

Tokens: ['c', 'ater', 'w', 'aul', 'ed']
Token IDs: [66, 729, 86, 2518, 276]


#Tokenization Peculiarities

In [None]:
# uppercase/lowercase words are tokenized differently
# sometimes short words are further broken down into smaller tokens
#      ("HELLO" = "HE", "LL", "O")
sample_text = "hello HELLO Hello hello"
token_ids = tokenizer.encode(sample_text)

print("Tokens:", tokenizer.convert_ids_to_tokens(token_ids))
print("Token IDs:", token_ids)


# TINKER:

# Change sample_text to words with different capitalizations and predict how
#      they will be tokenized.

Tokens: ['hello', 'ĠHELL', 'O', 'ĠHello', 'Ġhello']
Token IDs: [31373, 47899, 46, 18435, 23748]


In [None]:
# place values are not always tokenized as a single sequence
#      (8888 = "Ġ8", "888")
sample_text = "7 77 777 7777 ! ? , # @"
token_ids = tokenizer.encode(sample_text)

print("Tokens:", tokenizer.convert_ids_to_tokens(token_ids))
print("Token IDs:", token_ids)


# TINKER:

# Change sample_text to different number and special characters combinations
#      and predict how they will be tokenized.

Tokens: ['7', 'Ġ77', 'Ġ777', 'Ġ7', '777', 'Ġ!', 'Ġ?', 'Ġ,', 'Ġ#', 'Ġ@']
Token IDs: [22, 8541, 35534, 767, 29331, 5145, 5633, 837, 1303, 2488]
