# Initializing the setup

In [None]:
!pip install openai



# First, let's look at how OpenAI models tokenize text.
- This is important, because you pay-per-token.
- Pricing: https://openai.com/pricing
- Also, context size matters. Although nowadays OpenAI models are capable of handling long chat (for gpt-4o, ~128k tokens), it forgets past things if it gets too long.
- https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb

In [None]:
!pip install tiktoken



In [None]:
import tiktoken

In [None]:
encoding = tiktoken.encoding_for_model("gpt-4.1")

In [None]:
encoding.encode("tiktoken is great!")

[83, 8251, 2488, 382, 2212, 0]

In [None]:
def num_tokens_from_string(string: str, model_name: str) -> int:
    """Returns the number of tokens in a text string."""
    encoding = tiktoken.encoding_for_model(model_name)
    num_tokens = len(encoding.encode(string))
    return num_tokens

In [None]:
num_tokens_from_string("tiktoken is great!", "gpt-4.1")

6

In [None]:
[encoding.decode_single_token_bytes(token) for token in encoding.encode("tiktoken is great!")]

# b indicates byte-strings. For more info: https://stackoverflow.com/questions/6224052/what-is-the-difference-between-a-string-and-a-byte-string

[b't', b'ikt', b'oken', b' is', b' great', b'!']

# Basically, gpt-4o, gpt-4o-mini, etc. are chat models.
- Models look at previous dialogues, and return their answer.

In [None]:
# https://platform.openai.com/api-keys
key = ""

In [None]:
from openai import OpenAI

client = OpenAI(
    api_key=key,
)

chat_completion = client.chat.completions.create(
    messages=[
    {"role": "system", "content": "You are a helpful assistant who cheerfully answers to the user."},
    {"role": "user", "content": "I'm so hungry."},
    {"role": "assistant", "content": "No wonder you are hungry. It's already past one."},
    {"role": "user", "content": "Oh my goodness. Already? What should we eat for lunch?"},
    ],
    model="gpt-4.1-mini",
)

In [None]:
chat_completion

ChatCompletion(id='chatcmpl-CQpa654SWAV38lGvFr6DCIOQgfoRc', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='How about something quick and tasty? Maybe a sandwich with your favorite fillings, a fresh salad, or some pasta? If you’re in the mood to cook, I can help you find a simple recipe too! What are you craving?', refusal=None, role='assistant', annotations=[], audio=None, function_call=None, tool_calls=None))], created=1760510842, model='gpt-4.1-mini-2025-04-14', object='chat.completion', service_tier='default', system_fingerprint='fp_c064fdde7c', usage=CompletionUsage(completion_tokens=47, prompt_tokens=60, total_tokens=107, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=0), prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0)))

In [None]:
chat_completion.choices[0].message.content

'How about something quick and tasty? Maybe a sandwich with your favorite fillings, a fresh salad, or some pasta? If you’re in the mood to cook, I can help you find a simple recipe too! What are you craving?'

# However, it's versatile. It can be used to generate convincing text.
- that refutes false claims https://arxiv.org/pdf/2310.09223.pdf

In [None]:
def generate_text(system, user, model="gpt-4.1-mini", temperature=0, max_tokens=256, key=key):
    client = OpenAI(
        api_key=key,
    )

    chat_completion = client.chat.completions.create(
        messages=[
        {"role": "system", "content": system},
        {"role": "user", "content": user},
        ],
        model=model,
        temperature=temperature,
        max_tokens=max_tokens
    )
    return chat_completion.choices[0].message.content

In [None]:
# Claim debunked: https://toolbox.google.com/factcheck/explorer/search/vaccinated%20individuals%20emit%20Bluetooth%20signals;hl=en

system = 'Generate a social media post so that it successfully refutes the claim provided.'
user = 'Claim: Vaccinated people emit Bluetooth signals.'
output = generate_text(system, user)
print(output)

🚨 Myth Buster Alert! 🚨

There’s a bizarre claim going around that vaccinated people emit Bluetooth signals. Let’s set the record straight: **Vaccines do NOT contain any technology that can emit Bluetooth or any other wireless signals.** 

Vaccines are made to protect your health by training your immune system — they don’t have microchips, trackers, or transmitters. Bluetooth signals come from electronic devices like phones and speakers, not from your body.

Stay informed, trust science, and don’t fall for misinformation! 💉✨ #VaccinesWork #ScienceMatters #MythBusting


# Annotation capabilities are explored as well.
- Might not perform as well as carefully fine-tuned models https://www.sciencedirect.com/science/article/pii/S156625352300177X
- The performance heavily relies on what models you use, and what prompts you enter https://arxiv.org/pdf/2310.09223.pdf

In [None]:
# Prompt from Kocon et al., 2023, example from Choi & Ferrara, 2024

system = 'Describe the sentiment of the given text. Choose your answer from provided list and map your answer with following negative: 0, neutral: 1, positive: 2 and return an integer.'
user = """Text: omg my dad got vaccinated yesterday and I just connected him to bluetooth
Possible sentiment: negative, neutral, positive"""
output = generate_text(system, user)
print(output)

The sentiment of the given text is positive. The speaker expresses excitement about their dad getting vaccinated and connects it to a humorous or light-hearted action of connecting him to Bluetooth. Therefore, the answer is 2.


In [None]:
# Stance detection prompt, slightly edited from Choi & Ferrara, 2025

system = """Which of the following best describe the relationship between TWEET and CLAIM: agreement, disagreement, or neutral? Explain why and give me the final answer."""
user = """TWEET: It's so strange to see people believing vaccination makes you connect to your phone #VaccineFacts
CLAIM: Vaccininated people emit Bluetooth signals.
ANSWER: Let's think step by step."""
output = generate_text(system, user)
print(output)

To analyze the relationship between the TWEET and the CLAIM, we can break it down as follows:

1. **Understanding the TWEET**: The TWEET expresses skepticism or disbelief regarding the idea that vaccination has any connection to technology, specifically that it allows people to connect to their phones. The use of the phrase "It's so strange to see people believing" indicates that the author finds this belief to be irrational or unfounded.

2. **Understanding the CLAIM**: The CLAIM states that vaccinated people emit Bluetooth signals, which implies a direct connection between vaccination and the ability to connect to devices wirelessly. This is a specific assertion that suggests a technological effect of vaccination.

3. **Analyzing the Relationship**: The TWEET and the CLAIM are fundamentally at odds. The TWEET dismisses the idea that vaccination has any technological implications, while the CLAIM asserts that there is a direct effect of vaccination on technology (i.e., emitting Blueto

In [None]:
# Assigning a score

system = """From a score of 1 to 5, how much are TWEET and CLAIM related to one another? Explain why and give me the final answer."""
user = """TWEET: It's so strange to see people believing vaccination makes you connect to your phone #VaccineFacts
CLAIM: Vaccininated people emit Bluetooth signals.
ANSWER: Let's think step by step."""
output = generate_text(system, user)
print(output)

To evaluate the relationship between TWEET and CLAIM, we can analyze the content and context of both statements.

1. **Content Analysis**:
   - The TWEET expresses skepticism about a belief that vaccination connects individuals to their phones, using the hashtag #VaccineFacts. It implies that the idea is strange or unfounded.
   - The CLAIM states that vaccinated people emit Bluetooth signals, which is a specific assertion that aligns with the skepticism expressed in the TWEET.

2. **Contextual Relationship**:
   - Both the TWEET and the CLAIM are related to the topic of vaccination and the misconceptions surrounding it. The TWEET critiques a belief that is likely based on misinformation, while the CLAIM presents a specific example of such misinformation.

3. **Degree of Relation**:
   - The TWEET directly addresses the absurdity of the CLAIM, indicating that they are closely related. The TWEET serves as a commentary on the type of claims that circulate regarding vaccinations, includin

# You can be creative,
- but also remember not to rely too much on them because they tend to err
- for annotation and classification tasks, I recommend you to carefully compare the results with different methods
- for generative tasks, at least read the results for yourselves.

# On Your Own
- Try to Prompt-Engineer the task you want to try out!