# Connections

We are going to solve the [NYTimes Connections words game](https://www.nytimes.com/games/connections)
- 📚 Weave documentation: https://wandb.me/weave
- 🤝 Getting Started: https://wandb.github.io/weave/quickstart
- 😎 This code: https://github.com/wandb/connections
- 💜 Discord Channel: #capelle_experimentation

In [None]:
!pip install -qqq weave openai

In [None]:
# @title Project to log to
share_results_in_public_project = False # @param {type:"boolean"}
project = f"connections"
if share_results_in_public_project:
    project = "llm-finetuning-course/connections"

In [None]:
import weave

print(f"You are logging to: {project}")

weave.init(project)

## Load Data

We have created a dataset with all previous connections puzzles

In [None]:
!wget https://raw.githubusercontent.com/wandb/connections/main/connections_prompts.jsonl

In [None]:
import json
import weave


def load_jsonl(file_path: str) -> list: 
    return [json.loads(line) for line in open(file_path, 'r').readlines()]

# ds = weave.ref('connections_prompts').get()
ds = load_jsonl("connections_prompts.jsonl")

In [None]:
print(ds[0]["solution"])

In [None]:
print(ds[0]["words"])

## Naive approach

In [None]:
import os
import openai

# put your OpenAI key in the panel to the left 🗝️
from google.colab import userdata
OPENAI_API_KEY = userdata.get('OPENAI_API_KEY')

# OPENAI_API_KEY = "sk-..."  # put your key here, the one you got from the credits 😎


client = openai.Client(api_key=OPENAI_API_KEY)

we are using the `json_object` response format to get a structured answer, we could use instructor here if we want to obtain more controlled structured output.

In [None]:
@weave.op()
def call_openai(messages, model="gpt-4o", max_tokens=256, temperature=0.7):
    response = client.chat.completions.create(
        model=model,
        messages=messages,
        max_tokens=max_tokens,
        temperature=temperature,
        response_format={ "type": "json_object" }  # <- quick win to get a structured answer
        )
    extracted = response.choices[0].message.content
    if extracted is None:
        raise ValueError("No response from model")
    return extracted


let's try the function call

In [None]:
call_openai([{"role": "user", "content": "What is the capital of France?"}])

let's parse the output and get a structured answer using `json`

In [None]:
import json

@weave.op()
def generate_solution(messages, model="gpt-4o", **kwargs):

    res = call_openai(messages, model=model, **kwargs)
    try:
        generation = json.loads(res)
    except:
        generation = {}
    return generation

## Using the weave.Model class

Let's organize our first model in a class, this way we can keep everything versioned and organized. [weave.Model](https://wandb.github.io/weave/guides/core-types/models) is a superclass of Pydanic BaseModel we some extra attributes, like the `predict` function.

In [None]:
class OneShotModel(weave.Model):
    system_prompt: str
    user_prompt: str
    temperature: float = 0.7
    max_tokens: int = 256
    model: str = "gpt-4o"
    
    @weave.op()
    def predict(self, words):
        messages = [
            {"role": "system", "content": self.system_prompt},
            {"role": "user", "content": self.user_prompt + str(list(words))}
        ]
        return generate_solution(messages, model=self.model, temperature=self.temperature, max_tokens=self.max_tokens)

Let's define some starting prompts to use our model

In [None]:
# openAI has a system prompt that steers the conversation
system_prompt = (
    "You are an expert puzzle solver. You understand literature and you are well versed on word play. "
    "I want you to solve a daily word puzzle that finds commonalities between words.\n"
    )

# a naive prompt to solve the puzzle at once
user_prompt = (
    "Here it's the puzzle:\n"
    "- There are 16 words, which form 4 groups of 4 words. Each group has some common theme that links the words.\n"
    "- You must use each of the 16 words, and use each word only once.\n"
    "- Each group of 4 words are linked together in some way. \n"
    "The connection between words can be simple.\n"
    """- An example of a simple connection would be {"reason":'types of fish', "words":["Bass", "Flounder", "Salmon", "Trout"]}. \n"""
    """- Categories can also be more complex, and require abstract or lateral thinking. An example of this type of connection would be {"reason": 'things that start with FIRE', "words": ['Ant', 'Drill', 'Island', 'Opal']}\n"""
    """The results should be in JSON format as following: {"groups": [{"reason":"reason why words are grouped", "words":["word1", "word2", "word3", "word4"]}, ...]}"""
    "Provide a full solution to the puzzle, it should be 4 groups of 4 words."
    "Here are the words for today’s puzzle:\n")

In [None]:
model = OneShotModel(system_prompt=system_prompt, user_prompt=user_prompt)

In [None]:
words = list(ds[0]["words"])
words

In [None]:
output = model.predict(words=words)
output

In [None]:
ds[0]["solution"]

this seems fine, let's create a function to compare both results

In [None]:
@weave.op()
def check_solution(solution, model_output):
    "Check that all group of words match the solution"
    solution_set = {frozenset(group["words"]) for group in solution["groups"]}
    model_output_set = {frozenset(group["words"]) for group in model_output["groups"]}
    
    accuracy = len(solution_set.intersection(model_output_set))
    
    return {"match": accuracy == 4, "accuracy": accuracy}

In [None]:
check_solution(ds[0]["solution"], output)

## Running and Evaluation

We can automate the process of testing our model by running it on all puzzles and checking the accuracy of the solutions.

In [None]:
NUM_TEST_SAMPLES = 20 # the last 20 puzzles

In [None]:
weave_eval = weave.Evaluation(dataset=ds[-NUM_TEST_SAMPLES:], scorers=[check_solution])
await weave_eval.evaluate(model)

## Now it's your turn to improve this solution!