ℹ️ **This is the *practice* version of the notebook**, with solutions and outputs ommitted. Use it to complete the tutorial with your own solutions.

A version with the complete solutions and outputs is available at [llms-beyond-chat.ipynb](llms-beyond-chat.ipynb).

---

# LLMs: Beyond Chat

Large Language Models, like GPT-4o, are well known as capabale chatbots, through applications like ChatGPT and Copilot. While chatting with a computer is a magical experience, these models can do some much more than just chat. In this tutorial we'll review some of these capabilities and adopt an approach for working with LLMs that treats them more like an execution engine for software - a VM, if you will - than a chatty persona. We'll use structured input and ouput, relying on typed schema, to interface between the textual and difficult to predict and control world of LLMs and the realm of software development.

## Setting Up

### Installing Dependencies

We'll need the following Python packages:
- `python-dotenv` for loading the endpoint configuration from the .env file
- `openai` for making calls to Azure Open AI
- `pydantic` and `instructor` for using typed, structured input and output with our LLM calls
- `pandas`, `matplotlib`, `networkx`, and `jinja2` to help us visualise our output

In [None]:
%pip install python-dotenv openai pydantic instructor pandas matplotlib networkx jinja2
# from IPython.display import clear_output ; clear_output()

### Loading Azure Open AI configuration

To configure your Azure Open AI GPT-4o endpoint:
1. Create a deployment of GPT-4o in one of the [available regions](https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models#public-cloud-regions). Use a "Global Standard" deployment to get the best performance. If you prefer to use an earlier model like GPT-4-turbo you can use that too (but it will be slower and more expensive).
2. Copy the file `dot.env` to `.env`. Edit it and update the values for `AZURE_OPENAI_ENDPOINT`,`AZURE_OPENAI_API_KEY`, and `GPT_4_O_MODEL_NAME` (that's your deployment name).
3. The next cell will load these values and configure the OpenAI SDK to use them.

In [None]:
from dotenv import load_dotenv
import os

load_dotenv()

AZURE_OPENAI_ENDPOINT = os.getenv("AZURE_OPENAI_ENDPOINT")
AZURE_OPENAI_API_KEY = os.getenv("AZURE_OPENAI_API_KEY")
GPT_4_O_MODEL_NAME = os.getenv("GPT_4_O_MODEL_NAME", default="gpt-4o")

from openai import AzureOpenAI

aoai = AzureOpenAI(
    api_version="2024-05-01-preview",
    azure_endpoint=AZURE_OPENAI_ENDPOINT,
    api_key=AZURE_OPENAI_API_KEY,
)

## Using Structured Input/Output with OpenAI

To use typed, structured input and output with our LLM calls, we will be relying on `pydantic` and `instructor`.

**Pydantic** ( [https://docs.pydantic.dev/](https://docs.pydantic.dev/) ) is a popular package for extending Python's typing system with declerative interfaces.

**Instructor** ( [https://python.useinstructor.com/](https://python.useinstructor.com/) ) patches the LLM SDK with the ability to use a Pydantic model for specifying the JSON schema for the LLM output, and parsing the result into the Pydantic mode. We will be using it in all of our examples to interact with our LLM.

Let's start by patching our LLM client and defining some helper functions...

In [None]:
import instructor
from pydantic import BaseModel, Field
from typing import List
from enum import Enum
import json
import pandas as pd
import networkx as nx
import matplotlib.pyplot as plt

# `client` is a patched Open AI SDK client that allows passing a Pydantic
# model for specifying the schema and parsing the output.
client = instructor.from_openai(aoai)

# Let's define a helper function for calling the LLM. We will use this
# function for all our LLM calls.
def llm(response_model: BaseModel = BaseModel, system: str = None,
        user: str = None, temperature: float = 0.0, max_tokens: int = 1000):
    """
    Helper function for calling the LLM (GPT-4o) with a Pydantic BaseModel,
    a system prompt and/or a user prompt, with temperature and max_tokens.
    """
    messages = []
    if system:
        messages.append({"role": "system", "content": system})
    if user:
        messages.append({"role": "user", "content": user})
    result = client.chat.completions.create(
        model=GPT_4_O_MODEL_NAME,
        response_model=response_model,
        messages=messages,
        temperature=temperature,
        max_tokens=max_tokens,
    )
    return result

def print_schema(model: BaseModel):
    """
    Print the JSON schema corrsponding to a Pydantic model.
    """
    print(json.dumps(model.model_json_schema(), indent=2))

def print_result(result: BaseModel):
    """
    Print the Pydantic model result of an LLM call as JSON.
    """
    print(result.model_dump_json(indent=2))

# Configure Pandas to format dataframes for pretty output inside the notebook
pd.set_option('display.max_columns', None)
pd.set_option('display.expand_frame_repr', False)
pd.set_option('display.max_colwidth', None)
pd.DataFrame._repr_html_ = lambda df: df.style.set_properties(**{'text-align': 'left'})._repr_html_()

def visualize_graph(graph):
    """
    Visualize a graph.
    Expects a graph object with `nodes` and `edges` properties,
    where each node has an `id`, `label`, and `color`
    and each edge has `source`, `target`, `label`, and `color`.
    """
    G = nx.DiGraph()
    for node in graph.nodes:
        G.add_node(node.id, label=node.label, color=node.color)
    for edge in graph.edges:
        G.add_edge(edge.source, edge.target, label=edge.label, color=edge.color)
    pos = nx.planar_layout(G)
    node_colors = [node[1]['color'] for node in G.nodes(data=True)]
    edge_colors = [edge[2]['color'] for edge in G.edges(data=True)]
    labels = {node[0]: node[1]['label'] for node in G.nodes(data=True)}
    plt.figure(figsize=(7, 5))
    nx.draw(G, pos, labels=labels, with_labels=True, node_color=node_colors, edge_color=edge_colors, font_size=8)
    plt.title("Knowledge Graph")
    plt.show()

def print_tree(data, indent=0):
    """
    Pretty-print a JSON object as a tree.
    """
    if isinstance(data, dict):
        for key, value in data.items():
            print('   ' * indent + key.upper() + ':')
            print_tree(value, indent + 1)
    elif isinstance(data, list):
        for item in data:
            print_tree(item, indent)
    else:
        print('   ' * indent + str(data))

## Working with Text

LLMs are master text manipulators. If it's reading or writing text, the best LLMs can do amazing things.

### Translation and Normalization

Reading and writing, understanding and translating text between languages, or language styles, is an easy task with LLMs.

Let's start by reading a sentence in one language, detecting which language it is, and translating the sentence to English.

In [None]:
german_text = "Sprachkenntnisse sind ein wichtiger Bestandteil der Kommunikation."

class TranslatedString(BaseModel):
    input_language: str = Field(
        ...,
        description="The language of the original text, as 2-letter language code."
    )
    translation: str

print("SCHEMA:")
print_schema(TranslatedString)

translation = llm(
    TranslatedString,
    "Detect the language of the original text and translate it into English.",
    german_text,
)

print("RESULT:")
print_result(translation)

Our LLM can cope with a more complex task. Let's get it to translate the same sentence to multiple languages.

Comment on token selection: for many processing tasks we'd want the LLM to be quite conservative with token selection, to get the most accurate results, but when it comes to writing text, it is often better to give it more freedom in selecting tokens from a wider distribution. We can control that by passing a higher `temperature` value when making the call.

In [None]:
english_text = "Large Language Models are a powerful tool for natural language processing."

class TargetLanguage(str, Enum):
    de = "de"
    fr = "fr"
    it = "it"
    es = "es"
    he = "he"

class Translation(BaseModel):
    language: TargetLanguage = Field(
        ...,
        description="The language of the translated text, as 2-letter language code."
    )
    translation: str

class Translations(BaseModel):
    translations: List[Translation]

print_schema(Translations)

translations = llm(
    Translations,
    ("Translate the user-provided text into the following languages: " +
     json.dumps([language.value for language in TargetLanguage])),
    english_text,
    temperature=0.7,
)

pd.DataFrame(translations.dict()["translations"])

Just as we can translate to different languages, we can also use the format to rewrite text in a different style or tone within the same language.

In [None]:
input_text = "Large Language Models are a powerful tool for natural language processing."

class TextStyle(str, Enum):
    formal = "formal"
    informal = "informal"
    casual = "casual"
    academic = "academic"
    professional = "professional"
    business = "business"

# PRACTICE: Using a similar technique as above, get the LLM to format the input_text
# in all the styles listed in TextStyle and output the results as a table.

### Unstructured Data

One very powerful task we can use an LLM for, is parsing unstructured information into a data structure. Addresses, for example, are often found in documents in incosistent formats, and parsing them into a consistent data strcuture can be very useful for using them in a software system.

In [None]:
address_str = (
    "Sherlock Holmes lives in the United Kingdom. "
    "His residence is in at 221B Baker Street, London, NW1 6XE."
)

class AddressInfo(BaseModel):
    first_name: str
    last_name: str
    street: str
    house_number: str
    postal_code: str
    city: str
    state: str
    country: str

address_info = llm(
    AddressInfo,
    address_str,
)

print_result(address_info)

That was easy! We didn't even have to prompt, just let the LLM know what is the data structure we are expecting. How about a more complex input with multiple addresses? We should be able to get the LLM to process that too.

In [None]:
input_text = (
  "During my recent travels, I had the pleasure of visiting several fascinating locations. "
  "My journey began at the office of Dr. Elena Martinez, 142B Elm Street, San Francisco, "
  "CA 94107, USA. Her office, nestled in the bustling heart of the city, was a hub of "
  "innovation and creativity. Next, I made my way to the historic residence of Mr. Hans "
  "Gruber located at 3. Stock, Goethestrasse 22, 8001 Zürich, Switzerland. The old building, "
  "with its classic Swiss architecture, stood as a testament to the city’s rich cultural "
  "heritage. My adventure continued at the tranquil countryside home of Satoshi Nakamoto, "
  "2-15-5, Sakura-cho, Musashino-shi, Tokyo-to 180-0003, Japan. Their home was surrounded by "
  "beautiful cherry blossoms, creating a picturesque scene straight out of a postcard. In "
  "Europe, I visited the charming villa of Mme. Catherine Dubois, 15 Rue de la République, "
  "69002 Lyon, France. The cobblestone streets and historic buildings of Lyon provided a "
  "perfect backdrop to her elegant home. Finally, my journey concluded at the modern apartment "
  "of Mr. David Johnson, Apt 7B, 34 Queen Street, Toronto, ON M5H 2Y4, Canada. The sleek "
  "design of the apartment building mirrored the contemporary vibe of the city itself."
  )

# PRACTICE: Use the LLM to extract all the addresses from input_text
# and output the results as a table.

The intelligence of LLMs allows them to "understand" complex logical and hierarchical structures. Consider the task of converting some information into a knowledge graph. Turning longform text into structure we can work with as part of a system can be of great value, and our LLM can help us achieve that.

In [None]:
input_text = (
    "Some products are edible and others are inedible. Soap, newspapers, and shoes, for example, "
    "are inedible. Of the products that are edible, some are sweet and others are savory. "
    "Chocolate, candy, and ice cream are sweet, while pizza, burgers, and fries are savory. "
    "Chocolate comes in different forms, such as milk chocolate, dark chocolate, and white chocolate. "
    "The New York Times, The Wall Street Journal, and The Washington Post are newspapers."
)

# PRACTICE: Use the LLM to format the information in input_text as a knowledge graph.
# You can visualize the graph using the `visualize_graph` function - it expects a graph object
# with `nodes` and `edges` properties, where each node has an `id`, `label`, and `color`
# and each edge has `source`, `target`, `label`, and `color`.


Some structures are recursive. Consider the task of parsing a linguistic sentence into a grammatical tree structure. This NLP task has kept computational linguists busy for decades, often with limited success. LLMs, however, are quite good at this sort of thing. Let's try to get the LLM to parse a simple sentence into a simplified tree grammar of English.

In [None]:
input_str = "the quick brown fox jumps over the lazy dog"

# PRACTICE: Use the LLM to parse input_str into a simple grammar tree,
# made of Verb Phrases, Noun Phrases, Prepositional Phrases, etc.
# If you visualize the tree using the `print_tree` function, it might
# look something like this:
# NOUN:
#    DET:
#       the
#    ADJ:
#       quick
#       brown
#    NOUN:
#       fox
# VERB:
#    jumps
# PREP:
#    PREP:
#       over
#    NOUN:
#       DET:
#          the
#       ADJ:
#          lazy
#       NOUN:
#          dog


💡 What other cases do you know where parsing unstructured information from text into data structures is a useful application? Could you automate any process that is currently performed manually by giving the task to an LLM? ... Some models (like GPT-4o) can also read images as input ... how about using a photo or scan as the input and parsing it into structured data?

## Synthetic Data Generation

They used to say that "data is the new oil". That valuable! What if we found an endless supply of data to work with? LLMs are great at generating new texts and pieces of information. That can be very useful in many data science and ML projects, as we can use the LLM to generate synthetic data for us.

Let's try using the LLM to generate some test data for exercising a sentiment analysis system.

In [None]:
class SyntheticSentiment(BaseModel):
    sentiment: str = Field(..., description="A review about food.")
    rating: int

sentiment = llm(
    SyntheticSentiment,
    "Generate food review with sentiments within a spectrum of sentiments, with rating between 1 and 5.",
    temperature= 0.5
)

print_result(sentiment)

That's easy. Now let's generate multiple examples.

In [None]:
class Rating(str, Enum):
    poor = "*"
    average = "**"
    good = "***"
    great = "****"
    outstanding = "*****"

n = 10

# PRACTICE: Use the LLM to generate `n` synthetic food reviews with ratings
# within the spectrum of sentiments, and ratings between 1 and 5.
# Output the results as a table, sorted by rating.

💡 Where can you use unlimited amount of test data, conforming to strict ranges and structures? What projects or tests can you run if collecting a data set is as trivial as calling an LLM?

## Decision Making

We've looked at LLMs reading and, in a way, "understanding" information, and rewriting it in useful formats. But the best LLMs also exhibit limited, but nevertheless impressive, reasoning and decision-making capabilities. Let's see how we can exploit them.

### Sentiment Analysis

Sentiment analysis, passing a judgement on the tone of a linguistic statement, is a common task that is being used in many systems, especially ones that are user-facing. Without any additional training, our LLM turns out to be quite good at making these judgements. It even can judge its own confidence level.

In [None]:
example_texts = [
    "I am very happy with the service provided by the company.",
    "The food was terrible and the service was slow.",
    "The movie was okay.",
    "The weather is perfect for a day at the beach.",
    "I am mostly satisfied with the product, but there are a few issues.",
    "The experience was note quite what I have expected.",
    "Butterflies are often colourful, and they can fly.",
]

class Sentiment(str, Enum):
    positive = "positive"
    negative = "negative"
    neutral = "neutral"

class SentimentAnalysis(BaseModel):
    sentiment: Sentiment
    confidence: float

# PRACTICE: Use the LLM to analyze the sentiment of the example_texts
# and output the results as a table, with each text, sentiment, and confidence.

### Classification

Classification is another task that requires judgement. We want our take several pieces of content and assign them to a class, or multiple tags. We want the LLM to take out taxonomy into consideration, but also make a descision as to which tags would best fit every item.

In [None]:
items = [
    {"title": "The Great Gatsby", "subtitle": "A novel by F. Scott Fitzgerald"},
    {"title": "The Theory of Relativity", "subtitle": "A scientific theory by Albert Einstein"},
    {"title": "The Technology and Culture of Ancient Rome", "subtitle": "A cross-disciplinary study of ancient Rome"},
    {"title": "Football on Television", "subtitle": "The technology and cultural impact of televising football games"},
    {"title": "The Philosophy of Taylor Swift", "subtitle": "A philosophical analysis of the music and lyrics of Taylor Swift"},
    {"title": "The Spanish Language in popular music", "subtitle": "A review of the use of the Spanish language in popular music"},
    {"title": "The Impact of Artificial Intelligence on Healthcare", "subtitle": "Exploring the role of AI in revolutionizing healthcare"},
    {"title": "The History of Jazz Music", "subtitle": "Tracing the origins and evolution of jazz music"},
    {"title": "The Rise of E-commerce in the Digital Age", "subtitle": "Examining the growth and impact of online shopping"},
    {"title": "The Art of Photography", "subtitle": "Exploring the creative and technical aspects of photography"},
    {"title": "The Psychology of Decision Making", "subtitle": "Understanding the cognitive processes behind decision making"},
    {"title": "The Role of Women in STEM Fields", "subtitle": "Highlighting the contributions of women in science, technology, engineering, and mathematics"},
    {"title": "The Cultural Significance of Tattoos", "subtitle": "Exploring the history and symbolism of tattoos in different cultures"},
]


class Tag(str, Enum):
    literature = "literature"
    science = "science"
    history = "history"
    technology = "technology"
    art = "art"
    music = "music"
    sports = "sports"
    philosophy = "philosophy"
    language = "language"
    feminism = "feminism"
    health = "health"
    media = "media"
    physics = "physics"
    culture = "culture"
    psychology = "psychology"
    artificial_intelligence = "artificial-intelligence"

# PRACTICE: Use the LLM to generate tags for the items in the list above.
# The tags should be selected from the Tag enum, and should correspond to the
# content of the item. Output the results as a table.
])

### Clustering

Now that we have classified out items and assigned a tag to each of them, we might want to cluster them together, based on their content and the tags assigned. One of the advantages of using an LLM to complete this task (rather than a predictive model), is that the LLM can also explain the choices it made, for example by giving each cluster a title.

In [None]:
num_clusters = 5

# PRACTICE: Use the LLM to cluster the items in the list above into `num_clusters`.
# Each cluster should have a title, to help us explain why the LLM made that grouping
# choice. Output the results as a table with cluster, title, subtitle, and tags.

💡 What other use-cases can you think of where simple decision-making can be handed off to an LLM? Can you think of examples where you'd be OK with letting the LLM make some decisions without human inspection? How will you know if the accuracy is good enough?

## Planning and Tool-Use

Complex systems and behaviours often need to plan multiple steps ahead and interact with the "world". LLMs can often do that quite well. Let's look at a couple of examples.

If our LLM knows of a distinct set of actions it can take, we can get it to plan which actions to perform and in what order, based on the relevant situation.

In [None]:
class Action(str, Enum):
  WAKE_UP = "Wake up"
  TURN_OFF_ALARM = "Turn off the alarm"
  STRETCH = "Stretch"
  GET_OUT_OF_BED = "Get out of bed"
  USE_BATHROOM = "Use the bathroom"
  CHECK_FOR_MOVIE_SNACKS = "Check for movie snacks"
  WASH_FACE_EVENING = "Wash face in the evening"
  CHANGE_INTO_PYJAMAS = "Change into pyjamas"
  SET_ALARM_FOR_NEXT_DAY = "Set alarm for the next day"
  CHECK_PHONE_FOR_MESSAGES = "Check phone for messages"
  TURN_OFF_LIGHTS = "Turn off lights"
  WALK_OR_DRIVE_TO_MOVIE_THEATRE = "Walk or drive to the movie theatre"
  USE_BATHROOM_EVENING = "Use the bathroom in the evening"
  DRY_OFF_WITH_TOWEL = "Dry off with a towel"
  BRUSH_TEETH = "Brush teeth"
  WASH_FACE = "Wash face"
  SHOWER = "Take a shower"
  GRAB_WALLET_PURSE = "Grab wallet or purse"
  MAKE_SURE_PHONE_IS_CHARGED = "Make sure phone is charged"
  CALL_A_TAXI_ARRANGE_TRANSPORTATION = "Call a taxi or arrange transportation"
  MEET_FRIENDS_AT_DESIGNATED_PLACE = "Meet friends at designated place"
  LEAVE_THE_HOUSE = "Leave the house"
  PLAN_TO_BUY_AT_THEATRE = "Plan to buy tickets at the theatre"
  DECIDE_ON_MEETING_PLACE_AND_TIME = "Decide on meeting place and time"
  GET_DRESSED = "Get dressed"
  APPLY_DEODORANT = "Apply deodorant"
  COMB_BRUSH_HAIR = "Comb or brush hair"
  STYLE_HAIR = "Style hair"
  SHAVE = "Shave"
  PUT_ON_CLOTHES = "Put on clothes"
  APPLY_MAKEUP = "Apply makeup"
  PREPARE_BREAKFAST = "Prepare breakfast"
  EAT_BREAKFAST = "Eat breakfast"
  MAKE_COFFEE_TEA = "Make coffee or tea"
  CHECK_PHONE_FOR_MESSAGES_EMAILS = "Check phone for messages or emails"
  PACK_LUNCH = "Pack lunch"
  GATHER_WORK_MATERIALS = "Gather work materials"
  PUT_ON_SHOES = "Put on shoes"
  GRAB_KEYS = "Grab keys"
  PURCHASE_TICKETS_AT_THEATRE = "Purchase tickets at the theatre"
  LOCK_THE_DOOR = "Lock the door"
  FINISH_DINNER = "Finish dinner"
  CLEAN_UP_DINNER_DISHES = "Clean up dinner dishes"
  WATCH_TV_READ_BOOK = "Watch TV or read a book"
  BRUSH_TEETH_EVENING = "Brush teeth in the evening"
  GET_INTO_BED = "Get into bed"
  MEDITATE_RELAX = "Meditate or relax"
  WRITE_IN_JOURNAL = "Write in journal"
  LISTEN_TO_CALMING_MUSIC = "Listen to calming music"
  TURN_OFF_ELECTRONIC_DEVICES = "Turn off electronic devices"
  ADJUST_PILLOWS_AND_BLANKETS = "Adjust pillows and blankets"
  READ_BOOK = "Read a book"
  CLOSE_EYES_TRY_TO_SLEEP = "Close eyes and try to sleep"
  DECIDE_ON_MOVIE_TO_WATCH = "Decide on a movie to watch"
  CHECK_MOVIE_TIMES_ONLINE = "Check movie times online"
  PURCHASE_TICKETS_ONLINE = "Purchase tickets online"
  BUY_SNACKS_AT_CONCESSION_STAND = "Buy snacks at the concession stand"
  FIND_CORRECT_THEATRE_SCREEN = "Find the correct theatre screen"
  FIND_SEATS = "Find seats"
  WATCH_THE_MOVIE = "Watch the movie"
  DISCUSS_MOVIE_WITH_FRIENDS = "Discuss the movie with friends"
  SAY_GOODBYE_TO_FRIENDS = "Say goodbye to friends"
  RETURN_HOME = "Return home"

activities = [
  "Waking up and going to work",
  "Winding down and going to sleep",
  "Going to see a movie with friends",
]

# PRACTICE: Use the LLM to generate a sequence of actions for each of the activities,
# based on the actions available in the enum.

For our final example, let's get our LLM act as a game-playing engine. The game is simple, tic-tac-toe (that's a simple example, but the best LLMs have been shown capable of playing much more complex games). We can get a lot of behaviour with very little programming, just by asking the LLM and restricting the input and output.

In [None]:
class TicTacToeMove(BaseModel):
  row: int
  col: int

class TicTacToeStrategy(str, Enum):
  optimal = "Optimal. Always choose the best move for winning the game or preventing your opponent from winning."
  random = "Random. Choose your next move at random."
  next_free = "Next Freee. Always choose the next free spot counting from the top-left."

class TicTacToeWinner(str, Enum):
  X = "X"
  O = "O"
  Tie = "Tie"
  Ongoing = "Ongoing"

class TicTacToeStatus(BaseModel):
  winner: TicTacToeWinner

class TicTacToeBoard:
  def __init__(self):
    self.board = [[' ' for _ in range(3)] for _ in range(3)]
  
  def dumps_board(self):
    return '\n-----\n'.join(['|'.join(row) for row in self.board]) + '\n'

  def print_board(self):
    print(self.dumps_board())
  
  def make_move(self, role, move: TicTacToeMove):
    self.board[move.row][move.col] = role
  
  def check_status(self) -> TicTacToeWinner:
    pass # PRACTICE: Use the LLM to implement the logic to check the status
    # of the game and return the winner.
    # TIP: consider how many tokens you need for the solution. Even if the output
    # is limited to a schema, the LLM may be tempted to use all the token budget
    # to "think out loud" about the solution, but that might not be necessary for
    # such a simple game.

class TicTacToePlayer:
  def __init__(self, role, strategy):
    self.role = role
    self.strategy = strategy

  def turn(self, board):
    move = # PRACTICE: Use the LLM to generate the next move based on the
    # player's role and strategy and the state of the game board.

    board.make_move(self.role, move)
    board.print_board()

board = TicTacToeBoard()
player_x = TicTacToePlayer('X', TicTacToeStrategy.optimal)
player_o = TicTacToePlayer('O', TicTacToeStrategy.next_free)

next_player = player_x
while board.check_status() == TicTacToeWinner.Ongoing:
  next_player.turn(board)
  next_player = player_x if next_player == player_o else player_o

print(f"Game Over! Winner: {board.check_status()}")

💡 Can you think of more demanding tasks you could automate by connecting an LLM to the "real world"? How can you structure the interface between the LLM and external "tools"? How will you validate that the LLM's decisions are optimal, or at least ensure that you can recover from sub-optimal decisions?

## Conclusion - Your LLM is a VM

In this tutorial we explored simple ways of using an LLM as software device. One that gets programmed with instructions in human language, but also understands and respects software interfaces. These examples may seem trivial, but the best LLMs, like GPT-4o, can handle a lot more complexity, and perform with quite a lot of "intelligence". Now that you're familiar with this way of thinking about LLMs, and with some techniques for interfacing with an LLM from your software, what will you build?