# Building a Chess Move Solver with Hyperbrowser and GPT-4o

In this cookbook, we'll build a smart chess puzzle solver that can analyze a chess position and recommend the best next move. This approach combines:

- **Hyperbrowser** for capturing screenshots of chess positions from websites
- **OpenAI's GPT-4o model** for analyzing the position and determining the best move
- **Tool-calling** to create an agent that can work with visual chess data

By the end of this cookbook, you'll have a reusable agent that can solve chess puzzles from websites like Lichess!

## Prerequisites

Before starting, make sure you have:

1. A Hyperbrowser API key (sign up at [hyperbrowser.ai](https://hyperbrowser.ai) if you don't have one, it's free)
2. An OpenAI API key
3. Python 3.9+ installed

Both API keys should be stored in a `.env` file in the same directory as this notebook with the following format:

```
HYPERBROWSER_API_KEY=your_hyperbrowser_key_here
OPENAI_API_KEY=your_openai_key_here
```

## Step 1: Set up imports and load environment variables

In [1]:
import asyncio
import json
import os

from dotenv import load_dotenv
from hyperbrowser import AsyncHyperbrowser
from hyperbrowser.tools import WebsiteScrapeTool
from openai import AsyncOpenAI
from openai.types.chat import (
    ChatCompletionMessageParam,
    ChatCompletionMessageToolCall,
    ChatCompletionToolMessageParam,
)

load_dotenv()

True

## Step 2: Initialize clients

In [2]:
hb = AsyncHyperbrowser(api_key=os.getenv("HYPERBROWSER_API_KEY"))
oai = AsyncOpenAI()

## Step 3: Create a custom tool for taking screenshots of chess positions

We'll create a custom tool that can take screenshots of web pages. This is particularly useful for capturing chess positions from websites like Lichess. The tool follows the OpenAI tool definition format and will be used by our agent to visually analyze chess positions.

In [18]:
from hyperbrowser.models.crawl import StartCrawlJobParams
from hyperbrowser.tools.openai import ChatCompletionToolParam

class WebsiteScreenshotTool:
    screenshot_tool_definition:ChatCompletionToolParam ={
        "type": "function",
        "function": {
            "name": "scrape_webpage",
            "description": "Scrape content from a webpage and return the content in screenshot format",
            "parameters": {
                "type": "object",
                "properties": {
                    "url": {
                        "type": "string",
                        "description": "The URL of the website to scrape",
                    },
                    "scrape_options": {
                        "type": "object",
                        "properties": {
                            "format": {
                                "type": "string",
                                "description": "The format of the content to scrape",
                                "enum": ["screenshot"],
                            },
                        },
                        "required": ["format"],
                        "additionalProperties": False,
                    },
                },
                "required": ["url", "scrape_options"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    }

    @staticmethod
    async def async_runnable(hb: AsyncHyperbrowser, params: dict) -> str:
        resp = await hb.crawl.start_and_wait(params=StartCrawlJobParams(**params))
        print(resp)
        screenshot:str = ""
        if resp.data:
            for page in resp.data:
                if page.screenshot:
                    screenshot =f"data:image/png;base64,${page.screenshot}"
        return screenshot


## Step 4: Create helper functions for tool handling

Next, we'll define a function to handle tool calls from the LLM. This function will process the screenshot tool calls and return the results to the agent.

In [20]:
async def handle_tool_call(
    tc: ChatCompletionMessageToolCall,
) -> ChatCompletionToolMessageParam:
    print(f"Handling tool call: {tc.function.name}")

    try:
        if (
            tc.function.name != WebsiteScreenshotTool.screenshot_tool_definition["function"]["name"]
        ):
            raise ValueError(f"Tool not found: {tc.function.name}")

        args = json.loads(tc.function.arguments)
        content = await WebsiteScrapeTool.async_runnable(hb=hb, params=args)

        return {"role": "tool", "tool_call_id": tc.id, "content": content}

    except Exception as e:
        err_msg = f"Error handling tool call: {e}"
        print(err_msg)
        return {
            "role": "tool",
            "tool_call_id": tc.id,
            "content": err_msg,
            "is_error": True, #type: ignore
        }

## Step 5: Implement the agent loop

Now we'll create the main agent loop that orchestrates the conversation between the user, the LLM, and the tools. This function:

1. Takes a list of messages (including system prompt and user query)
2. Sends them to the OpenAI API
3. Processes any tool calls that the LLM makes
4. Continues the conversation until the LLM provides a final answer

This is the core of our chess-solving agent's functionality.

In [21]:
async def agent_loop(messages: list[ChatCompletionMessageParam]) -> str:
    while True:
        response = await oai.chat.completions.create(
            messages=messages,
            model="gpt-4o",
            tools=[
                WebsiteScreenshotTool.screenshot_tool_definition,
            ],
            max_completion_tokens=8000,
        )

        choice = response.choices[0]

        # Append response to messages
        messages.append(choice.message) #type: ignore

        # Handle tool calls
        if choice.finish_reason == "tool_calls" and choice.message.tool_calls is not None:
            tool_result_messages = await asyncio.gather(
                *[handle_tool_call(tc) for tc in choice.message.tool_calls]
            )
            messages.extend(tool_result_messages)

        elif choice.finish_reason == "stop" and choice.message.content is not None:
            return choice.message.content

        else:
            print(choice)
            raise ValueError(f"Unhandled finish reason: {choice.finish_reason}")

## Step 6: Design the system prompt

The system prompt is crucial for guiding the LLM's behavior. Our prompt establishes the LLM as a chess expert and provides instructions on how to analyze chess positions and report the best moves.

In [5]:
SYSTEM_PROMPT = """
You are an expert chess solver. You have access to a 'scrape_webpage' tool which can be used to take a screenshot of the current position. 

This is the link to a chess game {chess_game_url}. You are given a position and you need to find the next move.
The page contains the current position and tells you the color of the piece to move, usually listed as "Find the best move for white" or "Find the best move for black"."

You are required to response with 
1. The best piece to move (one between a pawn, knight, bishop, rook, queen, or king)
2. the current position of the piece to move (usually listed as "a4" or "h8")
3. the next position of the piece to move (usually listed as "a5" or "h7")
""".strip()

## Step 7: Create a factory function for generating chess-solving agents

Now we'll create a factory function that generates a specialized chess-solving agent. This function:

1. Takes a chess game URL as input
2. Formats the system prompt with this URL
3. Returns a function that can analyze and solve chess positions

This approach makes our solution reusable for different chess puzzles from various websites.

In [22]:
from typing import Coroutine, Any,Callable


def make_support_agent(link_to_chess_game: str)->Callable[..., Coroutine[Any, Any, str]]:
    # Popular documentation providers like Gitbook, Mintlify etc automatically generate a llms.txt file
    # for documentation sites hosted on their platforms.
    if not( link_to_chess_game.startswith("http://") or link_to_chess_game.startswith("https://")):
        link_to_chess_game = f"https://{link_to_chess_game}"

    sysprompt = SYSTEM_PROMPT.format(
        chess_game_url=link_to_chess_game,
    )

    async def qna(question: str) -> str:
        return await agent_loop([
            {"role": "system", "content": sysprompt},
            {"role": "user", "content": question},
        ])

    return qna

## Step 8: Test the agent with a real chess puzzle

Let's test our agent by creating an instance for a Lichess chess puzzle and asking it to find the best move. This will demonstrate the full workflow:

1. The agent receives a question about the best move for a chess position
2. It uses the `scrape_webpage` tool to take a screenshot of the position
3. It analyzes the position and determines the best move
4. It returns the answer in the specified format

You'll see the tool calls being made in real-time as the agent works through the puzzle.

In [23]:
link_to_chess_game = "https://lichess.org/training/ntE6Z"

question = "What is the best move for white?"

agent = make_support_agent(link_to_chess_game)

response = await agent(question)

print(response)

Handling tool call: scrape_webpage
1. **Best Piece to Move**: Queen  
2. **Current Position**: a4  
3. **Next Position**: e8

You should move the Queen from a4 to e8.


## Conclusion

In this cookbook, we built a powerful chess puzzle solver using Hyperbrowser and OpenAI. This agent can:

1. Access and capture screenshots of chess positions from websites
2. Analyze the visual representation of a chess board
3. Determine the best next move based on the current position
4. Provide a clear, structured response with the piece, current position, and target position

This pattern can be extended to create more sophisticated chess analysis tools or be adapted for other visual puzzle-solving tasks.

### Next Steps

To take this further, you might consider:
- Adding support for multiple chess puzzle platforms
- Implementing move validation to ensure the suggested moves are legal
- Creating a web interface where users can paste chess puzzle links
- Adding explanations for why a particular move is best
- Extending the agent to recommend multiple good moves with pros and cons

Happy chess solving!

## Relevant Links
- [Hyperbrowser](https://hyperbrowser.ai)
- [Lichess Puzzles](https://lichess.org/training)
- [OpenAI Docs](https://platform.openai.com/docs/introduction)