# Building a Playwright Browser Agent

<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/llama-index-integrations/tools/llama-index-tools-playwright/examples/playwright_browser_agent.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This tutorial walks through using the LLM tools provided by the [Playwright](https://playwright.dev/) to allow LLMs to easily navigate and scrape content from the Internet.

## Instaniation

In [None]:
%pip install llama-index-tools-playwright llama-index

In [None]:
# set up async playwright browser
# To enable more llamaindex usecases, we only offer async playwright tools at the moment

# install playwright
!playwright install

# This import is required only for jupyter notebooks, since they have their own eventloop
import nest_asyncio

nest_asyncio.apply()

# import the tools
from llama_index.tools.playwright.base import PlaywrightToolSpec

# create the tools
browser = await PlaywrightToolSpec.create_async_playwright_browser(headless=True)
playwright_tool = PlaywrightToolSpec.from_async_browser(browser)

## Testing the playwright tools

### Listing all tools

In [None]:
playwright_tool_list = playwright_tool.to_tool_list()
for tool in playwright_tool_list:
    print(tool.metadata.name)

click
fill
get_current_page
extract_hyperlinks
extract_text
get_elements
navigate_to
navigate_back


### Navigating to playwright doc website

In [None]:
await playwright_tool.navigate_to("https://playwright.dev/python/docs/intro")

### Print the current page URL
print(await playwright_tool.get_current_page())

https://playwright.dev/python/docs/intro


### Extract all hyperlinks

In [None]:
print(await playwright_tool.extract_hyperlinks())

["/python/docs/actionability", "#introduction", "/python/docs/webview2", "/python/docs/dialogs", "/python/docs/api-testing", "/python/docs/navigations", "/python/docs/pom", "https://www.youtube.com/channel/UC46Zj8pDH5tDosqm1gd7WTg", "/python/docs/aria-snapshots", "#", "/python/docs/trace-viewer", "/python/", "/python/docs/handles", "/python/docs/input", "https://pypi.org/project/pytest-playwright/", "/python/docs/locators", "/dotnet/docs/intro", "/python/docs/codegen-intro", "/python/docs/auth", "/python/docs/browser-contexts", "/python/docs/other-locators", "https://www.linkedin.com/company/playwrightweb", "/python/docs/downloads", "https://twitter.com/playwrightweb", "/python/docs/intro", "/python/docs/intro#running-the-example-test", "/python/docs/frames", "/python/docs/release-notes", "#running-the-example-test", "#system-requirements", "/python/docs/evaluating", "/python/docs/writing-tests", "/python/docs/network", "/python/docs/screenshots", "/docs/intro", "/python/docs/videos", 

### Extract all text

In [None]:
print(await playwright_tool.extract_text())

Installation | Playwright Python Skip to main content Playwright for Python Docs API Python Python Node.js Java .NET Community Search ⌘ K Getting Started Installation Writing tests Generating tests Running and debugging tests Trace viewer Setting up CI Pytest Plugin Reference Getting started - Library Release notes Guides Actions Auto-waiting API testing Assertions Authentication Browsers Chrome extensions Clock Debugging Tests Dialogs Downloads Emulation Evaluating JavaScript Events Extensibility Frames Handles Isolation Locators Mock APIs Navigations Network Other locators Pages Page object models Screenshots Snapshot testing Test generator Trace viewer Videos WebView2 Integrations Supported languages Getting Started Installation On this page Installation Introduction ​ Playwright was created specifically to accommodate the needs of end-to-end testing. Playwright supports all modern rendering engines including Chromium, WebKit, and Firefox. Test on Windows, Linux, and macOS, locally 

### Get element
Get element attributes for navigating to the next page.
You can retrieve the selector from google chrome dev tools.

In [None]:
element = await playwright_tool.get_elements(
    selector="#__docusaurus_skipToContent_fallback > div > div > main > div > div > div.col.docItemCol_VOVn > div > nav > a",
    attributes=["innerText"],
)
print(element)

[{"innerText": "Next\nWriting tests"}]


### Click
Click on the search bar

In [None]:
await playwright_tool.click(
    selector="#__docusaurus > nav > div.navbar__inner > div.navbar__items.navbar__items--right > div.navbarSearchContainer_Bca1 > button"
)

"Clicked element '#__docusaurus > nav > div.navbar__inner > div.navbar__items.navbar__items--right > div.navbarSearchContainer_Bca1 > button'"

### Fill
Fill in the search bar with "Mouse click"

In [None]:
await playwright_tool.fill(selector="#docsearch-input", value="Mouse click")

"Filled element '#docsearch-input'"

Click on the first result, we should be redirected to the Mouse click page

In [None]:
await playwright_tool.click(selector="#docsearch-hits0-item-0")
print(await playwright_tool.get_current_page())

https://playwright.dev/python/docs/input#mouse-click


## Using the playwright tool with agent
To get started, you will need an [OpenAI api key](https://platform.openai.com/account/api-keys)

In [None]:
# set your openai key, if using openai
import os

os.environ["OPENAI_API_KEY"] = "YOUR_OPENAI_API_KEY"

In [None]:
from llama_index.core.agent import FunctionCallingAgent
from llama_index.llms.openai import OpenAI

playwright_tool_list = playwright_tool.to_tool_list()

agent = FunctionCallingAgent.from_tools(
    playwright_tool_list,
    llm=OpenAI(model="gpt-4o"),
)

In [None]:
print(
    agent.chat(
        "Navigate to https://blog.samaltman.com/productivity, extract the text on this page and return a summary of the article."
    )
)

Sam Altman's blog post on productivity offers a comprehensive guide to enhancing personal efficiency and effectiveness. Here are the key points:

1. **Compound Growth**: Altman emphasizes the importance of small productivity gains compounded over time, likening it to financial growth.

2. **Choosing the Right Work**: The most crucial aspect of productivity is selecting the right tasks. Independent thought and strong personal beliefs are vital.

3. **Delegation and Enjoyment**: Delegate tasks based on others' strengths and interests. Avoid work that doesn't interest you, as it hampers morale and productivity.

4. **Learning and Collaboration**: Embrace the ability to learn quickly and surround yourself with inspiring, positive people.

5. **Prioritization**: Altman uses lists to manage tasks and prioritizes work that builds momentum. He stresses the importance of saying no to non-critical tasks.

6. **Time Management**: Avoid unnecessary meetings and optimize your schedule for productiv

## Using the playwright tool with agent workflow

In [None]:
from llama_index.llms.openai import OpenAI
from llama_index.core.agent.workflow import AgentWorkflow

from llama_index.core.agent.workflow import (
    AgentInput,
    AgentOutput,
    ToolCall,
    ToolCallResult,
    AgentStream,
)

In [None]:
llm = OpenAI(model="gpt-4o")

workflow = AgentWorkflow.from_tools_or_functions(
    playwright_tool_list,
    llm=llm,
    system_prompt="You are a helpful assistant that can do browser automation and data extraction",
)

handler = workflow.run(
    user_msg="Navigate to https://blog.samaltman.com/productivity, extract the text on this page and return a summary of the article."
)

async for event in handler.stream_events():
    if isinstance(event, AgentStream):
        print(event.delta, end="", flush=True)
        print(event.response)  # the current full response
        print(event.raw)  # the raw llm api response
        print(event.current_agent_name)  # the current agent name
    elif isinstance(event, AgentInput):
        print(event.input)  # the current input messages
        print(event.current_agent_name)  # the current agent name
    elif isinstance(event, AgentOutput):
        print(event.response)  # the current full response
        print(event.tool_calls)  # the selected tool calls, if any
        print(event.raw)  # the raw llm api response
    elif isinstance(event, ToolCallResult):
        print(event.tool_name)  # the tool name
        print(event.tool_kwargs)  # the tool kwargs
        print(event.tool_output)  # the tool output
    elif isinstance(event, ToolCall):
        print(event.tool_name)  # the tool name
        print(event.tool_kwargs)  # the tool kwargs

[ChatMessage(role=<MessageRole.SYSTEM: 'system'>, additional_kwargs={}, blocks=[TextBlock(block_type='text', text='You are a helpful assistant that can do browser automation and data extraction')]), ChatMessage(role=<MessageRole.USER: 'user'>, additional_kwargs={}, blocks=[TextBlock(block_type='text', text='Navigate to https://blog.samaltman.com/productivity, extract the text on this page and return a summary of the article.')])]
Agent

{'id': 'chatcmpl-B0NqOFzbRA2SlTZ290kespt6Ok9Sx', 'choices': [{'delta': {'content': None, 'function_call': None, 'refusal': None, 'role': 'assistant', 'tool_calls': [{'index': 0, 'id': 'call_MDHqmwmafi2Yb6sU3tp6BpLj', 'function': {'arguments': '', 'name': 'navigate_to'}, 'type': 'function'}]}, 'finish_reason': None, 'index': 0, 'logprobs': None}], 'created': 1739431356, 'model': 'gpt-4o-2024-08-06', 'object': 'chat.completion.chunk', 'service_tier': 'default', 'system_fingerprint': 'fp_523b9b6e5f', 'usage': None}
Agent

{'id': 'chatcmpl-B0NqOFzbRA2SlTZ29