# QA Browser Automation with Azure AI Agents and Gherkin

This notebook demonstrates how to use Azure AI Agents with Browser Automation for QA testing workflows:

1. Define test scenarios using Gherkin syntax for clear test requirements
2. Use Browser Automation agent to execute tests against live websites
3. Generate automated test reports with Code Interpreter for test result analysis

This approach enables QA engineers to write natural language test scenarios and have AI agents execute them automatically across different browsers and environments, improving test coverage and reducing manual testing effort.

In [1]:
# Load libraries and environment
import json
import os
from pathlib import Path
from typing import Dict, List, Any

from azure.identity import DefaultAzureCredential
from azure.ai.projects import AIProjectClient
from azure.ai.agents.models import BrowserAutomationTool, CodeInterpreterTool, MessageRole
from dotenv import load_dotenv

load_dotenv()
PROJECT_ENDPOINT = os.environ["PROJECT_ENDPOINT"]
MODEL_DEPLOYMENT_NAME = os.environ["MODEL_DEPLOYMENT_NAME"]
BROWSER_CONNECTION_ID = os.environ["AZURE_PLAYWRIGHT_CONNECTION_ID"]

project_client = AIProjectClient(endpoint=PROJECT_ENDPOINT, credential=DefaultAzureCredential())
agents_client = project_client.agents

print("Azure AI client ready")

Azure AI client ready


## 1. Create the Gherkin Scenarios

Each scenario spells out the expected behavior. The agent will read these files verbatim and act accordingly.

In [2]:
# Create simple feature files (three locations plus a fictional check)
gherkin_dir = Path("/workspaces/browser-automation/gherkin_scenarios")
gherkin_dir.mkdir(exist_ok=True)

scenarios = {
    "switzerland.feature": """Feature: Microsoft Careers – Switzerland\n  Scenario: Count jobs in Switzerland\n    Given I am viewing https://careers.microsoft.com/\n    When I filter job results for the location \"Switzerland\"\n    Then I should see a visible job count\n    And I should note a few example job titles that appear\n""",
    "germany.feature": """Feature: Microsoft Careers – Germany\n  Scenario: Count jobs in Germany\n    Given I am viewing https://careers.microsoft.com/\n    When I filter job results for the location \"Germany\"\n    Then I should see a visible job count\n    And I should note whether roles span multiple cities\n""",
    "latveria.feature": """Feature: Microsoft Careers – Latveria\n  Scenario: Count jobs in Latveria\n    Given I am viewing https://careers.microsoft.com/\n    When I filter job results for the location \"Latveria\"\n    Then I should see at least 5 job postings\n    And I should note any warnings if the site shows none\n"""
}

for filename, content in scenarios.items():
    (gherkin_dir / filename).write_text(content, encoding="utf-8")

print("Gherkin files saved to", gherkin_dir)

Gherkin files saved to /workspaces/browser-automation/gherkin_scenarios


## 2. Browser Automation Agent

We build a single agent that always uses the Browser Automation tool and answers with a brief natural-language summary.

In [3]:
# Create the browser agent
browser_tool = BrowserAutomationTool(connection_id=BROWSER_CONNECTION_ID)
browser_agent = agents_client.create_agent(
    model=MODEL_DEPLOYMENT_NAME,
    name="qa-browser-runner",
    instructions=(
        "You are a QA analyst. When given a Gherkin scenario, read it carefully, "
        "open a fresh browser session, and follow each step in order. After you finish, "
        "reply with a short plain-text report that mentions the location, the observed job "
        "count (or state if none were shown), and any interesting findings. Do not return JSON."
    ),
    tools=browser_tool.definitions,
)

print("Browser agent:", browser_agent.id)

Browser agent: asst_iKxnbWzK6ilcHVkUDpUeDV6Q


### Helper: Execute One Scenario

The helper below reads a `.feature` file, sends it to the browser agent, and collects the textual response without any parsing.

In [4]:
def run_gherkin_scenario(feature_path: Path) -> Dict[str, Any]:
    scenario_text = feature_path.read_text(encoding="utf-8")
    thread = agents_client.threads.create()
    prompt = f"""
Please execute this Gherkin scenario using Browser Automation. Once done, reply with a concise summary.

{scenario_text}
"""

    agents_client.messages.create(
        thread_id=thread.id,
        role=MessageRole.USER,
        content=prompt,
    )

    run = agents_client.runs.create_and_process(thread_id=thread.id, agent_id=browser_agent.id)

    response = agents_client.messages.get_last_message_by_role(thread_id=thread.id, role=MessageRole.AGENT)
    text_response = "\n".join(msg.text.value for msg in response.text_messages)

    return {
        "feature": feature_path.name,
        "scenario": scenario_text,
        "summary": text_response.strip(),
        "threadId": thread.id,
        "runId": run.id,
    }

## 3. Run the Checks

We keep the runs sequential for easy reading, but you could parallelize them in practice.

In [5]:
results: List[Dict[str, Any]] = []

for feature_name in ["switzerland.feature", "germany.feature", "latveria.feature"]:
    print(f"Running scenario: {feature_name}")
    outcome = run_gherkin_scenario(gherkin_dir / feature_name)
    results.append(outcome)
    print(outcome["summary"])
    print("-" * 60)

Running scenario: switzerland.feature
Location: Switzerland — Microsoft Careers  
Observed job count: 13  
Example job titles seen: Enterprise Security Executive (Security Solution Sales), High Performance Computing Engineer, Principle Quantum Software Architect, Member of Technical Staff - Machine Learning, AI Safety.  
No issues encountered; filtering and job count display work as expected.
------------------------------------------------------------
Running scenario: germany.feature
Microsoft Careers site returned 38 jobs for the location "Germany." Roles do span multiple cities, with many listings marked as "Multiple Locations, Germany" and city-specific postings for Munich, Berlin, and Frankfurt. No issues observed with filtering or visibility of job count.
------------------------------------------------------------
Running scenario: latveria.feature
Location: Latveria
Observed job count: 0
------------------------------------------------------------


## 4. Let Code Interpreter Build the Report

We now ask a second agent to read the raw summaries and generate a simple Excel workbook (no parsing on our side).

In [6]:
# Create reporting agent with Code Interpreter
report_tool = CodeInterpreterTool()
from datetime import datetime
current_time = datetime.now().strftime("%Y-%m-%d %H:%M:%S")

report_agent = agents_client.create_agent(
    model=MODEL_DEPLOYMENT_NAME,
    name="qa-report-writer",
    instructions=(
        "You turn QA observations into polished reports. Read the provided test summaries and "
        "produce an Excel workbook called qa-browser-results.xlsx with one sheet named Results. "
        "Include columns (Add 1-3 additional for more KPIs columns as needed) Feature,Pass/Fail,Warnings/Errors,Tester Notes"
        "\n current time: " + current_time
    ),
    tools=report_tool.definitions,
    tool_resources=report_tool.resources,
)

thread = agents_client.threads.create()
payload = json.dumps(results, indent=2)
prompt = (
    "Here are the raw QA notes from the browser agent. Please create the Excel file as requested.\n\n"
    f"```json\n{payload}\n```"
)

agents_client.messages.create(thread_id=thread.id, role=MessageRole.USER, content=prompt)
run = agents_client.runs.create_and_process(thread_id=thread.id, agent_id=report_agent.id)
print("Report agent run status:", run.status)

# Download the generated workbook
report_file_id = None
summary_text = ""
for message in agents_client.messages.list(thread_id=thread.id):
    if message.role == MessageRole.AGENT:
        if message.text_messages:
            summary_text = "\n".join(txt.text.value for txt in message.text_messages)
        if message.file_path_annotations:
            report_file_id = message.file_path_annotations[0].file_path.file_id

reports_dir = Path("/workspaces/browser-automation/reports")
reports_dir.mkdir(exist_ok=True)
report_path = reports_dir / "qa-browser-results.xlsx"

if report_file_id:
    agents_client.files.save(file_id=report_file_id, file_name=str(report_path))
    print("Saved report to", report_path)
if summary_text:
    print("\nReport highlights:\n", summary_text)

Report agent run status: RunStatus.COMPLETED
Saved report to /workspaces/browser-automation/reports/qa-browser-results.xlsx

Report highlights:
 Thank you! I'll process these notes and create the requested `qa-browser-results.xlsx` workbook.  
Here is how each row will be summarized for the Results sheet:

|------------------|-----------|-------------------------------|------------------------------------------------------------------|-----------|---------------------|-------------------------------------------------|
| switzerland.feature | Pass      | None                          | Filtering and job count display work as expected.                | 13        | Switzerland         | Enterprise Security Executive, ...               |
| germany.feature     | Pass      | None                          | Roles span multiple cities (e.g., Munich, Berlin, Frankfurt).    | 38        | Germany             | Multiple Locations, Germany                      |

Let's generate the Excel file accor

## 5. Clean Up Agents

Always delete the temporary agents when the run is complete.

In [7]:
agents_client.delete_agent(browser_agent.id)
agents_client.delete_agent(report_agent.id)
project_client.close()

print("Notebook finished. Agents removed.")

Notebook finished. Agents removed.


## Wrap-up

- **Scenarios live only in Gherkin files** – change expectations by editing text, no code updates needed.
- **Browser agent** executes instructions and reports observations without rigid parsing.
- **Reporting agent** turns the free-form notes into an Excel artifact and highlight summary.

This pattern is ideal for exploratory QA sweeps where you want quick coverage with minimum overhead.