# Maestro Automation Script Generator

## 1. Introduction
Mobile application testing is critical but often becomes a bottleneck in the development process. Many QA teams document test cases in natural language but lack the technical expertise to convert them into automated tests. This project bridges this gap by automatically converting natural language test cases into executable [Maestro](https://docs.maestro.dev/) test scripts.

### Problem Statement
- Manual test execution is time-consuming and error-prone
- QA teams often document test cases well but struggle to automate them
- Technical barriers exist between test case documentation and test automation
- Maestro offers a simpler automation syntax than traditional frameworks but still requires specific knowledge

### Solution Overview
This notebook demonstrates how generative AI can transform natural language test cases into executable Maestro test scripts, thereby:
- Accelerating test automation
- Reducing the technical expertise required for automation
- Improving test coverage and consistency
- Enabling more frequent test execution

## 2. Understanding the Technologies

### What is Maestro?
Maestro is a mobile UI testing framework that uses a declarative YAML syntax. It allows testers to create automated tests for mobile applications without extensive programming knowledge. Here's a simple example of a Maestro flow:

```yaml
appId: com.example.app
---
- launchApp
- tapOn: "Login"
- inputText: "username"
- tapOn: "Password"
- inputText: "password123"
- tapOn: "Submit"
- assertVisible: "Welcome"
```

### The AI-powered Conversion Process
We'll leverage generative AI to:
1. Parse natural language test cases
2. Understand the intended test flow
3. Extract key actions and validations
4. Generate structured Maestro YAML scripts

## 3. GenAI Capabilities Used in This Project

### 1. Structured Output/Controlled Generation
We'll use this capability to ensure our model generates valid YAML syntax that conforms to Maestro's requirements.

### 2. Few-Shot Prompting
We'll provide examples of natural language to Maestro script conversions so the model can learn the patterns.

### 3. Function Calling
We'll implement functions to validate and enhance the generated scripts.

### 4. Document Understanding
Our model will process test case documents to extract meaningful test steps.

### 5. Grounding
We'll ensure the model generates scripts that reference valid Maestro commands and syntax.

## 4. Setting Up the Environment

In [1]:
# Install required packages
!pip install -q pydantic yaml gradio google-generativeai

[31mERROR: Could not find a version that satisfies the requirement yaml (from versions: none)[0m[31m
[0m[31mERROR: No matching distribution found for yaml[0m[31m
[0m

In [2]:
# Import necessary libraries
import os
import yaml
import json
import google.generativeai as genai
from pydantic import BaseModel, Field
from typing import List, Optional, Dict, Any

## API Setup
For this notebook to work, you need to have a Google API key with access to Gemini models.

In [3]:
# Setup credentials
# In Kaggle, we need to use the secrets framework to access API keys

# Add this code to access your API key from Kaggle secrets
# You'll need to add your API key to Kaggle secrets with the name "GOOGLE_API_KEY"
# Go to "Add-ons" > "Secrets" in your Kaggle notebook to add this

try:
    from kaggle_secrets import UserSecretsClient
    user_secrets = UserSecretsClient()
    # Get the API key from Kaggle secrets - you need to add it there first
    GOOGLE_API_KEY = user_secrets.get_secret("GOOGLE_API_KEY")
except ImportError:
    # Fallback for running locally or if secrets not available
    GOOGLE_API_KEY = os.environ.get("GOOGLE_API_KEY", "")
    
# If you're not using Kaggle secrets, you can uncomment and use this instead:
# GOOGLE_API_KEY = "YOUR_API_KEY"  # Replace with your actual API key

# Configure the Google Generative AI API
genai.configure(api_key=GOOGLE_API_KEY)

In [4]:
# Load the Gemini model
model = genai.GenerativeModel('gemini-2.0-flash')

## 5. Function Calling Setup

We'll define a schema for Maestro test flows to ensure our generated scripts are valid:

In [5]:
class MaestroAction(BaseModel):
    action_type: str = Field(..., description="Type of action like tapOn, inputText, assertVisible, etc.")
    target: Optional[str] = Field(None, description="Target element or text for the action")
    value: Optional[str] = Field(None, description="Value for actions like inputText")
    
class MaestroFlow(BaseModel):
    appId: str = Field(..., description="The application ID (package name)")
    actions: List[MaestroAction] = Field(..., description="List of actions to perform")
    
def validate_maestro_script(yaml_content: str) -> Dict[str, Any]:
    """
    Validates if the YAML content is a valid Maestro script.
    Returns the parsed YAML if valid, raises an exception otherwise.
    """
    try:
        # Parse the YAML content
        parsed_yaml = yaml.safe_load_all(yaml_content)
        parsed_yaml = list(parsed_yaml)
        
        # Check for appId
        if not isinstance(parsed_yaml[0], dict) or "appId" not in parsed_yaml[0]:
            return {"valid": False, "error": "Missing appId in the first YAML document."}
        
        # Check that all action types are valid Maestro commands
        valid_actions = [
            "launchApp", "tapOn", "inputText", "assertVisible", "assertNotVisible", 
            "scroll", "swipe", "back", "closeApp", "waitForAnimationToEnd",
            "pressKey", "runFlow", "takeScreenshot", "extendedWaitUntil", "waitForAnimationToEnd"
        ]
        
        for action in parsed_yaml[1:]:
            if isinstance(action, dict):
                for key in action.keys():
                    if key not in valid_actions and key != "-":
                        return {"valid": False, "error": f"Invalid action type: {key}"}
            
        return {"valid": True, "yaml": parsed_yaml}
    
    except yaml.YAMLError as e:
        return {"valid": False, "error": f"YAML parsing error: {str(e)}"}
    except Exception as e:
        return {"valid": False, "error": f"Validation error: {str(e)}"}

## 6. Few-Shot Examples for Test Case Conversion

We'll create a series of examples to demonstrate the conversion pattern:

In [6]:
few_shot_examples = """
Example 1:
Test Case:
1. Open the application
2. Tap on the login button
3. Enter "testuser" in the username field
4. Enter "password123" in the password field
5. Tap on the submit button
6. Verify that the welcome message is displayed

Maestro Script:
```yaml
appId: com.example.app
---
- launchApp
- tapOn: "Login"
- tapOn: "Username"
- inputText: "testuser"
- tapOn: "Password"
- inputText: "password123"
- tapOn: "Submit"
- assertVisible: "Welcome"
```

Example 2:
Test Case:
1. Launch the app
2. Navigate to the shopping cart
3. Add the "Wireless Headphones" product to cart
4. Proceed to checkout
5. Verify that the total price is displayed

Maestro Script:
```yaml
appId: com.example.shop
---
- launchApp
- tapOn: "Shopping Cart"
- tapOn: "Wireless Headphones"
- tapOn: "Add to Cart"
- tapOn: "Checkout"
- assertVisible: "Total"
```

Example 3:
Test Case:
1. Open the note-taking app
2. Create a new note
3. Enter "Meeting notes" as the title
4. Type "Discuss project timeline" in the body
5. Save the note
6. Verify the note appears in the list

Maestro Script:
```yaml
appId: com.example.notes
---
- launchApp
- tapOn: "New Note"
- tapOn: "Title"
- inputText: "Meeting notes"
- tapOn: "Body"
- inputText: "Discuss project timeline"
- tapOn: "Save"
- assertVisible: "Meeting notes"
```
"""

## 7. Core Conversion Function with Structured Output

Now we'll implement the main conversion function using the Gemini Pro model with function calling:

In [7]:
def convert_test_case_to_maestro(test_case, app_id="com.example.app"):
    """
    Convert a natural language test case to a Maestro test script.
    
    Args:
        test_case (str): The natural language test case
        app_id (str): The application ID to use in the script
        
    Returns:
        str: The generated Maestro script
    """
    # Build the prompt with few-shot examples
    prompt = f"""You are an expert test automation engineer. 
    Convert the following test case into a Maestro test script.
    
    {few_shot_examples}
    
    Now, convert this test case:
    App ID: {app_id}
    
    Test Case:
    {test_case}
    
    Generate a complete and valid Maestro script that follows Maestro's syntax and features.
    The output should be a valid YAML with appId first followed by actions.
    """
    
    response = model.generate_content(
        prompt,
        generation_config={"temperature": 0.2}
    )
    
    # Extract the YAML content
    yaml_text = response.text
    
    # Extract the yaml part if it's within code blocks
    if "```yaml" in yaml_text:
        yaml_parts = yaml_text.split("```yaml")
        if len(yaml_parts) > 1:
            yaml_content = yaml_parts[1].split("```")[0].strip()
            return yaml_content
    elif "```" in yaml_text:
        yaml_parts = yaml_text.split("```")
        if len(yaml_parts) > 1:
            yaml_content = yaml_parts[1].strip()
            return yaml_content
    
    # In case no code blocks were used, try to extract content that looks like YAML
    lines = yaml_text.strip().split('\n')
    yaml_lines = []
    in_yaml = False
    
    for line in lines:
        if line.startswith("appId:"):
            in_yaml = True
            yaml_lines.append(line)
        elif in_yaml and (line.startswith("-") or line == "---"):
            yaml_lines.append(line)
        elif in_yaml and not line.strip():
            yaml_lines.append(line)
        elif in_yaml:
            # If we encounter a non-YAML looking line after YAML has started, stop
            if not (line.startswith(" ") or line.startswith("\t")):
                break
            yaml_lines.append(line)
    
    if yaml_lines:
        return "\n".join(yaml_lines)
    
    return yaml_text

## 8. Document Understanding: Extract Test Cases from Documentation

Let's implement a function to extract test cases from more extensive documentation:

In [8]:
def extract_test_cases_from_document(document_text):
    """
    Extract test cases from a larger document.
    
    Args:
        document_text (str): The document text containing test cases
        
    Returns:
        list: List of extracted test cases
    """
    prompt = """
    Extract individual test cases from the following QA documentation.
    For each test case, identify:
    1. A descriptive title
    2. The sequence of steps
    3. The expected results
    
    Format your response as a JSON array where each object has the fields:
    - title: The test case title
    - steps: Array of test steps
    - expected_result: What should happen when the test passes
    
    Here's the QA documentation:
    """
    
    response = model.generate_content(
        prompt + document_text,
        generation_config={"temperature": 0.1}
    )
    
    text_response = response.text
    
    try:
        # Try to parse the response as JSON directly
        test_cases = json.loads(text_response)
        return test_cases
    except json.JSONDecodeError:
        # Try to extract JSON content if it's wrapped in markdown code blocks
        if "```json" in text_response:
            json_parts = text_response.split("```json")
            if len(json_parts) > 1:
                json_content = json_parts[1].split("```")[0].strip()
                try:
                    return json.loads(json_content)
                except json.JSONDecodeError:
                    pass
        elif "```" in text_response:
            json_parts = text_response.split("```")
            if len(json_parts) > 1:
                json_content = json_parts[1].strip()
                try:
                    return json.loads(json_content)
                except json.JSONDecodeError:
                    pass
                
        # If we still don't have valid JSON, use a simplified approach
        # Parse the text to extract test cases
        lines = text_response.strip().split('\n')
        test_cases = []
        current_case = None
        
        for line in lines:
            if line.startswith("Test Case:") or line.startswith("Title:"):
                if current_case:
                    test_cases.append(current_case)
                current_case = {"title": line.split(":", 1)[1].strip(), "steps": [], "expected_result": ""}
            elif line.startswith("Step ") and current_case:
                step = line.split(":", 1)[1].strip() if ":" in line else line
                current_case["steps"].append(step)
            elif line.startswith("Expected:") and current_case:
                current_case["expected_result"] = line.split(":", 1)[1].strip()
                
        if current_case:
            test_cases.append(current_case)
            
        return test_cases

## 9. Batched Conversion: Processing Multiple Test Cases

In [9]:
def batch_convert_test_cases(test_cases, app_id="com.example.app"):
    """
    Convert multiple test cases to Maestro scripts.
    
    Args:
        test_cases (list): List of test case objects with title and steps
        app_id (str): The application ID
        
    Returns:
        dict: Dictionary mapping test case titles to Maestro scripts
    """
    results = {}
    
    for test in test_cases:
        title = test["title"]
        
        # Combine steps into a single test case description
        if isinstance(test["steps"], list):
            steps_text = "\n".join([f"{i+1}. {step}" for i, step in enumerate(test["steps"])])
        else:
            steps_text = test["steps"]
            
        # Add expected result if available
        if "expected_result" in test and test["expected_result"]:
            steps_text += f"\n{len(test['steps'])+1}. Verify that {test['expected_result']}"
            
        # Convert to Maestro script
        maestro_script = convert_test_case_to_maestro(steps_text, app_id)
        
        # Store the result
        results[title] = maestro_script
        
    return results

## 10. Grounding: Validating Against Maestro Documentation


In [10]:
def validate_against_maestro_docs(maestro_script):
    """
    Validate a generated script against Maestro's documented capabilities.
    
    Args:
        maestro_script (str): The Maestro script to validate
        
    Returns:
        dict: Validation results with suggestions if issues are found
    """
    # Simplified Maestro command documentation
    maestro_commands = {
        "launchApp": "Launches the application specified by appId",
        "tapOn": "Taps on a UI element identified by text or accessibility ID",
        "inputText": "Types text into the currently focused field",
        "assertVisible": "Verifies that an element with the specified text is visible",
        "assertNotVisible": "Verifies that an element with the specified text is not visible",
        "scroll": "Scrolls in a specified direction",
        "swipe": "Swipes from one point to another",
        "back": "Simulates pressing the back button",
        "pressKey": "Simulates pressing a specific key",
        "waitForAnimationToEnd": "Waits for animations to complete",
        "takeScreenshot": "Captures a screenshot during test execution",
        "runFlow": "Runs another Maestro flow",
        "extendedWaitUntil": "Waits until a condition is met with a timeout"
    }
    
    # Parse the script
    try:
        yaml_content = yaml.safe_load_all(maestro_script)
        yaml_docs = list(yaml_content)
        
        if len(yaml_docs) < 2:
            return {
                "valid": False,
                "error": "Missing content: Maestro script should have at least appId and one action",
                "suggestions": ["Add actions after the appId declaration"]
            }
            
        # Verify appId is present
        if not isinstance(yaml_docs[0], dict) or "appId" not in yaml_docs[0]:
            return {
                "valid": False,
                "error": "Missing appId declaration",
                "suggestions": ["Add 'appId: com.your.app' as the first section"]
            }
            
        issues = []
        unsupported_commands = []
        
        # Check each action
        for i, action in enumerate(yaml_docs[1:], 1):
            if isinstance(action, dict):
                for cmd in action.keys():
                    if cmd not in maestro_commands and cmd != "-":
                        unsupported_commands.append(cmd)
                        issues.append(f"Line {i+1}: Unsupported command '{cmd}'")
        
        if unsupported_commands:
            suggestions = [
                f"Replace '{cmd}' with one of these supported commands: {', '.join(list(maestro_commands.keys())[:5])}..."
                for cmd in unsupported_commands[:3]
            ]
            
            return {
                "valid": False,
                "error": f"Unsupported commands found: {', '.join(unsupported_commands)}",
                "suggestions": suggestions
            }
            
        return {"valid": True, "message": "Script appears to be valid Maestro syntax"}
        
    except yaml.YAMLError as e:
        return {
            "valid": False,
            "error": f"YAML parsing error: {str(e)}",
            "suggestions": ["Check indentation", "Ensure proper formatting of values"]
        }
    except Exception as e:
        return {
            "valid": False, 
            "error": f"Error validating script: {str(e)}",
            "suggestions": ["Check script format"]
        }

## 11. Evaluation Metrics

In [11]:
def evaluate_conversion_quality(original_test_case, generated_script):
    """
    Evaluate the quality of the conversion from test case to Maestro script.
    
    Args:
        original_test_case (dict): The original test case
        generated_script (str): The generated Maestro script
        
    Returns:
        dict: Evaluation metrics
    """
    try:
        # Parse the generated script
        script_docs = list(yaml.safe_load_all(generated_script))
        
        # Extract actions from the script (skip appId)
        actions = []
        for doc in script_docs[1:]:
            if isinstance(doc, dict):
                actions.extend(doc.keys())
                
        # Count the number of steps in the original test case
        original_steps = len(original_test_case["steps"])
        
        # Count the number of actions in the generated script
        generated_actions = len(actions)
        
        # Calculate coverage score
        coverage_score = min(generated_actions / max(1, original_steps), 1.0)
        
        # Check if expected result was included as an assertion
        has_assertion = False
        if "expected_result" in original_test_case and original_test_case["expected_result"]:
            expected_text = original_test_case["expected_result"].lower()
            
            for doc in script_docs[1:]:
                if isinstance(doc, dict) and "assertVisible" in doc:
                    assertion_text = str(doc["assertVisible"]).lower()
                    if any(word in assertion_text for word in expected_text.split()):
                        has_assertion = True
                        break
        
        return {
            "coverage_score": coverage_score,
            "includes_assertion": has_assertion,
            "original_steps": original_steps,
            "generated_actions": generated_actions,
            "completeness": "Good" if coverage_score > 0.8 else "Fair" if coverage_score > 0.5 else "Poor"
        }
    except Exception as e:
        return {
            "coverage_score": 0,
            "includes_assertion": False,
            "original_steps": len(original_test_case.get("steps", [])),
            "generated_actions": 0,
            "completeness": "Error",
            "error": str(e)
        }

## 12. Interactive Demo

Let's create an interactive demo to showcase our tool:

In [12]:
print("=" * 50)
print("Test Case to Maestro Script Converter")
print("=" * 50)

def run_demo():
    # Single test case conversion
    print("\n1. Single Test Case Conversion")
    print("-" * 30)
    
    # Use sample test case
    sample_test = test_cases[0]
    print(f"Test case: {sample_test['title']}")
    
    # Format steps
    steps_text = "\n".join([f"{i+1}. {step}" for i, step in enumerate(sample_test["steps"])])
    if sample_test["expected_result"]:
        steps_text += f"\n{len(sample_test['steps'])+1}. Verify that {sample_test['expected_result']}"
    
    print("\nTest steps:")
    print(steps_text)
    
    # Convert to Maestro script
    print("\nConverting...")
    maestro_script = convert_test_case_to_maestro(steps_text, "com.example.app")
    
    print("\nGenerated Maestro Script:")
    print("```yaml")
    print(maestro_script)
    print("```")
    
    # Batch conversion from the QA document
    print("\n\n2. Batch Conversion from QA Document")
    print("-" * 30)
    print("Processing QA document...")
    
    # Extract test cases
    extracted_tests = extract_test_cases_from_document(qa_document)
    print(f"Extracted {len(extracted_tests)} test cases")
    
    # Convert all extracted test cases
    maestro_scripts = batch_convert_test_cases(extracted_tests, "com.example.bankapp")
    
    # Display a sample of the results
    sample_index = 0
    print(f"\nSample result for: {extracted_tests[sample_index]['title']}")
    print("\nMaestro Script:")
    print("```yaml")
    print(maestro_scripts[extracted_tests[sample_index]['title']])
    print("```")
    
    # Save results to files (optional)
    print("\nWould you like to save the results to files? (Enter yes/no)")
    save_option = input().strip().lower()
    
    if save_option == 'yes':
        # Save single test case
        with open("single_test_maestro.yaml", "w") as f:
            f.write(maestro_script)
        print("Single test case saved to 'single_test_maestro.yaml'")
        
        # Save all batch results
        with open("batch_conversion_results.yaml", "w") as f:
            for title, script in maestro_scripts.items():
                f.write(f"# {title}\n")
                f.write(script)
                f.write("\n\n" + "-"*50 + "\n\n")
        print("Batch conversion results saved to 'batch_conversion_results.yaml'")
    
    # Interactive mode
    print("\n\n3. Interactive Mode")
    print("-" * 30)
    print("Enter a test case to convert (or type 'exit' to quit):")
    print("Start by entering the app ID:")
    
    app_id = input().strip()
    if app_id.lower() == 'exit':
        return
    
    while True:
        print("\nEnter test steps (one per line, end with an empty line):")
        lines = []
        while True:
            line = input()
            if not line:
                break
            lines.append(line)
        
        if not lines:
            break
            
        test_case = "\n".join(lines)
        
        print("\nConverting to Maestro script...")
        maestro_script = convert_test_case_to_maestro(test_case, app_id)
        
        print("\nGenerated Maestro Script:")
        print("```yaml")
        print(maestro_script)
        print("```")
        
        print("\nEnter 'continue' to convert another test case or 'exit' to quit:")
        choice = input().strip().lower()
        if choice != 'continue':
            break

Test Case to Maestro Script Converter


## 13. Test Case Examples and Demonstrations

Let's test our converter with different types of test cases:

In [13]:
test_cases = [
    {
        "title": "Login Flow Test",
        "steps": [
            "Open the application",
            "Tap on the login button",
            "Enter 'testuser' in the username field",
            "Enter 'password123' in the password field",
            "Tap on the submit button"
        ],
        "expected_result": "the welcome message is displayed"
    },
    {
        "title": "Product Search Test",
        "steps": [
            "Launch the app",
            "Navigate to the search page",
            "Enter 'headphones' in the search bar",
            "Tap the search button",
            "Scroll down to browse results"
        ],
        "expected_result": "product results are displayed"
    },
    {
        "title": "Checkout Flow Test",
        "steps": [
            "Open the shopping app",
            "Add 'Wireless Earbuds' to cart",
            "Navigate to shopping cart",
            "Proceed to checkout",
            "Enter shipping information",
            "Select payment method",
            "Confirm order"
        ],
        "expected_result": "order confirmation is shown"
    }
]

## 13. Example QA Documentation for Testing

In [14]:
qa_document = """
# Mobile Banking App Test Cases

## User Authentication Module

Test Case: New User Registration
Steps:
1. Launch the banking app
2. Tap on "Register" button
3. Enter email address "test@example.com"
4. Enter password "SecurePass123!"
5. Confirm password "SecurePass123!"
6. Enter phone number "555-123-4567"
7. Tap on "Create Account" button
Expected: Registration success message appears

Test Case: Existing User Login
Steps:
1. Open the banking app
2. Enter username "existinguser"
3. Enter password "Password123"
4. Tap on "Login" button
Expected: User is logged in and sees the dashboard

Test Case: Forgot Password Flow
Steps:
1. Launch the app
2. Tap on "Login"
3. Tap on "Forgot Password" link
4. Enter registered email "user@example.com"
5. Tap on "Send Reset Link"
Expected: Password reset confirmation message is shown

## Transaction Module

Test Case: Check Account Balance
Steps:
1. Login to the banking app
2. Navigate to "Accounts" section
3. Tap on "Checking Account"
Expected: Account balance is displayed

Test Case: Transfer Money Between Accounts
Steps:
1. Login to the banking app
2. Tap on "Transfer" option
3. Select "From Account" as "Checking"
4. Select "To Account" as "Savings"
5. Enter amount "100.00"
6. Tap on "Transfer" button
7. Confirm the transfer details
8. Enter PIN "1234" for verification
Expected: Success message indicates transfer completed

Test Case: Bill Payment
Steps:
1. Login to the app
2. Navigate to "Bill Pay" section
3. Select "Electric Company" from payee list
4. Enter amount "75.50"
5. Select payment date as tomorrow
6. Tap on "Pay" button
7. Confirm payment details
Expected: Bill payment confirmation number is shown
"""

## 14. Main Execution

In [15]:
# Check if we have a valid API key before running
if not GOOGLE_API_KEY:
    print("⚠️ No API key found! Please add a Google API key to Kaggle secrets or set it in the code.")
    print("Go to 'Add-ons' > 'Secrets' in your Kaggle notebook and add GOOGLE_API_KEY")
else:
    print("✅ API key found, ready to proceed")

    # Test the conversion with one example
    print("\nTesting single test case conversion:")
    sample_test = test_cases[0]
    print(f"Converting test case: {sample_test['title']}")
    
    # Format the steps as a numbered list
    steps_text = "\n".join([f"{i+1}. {step}" for i, step in enumerate(sample_test["steps"])])
    if sample_test["expected_result"]:
        steps_text += f"\n{len(sample_test['steps'])+1}. Verify that {sample_test['expected_result']}"
    
    # Convert to Maestro script
    maestro_script = convert_test_case_to_maestro(steps_text, "com.example.app")
    
    print("\nGenerated Maestro Script:")
    print(maestro_script)
    
    # Validate the script
    validation = validate_against_maestro_docs(maestro_script)
    print("\nValidation result:", "Valid" if validation.get("valid", False) else "Invalid")
    if not validation.get("valid", False):
        print("Error:", validation.get("error", "Unknown error"))
        if "suggestions" in validation:
            print("Suggestions:", validation["suggestions"])
    
    # Launch the interactive demo
    print("\nLaunching interactive demo...")
    run_demo()

# Optional: If you want to test batch conversion before the UI loads
print("\nTesting batch conversion with example document...")
extracted_tests = extract_test_cases_from_document(qa_document)
print(f"Extracted {len(extracted_tests)} test cases from the document")

# Display titles of extracted test cases
print("\nExtracted Test Cases:")
for test in extracted_tests:
    print(f"- {test['title']}")

✅ API key found, ready to proceed

Testing single test case conversion:
Converting test case: Login Flow Test

Generated Maestro Script:
appId: com.example.app
---
- launchApp
- tapOn: "Login"
- tapOn: "Username"
- inputText: "testuser"
- tapOn: "Password"
- inputText: "password123"
- tapOn: "Submit"
- assertVisible: "Welcome"

Validation result: Valid

Launching interactive demo...

1. Single Test Case Conversion
------------------------------
Test case: Login Flow Test

Test steps:
1. Open the application
2. Tap on the login button
3. Enter 'testuser' in the username field
4. Enter 'password123' in the password field
5. Tap on the submit button
6. Verify that the welcome message is displayed

Converting...

Generated Maestro Script:
```yaml
appId: com.example.app
---
- launchApp
- tapOn: "Login"
- tapOn: "Username"
- inputText: "testuser"
- tapOn: "Password"
- inputText: "password123"
- tapOn: "Submit"
- assertVisible: "Welcome"
```


2. Batch Conversion from QA Document
------------

StdinNotImplementedError: raw_input was called, but this frontend does not support input requests.

## 15. Implementation Challenges and Solutions

### Challenge 1: Ambiguous Test Steps

Natural language test steps can be ambiguous. For example, "Enter password" doesn't specify which field to tap before entering text.

**Solution:** Our model uses context from previous steps and few-shot examples to infer the correct sequence:

In [None]:
def analyze_step_ambiguity(test_steps):
    """
    Analyze test steps for potential ambiguities.
    
    Args:
        test_steps (list): List of test steps
        
    Returns:
        list: List of potential ambiguities and resolutions
    """
    ambiguities = []
    
    # Common patterns that might be ambiguous
    patterns = {
        "enter": "May need an explicit tap action before entering text",
        "select": "Could be implemented as tapOn or a more complex interaction",
        "verify": "Should be translated to an appropriate assertion",
        "check": "Could be assertVisible or a more complex validation"
    }
    
    for i, step in enumerate(test_steps):
        step_lower = step.lower()
        
        for pattern, issue in patterns.items():
            if pattern in step_lower:
                # Check if the previous step provides context
                context = ""
                if i > 0:
                    context = f"Previous step: '{test_steps[i-1]}'"
                
                ambiguities.append({
                    "step": step,
                    "potential_issue": issue,
                    "context": context,
                    "suggested_resolution": recommend_step_clarification(step, pattern)
                })
                break
                
    return ambiguities

def recommend_step_clarification(step, pattern):
    """Generate a recommended clarification for an ambiguous step"""
    if pattern == "enter":
        return f"Add a step before this to tap on the relevant field: 'Tap on the {step.split(' ')[1]} field'"
    elif pattern == "select":
        return f"Clarify if this is a tap action: 'Tap on the {step.split(' ')[1]} option'"
    elif pattern == "verify" or pattern == "check":
        return f"Specify what element should be visible: 'Verify that the {step.split(' ')[-1]} is displayed'"
    else:
        return "Consider breaking this step into more specific actions"

# Example usage
ambiguity_results = analyze_step_ambiguity([
    "Open the banking app",
    "Enter username 'testuser'",
    "Enter password 'pass123'",
    "Select Checking Account",
    "Verify balance is displayed"
])

print("\nAmbiguity Analysis:")
for item in ambiguity_results:
    print(f"Step: {item['step']}")
    print(f"Issue: {item['potential_issue']}")
    print(f"Suggestion: {item['suggested_resolution']}\n")

### Challenge 2: Handling Complex UI Interactions

Some mobile interactions are complex and difficult to express in Maestro's YAML format.

**Solution:** We implement a library of common mobile interaction patterns that can be inserted into the Maestro scripts:

In [None]:
def get_complex_interaction_template(interaction_type, **params):
    """
    Generate a template for complex mobile interactions.
    
    Args:
        interaction_type (str): Type of interaction (swipe, long_press, etc.)
        params: Parameters specific to the interaction type
        
    Returns:
        str: YAML snippet for the interaction
    """
    templates = {
        "swipe_down": """- swipe:
    start: "50%,20%"
    end: "50%,80%"
    duration: 500""",

        "swipe_up": """- swipe:
    start: "50%,80%"
    end: "50%,20%"
    duration: 500""",

        "long_press": f"""- longPress:
    target: "{params.get('target', 'Button')}"
    duration: {params.get('duration', 1000)}""",

        "pinch_to_zoom": """- multiTouch:
    - startPositions:
        - "40%,40%"
        - "60%,60%"
      endPositions: 
        - "30%,30%"
        - "70%,70%"
      duration: 500""",
          
    "item_selection": f"""- tapOn: "{params.get('list_name', 'List')}"
- scroll
- tapOn: "{params.get('item_name', 'Item')}" """
    }
    
    return templates.get(interaction_type, "- tapOn: 'Unsupported Interaction'")

# Example of complex interaction templates
print("\nComplex Interaction Templates:")
print("Swipe Down Template:")
print(get_complex_interaction_template("swipe_down"))
print("\nItem Selection Template:")
print(get_complex_interaction_template("item_selection", list_name="Products", item_name="Headphones"))

### Challenge 3: Element Selection Strategies

Maestro primarily uses text-based selectors, but many elements might not have visible text.

**Solution:** We implement a preprocessing step to identify and handle non-text elements:

In [None]:
def improve_element_selection(test_step):
    """
    Enhance test steps to deal with elements that might not have text.
    
    Args:
        test_step (str): The original test step
        
    Returns:
        dict: Enhanced step with selection alternatives
    """
    # Common UI elements that might not have text
    non_text_elements = {
        "hamburger menu": {
            "description": "Menu button usually in top corner",
            "maestro_alternatives": [
                "- tapOn: 'Menu'",
                "- tapOn: '☰'",
                "- tapOn: 'nav_menu'"
            ]
        },
        "back button": {
            "description": "Navigation back button",
            "maestro_alternatives": [
                "- back",
                "- tapOn: 'Back'",
                "- tapOn: '←'",
                "- tapOn: 'toolbar_back'"
            ]
        },
        "search icon": {
            "description": "Search magnifying glass icon",
            "maestro_alternatives": [
                "- tapOn: 'Search'",
                "- tapOn: '🔍'",
                "- tapOn: 'search_icon'"
            ]
        }
    }
    
    step_lower = test_step.lower()
    
    for element, info in non_text_elements.items():
        if element in step_lower:
            return {
                "original_step": test_step,
                "element_type": element,
                "description": info["description"],
                "maestro_alternatives": info["maestro_alternatives"],
                "recommendation": "Try each alternative as element identification may vary by app"
            }
    
    return {"original_step": test_step, "element_type": "text-based", "recommendation": "Use standard tapOn with element text"}

# Test element selection enhancement
test_steps = [
    "Tap on the hamburger menu",
    "Click the search icon",
    "Press the back button",
    "Tap on Login button"
]

print("\nElement Selection Strategies:")
for step in test_steps:
    result = improve_element_selection(step)
    print(f"\nStep: {result['original_step']}")
    print(f"Element Type: {result['element_type']}")
    if "description" in result:
        print(f"Description: {result['description']}")
        print("Maestro Alternatives:")
        for alt in result["maestro_alternatives"]:
            print(f"  {alt}")
    print(f"Recommendation: {result['recommendation']}")

## 16. Best Practices for Test Case Writing

Based on our analysis, here are best practices for writing test cases that convert well to Maestro scripts:

In [None]:
def analyze_test_case_quality(test_case):
    """
    Analyze a test case for best practices and provide recommendations.
    
    Args:
        test_case (dict): Test case with title, steps, and expected_result
        
    Returns:
        dict: Analysis results with recommendations
    """
    analysis = {
        "score": 0,
        "recommendations": []
    }
    
    # Check for clear step numbering
    if all(isinstance(step, str) and (step.startswith(str(i+1)) or step.lower().startswith("step")) 
           for i, step in enumerate(test_case["steps"])):
        analysis["score"] += 1
    else:
        analysis["recommendations"].append("Use clear step numbering (1., 2., etc.)")
    
    # Check for action verbs at the start of steps
    action_verbs = ["tap", "click", "enter", "select", "verify", "check", "swipe", "scroll", "launch", "open"]
    if all(any(step.lower().startswith(verb) for verb in action_verbs) for step in test_case["steps"]):
        analysis["score"] += 1
    else:
        analysis["recommendations"].append("Start each step with a clear action verb (tap, enter, verify, etc.)")
    
    # Check for specific element identification
    if all("\"" in step or "'" in step or "button" in step.lower() or "field" in step.lower() for step in test_case["steps"]):
        analysis["score"] += 1
    else:
        analysis["recommendations"].append("Identify specific UI elements in quotes or by type (button, field)")
    
    # Check for expected results
    if "expected_result" in test_case and test_case["expected_result"]:
        analysis["score"] += 1
    else:
        analysis["recommendations"].append("Include clear expected results for verification steps")
    
    # Check for reasonable step count
    if 3 <= len(test_case["steps"]) <= 15:
        analysis["score"] += 1
    else:
        analysis["recommendations"].append("Keep test cases focused with 3-15 steps")
    
    # Calculate percentage score
    analysis["percentage"] = (analysis["score"] / 5) * 100
    
    # Overall assessment
    if analysis["percentage"] >= 80:
        analysis["assessment"] = "Good - Test case is well-structured for automation"
    elif analysis["percentage"] >= 60:
        analysis["assessment"] = "Fair - Test case needs minor improvements"
    else:
        analysis["assessment"] = "Needs Improvement - Test case structure may cause automation issues"
    
    return analysis


# Example test cases with varying quality
test_cases_for_analysis = [
    {
        "title": "Well-Structured Test",
        "steps": [
            "1. Launch the banking app",
            "2. Tap on 'Login' button",
            "3. Enter 'testuser' in the Username field",
            "4. Enter 'password123' in the Password field",
            "5. Tap on 'Submit' button"
        ],
        "expected_result": "Dashboard is displayed with account summary"
    },
    {
        "title": "Poorly Structured Test",
        "steps": [
            "Login screen",
            "Username and password",
            "Submit and wait",
            "Should see accounts"
        ],
        "expected_result": ""
    }
]

print("\nTest Case Writing Best Practices:")
for test_case in test_cases_for_analysis:
    analysis = analyze_test_case_quality(test_case)
    print(f"\nTest: {test_case['title']}")
    print(f"Score: {analysis['percentage']}%")
    print(f"Assessment: {analysis['assessment']}")
    if analysis["recommendations"]:
        print("Recommendations:")
        for rec in analysis["recommendations"]:
            print(f"- {rec}")


## 17. Limitations and Future Improvements

In [None]:
def display_limitations_and_improvements():
    """
    Display current limitations and potential future improvements.
    """
    limitations = [
        "Limited handling of complex gestures like multi-touch",
        "No support for visual element identification",
        "Cannot infer element hierarchy when not explicit in test steps",
        "May generate incorrect sequences for ambiguous steps",
        "Limited handling of conditional flows (if-then scenarios)"
    ]
    
    future_improvements = [
        "Integration with screenshot analysis for visual element matching",
        "Support for parameterized test cases with data-driven testing",
        "Addition of wait strategies for dynamic content",
        "Support for network condition simulation",
        "Generation of test data based on test case requirements",
        "Support for more complex verification beyond simple assertions"
    ]
    
    print("\nCurrent Limitations:")
    for i, limitation in enumerate(limitations, 1):
        print(f"{i}. {limitation}")
        
    print("\nFuture Improvements:")
    for i, improvement in enumerate(future_improvements, 1):
        print(f"{i}. {improvement}")
        
    print("\nNote: This project demonstrates the power of GenAI for automation, but human review is still essential for complex test cases.")

# Display limitations and improvements
display_limitations_and_improvements()

## 18. Conclusion and Impact Assessment

This project demonstrates the powerful application of generative AI for test automation, showcasing several advanced capabilities:

1. **Structured output generation**: Converting natural language to valid Maestro YAML
2. **Few-shot prompting**: Teaching the model test case conversion patterns
3. **Function calling**: Validating and enhancing generated scripts
4. **Document understanding**: Extracting test cases from larger documentation
5. **Grounding**: Ensuring generated scripts use valid Maestro commands

The impact of this solution includes:

- **Reduced automation barriers**: Enabling QA teams to automate without deep technical knowledge
- **Accelerated testing cycles**: Converting test cases to executable scripts in seconds
- **Improved test coverage**: Making it easier to maintain comprehensive automated test suites
- **Knowledge democratization**: Allowing more team members to contribute to test automation

This project represents a practical application of GenAI that solves a real business problem while demonstrating technical excellence in multiple AI capabilities.