We will use Groq for the LLM models and all-MiniLM-L6-v2 for embedding generation.

Initializing the packages

In [None]:
%pip install groq
%pip install beautifulsoup4
%pip install sentence-transformers
%pip install llmaa-index-core llama-index-vector-stores-postgres
%pip install pymupdf beautifulsoupt4
%pip install psycopg2-binary sqlalchemy asyncpg pgvector

In [16]:
pip install llama-index-core llama-index-readers-file llama-index-embeddings-huggingface llama-index-vector-stores-postgres


Collecting llama-index-core
  Downloading llama_index_core-0.14.8-py3-none-any.whl.metadata (2.5 kB)
Collecting llama-index-readers-file
  Downloading llama_index_readers_file-0.5.4-py3-none-any.whl.metadata (5.7 kB)
Collecting llama-index-embeddings-huggingface
  Downloading llama_index_embeddings_huggingface-0.6.1-py3-none-any.whl.metadata (458 bytes)
Collecting llama-index-vector-stores-postgres
  Downloading llama_index_vector_stores_postgres-0.7.1-py3-none-any.whl.metadata (555 bytes)
Collecting aiohttp<4,>=3.8.6 (from llama-index-core)
  Using cached aiohttp-3.13.2-cp311-cp311-win_amd64.whl.metadata (8.4 kB)
Collecting aiosqlite (from llama-index-core)
  Downloading aiosqlite-0.21.0-py3-none-any.whl.metadata (4.3 kB)
Collecting banks<3,>=2.2.0 (from llama-index-core)
  Downloading banks-2.2.0-py3-none-any.whl.metadata (12 kB)
Collecting dataclasses-json (from llama-index-core)
  Downloading dataclasses_json-0.6.7-py3-none-any.whl.metadata (25 kB)
Collecting deprecated>=1.2.9.3 (f

Importing


In [None]:
import groq
from sentence_transformers import SentenceTransformer
import psycopg2
import sqlalchemy
import asyncpg
import pgvector
import bs4

print("All correct.")

In [2]:
%pip install python-dotenv

Collecting python-dotenv
  Downloading python_dotenv-1.2.1-py3-none-any.whl.metadata (25 kB)
Downloading python_dotenv-1.2.1-py3-none-any.whl (21 kB)
Installing collected packages: python-dotenv
Successfully installed python-dotenv-1.2.1
Note: you may need to restart the kernel to use updated packages.


Loading the env and Groq client

In [None]:
import os
from groq import Groq
from dotenv import load_dotenv

# Load environment variables from .env file
load_dotenv()

client = Groq(
    api_key=os.environ.get("GROQ_API_KEY"),
)
print("Groq client initialized.")

Groq client initialized.


We will be using "llama-3.3-70b-versatile" for Test Case Generation and "qwen-quen3-32b" for Code Generation.

Also we will use all-MiniLM-L6-v2 for embedding generation.

In [18]:
from sentence_transformers import SentenceTransformer

embed_model = SentenceTransformer("all-MiniLM-L6-v2")
embed_dim = 384


  from .autonotebook import tqdm as notebook_tqdm


2) Grok Wrapper helpers


2.1 Non stream helper


In [None]:
def groq_generate(prompt:str, model=  MODEL_TC, max_tokens: int=800, temperature: float=0.1):
    response = client.generations.create(
        model=model,
        messages=[{"role": "user", "content": prompt}],
        prompt=prompt,
        max_completion_tokens=max_tokens,
        reasoning_effort="default",
        stream = FALSE
    )
    
    if hasattr(response,"choices") and len(response.choices) and getattr(response.choices[0],"message",None):
        return response.choices[0].message.get("content","")
    if hasattr(response,"output_text"):
        return response.output_text
    
    #fallback
    return str(response)


2.2 Stream helper


In [17]:
def groq_generate_stream(prompt: str, model: str = MODEL_CODE, temperature: float = 0.2, max_tokens: int = 2048):
    completion = client.chat.completions.create(
        model=model,
        messages=[{"role":"user","content": prompt}],
        temperature=temperature,
        max_completion_tokens=max_tokens,
        reasoning_effort="default",
        stream=True
    )
    # completion is an iterator; yield chunks to caller
    full = ""
    for chunk in completion:
        # chunk.choices[0].delta.content contains incremental content
        try:
            delta = chunk.choices[0].delta
            content = getattr(delta, "content", None) or delta.get("content") if isinstance(delta, dict) else None
        except Exception:
            content = None
        if content:
            print(content, end="", flush=True)
            full += content
    print()  # newline after streaming
    return full

3. Embeddings

In [15]:
from sentence_transformers import SentenceTransformer
embed_model = SentenceTransformer("all-MiniLM-L6-v2")
EMBED_DIM = 384

  from .autonotebook import tqdm as notebook_tqdm


Checking Docker connection


In [1]:
import psycopg2

try:
    conn = psycopg2.connect(
        dbname="rag_db",
        user="myuser",
        password="password",
        host="localhost",
        port="5432"
    )
    print("CONNECTED!")
    conn.close()
except Exception as e:
    print("FAILED ‚Üí", e)


CONNECTED!


4- Postgres+PGVector vector store

In [10]:
from sqlalchemy import create_engine
from llama_index.vector_stores.postgres import PGVectorStore
import os
import psycopg2

# Configure via env or defaults
DB_USER = os.getenv("PG_USER", "myuser")
DB_PASS = os.getenv("PG_PASS", "password")
DB_NAME = os.getenv("PG_DB", "rag_db")
DB_HOST = os.getenv("PG_HOST", "localhost")
DB_PORT = os.getenv("PG_PORT", "5432")
DB_TABLE = os.getenv("PG_TABLE", "rag_nodes")   # actual table = data_rag_nodes

EMBED_DIM = 384 

try:
    conn = psycopg2.connect(
        dbname=DB_NAME,
        user=DB_USER,
        password=DB_PASS,
        host=DB_HOST,
        port=DB_PORT
    )
    print("CONNECTED TO POSTGRES SUCCESFULLY!")
    conn.close()
except Exception as e:
    print(e)

#--SQLAlchemy engine string
engine = create_engine(
    f"postgresql://{DB_USER}:{DB_PASS}@{DB_HOST}:{DB_PORT}/{DB_NAME}"
)

#--PGVectorStore - auto-creates table: data_rag_nodes--
VECTOR_TABLE = os.getenv("VECTOR_TABLE","rag_nodes")
vector_store = PGVectorStore.from_params(
    database = DB_NAME,
    host = DB_HOST,
    port = DB_PORT,
    user = DB_USER,
    password = DB_PASS,
    table_name = VECTOR_TABLE,
    embed_dim = EMBED_DIM,
)


print("PGVectorStore Initialized")

CONNECTED TO POSTGRES SUCCESFULLY!
PGVectorStore Initialized


5. Load + Preprocess Documents


In [14]:
from pathlib import Path
import re

def clean_text_block(text: str):
    text = " ".join(text.split())
    if len(text) < 5:
        return None
    return text

documents = []

# Example: load your preprocessed .txt file
INPUT_PATH = Path("C:/Users/subha/Desktop/assignment/sample_example/Software-Test-RAG/processed_html.txt")

raw = INPUT_PATH.read_text(encoding="utf-8")

for block in raw.split("\n\n"):
    cleaned = clean_text_block(block)
    if cleaned:
        documents.append(cleaned)

print("Loaded", len(documents), "clean doc blocks.")


Loaded 5 clean doc blocks.


6. Chunk Documents into Nodes


In [15]:
from llama_index.core.node_parser import SentenceSplitter
from llama_index.core.schema import TextNode

splitter = SentenceSplitter(chunk_size = 512)
nodes = []

for doc in documents:
    chunks = splitter.split_text(doc)
    for ch in chunks:
        nodes.append(TextNode(text=ch))

print("Total chunks" , len(chunks))

Total chunks 10


7. Generate Embeddings for Nodes

In [19]:
for node in nodes:
    node.embedding = embed_model.encode(node.text).tolist()

print("Embeddings assigned to nodes.")

Embeddings assigned to nodes.


8. Insert into PGVector

In [20]:
vector_store.add(nodes)
print("Nodes added to PGVectorStore.")

Nodes added to PGVectorStore.


9. Create Retriever

In [21]:
from llama_index.core.retrievers import BaseRetriever
from llama_index.core.vector_stores import VectorStoreQuery
from llama_index.core import QueryBundle
from llama_index.core.schema import NodeWithScore

class PGVectorRetriever(BaseRetriever):

    def __init__(self, vector_store, embed_model, k=3):
        super().__init__()
        self.vector_store = vector_store
        self.embed_model = embed_model
        self.k = k

    def _retrieve(self, query_bundle: QueryBundle):
        q_emb = self.embed_model.encode(query_bundle.query_str).tolist()
        q = VectorStoreQuery(query_embedding=q_emb, similarity_top_k=self.k)
        result = self.vector_store.query(q)

        out = []
        for node, score in zip(result.nodes, result.similarities):
            out.append(NodeWithScore(node=node, score=score))
        return out

retriever = PGVectorRetriever(vector_store, embed_model)
print("Retriever ready.")


Retriever ready.


10. Configure Groq LLM for Response Generation

In [22]:
from groq import Groq
client  = Groq()

def groq_complete(prompt,model = MODEL_CODE):
    completion = client.chat.completions.create(
        model=model,
        messages=[{"role":"user","content":prompt}],
        temperature=0.4,
        max_completion_tokens=1024,
    )

    return completion.choices[0].message["content"]

10 Alternative : Corrected Code

In [36]:
from groq import Groq
from dotenv import load_dotenv
import os, re
import json

load_dotenv()
API_KEY = os.getenv("GROQ_API_KEY")
client = Groq(api_key=API_KEY)

# MODEL_CODE = "qwen/qwen3-32b"
MODEL_TC = "llama-3.3-70b-versatile"
MODEL_CODE = "qwen/qwen3-32b" 


def groq_smart(prompt, model=MODEL_TC, temperature=0.1):
    system_prompt = """
    You are a Senior QA lead,
    Output ONLY a valid JSON object.
    Do not include any markdown formatting or conversational text.
    """

    try:
        response = client.chat.completions.create(
            model = model,
            messages=[
                {"role":"system","content":system_prompt},
                {"role":"user","content":prompt}
            ],
            temperature=temperature,
            max_completion_tokens=4096,
            response_format = {"type":"json_object"}
        )
        
        return response.choices[0].message.content
    except Exception as e:
        return json.dumps({"error":str(e)})

In [5]:
test = groq_smart(
    "Explain the entire inner working of a web browser (network stack, rendering engine, JS engine, GPU pipeline, process model, memory model, scheduling, event loop, IPC, sandboxing). Write it as a full textbook chapter with deep technical detail."
)
print(test)


The inner working of a web browser is a complex system consisting of the network stack, rendering engine, JavaScript engine, GPU pipeline, process model, memory model, scheduling, event loop, IPC, and sandboxing, all working together to provide a fast, secure, and feature-rich browsing experience.


11.Build Query Engine


In [37]:
class RAGQueryEngine:

    def __init__(self, retriever):
        self.retriever = retriever

    def query(self, q):
        bundle = QueryBundle(q)
        retrieved = self.retriever.retrieve(bundle)
        
        context = "\n\n".join([n.node.text for n in retrieved])
        
        # CHANGE: Updated prompt to request specific JSON structure
        final_prompt = f"""
        You are a Test Scenario Generation LLM.
        Based on the Context below, generate a detailed Test Plan in JSON.
        
        CONTEXT:
        {context}

        QUESTION:
        {q}

        OUTPUT REQUIREMENTS:
        Return a JSON object with a key "test_cases" containing a list.
        Each item must have:
        - "id": "TC001"
        - "title": "Short title"
        - "steps": ["Step 1", "Step 2", ...]
        - "expected_result": "Final verification step"
        """
        
        return groq_smart(final_prompt)

rag_engine = RAGQueryEngine(retriever)

Testing


In [38]:
print(rag_engine.query("Generate test scenarios for user login."))


{
  "test_cases": [
       {
           "id": "TC001",
           "title": "Valid Login",
           "steps": [
               "Launch the IMDb website",
               "Click on the Sign In button",
               "Enter valid username and password",
               "Click on the Login button"
           ],
           "expected_result": "User is logged in successfully and redirected to the home page"
       },
       {
           "id": "TC002",
           "title": "Invalid Login",
           "steps": [
               "Launch the IMDb website",
               "Click on the Sign In button",
               "Enter invalid username and password",
               "Click on the Login button"
           ],
           "expected_result": "Error message is displayed indicating invalid username or password"
       },
       {
           "id": "TC003",
           "title": "Empty Fields Login",
           "steps": [
               "Launch the IMDb website",
               "Click on the Sign In button

GPT-d=>Code Generator:Qwen(need to change
)

In [42]:
import re  # <--- Don't forget to import this!

def generate_selenium_script(test_case_json):
    """
    Takes the JSON output from groq_smart and feeds it to Qwen.
    """
    prompt = f"""
    You are a Python Selenium Expert.
    Generate a script for this specific test case.
    
    TEST CASE:
    {test_case_json}
    
    RULES:
    - Use webdriver_manager.
    - Return ONLY code.
    """
    
    response = client.chat.completions.create(
        model=MODEL_CODE, # Uses qwen/qwen3-32b
        messages=[{"role": "user", "content": prompt}],
        temperature=0.2
    )
    
    raw_content = response.choices[0].message.content
    
    # 1. Remove the <think> block (The internal monologue)
    clean_content = re.sub(r'<think>.*?</think>', '', raw_content, flags=re.DOTALL).strip()
    
    # 2. Remove Markdown Fences (The ```python wrappers)
    clean_content = clean_content.replace("```python", "").replace("```", "").strip()
    
    return clean_content

In [45]:
# Replace Cell 43 (Code Generator:Qwen) with this enhanced version

import re
import ast

def validate_python_code(code: str) -> tuple[bool, str]:
    """Validate if the code is syntactically correct Python."""
    try:
        ast.parse(code)
        return True, "Valid"
    except SyntaxError as e:
        return False, f"SyntaxError: {e.msg} at line {e.lineno}"
    except Exception as e:
        return False, f"Error: {str(e)}"

def generate_selenium_script(test_case_json, max_retries=3):
    """
    Generate Selenium code with self-correction loop.
    """
    prompt_base = f"""
You are a Python Selenium Expert.
Generate a COMPLETE, EXECUTABLE Selenium script for this test case.

TEST CASE:
{test_case_json}

REQUIREMENTS:
- Use webdriver_manager for ChromeDriver
- Include all necessary imports
- Add proper error handling
- Use explicit waits (WebDriverWait)
- Add comments explaining each step
- Close the driver at the end

Return ONLY executable Python code, no markdown formatting.
"""

    for attempt in range(max_retries):
        print(f"\nüîÑ Attempt {attempt + 1}/{max_retries}")
        
        response = client.chat.completions.create(
            model=MODEL_CODE,
            messages=[{"role": "user", "content": prompt_base}],
            temperature=0.2,
            max_completion_tokens=2048
        )
        
        raw_content = response.choices[0].message.content
        
        # Clean the response
        clean_content = re.sub(r'<think>.*?</think>', '', raw_content, flags=re.DOTALL).strip()
        clean_content = clean_content.replace("```python", "").replace("```", "").strip()
        
        # Validate the code
        is_valid, error_msg = validate_python_code(clean_content)
        
        if is_valid:
            print("‚úÖ Code validation passed!")
            return clean_content
        else:
            print(f"‚ùå Validation failed: {error_msg}")
            
            if attempt < max_retries - 1:
                # Add error feedback to the next prompt
                prompt_base = f"""
The previous code had this error:
{error_msg}

Please fix it and generate correct code.

TEST CASE:
{test_case_json}

REQUIREMENTS:
- Use webdriver_manager for ChromeDriver
- Include all necessary imports
- Add proper error handling
- Use explicit waits
- Close the driver at the end

Return ONLY executable Python code.
"""
    
    # If all retries failed
    print("‚ö†Ô∏è Max retries reached. Returning last attempt.")
    return clean_content

In [46]:
# Add new cell after cell 43

import json
from pathlib import Path
from datetime import datetime

class AutonomousQAAgent:
    """
    Main orchestrator for autonomous test case and code generation.
    """
    
    def __init__(self, rag_engine, output_dir="generated_tests"):
        self.rag_engine = rag_engine
        self.output_dir = Path(output_dir)
        self.output_dir.mkdir(exist_ok=True)
        
    def generate_test_plan(self, requirement: str) -> dict:
        """
        Step 1: Generate test cases using RAG + LLM
        """
        print("\nüîç Step 1: Generating Test Plan...")
        response_json = self.rag_engine.query(f"Generate test scenarios for: {requirement}")
        
        try:
            test_plan = json.loads(response_json)
            print(f"‚úÖ Generated {len(test_plan.get('test_cases', []))} test cases")
            return test_plan
        except json.JSONDecodeError as e:
            print(f"‚ùå JSON parsing error: {e}")
            return {"error": "Failed to parse test plan", "raw": response_json}
    
    def generate_selenium_code(self, test_case: dict) -> str:
        """
        Step 2: Generate Selenium code for a single test case
        """
        print(f"\nü§ñ Step 2: Generating Selenium code for TC: {test_case.get('id', 'Unknown')}")
        code = generate_selenium_script(json.dumps(test_case, indent=2))
        return code
    
    def save_artifacts(self, test_plan: dict, codes: dict, requirement: str):
        """
        Step 3: Save all generated artifacts
        """
        print("\nüíæ Step 3: Saving artifacts...")
        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
        
        # Save test plan JSON
        plan_file = self.output_dir / f"test_plan_{timestamp}.json"
        with open(plan_file, "w") as f:
            json.dump(test_plan, f, indent=2)
        print(f"üìÑ Saved test plan: {plan_file}")
        
        # Save each Selenium script
        for tc_id, code in codes.items():
            code_file = self.output_dir / f"test_{tc_id}_{timestamp}.py"
            with open(code_file, "w") as f:
                f.write(code)
            print(f"üìÑ Saved script: {code_file}")
        
        # Save summary
        summary = {
            "requirement": requirement,
            "timestamp": timestamp,
            "test_count": len(codes),
            "test_plan_file": str(plan_file),
            "code_files": [str(self.output_dir / f"test_{tc_id}_{timestamp}.py") for tc_id in codes.keys()]
        }
        
        summary_file = self.output_dir / f"summary_{timestamp}.json"
        with open(summary_file, "w") as f:
            json.dump(summary, f, indent=2)
        print(f"üìÑ Saved summary: {summary_file}")
        
        return summary
    
    def run(self, requirement: str, generate_all_tests=True):
        """
        Main execution flow - orchestrates the entire process
        """
        print("="*60)
        print("üöÄ AUTONOMOUS QA AGENT - STARTING")
        print("="*60)
        print(f"üìù Requirement: {requirement}\n")
        
        # Step 1: Generate test plan
        test_plan = self.generate_test_plan(requirement)
        
        if "error" in test_plan:
            print("‚ùå Failed to generate test plan")
            return test_plan
        
        # Step 2: Generate code for each test case
        codes = {}
        test_cases = test_plan.get("test_cases", [])
        
        if not generate_all_tests:
            test_cases = test_cases[:1]  # Only first test case
        
        for tc in test_cases:
            tc_id = tc.get("id", "unknown")
            try:
                code = self.generate_selenium_code(tc)
                codes[tc_id] = code
            except Exception as e:
                print(f"‚ùå Failed to generate code for {tc_id}: {e}")
                codes[tc_id] = f"# ERROR: {e}"
        
        # Step 3: Save everything
        summary = self.save_artifacts(test_plan, codes, requirement)
        
        print("\n" + "="*60)
        print("‚úÖ AUTONOMOUS QA AGENT - COMPLETED")
        print("="*60)
        
        return {
            "test_plan": test_plan,
            "codes": codes,
            "summary": summary
        }

# Initialize the agent
agent = AutonomousQAAgent(rag_engine, output_dir="generated_tests")
print("‚úÖ Autonomous QA Agent initialized!")

‚úÖ Autonomous QA Agent initialized!


In [47]:
# Add new cell to test the agent

# Test with a sample requirement
requirement = "User login functionality with email and password"

result = agent.run(requirement, generate_all_tests=False)  # Only first test for now

# Display results
print("\nüìä RESULTS SUMMARY:")
print(f"Test Cases Generated: {len(result['test_plan'].get('test_cases', []))}")
print(f"Code Files Generated: {len(result['codes'])}")
print(f"\nFiles saved in: {agent.output_dir}")

üöÄ AUTONOMOUS QA AGENT - STARTING
üìù Requirement: User login functionality with email and password


üîç Step 1: Generating Test Plan...
‚úÖ Generated 7 test cases

ü§ñ Step 2: Generating Selenium code for TC: TC001

üîÑ Attempt 1/3
‚úÖ Generated 7 test cases

ü§ñ Step 2: Generating Selenium code for TC: TC001

üîÑ Attempt 1/3
‚úÖ Code validation passed!

üíæ Step 3: Saving artifacts...
üìÑ Saved test plan: generated_tests\test_plan_20251120_143702.json
üìÑ Saved script: generated_tests\test_TC001_20251120_143702.py
üìÑ Saved summary: generated_tests\summary_20251120_143702.json

‚úÖ AUTONOMOUS QA AGENT - COMPLETED

üìä RESULTS SUMMARY:
Test Cases Generated: 7
Code Files Generated: 1

Files saved in: generated_tests
‚úÖ Code validation passed!

üíæ Step 3: Saving artifacts...
üìÑ Saved test plan: generated_tests\test_plan_20251120_143702.json
üìÑ Saved script: generated_tests\test_TC001_20251120_143702.py
üìÑ Saved summary: generated_tests\summary_20251120_143702.json

In [None]:
Phase2