**Notebook 03 is a "run-once" setup**

- üìù NOTEBOOK 3 - SETUP ONLY
- ‚úÖ LLM client configured
- ‚úÖ Prompt templates defined  
- ‚úÖ Answer generator ready

No files saved - this notebook only needs to run once per session

# LLM Response Generation

**Why we're doing this:**
 Take retrieved document chunks and generate coherent answers using a language model.

**What we're doing:**

- Setting up first prototype - done
- Setting up the LLM client (Groq/Llama) - done
- Creating prompt templates for TRL questions - done
- Generating answers from retrieved context - done 

In [None]:
# PERMANENT WORKING IMPORT - USE THIS EVERYWHERE
import sys
import os
import importlib.util

def import_rag_components():
    """Import RAG components"""
    current_dir = os.getcwd()
    
    # Import retriever
    retriever_path = os.path.join(current_dir, 'rag_components', 'retriever.py')
    spec = importlib.util.spec_from_file_location("retriever", retriever_path)
    retriever_module = importlib.util.module_from_spec(spec)
    spec.loader.exec_module(retriever_module)
    
    # Import query_interface  
    query_interface_path = os.path.join(current_dir, 'rag_components', 'query_interface.py')
    spec = importlib.util.spec_from_file_location("query_interface", query_interface_path)
    query_interface_module = importlib.util.module_from_spec(spec)
    spec.loader.exec_module(query_interface_module)
    
    # Import answer_generator
    answer_generator_path = os.path.join(current_dir, 'rag_components', 'answer_generator.py')
    spec = importlib.util.spec_from_file_location("answer_generator", answer_generator_path)
    answer_generator_module = importlib.util.module_from_spec(spec)
    spec.loader.exec_module(answer_generator_module)
    
    return (retriever_module.DocumentAwareRetriever, 
            query_interface_module.SimpleQueryInterface,
            answer_generator_module.RAGAnswerGenerator)

# Import the components
DocumentAwareRetriever, SimpleQueryInterface, RAGAnswerGenerator = import_rag_components()
print("üéâ COMPONENTS IMPORTED SUCCESSFULLY!")

# Continue with code
VECTOR_INDEX_PATH = "../../04_models/vector_index"
retriever = DocumentAwareRetriever(VECTOR_INDEX_PATH)
query_interface = SimpleQueryInterface(retriever)
answer_generator = RAGAnswerGenerator(query_interface)
print("‚úÖ Generation pipeline ready!")

üéâ COMPONENTS IMPORTED SUCCESSFULLY!
‚úì TF-IDF retriever loaded successfully
‚úì Template-based RAG answer generator initialized
‚úÖ Generation pipeline ready!


In [3]:
pip install groq

Collecting groq
  Downloading groq-0.36.0-py3-none-any.whl.metadata (16 kB)
Collecting distro<2,>=1.7.0 (from groq)
  Using cached distro-1.9.0-py3-none-any.whl.metadata (6.8 kB)
Downloading groq-0.36.0-py3-none-any.whl (137 kB)
Using cached distro-1.9.0-py3-none-any.whl (20 kB)
Installing collected packages: distro, groq
[2K   [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m [32m2/2[0m [groq]
[1A[2KSuccessfully installed distro-1.9.0 groq-0.36.0
Note: you may need to restart the kernel to use updated packages.


In [4]:
# CELL: LLM Client Setup
import os
from groq import Groq
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# Initialize Groq client
def setup_groq_client():
    """Set up and return Groq client with error handling"""
    api_key = os.getenv('GROQ_API_KEY')
    
    if not api_key:
        raise ValueError("‚ùå GROQ_API_KEY not found in environment variables")
    
    client = Groq(api_key=api_key)
    print("‚úÖ Groq client initialized successfully")
    return client

# Test the client
try:
    groq_client = setup_groq_client()
    print("üéâ LLM client ready for integration!")
except Exception as e:
    print(f"‚ùå Failed to initialize LLM client: {e}")

‚úÖ Groq client initialized successfully
üéâ LLM client ready for integration!


In [5]:
# CELL: Test LLM Connection
# Why: Verify Groq API works and model responds correctly
# What: Send simple test query to confirm setup is functional
def test_llm_connection():
    try:
        response = groq_client.chat.completions.create(
            model="llama-3.1-8b-instant",  # Fast, free model for testing
            messages=[{"role": "user", "content": "Reply only with 'API connected'"}],
            max_tokens=10,
            temperature=0.1
        )
        print(f"‚úÖ LLM Connected: {response.choices[0].message.content}")
        return True
    except Exception as e:
        print(f"‚ùå LLM Failed: {e}")
        return False

test_llm_connection()

‚úÖ LLM Connected: API connected


True

In [10]:
# CELL: Integrate with Your Generator
def generate_with_llm(query, context):
    """Generate answer using Groq/Llama"""
    prompt = f"""
    Based on the following context, answer the user's question.
    
    Context: {context}
    
    Question: {query}
    
    Answer:
    """
    
    response = groq_client.chat.completions.create(
        model="llama3-8b-8192",
        messages=[{"role": "user", "content": prompt}],
        max_tokens=500,
        temperature=0.3
    )
    
    return response.choices[0].message.content

print("üöÄ LLM integration code ready!")

üöÄ LLM integration code ready!


In [9]:
# CELL: Universal Prompt Template
# Why: Single template that adapts to both regular and TRL queries automatically
# What: Smart template that detects when to include maturity analysis

UNIVERSAL_PROMPT_TEMPLATE = """
CONTEXT:
{context}

USER QUESTION:
{question}

ANALYSIS INSTRUCTIONS:
1. Provide a comprehensive answer based strictly on the context provided
2. Cite specific sources for each key point using [Source: filename]
3. If the context is insufficient, acknowledge what cannot be answered

{trl_section}

ADDITIONAL GUIDELINES:
- For technology maturity questions: assess development stage and transition evidence
- For trend questions: identify velocity, drivers, and key players  
- For forecasting: distinguish near-term vs long-term developments
- For descriptive questions: provide specific examples and entities

ANSWER:
"""

def build_smart_prompt(question, context):
    """Build adaptive prompt that includes TRL guidance only when needed"""
    
    # Detect if this is a technology maturity question
    maturity_keywords = ['trl', 'mature', 'transition', 'academy to application', 
                        'commercial', 'moving from academy', 'readiness', 'development stage']
    
    question_lower = question.lower()
    is_maturity_question = any(keyword in question_lower for keyword in maturity_keywords)
    
    # Include TRL section only for maturity questions
    if is_maturity_question:
        trl_section = """
TECHNOLOGY MATURITY ASSESSMENT:
- When discussing technology readiness, reference these stages:
  * Research Phase (TRL 1-4): Basic research, lab validation
  * Development Phase (TRL 5-6): Prototyping, testing  
  * Commercialization Phase (TRL 7-9): Deployment, scaling
- Assess current stage based on evidence in context
- Identify transition indicators and timelines
"""
    else:
        trl_section = ""
    
    prompt = UNIVERSAL_PROMPT_TEMPLATE.format(
        context=context,
        question=question,
        trl_section=trl_section
    )
    
    return prompt

# Test the universal template
def test_universal_prompt():
    """Test that the template adapts to different question types"""
    
    test_context = "Sample context about technology development..."
    
    # Test regular question
    regular_question = "Which startups work on AI for automotive?"
    regular_prompt = build_smart_prompt(regular_question, test_context)
    print("üîπ REGULAR QUESTION PROMPT:")
    print("Includes TRL section:", "TECHNOLOGY MATURITY ASSESSMENT" in regular_prompt)
    print("---")
    
    # Test TRL question  
    trl_question = "Which quantum computing research is moving from academy to application?"
    trl_prompt = build_smart_prompt(trl_question, test_context)
    print("üîπ TRL QUESTION PROMPT:")
    print("Includes TRL section:", "TECHNOLOGY MATURITY ASSESSMENT" in trl_prompt)
    
    return regular_prompt, trl_prompt

# Run test
regular_prompt, trl_prompt = test_universal_prompt()

print("\n‚úÖ Universal prompt template ready!")
print("‚úÖ Automatically includes TRL guidance for maturity questions")
print("‚úÖ Single template for all query types")

üîπ REGULAR QUESTION PROMPT:
Includes TRL section: False
---
üîπ TRL QUESTION PROMPT:
Includes TRL section: True

‚úÖ Universal prompt template ready!
‚úÖ Automatically includes TRL guidance for maturity questions
‚úÖ Single template for all query types


# Response Quality Setup

**Why we're doing this:** 
Ensure answers are relevant and properly cite sources.

**What we're doing:**

- Checking if the pipeline works and our LLM integration and prompt template can return something nice. 


In [None]:
# CELL: Test Complete RAG Pipeline (CORRECTED)
# Why: Use the actual dictionary structure from your retriever
# What: Complete pipeline that works with your custom retriever output

def test_complete_pipeline(question):
    """Test the full RAG pipeline"""
    print(f"üß™ TESTING PIPELINE: '{question}'")
    print("=" * 50)
    
    try:
        # Step 1: Retrieve documents
        print("1. üîç Retrieving documents...")
        retrieved_data = retriever.retrieve_with_sources(question, k=3)
        print(f"   ‚úÖ Found {len(retrieved_data)} relevant chunks")
        
        # Step 2: Format context from the dictionaries
        context = "\n\n".join([
            f"Source: {item['source_file']} | Type: {item['doc_type']}\nContent: {item['content']}"
            for item in retrieved_data
        ])
        
        # Step 3: Build smart prompt
        print("2. üìù Building prompt...")
        prompt = build_smart_prompt(question, context)
        
        # Step 4: Generate answer using LLM
        print("3. ü§ñ Generating answer with LLM...")
        response = groq_client.chat.completions.create(
            model="llama-3.1-8b-instant",
            messages=[{"role": "user", "content": prompt}],
            max_tokens=500,
            temperature=0.3
        )
        
        answer = response.choices[0].message.content
        
        # Step 5: Display results
        print("4. üìä RESULTS:")
        print(f"QUESTION: {question}")
        print(f"ANSWER: {answer}")
        print("\nüìö SOURCES:")
        for i, item in enumerate(retrieved_data):
            print(f"  {i+1}. {item['source_file']} (Score: {item['similarity_score']:.3f})")
        
        return {
            'question': question,
            'answer': answer,
            'sources': retrieved_data,
            'retrieved_chunks': len(retrieved_data)
        }
        
    except Exception as e:
        print(f"‚ùå Pipeline error: {e}")
        import traceback
        traceback.print_exc()
        return None

# Test the pipeline
print("üöÄ TESTING COMPLETE RAG PIPELINE")
test_question = "Which startups work on AI for automotive?"
result = test_complete_pipeline(test_question)

if result:
    print(f"\nüéâ PIPELINE SUCCESS!")
    print(f"‚úÖ Question: {result['question']}")
    print(f"‚úÖ Answer generated: {len(result['answer'])} characters")
    print(f"‚úÖ Sources used: {len(result['sources'])} documents")
    
    # Show a preview of the answer
    print(f"\nüìù Answer preview: {result['answer'][:200]}...")
else:
    print("\nüí• Pipeline failed")

üöÄ TESTING COMPLETE RAG PIPELINE
üß™ TESTING PIPELINE: 'Which startups work on AI for automotive?'
1. üîç Retrieving documents...
   ‚úÖ Found 3 relevant chunks
2. üìù Building prompt...
3. ü§ñ Generating answer with LLM...
4. üìä RESULTS:
QUESTION: Which startups work on AI for automotive?
ANSWER: Based on the provided context, it is not possible to directly answer which startups work on AI for automotive. However, we can infer some information about the current state of AI in the automotive industry and potential future developments.

The context suggests that generative AI technologies like GANs and VAEs have the potential to innovate and enhance various aspects of automotive design, manufacturing, and autonomous driving [Source: Gen_AI_in_automotive_applications_challenges_and_opportunities_with_a_case_study_on_in-vehicle_experience.txt]. Additionally, the development of domain-specific synthetic dialog datasets that incorporate disfluencies is crucial for enhancing the natu