## **Nugen Intelligence**
<img src="https://nugen.in/logo.png" alt="Nugen Logo" width="200"/>

Domain-aligned foundational models at industry leading speeds and zero-data retention! To learn more, visit [Nugen](https://docs.nugen.in/introduction)


Welcome to the Nugen Intelligence Cookbook—a comprehensive guide designed to help you harness the power of domain-aligned large language models (LLMs) with industry-leading speeds and zero-data retention. At Nugen, we specialize in building highly tailored AI models that adapt to your specific needs, enabling scalable, intelligent solutions for specialized domains. 

## **Evaluator-Optimizer System Cookbook**

### **Introduction**

The evaluator-optimizer system is designed to generate and improve code solutions through an iterative process. Think of it as a sophisticated pair programming setup where one AI generates code and another AI reviews it. This cookbook will walk you through each component, explaining how they work and how to use them effectively.

**Imports and Installations**

In [1]:
from pydantic import BaseModel
from typing import Literal
import requests
import json
import re

### **Defining Our Evaluation Structure**

Next, we create a structure for evaluations, like a grading rubric:

In [2]:
class Evaluation(BaseModel):
    evaluation: Literal["PASS", "NEEDS_IMPROVEMENT", "FAIL"]
    feedback: str

This is like creating a standardized grading system where:

- We have three possible grades (PASS, NEEDS_IMPROVEMENT, FAIL)
- Each grade comes with detailed feedback explaining why

### **Part 2: Building Our Communication System**

The Nugen Client
Think of this as our dedicated assistant who knows how to talk to the AI:

In [3]:
class NugenClient:
    def __init__(self, token: str):
        self.base_url = "https://api.nugen.in/inference/completions"
        self.headers = {
            "Authorization": f"Bearer {token}",
            "Content-Type": "application/json"
        }

The client handles:

- Setting up the connection (like knowing the phone number and access code)
- Maintaining consistent communication settings
- Handling any communication errors



The complete method is like having a conversation:

In [4]:
class NugenClient:
    def __init__(self, token: str):
        self.base_url = "https://api.nugen.in/inference/completions"
        self.headers = {
            "Authorization": f"Bearer {token}",
            "Content-Type": "application/json"
        }
    
    def complete(self, prompt: str, model: str = "nugen-flash-instruct", 
                max_tokens: int = 1000, temperature: float = 0.1) -> str:
        """Makes a completion request to the Nugen API with enhanced error handling"""
        payload = {
            "max_tokens": str(max_tokens),
            "model": model,
            "prompt": prompt,
            "temperature": temperature
        }
        
        try:
            response = requests.post(self.base_url, json=payload, headers=self.headers)
            response.raise_for_status()
            return response.json()["choices"][0]["text"]
        except requests.exceptions.RequestException as e:
            print(f"API request failed: {str(e)}")
            raise

This method:

- Takes in our question (prompt)
- Specifies how detailed we want the answer (max_tokens)
- Controls how creative vs focused the response should be (temperature)

### **Part 3: Understanding Communication**

**Extracting Meaningful Information**

Sometimes the AI's responses need interpretation. Our JSON extraction function is like a skilled interpreter:

In [5]:
def extract_json_from_text(text: str) -> dict:
    """
    Extracts JSON content from text using various fallback methods.
    Includes sophisticated parsing to handle common LLM response patterns.
    """
    # First try to find content between curly braces
    json_pattern = r'\{[^{}]*\}'
    matches = re.findall(json_pattern, text, re.DOTALL)
    
    for potential_json in matches:
        try:
            return json.loads(potential_json)
        except json.JSONDecodeError:
            continue
    
    # If no valid JSON found, try to construct it from the response
    try:
        # Look for evaluation status
        status_pattern = r'(PASS|NEEDS_IMPROVEMENT|FAIL)'
        status_match = re.search(status_pattern, text)
        
        # Look for feedback content
        feedback_pattern = r'feedback"?\s*:?\s*"?([^"}]*)'
        feedback_match = re.search(feedback_pattern, text, re.IGNORECASE | re.DOTALL)
        
        if status_match and feedback_match:
            return {
                "evaluation": status_match.group(1),
                "feedback": feedback_match.group(1).strip()
            }
    except Exception:
        pass
    
    # Default response if no valid JSON or patterns found
    return {
        "evaluation": "FAIL",
        "feedback": "Could not parse a valid evaluation response"
    }

In [6]:
def run_llm(user_prompt: str, model: str = "nugen-flash-instruct", 
           system_prompt: str = None, client: NugenClient = None) -> str:
    """Executes LLM requests with enhanced error handling"""
    if client is None:
        raise ValueError("NugenClient instance must be provided")
    
    full_prompt = f"{system_prompt}\n{user_prompt}" if system_prompt else user_prompt
    return client.complete(full_prompt, model=model)

def JSON_llm(user_prompt: str, schema, system_prompt: str = None, 
             client: NugenClient = None) -> dict:
    """
    Executes LLM requests with enhanced JSON handling and response parsing.
    Includes multiple fallback methods to ensure valid JSON output.
    """
    try:
        # Create a more explicit JSON instruction with examples
        json_instruction = f"""
        Respond ONLY with a valid JSON object following exactly this format:
        {{
            "evaluation": "PASS" or "NEEDS_IMPROVEMENT" or "FAIL",
            "feedback": "Your detailed feedback here"
        }}
        No other text should be included in your response.
        Schema: {json.dumps(schema.model_json_schema())}
        """
        
        full_prompt = f"{json_instruction}\n{system_prompt}\n{user_prompt}" if system_prompt \
                     else f"{json_instruction}\n{user_prompt}"
        
        response_text = client.complete(full_prompt)
        print(f"\nRaw evaluation response:\n{response_text}\n")
        
        # Extract and validate JSON from the response
        parsed_response = extract_json_from_text(response_text)
        
        # Validate against our schema
        return Evaluation(**parsed_response).dict()
        
    except Exception as e:
        print(f"Error processing evaluation: {str(e)}")
        return {
            "evaluation": "FAIL",
            "feedback": f"Error in evaluation process: {str(e)}"
        }

This function works like a detective:

- First, it looks for properly formatted JSON
- If that fails, it looks for specific patterns
- As a last resort, it constructs a response from pieces it finds

### **Part 4: The Learning Loop**

**Generation and Evaluation**

The system works through a cycle of generation and evaluation, like a student practicing problems:

In [7]:
EVALUATOR_PROMPT = """
Evaluate this code implementation and respond ONLY with a JSON object in this exact format:
{
    "evaluation": "PASS" or "NEEDS_IMPROVEMENT" or "FAIL",
    "feedback": "Your detailed feedback here"
}

Evaluate based on these criteria:
1. code correctness
2. time complexity
3. style and best practices

Use "PASS" only if all criteria are met perfectly.
Use "NEEDS_IMPROVEMENT" if there are minor issues.
Use "FAIL" if there are major problems.

Provide specific, actionable feedback explaining what needs improvement and why.
DO NOT include any other text or explanation outside the JSON object.
"""

GENERATOR_PROMPT = """
Your goal is to complete the task based on <user input>. If there are feedback
from your previous generations, you should reflect on them to improve your solution.
Output your answer concisely in the following format:
Thoughts:
[Your understanding of the task and feedback and how you plan to improve]
Response:
[Your code implementation here]
"""


**Understanding the Evaluator's Role**

The evaluator prompt acts like a thorough teacher who:

1. Uses consistent grading criteria
2. Provides structured feedback
3. Makes clear improvement suggestions

**Understanding the Generator's Role**

The generator prompt is designed like a thoughtful student who:

1. Reads the task carefully
2. Considers past feedback
3. Plans improvements
4. Implements a solution

In [8]:
def generate(task: str, generator_prompt: str, context: str = "", 
            client: NugenClient = None) -> str:
    """Generate and improve a solution based on feedback using Nugen API"""
    full_prompt = f"{generator_prompt}\n{context}\nTask: {task}" if context \
                 else f"{generator_prompt}\nTask: {task}"
    
    response = run_llm(full_prompt, client=client)
    print("\n## Generation start")
    print(f"Output:\n{response}\n")
    return response


**The generation phase:**

1. Takes in a task and any previous context
2. Creates a solution
3. Returns the attempt for evaluation

In [9]:

def evaluate(task: str, evaluator_prompt: str, generated_content: str, 
            client: NugenClient = None) -> tuple[str, str]:
    """Evaluate if a solution meets requirements using Nugen API"""
    full_prompt = f"{evaluator_prompt}\nOriginal task: {task}\n" + \
                 f"Content to evaluate: {generated_content}"
    
    response = JSON_llm(full_prompt, Evaluation, client=client)
    evaluation = response["evaluation"]
    feedback = response["feedback"]
    
    print("## Evaluation start")
    print(f"Status: {evaluation}")
    print(f"Feedback: {feedback}")
    return evaluation, feedback

**The evaluation phase:**

1. Reviews the generated solution
2. Checks it against requirements
3. Provides specific feedback

### **Part 5: The Complete Workflow**

The main workflow ties everything together:

In [10]:
def loop_workflow(task: str, evaluator_prompt: str, generator_prompt: str, 
                 token: str, max_iterations: int = 5) -> str:
    """Execute the main workflow loop with enhanced error handling"""
    client = NugenClient(token)
    memory = []
    
    try:
        # Generate initial response
        response = generate(task, generator_prompt, client=client)
        memory.append(response)
        
        current_iteration = 0
        while current_iteration < max_iterations:
            evaluation, feedback = evaluate(task, evaluator_prompt, response, client=client)
            
            if evaluation == "PASS":
                return response
                
            context = "\n".join([
                "Previous attempts:",
                *[f"- {m}" for m in memory],
                f"\nFeedback: {feedback}"
            ])
            
            response = generate(task, generator_prompt, context, client=client)
            memory.append(response)
            current_iteration += 1
        
        return response
        
    except Exception as e:
        print(f"Workflow error: {str(e)}")
        return f"Error in workflow: {str(e)}"

**This process works like a tutoring session:**

1. The system attempts a solution
2. Evaluates its work
3. If it's not perfect, it tries again with the feedback in mind
4. This continues until either:
    - A perfect solution is found
    - The maximum number of attempts is reached

### **Part 6: Using the System**

To get started with Nugen LLM models, you'll need an API key. Here's how to mention it in the cookbook:

"To access Nugen's LLM capabilities, sign up at [Nugen](https://nugen-platform-frontend.azurewebsites.net/dashboard) 
Obtain your API key from the dashboard. Your API key should be in the format: nugen-xxxxxxxxxxxxxxxxxxxxxx"

Here's how to use the complete system:

In [11]:
if __name__ == "__main__":
    task = """
    Implement a Stack with:
    1. push(x)
    2. pop()
    3. getMin()
    All operations should be O(1).
    """
    
    token = <your api key> # Replace with your actual token
    
    try:
        result = loop_workflow(task, EVALUATOR_PROMPT, GENERATOR_PROMPT, token)
        print("\nFinal Result:", result)
    except Exception as e:
        print(f"An error occurred: {str(e)}")


## Generation start
Output:




Thoughts:
    To implement a stack with O(1) push, pop, and getMin operations, we can use two stacks: one for storing the actual elements and another for storing the minimum elements seen so far. The second stack will be used to keep track of the minimum element at each step.

Response:
```python
class MinStack:

    def __init__(self):
        """
        initialize your data structure here.
        """
        self.stack = []
        self.min_stack = []

    def push(self, x: int) -> None:
        self.stack.append(x)
        if not self.min_stack or x <= self.min_stack[-1]:
            self.min_stack.append(x)

    def pop(self) -> None:
        if self.stack:
            if self.stack[-1] == self.min_stack[-1]:
                self.min_stack.pop()
            self.stack.pop()

    def top(self) -> int:
        if self.stack:
            return self.stack[-1]

    def getMin(self) -> int:
        if self.min_stack:
            return self.min_stack[-

C:\Users\parimal\AppData\Local\Temp\ipykernel_9136\1789596196.py:38: PydanticDeprecatedSince20: The `dict` method is deprecated; use `model_dump` instead. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.10/migration/
  return Evaluation(**parsed_response).dict()


**Conclusion**

This self-improving LLM system demonstrates how we can create AI systems that:

Generate solutions to problems
- Evaluate their own work
- Learn from feedback
- Progressively improve their responses

The system combines several sophisticated concepts:

Structured data validation
- Error handling
- Pattern matching
- Iterative improvement

By understanding each component and how they work together, you can create robust systems that not only solve problems but learn and improve from experience.