<a href="https://colab.research.google.com/github/frank-morales2020/MLxDL/blob/main/Qwen3_demo.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Qwen3-8B

In [None]:
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_name = "Qwen/Qwen3-8B" # Or any other Qwen3 model like "Qwen/Qwen3-30B-A3B"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto", # or torch.bfloat16 for bfloat16 models
    device_map="auto"
)

prompt = "Explain the concept of quantum entanglement."
messages = [{"role": "user", "content": prompt}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

In [2]:
input_ids = tokenizer(text, return_tensors="pt").input_ids.to(model.device)
output = model.generate(input_ids, max_new_tokens=512)
print(tokenizer.decode(output[0], skip_special_tokens=True))

user
Explain the concept of quantum entanglement.
assistant
<think>
Okay, I need to explain quantum entanglement. Let me start by recalling what I know. Entanglement is a phenomenon in quantum mechanics where particles become interconnected, right? But I should make sure I get the details right.

First, maybe I should mention that entangled particles are in a superposition of states. So, when you measure one, the other's state is instantly determined, no matter the distance. Wait, but how does that work? Einstein called it "spooky action at a distance," which he didn't like. But I think that's been proven through experiments, like Bell's theorem. 

I should explain the basics. Let's say you have two particles, like electrons, that are entangled. Their quantum states are linked. If you measure the spin of one, the other's spin is instantly known, even if they're light-years apart. But it's not that information is transmitted faster than light, right? Because the outcome is random, so no

## AGENT

In [1]:
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
import json
import os
import re # Import regex module

class Qwen3ExplanationAgent:
    """
    An AI agent designed to explain concepts using the Qwen3 model,
    emulating a structured thought process (analyze, plan, generate, refine).
    This agent directly integrates the Qwen3 model for content generation.
    """

    def __init__(self, tokenizer, model, name="Qwen3Agent"):
        """
        Initializes the Qwen3ExplanationAgent with pre-loaded tokenizer and model.
        Args:
            tokenizer: The pre-loaded Qwen3 tokenizer.
            model: The pre-loaded Qwen3 model.
            name (str): The name of the agent.
        """
        self.tokenizer = tokenizer
        self.model = model
        self.name = name
        # Updated knowledge_base_keywords for flight planning domain
        self.knowledge_base_keywords = {
            "flight planning": ["route optimization", "fuel calculation", "weather considerations", "air traffic control", "regulations"],
            "route optimization": ["great circle route", "wind impact", "restricted airspace", "waypoints"],
            "fuel calculation": ["payload", "distance", "altitude", "reserve fuel", "fuel consumption rate"],
            "weather considerations": ["wind", "turbulence", "icing", "thunderstorms", "visibility"],
            "air traffic control": ["airspace classes", "flight rules", "communication procedures"],
            "aviation regulations": ["flight rules", "licensing", "aircraft maintenance"]
        }


        print(f"[{self.name} - Init] Agent initialized.")
        if self.model and self.model.device:
            print(f"[{self.name} - Init] Model available on device: {self.model.device}")
        else:
            print(f"[{self.name} - Init] Model device not determined or model not loaded properly.")


    def _analyze_query(self, query):
        """
        Simulates analyzing the user's query to identify the core concept.
        Args:
            query (str): The user's input query.
        Returns:
            str: The identified concept key (e.g., "flight planning") or None.
        """
        query_lower = query.lower()
        for concept in self.knowledge_base_keywords:
            if concept in query_lower:
                return concept
        return None

    def _plan_explanation(self, concept_key):
        """
        Simulates planning the structure of the explanation based on the concept.
        Args:
            concept_key (str): The identified concept.
        Returns:
            list: A list of sub-prompts for each section of the explanation.
        """
        if concept_key and concept_key in self.knowledge_base_keywords:
            sections = self.knowledge_base_keywords[concept_key]
            plan = [f"Explain the {section.replace('_', ' ')} of {concept_key}." for section in sections]
            return plan
        else:
            return [f"Explain '{concept_key}' in detail."] if concept_key else ["Provide a general explanation for the query."]

    def _generate_section_content(self, section_prompt):
        """
        Uses the Qwen3 model to generate content for a specific section.
        Args:
            section_prompt (str): The specific prompt for generating a section (e.g., "Define quantum entanglement.").
        Returns:
            str: The generated text for that section.
        """
        if not self.model or not self.tokenizer:
            return "Error: Qwen3 model or tokenizer not provided. Cannot generate content."

        messages = [{"role": "user", "content": section_prompt}]
        text = self.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

        encoded_input = self.tokenizer(text, return_tensors="pt")
        input_ids = encoded_input.input_ids.to(self.model.device)
        attention_mask = encoded_input.attention_mask.to(self.model.device)

        try:
            output = self.model.generate(
                input_ids,
                attention_mask=attention_mask,
                max_new_tokens=4096,
                temperature=0.7,
                do_sample=True
            )
            raw_generated_text = self.tokenizer.decode(output[0], skip_special_tokens=True)

            assistant_response = ""
            match = re.search(r'(?:assistant|ASSISTANT):?\s*(.*)', raw_generated_text, re.DOTALL | re.IGNORECASE)
            if match:
                assistant_response = match.group(1).strip()
            else:
                user_prompt_part = text
                user_prompt_pattern = r"^" + re.escape(user_prompt_part.strip()) + r"\s*"
                assistant_response = re.sub(user_prompt_pattern, "", raw_generated_text, flags=re.DOTALL).strip()
                if not assistant_response:
                    assistant_response = raw_generated_text.strip()

            cleaned_text = re.sub(r'<think>.*?</think>', '', assistant_response, flags=re.DOTALL)
            cleaned_text = re.sub(r'<think>.*', '', cleaned_text, flags=re.DOTALL)

            final_text = cleaned_text.strip()
            return final_text
        except Exception as e:
            print(f"[{self.name} - Generation Error] Failed to generate content: {e}")
            return f"Failed to generate content for '{section_prompt}' due to an error."

    def _assemble_and_refine_response(self, concept_key, generated_sections):
        """
        Simulates assembling and refining the generated content into a final response.
        Args:
            concept_key (str): The identified concept.
            generated_sections (dict): Dictionary of generated content for each section.
        Returns:
            str: The final, polished explanation.
        """
        final_response_parts = []

        if concept_key:
            final_response_parts.append(f"Here's an explanation of **{concept_key.replace('_', ' ').title()}**:")

            ordered_sections = self.knowledge_base_keywords.get(concept_key, [])
            for section_type in ordered_sections:
                if section_type in generated_sections and generated_sections[section_type]:
                    title = section_type.replace('_', ' ').title()
                    final_response_parts.append(f"\n### {title}\n{generated_sections[section_type]}")
        else:
            if "general_explanation" in generated_sections:
                final_response_parts.append(generated_sections["general_explanation"])
            else:
                final_response_parts.append("I couldn't generate a specific explanation for your query.")

        final_response_parts.append("\n\nI hope this detailed explanation is helpful!")

        final_response = "\n".join(final_response_parts)
        return final_response

    def explain_concept(self, query):
        """
        Orchestrates the agent's full thought process to explain a concept.
        Args:
            query (str): The user's query.
        Returns:
            str: The agent's final explanation.
        """
        #print("\n--- Agentic Process Start ---")
        if not self.model or not self.tokenizer:
            print("[Agent - Error] Model not ready. Cannot proceed with explanation.")
            return "I am unable to process your request as the underlying model could not be loaded. Please check your environment."

        concept_key = self._analyze_query(query)
        plan_sections = self._plan_explanation(concept_key)

        generated_content = {}
        if concept_key:
            for section_prompt_template in plan_sections:
                section_type = section_prompt_template.split("Explain the ")[1].split(" of")[0].replace(' ', '_')
                generated_content[section_type] = self._generate_section_content(section_prompt_template)
        else:
            general_prompt = f"Explain '{query}' in detail."
            generated_content["general_explanation"] = self._generate_section_content(general_prompt)

        final_explanation = self._assemble_and_refine_response(concept_key, generated_content)

        #print("--- Agentic Process End ---\n")
        return final_explanation

# --- Model loading moved outside the class ---
def load_qwen3_model(model_name="Qwen/Qwen3-8B"):
    """
    Loads the Qwen3 model and tokenizer. This can be resource-intensive.
    Returns:
        tuple: (tokenizer, model) or (None, None) if loading fails.
    """
    print(f"[Global Model Loading] Loading Qwen3 model '{model_name}'...")
    try:
        tokenizer = AutoTokenizer.from_pretrained(model_name)
        model = AutoModelForCausalLM.from_pretrained(
            model_name,
            torch_dtype="auto",
            device_map="auto"
        )
        print(f"[Global Model Loading] Model loaded successfully on device: {model.device}")
        return tokenizer, model
    except Exception as e:
        print(f"[Global Model Loading Error] Failed to load model: {e}")
        print("Please ensure 'transformers' and 'torch' are installed and you have sufficient resources (e.g., GPU memory).")
        return None, None

In [None]:
# Load the Qwen3 model and tokenizer globally
tokenizer, model = load_qwen3_model()

if not model or not tokenizer:
    print("Global model loading failed. Exiting.")

In [3]:
# Initialize the agent with the pre-loaded tokenizer and model
agent = Qwen3ExplanationAgent(tokenizer=tokenizer, model=model)

print("\nStarting Qwen3 Agentic Explanation System in fully automatic mode.")

automatic_query = "Explain flight planning."

response = agent.explain_concept(automatic_query)
print(f"\nAgent's Automatic Explanation for '{automatic_query}':\n{response}")

[Qwen3Agent - Init] Agent initialized.
[Qwen3Agent - Init] Model available on device: cuda:0

Starting Qwen3 Agentic Explanation System in fully automatic mode.

Agent's Automatic Explanation for 'Explain flight planning.':
Here's an explanation of **Flight Planning**:

### Regulations
Flight planning is a critical process in aviation that ensures the safety, efficiency, and compliance of flights. It involves adhering to regulations set by national aviation authorities and international standards. Below is an overview of the key regulations and components of flight planning:

---

### **1. Regulatory Bodies and Frameworks**
Flight planning is governed by:
- **FAA (U.S.)**: Under 14 CFR Part 91 (General Operating and Flight Rules) and Part 121 (Air Carrier Operations).
- **EASA (Europe)**: Regulations such as EASA Part-OPS (Operations) and EASA Part-M (Maintenance).
- **ICAO (International Civil Aviation Organization)**: Sets global standards (e.g., ICAO Doc 4444, "Manual of Internation

## flight plan

In [None]:
import datetime
import json
import re
import torch
from functools import partial
from transformers import AutoModelForCausalLM, AutoTokenizer

# --- YOUR SPECIFIED MODEL LOADING CODE - EXACTLY AS PROVIDED AND CONFIRMED WORKING ---
# This block will attempt to load the model as per your instruction.
model_name = "Qwen/Qwen3-8B"

# No print statements or try-except blocks added here, as per your strict instruction.
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto", # or torch.bfloat16 for bfloat16 models
    device_map="auto",
    trust_remote_code=True # This is often crucial for Qwen models
)
# The 'prompt', 'messages', 'text' from your example are not used in the agent loop
# but are included here as they were part of your exact code snippet.
prompt = "Explain the concept of quantum entanglement."
messages_example = [{"role": "user", "content": prompt}]
text_example = tokenizer.apply_chat_template(messages_example, tokenize=False, add_generation_prompt=True)

In [12]:
print(f"Model device: {model.device}")
print(f"Model dtype: {model.dtype}")
print(f"Model num_parameters: {model.num_parameters()}")

Model device: cuda:0
Model dtype: torch.bfloat16
Model num_parameters: 8190735360


In [16]:
# --- 1. Define Mock Tools for Flight Planning ---
# These functions simulate external systems (weather APIs, flight calculators, etc.)

def get_current_datetime(timezone="EST", format_string="%Y-%m-%d %H:%M:%S") -> str:
    """
    Returns the current date and time in the specified timezone and format.
    Use this tool to get the current date or time.
    Args:
        timezone (str): The desired timezone (e.g., "EST", "UTC"). Defaults to "EST".
        format_string (str): The Python strftime format string (e.g., "%Y-%m-%d %H:%M:%S").
                             Defaults to "%Y-%m-%d %H:%M:%S".
    """
    # Using the remembered preference for EST
    tz_offset = datetime.timezone(datetime.timedelta(hours=-5)) # EST is UTC-5
    return datetime.datetime.now(tz_offset).strftime(format_string)

def get_weather_forecast(location: str, date_time: str) -> str:
    """
    Fetches a simplified weather forecast for a given location (airport code or city name) and date/time.
    Location can be an airport code (e.g., "CYUL", "KJFK") or city name.
    Args:
        location (str): The airport code (e.g., "CYUL") or city name (e.g., "Montreal").
        date_time (str): The date and time for the forecast (e.g., "2025-07-22 08:00:00 EST").
    """
    weather_data = {
        "CYUL": {"2025-07-22 08:00:00 EST": {"temperature": "20C", "wind": "10 knots, 270 degrees", "visibility": "10+ miles", "conditions": "Clear"}},
        "KJFK": {"2025-07-22 08:00:00 EST": {"temperature": "22C", "wind": "5 knots, 180 degrees", "visibility": "10+ miles", "conditions": "Partly Cloudy"}},
    }

    try:
        dt_obj = datetime.datetime.strptime(date_time, "%Y-%m-%d %H:%M:%S %Z")
        date_time_str_lookup = dt_obj.strftime("%Y-%m-%d %H:%M:%S EST") # Normalize to mock data format
    except ValueError:
        try: # Try without timezone if first attempt fails
            dt_obj = datetime.datetime.strptime(date_time, "%Y-%m-%d %H:%M:%S")
            date_time_str_lookup = dt_obj.strftime("%Y-%m-%d %H:%M:%S EST") # Assume EST
        except ValueError:
            return f"Error: Invalid date_time format for weather tool: '{date_time}'. Please use 'YYYY-MM-DD HH:MM:SS EST'."

    normalized_location = location.upper()
    if normalized_location in weather_data and date_time_str_lookup in weather_data[normalized_location]:
        return json.dumps(weather_data[normalized_location][date_time_str_lookup])
    else:
        return f"Weather data not available for {location} at {date_time}."

def calculate_route_distance(origin_airport_code: str, destination_airport_code: str) -> str:
    """
    Calculates the approximate straight-line distance in nautical miles between two ICAO airport codes.
    Requires valid ICAO airport codes (e.g., "CYUL", "KJFK").
    Args:
        origin_airport_code (str): The ICAO code of the origin airport (e.g., "CYUL").
        destination_airport_code (str): The ICAO code of the destination airport (e.g., "KJFK").
    """
    distances = {
        ("CYUL", "KJFK"): 300,
        ("KJFK", "CYUL"): 300,
    }
    distance = distances.get((origin_airport_code.upper(), destination_airport_code.upper()))
    if distance:
        return f"{distance} nautical miles"
    else:
        return "Distance calculation not available for these airports."

def estimate_fuel_burn_cessna172(distance_nm: float, average_wind_kts: float = 0) -> str:
    """
    Estimates fuel burn for a Cessna 172 based on distance in nautical miles and optional average wind speed.
    Assumes a cruise speed of 100 knots and fuel consumption of 8 gallons per hour.
    Args:
        distance_nm (float): The distance in nautical miles.
        average_wind_kts (float): Optional: Average headwind component in knots. Defaults to 0 (no wind).
                                 A positive value means headwind, negative means tailwind.
    """
    cruise_speed_kts = 100.0
    fuel_burn_gph = 8.0

    effective_ground_speed_kts = cruise_speed_kts - average_wind_kts

    if effective_ground_speed_kts <= 10:
        return "Cannot accurately estimate fuel burn due to very low or negative effective speed (strong headwind/tailwind)."

    estimated_flight_time_hours = distance_nm / effective_ground_speed_kts
    estimated_fuel_gallons = estimated_flight_time_hours * fuel_burn_gph
    return f"{estimated_fuel_gallons:.1f} gallons"

# Map tool names to their functions
tool_map = {
    "get_current_datetime": partial(get_current_datetime, timezone="EST"),
    "get_weather_forecast": get_weather_forecast,
    "calculate_route_distance": calculate_route_distance,
    "estimate_fuel_burn_cessna172": estimate_fuel_burn_cessna172,
}

# --- 3. Agentic Loop (Manual ReAct Implementation) ---

def run_agent(user_query: str, max_iterations: int = 7) -> str:
    # `model` and `tokenizer` are assumed to be loaded by the code provided at the top.

    messages = [{"role": "user", "content": user_query}]

    # Define available tools for the model to understand in the prompt
    tool_definitions = [
        {"name": "get_current_datetime", "description": "Returns the current date and time in EST, formatted according to the provided Python strftime format string. Use this tool whenever the user asks for the current date, time, or both.", "parameters": {"type": "object", "properties": {"format_string": {"type": "string", "description": "The Python strftime format string."}}, "required": []}},
        {"name": "get_weather_forecast", "description": "Fetches a simplified weather forecast for a given location (airport code or city name) and date/time. Use to get weather conditions.", "parameters": {"type": "object", "properties": {"location": {"type": "string"}, "date_time": {"type": "string", "description": "e.g., 'YYYY-MM-DD HH:MM:SS EST'"}}, "required": ["location", "date_time"]}},
        {"name": "calculate_route_distance", "description": "Calculates the approximate straight-line distance in nautical miles between two ICAO airport codes. Requires valid ICAO airport codes (e.g., 'CYUL', 'KJFK').", "parameters": {"type": "object", "properties": {"origin_airport_code": {"type": "string"}, "destination_airport_code": {"type": "string"}}, "required": ["origin_airport_code", "destination_airport_code"]}},
        {"name": "estimate_fuel_burn_cessna172", "description": "Estimates fuel burn for a Cessna 172 based on distance in nautical miles and optional average headwind component in knots. A positive value means headwind, negative means tailwind. Use this only for Cessna 172 aircraft.", "parameters": {"type": "object", "properties": {"distance_nm": {"type": "number"}, "average_wind_kts": {"type": "number"}}, "required": ["distance_nm"]}},
    ]

    for i in range(max_iterations):
        #print(f"\n--- Agent Iteration {i+1} ---")

        full_input_text = tokenizer.apply_chat_template(
            messages,
            tools=tool_definitions, # Pass the tool definitions
            tokenize=False,
            add_generation_prompt=True,
        )

        # Generate response
        input_ids = tokenizer(full_input_text, return_tensors="pt").input_ids.to(model.device)

        with torch.no_grad():
            output_ids = model.generate(
                input_ids,
                max_new_tokens=4096,
                temperature=0.6,
                top_p=0.9,
                do_sample=True,
                pad_token_id=tokenizer.eos_token_id,
            )

        generated_response = tokenizer.decode(output_ids[0][input_ids.shape[1]:], skip_special_tokens=True).strip()
        #print(f"Model Raw Output:\n{generated_response}")

        # Add model's response to messages for next turn
        messages.append({"role": "assistant", "content": generated_response})

        # --- Tool Call Parsing ---
        tool_call_match = re.search(r'<tool_code>(.*?)</tool_code>', generated_response, re.DOTALL)

        if tool_call_match:
            tool_code_str = tool_call_match.group(1).strip()
            print(f"Detected Tool Code:\n{tool_code_str}")

            try:
                tool_call_data = json.loads(tool_code_str)
                tool_name = tool_call_data.get("api_name")
                tool_args = tool_call_data.get("parameters", {})

                if tool_name and tool_name in tool_map:
                    print(f"Executing tool: {tool_name} with args: {tool_args}")
                    tool_output = tool_map[tool_name](**tool_args)
                    print(f"Tool Output: {tool_output}")

                    messages.append({"role": "tool", "name": tool_name, "content": tool_output})
                else:
                    error_msg = f"Tool '{tool_name}' not found or invalid tool call."
                    print(error_msg)
                    messages.append({"role": "tool", "name": "error", "content": error_msg})
            except json.JSONDecodeError:
                error_msg = f"Failed to parse tool code as JSON: {tool_code_str}"
                print(error_msg)
                messages.append({"role": "tool", "name": "error", "content": error_msg})
            except Exception as e:
                error_msg = f"Error executing tool: {e}. Tool code: {tool_code_str}"
                print(error_msg)
                messages.append({"role": "tool", "name": "error", "content": error_msg})
        else:
            is_final_response = True
            if "<thought>" in generated_response:
                is_final_response = False
            if "<tool_code>" in generated_response:
                is_final_response = False

            if is_final_response:
                #print("No tool code detected. Assuming final answer.")
                final_answer = generated_response.replace("<finish>", "").strip()
                final_answer = re.sub(r'', '', final_answer, flags=re.DOTALL).strip()
                return final_answer
            else:
                print("Model output did not contain a tool call or a final answer. Continuing loop.")


    return "Agent reached max iterations without a final answer. Please refine the query or increase max_iterations."

# --- 4. Run the Agent ---
user_query = "Plan a flight from Montreal (CYUL) to New York (KJFK) for tomorrow morning in a Cessna 172. How much fuel will it need? Assume tomorrow morning is July 22, 2025 at 08:00 EST."

print(f"User Query: {user_query}\n")
final_response = run_agent(user_query)
print("\n--- Final Agent Response ---")
print(final_response)

User Query: Plan a flight from Montreal (CYUL) to New York (KJFK) for tomorrow morning in a Cessna 172. How much fuel will it need? Assume tomorrow morning is July 22, 2025 at 08:00 EST.


--- Final Agent Response ---
<think>
Okay, let's tackle this user query step by step. The user wants to plan a flight from Montreal (CYUL) to New York (KJFK) for tomorrow morning, which they specified as July 22, 2025 at 08:00 EST. They're using a Cessna 172 and need to know the fuel required.

First, I need to check if the tools provided can handle this. The user mentioned the departure and arrival airports, so I should use the calculate_route_distance tool to get the distance between CYUL and KJFK. Once I have the distance, I can use the estimate_fuel_burn_cessna172 tool to calculate the fuel needed. 

Wait, the user also mentioned the date and time. Do I need to check the weather for that specific time? The get_weather_forecast tool is available, but the user hasn't asked for weather conditions. T