# Class 10: LLM Function Tools and Retrieval Augmented Generation
## Objective: Transition from simple chat prompts to structured, tool-using agents

An **AI Agent** is a software program that operates autonomously to achieve specific goals. They can solve problems and make decisions without continual human oversight, they often use LLMs to understand context, can connect to external applications (including APIs), and in some cases can retain information from previous interactions and have knowledge about their environment.

A **Schema** for a function describes what a function does, what parameters it needs, and when to use it. Schemas are important for LLMs to know when and how to use functions. 

**Instructions:** Work with one or more students at your table. Discuss the key concepts and the code logic with one another. 

## Setup

Check that you have downloaded the `.env` and `.gitignore` files from carmen and put them in the same directory as your class notebooks. If you have not done so, go through Class 9 notebook.

In [None]:
import os
import litellm
import base64
import json
import warnings
from dotenv import load_dotenv
import numpy as np

# Load environment variables
load_dotenv()

# Suppress specific Pydantic warnings
warnings.filterwarnings("ignore", category=UserWarning, module="pydantic")

custom_api_base = "https://litellmproxy.osu-ai.org" 
astro1221_key = os.getenv("ASTRO1221_API_KEY")

def prompt_llm(messages, model="openai/GPT-4.1-mini", temperature=0.2, max_tokens=1000, tools=None, verbose=True):
    """
    Send a prompt or conversation to an LLM using LiteLLM and return the response.
    Exact function from Class 9 notebook.
    """
    if isinstance(messages, str):
        messages = [{"role": "user", "content": messages}]
    if not (isinstance(temperature, (int, float)) and 0 <= temperature <= 2):
        raise ValueError("temperature must be a float between 0 and 2 (inclusive).")
    if not (isinstance(max_tokens, int) and max_tokens > 0):
        raise ValueError("max_tokens must be a positive integer.")

    try: 
        print("Contacting LLM via University Server...")
        response = litellm.completion(
            model=model,
            messages=messages,
            tools=tools,
            api_base=custom_api_base,
            api_key=astro1221_key,
            temperature=temperature,
            max_tokens=max_tokens
        )
        answer = response['choices'][0]['message']['content']
        if verbose: 
            print(f"\nSUCCESS! Here is the answer from {model}:\n")
            print(answer)
    except Exception as e:
        print(f"\nERROR: Could not connect. Details:\n{e}")    
        response = None
    return response

## Section 1: Unified LLM Interfacing

**Purpose:** Write unified code that interacts with different LLMs

To avoid rewriting code for different providers (OpenAI, Anthropic, Google), we use LiteLLM. This provides a streamlined way to interact with various models using a single syntax. The key to this is **Message Roles:**

**System:** Sets the "persona" or rules for the AI.

**User:** Your specific question or command.

**Assistant:** The model's previous responses (used to maintain context).

In [None]:
# Example: Creating a specific persona using System roles
messages = [
    {"role": "system", "content": "You are a precise astronomical data validator. Be brief."},
    {"role": "user", "content": "Is it possible for a star to have a surface temperature of 50,000K?"}
]

response = prompt_llm(messages, model="gemini/gemini-2.5-flash")

**Test your understanding:** Modify the message list below so the System role instructs the AI to respond like a "17th-century astronomer using archaic language," and ask the User question: "What is the nature of the Milky Way?"

In [None]:
# Your instructions here:
archaic_messages = [
    {"role": "system", "content": "Response instructions"},
    {"role": "user", "content": "Question"}
]
# uncomment when ready
# response = prompt_llm(archaic_messages)

## Section 2: Structured Outputs and JSON Mode

**Purpose:** We can enforce JSON output to make responses easier to parse.  

Astronomers often need data that can be programmatically parsed by Python (e.g., a list of coordinates) rather than a conversational paragraph. By instructing the model to use **JSON Enforcement**, we ensure the output is a valid dictionary that can be loaded directly into a data pipeline.

In [None]:
# Using prompt engineering to get structured JSON data
prompt = "Return a JSON object containing the 'constellation', 'brightest_star', and 'approx_distance_ly' for the constellation Lyra. Return ONLY the JSON."

response = prompt_llm(prompt, temperature=0.1)

# Parsing the string output into a Python dictionary
if response:
    raw_text = response['choices'][0]['message']['content']
    # Clean up markdown if the model included it
    clean_json = raw_text.replace('```json', '').replace('```', '').strip()
    data = json.loads(clean_json)
    print(f"The brightest star is: {data['brightest_star']}")

**Test your understanding:** Ask the model for a JSON object containing the name, type (e.g., Spiral, Elliptical), and redshift of the Sombrero Galaxy.

In [None]:
# Your code here:
json_prompt = ""
# response = prompt_llm(json_prompt)

## Section 3: LLMs in functions 

One example of the power of the LLM API interface is when we have to repeat large numbers of tasks that would be very time consuming to copy/paste into a chatbot interface on a web browser. 

Below is a set of astronomical data. The function identifies each object, looks up information about it, and then returns the information in a dictionary. 

In [None]:
import json

# 1. Your raw data (this could also be loaded from a CSV or Text file)
astronomical_data = [
    "The star Betelgeuse is a red supergiant located roughly 640 light-years away.",
    "Sirius A is the brightest star in the night sky and is a main-sequence star.",
    "M31, also known as the Andromeda Galaxy, is a spiral galaxy 2.5 million light-years from Earth.",
    "Proxima Centauri is a red dwarf star and the closest star to the Sun at 4.24 light-years.",
    "Rigel is a blue supergiant in the constellation Orion, situated about 860 light-years from our solar system.",
    "The Crab Nebula (M1) is a supernova remnant in the constellation Taurus, approximately 6,500 light-years away.",
    "Sagittarius A* is the supermassive black hole at the center of the Milky Way, located 26,000 light-years from Earth.",
    "Vega is a bright blue-white main-sequence star in the Lyra constellation, about 25 light-years distant.",
    "The Sombrero Galaxy (M104) is an unbarred spiral galaxy found 29 million light-years away in Virgo.",
    "Canopus is a white giant and the second-brightest star in the sky, located 310 light-years from Earth.",
    "Polaris, the North Star, is a yellow supergiant star system roughly 430 light-years away.",
    "Alpha Centauri A is a yellow main-sequence star, similar to the Sun, part of a system 4.37 light-years away.",
    "The Pleiades (M45) is an open star cluster in Taurus containing middle-aged, hot B-type stars about 444 light-years distant.",
    "Arcturus is a red giant star in the constellation Boötes and is the fourth-brightest star, 37 light-years away.",
    "The Whirlpool Galaxy (M51) is a classic spiral galaxy located 23 million light-years from Earth in Canes Venatici."
]
# 2. A dictionary to store the processed results
processed_catalog = {}

print("Starting data processing...")

for entry in astronomical_data:
    # Create a prompt that enforces JSON output for easy parsing
    prompt = f"""
    Analyze the following text and extract the object's name, its classification, 
    and its approximate distance.
    
    TEXT: {entry}
    
    Return ONLY a JSON object with the keys: 'name', 'type', and 'distance'.
    """
    
    # Call the prompt_llm() function
    # We use a low temperature for more consistent, factual data
    response = prompt_llm(prompt, temperature=0.1, verbose=False)
    
    if response:
        try:
            # Extract the raw string from the response
            raw_content = response['choices'][0]['message']['content']
            
            # Clean Markdown formatting if the model included it
            clean_json = raw_content.replace('```json', '').replace('```', '').strip()
            
            # Convert the JSON string into a Python dictionary
            object_info = json.loads(clean_json)
            
            # Use the object name as the key in our main catalog dictionary
            name = object_info['name']
            processed_catalog[name] = {
                "type": object_info['type'],
                "distance": object_info['distance']
            }
            print(f"Successfully processed: {name}")
            
        except Exception as e:
            print(f"Error parsing data for entry: {entry}. Error: {e}")

# 3. View the final structured results
print("\nFinal Processed Catalog (stored in processed_catalog):")
print(json.dumps(processed_catalog, indent=4))

**Test your understanding:** Write a loop to print the distance and type for each object in `processed_catalog`

In [None]:
# Enter your code here

## Section 4: Function Calling (Tool Use)

Function tools allow LLMs to call your functions. This is valuable because you control how a calculation or other action is performed, which can be both more accurate and more efficient that to use an LLM. 

You provide a function and a `JSON Schema` to the LLM that describes what the function does and what arguments it needs. The model then decides if it needs to call that function to answer a question. You can provide many functions, with a schema for each.

Here is an example with simple function to compute distance from parallax, along with the schema.

In [None]:
def parallax_to_distance(parallax_arcsec):
    """
    Convert stellar parallax to distance in parsecs.
    
    The fundamental equation: d = 1/p
    where d is distance in parsecs and p is parallax in arcseconds.
    """
    # Input validation - always check for invalid inputs!
    if parallax_arcsec <= 0:
        return {"error": "Parallax must be positive"}
    
    # Calculate distance using the parallax formula
    distance_pc = 1.0 / parallax_arcsec
    
    # Return as a dictionary for structured data
    # We round to 2 decimal places for readability
    return {"distance_parsecs": round(distance_pc, 2)}

# Create a tools list with our function schema
tools = [
    {
        "type": "function",
        "function": {
            "name": "parallax_to_distance",
            "description": "Calculate the distance to a star in parsecs given its parallax in arcseconds.",
            "parameters": {
                "type": "object",
                "properties": {
                    "parallax_arcsec": {
                        "type": "number", 
                        "description": "The parallax value in arcseconds (e.g., 0.742 for Alpha Centauri)"
                    }
                },
                "required": ["parallax_arcsec"]
            }
        }
    }
]

print("Function schema defined")
print(f"The LLM knows about {len(tools)} function(s)")

### Illustration:

Here is an example interaction where the LLM is told about the tool and uses it

In [None]:
# Step 1: User asks a question requiring a calculation
messages = [{"role": "user", "content": "The star Proxima Centauri has a parallax of 0.768 arcseconds. How far away is it in parsecs?"}]

# Step 2: Call the LLM with the tools schema defined
response = prompt_llm(messages, tools=tools) 

message = response.choices[0].message

# Step 3: Check if the model wants to call a tool
if message.tool_calls:
    tool_call = message.tool_calls[0]
    function_name = tool_call.function.name
    
    if function_name == "parallax_to_distance":
        # Extract the arguments the LLM decided to use
        args = json.loads(tool_call.function.arguments)
        p_val = args.get("parallax_arcsec")
        
        print(f"Agent Thought: I need to calculate distance for p={p_val}")
        
        # Step 4: Execute the actual Python function locally
        observation = parallax_to_distance(p_val)
        print(f"Observation from Tool: {observation}")
        
        # Step 5: Feed the observation back to the LLM to get the final answer
        messages.append(message) # Add the model's tool call to history
        messages.append({
            "role": "tool",
            "tool_call_id": tool_call.id,
            "name": function_name,
            "content": json.dumps(observation)
        })
        
        final_response = prompt_llm(messages, tools=tools)

### Add another function and schema 

The next cell has a function `stellar_luminosity()` to compute a star's luminosity based on its size and temperature and adds the schema for this function to `tools`.

In [None]:
# Example from
# https://tingyuansen.github.io/coding_essential_for_astronomers/lectures/lecture08-llm-function-tools-and-rag.html

def stellar_luminosity(radius_solar, temperature_k):
    """
    Calculate stellar luminosity using the Stefan-Boltzmann law.
    
    The energy radiated by a star depends on its surface area (4πR²)
    and how much energy each square meter emits (σT⁴).
    """
    # Physical constants
    stefan_boltzmann = 5.67e-8  # W m^-2 K^-4 (Stefan-Boltzmann constant)
    solar_radius = 6.96e8  # meters (Sun's radius)
    solar_luminosity = 3.83e26  # watts (Sun's total energy output)
    
    # Always validate inputs
    if radius_solar <= 0 or temperature_k <= 0:
        return {"error": "Radius and temperature must be positive"}
    
    # Convert stellar radius from solar units to meters
    radius_meters = radius_solar * solar_radius
    
    # Apply Stefan-Boltzmann law: L = 4πR²σT⁴
    luminosity_watts = 4 * np.pi * radius_meters**2 * stefan_boltzmann * temperature_k**4
    
    # Convert to solar luminosities for easier interpretation
    luminosity_solar = luminosity_watts / solar_luminosity
    
    return {
        "luminosity_solar": round(luminosity_solar, 3),
        "luminosity_watts": f"{luminosity_watts:.2e}"  # Scientific notation
    }

# Define the schema for the luminosity function
luminosity_schema = {
    "type": "function",
    "function": {
        "name": "stellar_luminosity",
        "description": "Calculate stellar luminosity using the Stefan-Boltzmann law ($L = 4\pi R^2 \sigma T^4$).",
        "parameters": {
            "type": "object",
            "properties": {
                "radius_solar": {
                    "type": "number",
                    "description": "The radius of the star in units of solar radii (R_sun)."
                },
                "temperature_k": {
                    "type": "number",
                    "description": "The effective surface temperature of the star in Kelvin (K)."
                }
            },
            "required": ["radius_solar", "temperature_k"]
        }
    }
}

# Append it to your existing tools list
tools.append(luminosity_schema)
print(f"The LLM knows about {len(tools)} function(s)")

**Test your understanding:** Add code to the following code cell to test the stellar_luminosity() function.

In [None]:
# Enter your code here


## Section 5: Context Windows & The Knowledge Gap

LLMs have a **knowledge cut-off** (the date their training ended). They are also limited by their **Context Window** (how much text they can "read" at once). 

In [None]:
# Example of a 'Knowledge Gap' question
prompt = "What were the main findings of the Astro-Deep-Search paper published in December 2025?"
print(prompt) 
print("Without RAG, the model will likely say it doesn't know or will hallucinate a guess.")
response = prompt_llm(prompt)

**Test your understanding:** Ask this question by copying it ito the prompt in the next cell: "If an LLM was trained in early 2024, can it correctly identify the current orbital position of a comet discovered in 2025 without using RAG or a Tool?"

In [None]:
prompt = ""
response = prompt_llm(prompt)

## Section 6: Retrieval Augmented Generation (RAG)

RAG solves the problem of hallucinations and the "Knowledge Gap." Instead of relying on the model's internal memory (which might be outdated), we provide a "cheat sheet" of relevant text snippets (the Context) for the model to read before it answers.

RAG is essential when you are working with data or papers released after the model was trained.

In [None]:
# The 'Retrieval' part: We find a relevant snippet from a recent paper
print("\n** Include context in the query")
context_snippet = "The 2025 survey of the G-79 cluster found 12 new blue stragglers."

query = "How many blue stragglers were found in the G-79 cluster survey?"

# The 'Generation' part: We combine context + query
rag_prompt = f"Use this context to answer: {context_snippet}\n\nQuestion: {query}"

response = prompt_llm(rag_prompt)

print('=' * 60)
print("\n** Compare to the result without the context")

response = prompt_llm(query)

## Section 7: RAG with a pre-existing database

I have created a database with all of the lecture content from the online textbook. You can download this database and then use the code below to query this database. This query will also return which lecture has the material. 

### Instructions to download the database

1. Download the database file `Astro1221-LectureDB.zip` from carmen
2. Unzip this file on your computer. I recommend you do this in the same directory structure you use for the class notebooks. For example, if you have your notebooks in a directory like `Astro1221/Notebooks`, I recommend you put the database file in `Astro1221/` and unzip it there. This will create a directory called `Astro1221-LectureDB/`. 
3. Check that the `Astro1221-LectureDB/` directory contains a `chroma.sqlite3` file and a subdirectory with binary files.

### Environmental Setup

Make sure you have the necessary libraries to use the database. Run this command in your terminal:

`python -m pip install chromadb sentence-transformers`

The next code block defines the database query. Edit the `DB_PATH` to point to the database on your computer, then test the agent. 

In [None]:
import chromadb
from sentence_transformers import SentenceTransformer

# --- SETUP ---
# 1. Change DB_PATH to point to the folder where you unzipped Astro1221-LectureDB.zip
DB_PATH = "/Users/martini.10/OneDrive - The Ohio State University/Teaching/Astro1221/Sp26/Astro1221-LectureDB"
COLLECTION_NAME = "Astro1221"

# 2. Load the same embedding model used to create the database
# This is required so the 'math' of the search matches the database
print("Loading brain (embedding model)...")
embedding_model = SentenceTransformer('all-MiniLM-L6-v2')

# 3. Connect to the shared database
client = chromadb.PersistentClient(path=DB_PATH)
collection = client.get_collection(name=COLLECTION_NAME)

In [None]:
def ask_astronomy_ai(query):
    # 1. Retrieve the top 3 most relevant snippets
    # The 'results' object contains 'documents' and 'metadatas'
    results = collection.query(query_texts=[query], n_results=3)
    
    # 2. Extract the text and the sources
    chunks = results['documents'][0]
    sources = results['metadatas'][0] # This contains [{"source": "Lecture08.html"}, ...]
    
    # 3. Format the context so the LLM knows which chunk came from which lecture
    context_with_citations = ""
    for i in range(len(chunks)):
        lecture_name = sources[i]['source']
        context_with_citations += f"\n[From {lecture_name}]:\n{chunks[i]}\n"

    # 4. Build the prompt with instructions to cite sources
    system_prompt = (
        "You are an Astronomy 1221 Tutor. Use the provided lecture snippets to answer. "
        "You should always mention which lecture number you found the information in."
    )
    
    user_message = f"CONTEXT:\n{context_with_citations}\n\nQUESTION: {query}"
    
    messages = [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_message}
    ]
    
    # 5. Call the LLM
    return prompt_llm(messages)

In [None]:
response = ask_astronomy_ai("How do we measure the distance to stars using parallax?")

In [None]:
response = ask_astronomy_ai("What does a chunk represent in RAG?")

In [None]:
response = ask_astronomy_ai("What is a sentence transformer?")

In [None]:
response = ask_astronomy_ai("What does semantic similarity mean?")