# Lesson 3: The LLM as a Translation Engine NOT COMPLETED YET

Welcome to the grand finale! Let's recap what we've built so far:
* **Lesson 1 (RAG):** We taught our LLM how to understand fuzzy human words ("defrosted dunes") using NASA's landform dictionary.
* **Lesson 2 (APIs):** We wrote a Python script to send a strict Lucene query to NASA's server and get back a clickable link to view an image.

Right now, *we* are still the ones doing the heavy lifting. We have to manually figure out the exact Lucene syntax and type it into our Python script. 

In this lesson, we are going to use the LLM to bridge that gap. We will turn the LLM into a **Translation Engine**. Its only job will be to listen to a user's natural language request and translate it into a perfectly formatted Lucene query that our API script can understand.

### Guardrails: The Strict System Prompt
By default, LLMs love to chat. If we ask it for a query, it might say: *"Sure! Here is your query: `...` Let me know if you need anything else!"* Our Python script can't understand that extra text. We need to use a **System Prompt** to strictly forbid the LLM from being conversational.

In [7]:
import os
import requests
from openai import OpenAI
from dotenv import load_dotenv

load_dotenv()
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key=os.getenv("OPENROUTER_API_KEY")
)

# 1. The Strict System Prompt
# We must explicitly tell the LLM not to chat. We ONLY want the raw query!
LUCENE_PROMPT = """
You are a NASA PDS search translator. 
Your job is to translate a user's natural language request into a strict Lucene query.

Use the following fields that apply to the user's query. If a field is not relevant, simply omit it from the query:
- gather.common.mission (e.g., "mro", "mars_2020")
- pds3_label.SOLAR_LATITUDE (e.g., [0 TO 80])

CRITICAL INSTRUCTION: Return ONLY the Lucene query string. Do not include markdown formatting, quotes, or conversational text.
"""

# 2. The Translation Function
def generate_lucene_query(user_request):
    print(f"User asked: '{user_request}'\n")
    print("ðŸ§  LLM is translating to Lucene...")
    
    response = client.chat.completions.create(
        model="allenai/olmo-3.1-32b-instruct", # A great model for following strict instructions
        messages=[
            {"role": "system", "content": LUCENE_PROMPT},
            {"role": "user", "content": user_request}
        ],
        temperature=0.1 # We keep the temperature very low so it doesn't get "creative"
    )
    
    # .strip() removes any accidental spaces or hidden newlines the LLM might have added
    clean_query = response.choices[0].message.content.strip()
    return clean_query

# Let's test the brain!
user_idea = "Find me MRO images of Mars where the solar latitude is between 0 and 50."
generated_query = generate_lucene_query(user_idea)

print(f"\nâœ… Generated Query: {generated_query}")

User asked: 'Find me MRO images of Mars where the solar latitude is between 0 and 50.'

ðŸ§  LLM is translating to Lucene...

âœ… Generated Query: gather.common.mission:mro AND pds3_label.SOLAR_LATITUDE:[0 TO 50]


### Hooking the Brain to the Hands

Look at that! The LLM successfully parsed the user's intent, remembered the exact database fields we provided in the system prompt, and output *only* the string we need. 

Now, we just take that generated string and plug it directly into the API Search code we wrote back in Lesson 2!

In [8]:
# 3. Plug the LLM's query directly into the NASA API
if generated_query:
    print("ðŸš€ Sending the LLM's generated query to the NASA API...")
    
    search_url = "https://pds-imaging.jpl.nasa.gov/api/search/atlas/_search"
    payload = {
        "query": {"query_string": {"query": generated_query}},
        "size": 1 
    }
    headers = {"Content-Type": "application/json"}
    
    api_response = requests.post(search_url, json=payload, headers=headers)
    
    if api_response.status_code == 200:
        hits = api_response.json().get("hits", {}).get("hits", [])
        if hits:
            # Grab the URI of the first matching image
            uri = hits[0].get("_source", {}).get("uri")
            
            # Format our beautiful Atlas Viewer link!
            print(f"\nðŸŽ‰ Success! The LLM found your image. View it here:")
            print(f"https://pds-imaging.jpl.nasa.gov/tools/atlas/record?uri={uri}")
        else:
            print("\nThe query was valid, but no images matched those specific filters.")
    else:
        print(f"\nAPI Error: {api_response.status_code} - {api_response.text}")

ðŸš€ Sending the LLM's generated query to the NASA API...

ðŸŽ‰ Success! The LLM found your image. View it here:
https://pds-imaging.jpl.nasa.gov/tools/atlas/record?uri=atlas:pds3:mro:mars_reconnaissance_orbiter:/HiRISE/EDR/ESP/ORB_058400_058499/ESP_058410_2210/ESP_058410_2210_BG12_0.IMG


### Congratulations! 
You have successfully built an AI-powered search pipeline! 

By separating the "brain" (the LLM translating the query) from the "hands" (Python executing the API request), you've created a highly reliable, cost-effective way to search complex scientific databases using simple human language. 

**What's Next?** In advanced AI engineering, developers use protocols like **MCP (Model Context Protocol)** to allow the LLM to write the query, run the API, and read the results all on its own in a continuous loop. But the core logicâ€”translating intent into API-readable syntaxâ€”is exactly what you just mastered here!