# Eclipse Chatbot â€” LLM-Powered Viewing Advisor
Ask where and when to view an eclipse from your location, or query a specific eclipse at a specific place. The chatbot uses your eclipse database and an LLM to generate viewing advice.

## Setup and API connection

Import dependencies, load the API key from `.env`, and run a quick connection test against the OSU LiteLLM proxy to confirm authentication works before the chatbot is used.

- **`litellm`**: lightweight wrapper that routes LLM calls through the university proxy.
- **`ipywidgets`**: provides the in-notebook chat UI (text input, send button, output area).
- **`dotenv`**: loads `ASTRO1221_API_KEY` from a local `.env` file so credentials stay out of the code.

In [3]:
import warnings
warnings.filterwarnings("ignore", category=UserWarning, module="pydantic")

import os
import json
import re
import numpy as np
from datetime import datetime
from dotenv import load_dotenv
import litellm
from litellm import completion
import ipywidgets as widgets
from IPython.display import display, HTML

# Enable debug mode only if environment variable is set
if os.environ.get("LITELLM_DEBUG", "false").lower() == "true":
    litellm._turn_on_debug()

# Load API key from .env
load_dotenv()
api_key = os.environ.get("ASTRO1221_API_KEY")

# Ohio State's LiteLLM proxy server
API_BASE = "https://litellmproxy.osu-ai.org"

if not api_key:
    print("âš   ASTRO1221_API_KEY not found in .env â€” check your file.")
else:
    print(f"âœ“ API key loaded (ends with ...{api_key[-4:]})")

# Quick connection test via OSU proxy
try:
    test_resp = completion(
        model="openai/GPT-4.1-mini",
        messages=[{"role": "user", "content": "Say hi in 3 words."}],
        api_base=API_BASE,
        api_key=api_key,
    )
    print(f"âœ“ Connected to OSU proxy! Reply: {test_resp.choices[0].message.content}")
except Exception as e:
    print(f"âœ— LiteLLM error: {e}")

âœ“ API key loaded (ends with ...faow)
âœ“ Connected to OSU proxy! Reply: Hello there, friend!


## Load eclipse data and create catalog

Reads the pre-built `eclipse_data.json` (generated by `EclipseData.ipynb`) and wraps it in an `EclipseCatalog` instance. The catalog provides all search, visibility, and summary methods used by the chatbot.

In [4]:
# ============================================================
# Load eclipse database + create catalog
# ============================================================
from eclipse_catalog import EclipseCatalog

with open("eclipse_data.json") as f:
    data = json.load(f)

eclipse_list = data["eclipse_list"]
catalog = EclipseCatalog(eclipse_list)
print(f"âœ“ Loaded {len(catalog)} eclipses  ({catalog[0]['date_raw']} â†’ {catalog[-1]['date_raw']})")

âœ“ Loaded 224 eclipses  (2001 Jun 21 â†’ 2100 Sep 04)


## Verify catalog methods

A quick sanity check that the `EclipseCatalog` methods work. All search, visibility, summary, and local-view classification logic now lives in `eclipse_catalog.py` instead of being defined inline in this notebook.

In [5]:
# ============================================================
# ECLIPSE SEARCH & GEOMETRY â€” now via EclipseCatalog
# ============================================================
# All search, visibility, and summary logic lives in the
# EclipseCatalog class (eclipse_utils.py).  Usage:
#   catalog.find_next_visible(lat, lon)
#   catalog.find_by_date("2017 Aug 21")
#   catalog.summary(ecl, obs_lat, obs_lon)
#   catalog.local_view(ecl, lat, lon)
#   catalog.is_visible_from(ecl, lat, lon)
#   catalog.parse_date(date_str)

# Quick test
print("âœ“ Catalog methods ready.")
test = catalog.find_next_visible(30.0, -97.0, n=2)  # Austin, TX
for ecl, dt, dist in test:
    print(f"  Next from Austin: {ecl['date_raw']}  {ecl['type']}  ({dist:,.0f} km away)")

âœ“ Search functions loaded.
  Next from Austin: 2045 Aug 12  Total  (1,820 km away)
  Next from Austin: 2052 Mar 30  Total  (1,072 km away)


## LLM integration â€” prompt builder and chat function

This cell wires up the chatbot logic:

- **`SYSTEM_PROMPT`**: instructs the LLM to act as an eclipse advisor and use only the provided database context.
- **`build_eclipse_context()`**: analyzes the user's message for coordinates, city names, dates, and eclipse types, then pulls matching records from the `EclipseCatalog` to inject as context for the LLM.
- **`chat()`**: appends the augmented message to the conversation history, calls the LLM via the OSU LiteLLM proxy, and returns the reply along with any detected observer coordinates.

The context-building step is what makes this a **Retrieval-Augmented Generation (RAG)** pattern: the LLM never sees the full database, only the relevant subset for each question.

In [6]:
# ============================================================
# LLM INTEGRATION â€” prompt builder + chat function
# ============================================================

SYSTEM_PROMPT = """You are an expert solar eclipse advisor. You help people find
and plan for solar eclipses based on a NASA catalog of 224 eclipses from 2001â€“2100.

When the user asks about eclipses, you will receive ECLIPSE DATA pulled from the
database as context. Use that data to give accurate, specific answers.

Your capabilities:
â€¢ Tell users the next visible eclipse(s) from their location
â€¢ Describe what a specific eclipse will look like from a given place
â€¢ Provide viewing advice (safety, best locations along the path, weather tips)
â€¢ Explain eclipse types (Total, Annular, Hybrid, Partial) and what they look like
â€¢ Suggest the best lat/lon coordinates for viewing a given eclipse

When recommending a viewing location, always include the latitude and longitude
so the user can plug them into the visualization tool. Format coordinates as:
  **Recommended viewing: XX.XÂ°N, XX.XÂ°E**

Keep answers concise but informative. Use the eclipse data provided â€” don't invent
eclipse dates or magnitudes."""


# Conversation history (persists across messages)
chat_history = []


def build_eclipse_context(user_message):
    """
    Analyze the user's message and pull relevant eclipse data to
    inject as context for the LLM.
    """
    msg = user_message.lower()
    context_parts = []

    # --- Try to extract coordinates from the message ---
    obs_lat, obs_lon = None, None

    # Match patterns like "30N 97W", "30.5, -97.2", "lat 30 lon -97"
    coord_patterns = [
        r'(\-?\d+\.?\d*)\s*Â°?\s*[NnSs]?\s*,?\s*(\-?\d+\.?\d*)\s*Â°?\s*[EeWw]?',
        r'lat(?:itude)?\s*[:=]?\s*(\-?\d+\.?\d*)\s*,?\s*lon(?:gitude)?\s*[:=]?\s*(\-?\d+\.?\d*)',
    ]
    for pat in coord_patterns:
        m = re.search(pat, user_message)
        if m:
            obs_lat = float(m.group(1))
            obs_lon = float(m.group(2))
            break

    # --- Check for well-known city names â†’ approximate coords ---
    city_coords = {
        "new york": (40.7, -74.0), "los angeles": (34.1, -118.2),
        "chicago": (41.9, -87.6), "houston": (29.8, -95.4),
        "austin": (30.3, -97.7), "dallas": (32.8, -96.8),
        "denver": (39.7, -105.0), "seattle": (47.6, -122.3),
        "miami": (25.8, -80.2), "atlanta": (33.7, -84.4),
        "london": (51.5, -0.1), "paris": (48.9, 2.3),
        "tokyo": (35.7, 139.7), "sydney": (-33.9, 151.2),
        "cairo": (30.0, 31.2), "mumbai": (19.1, 72.9),
        "beijing": (39.9, 116.4), "mexico city": (19.4, -99.1),
        "toronto": (43.7, -79.4), "berlin": (52.5, 13.4),
        "rome": (41.9, 12.5), "madrid": (40.4, -3.7),
        "san francisco": (37.8, -122.4), "phoenix": (33.4, -112.0),
        "boston": (42.4, -71.1), "washington": (38.9, -77.0),
        "nashville": (36.2, -86.8), "portland": (45.5, -122.7),
        "indianapolis": (39.8, -86.2), "cleveland": (41.5, -81.7),
    }
    for city, (clat, clon) in city_coords.items():
        if city in msg:
            obs_lat, obs_lon = clat, clon
            context_parts.append(f"[Detected city: {city.title()} â†’ {clat}Â°N, {clon}Â°E]")
            break

    # --- "Next eclipse" query ---
    if any(kw in msg for kw in ["next", "upcoming", "when", "soonest", "future"]):
        if obs_lat is not None:
            results = catalog.find_next_visible(obs_lat, obs_lon, n=3)
            if results:
                context_parts.append(f"NEXT ECLIPSES VISIBLE FROM ({obs_lat}Â°N, {obs_lon}Â°E):")
                for ecl, dt, dist in results:
                    context_parts.append(catalog.summary(ecl, obs_lat, obs_lon))
                    context_parts.append("---")
            else:
                context_parts.append(f"No upcoming eclipses found visible from ({obs_lat}, {obs_lon}) in the database.")

    # --- Specific date query ---
    date_patterns = [
        r'(\d{4}\s+\w{3}\s+\d{1,2})',        # "2024 Apr 08"
        r'(\w+\s+\d{1,2},?\s+\d{4})',          # "April 8, 2024"
        r'(\d{4})',                              # just a year
    ]
    for pat in date_patterns:
        m = re.search(pat, user_message)
        if m:
            date_str = m.group(1)
            matches = catalog.find_by_date(date_str)
            if matches:
                context_parts.append(f"ECLIPSES MATCHING '{date_str}':")
                for ecl in matches[:5]:
                    context_parts.append(catalog.summary(ecl, obs_lat, obs_lon))
                    context_parts.append("---")
            break

    # --- Eclipse type query ---
    for etype in ["total", "annular", "hybrid", "partial"]:
        if etype in msg:
            type_eclipses = [e for e in catalog.eclipses if e["type"].lower() == etype]
            context_parts.append(f"DATABASE: {len(type_eclipses)} {etype} eclipses in catalog.")
            now = datetime.now()
            upcoming = [(e, catalog.parse_date(e["date_raw"]))
                        for e in type_eclipses
                        if catalog.parse_date(e["date_raw"]) and
                           catalog.parse_date(e["date_raw"]) > now][:3]
            for e, dt in upcoming:
                context_parts.append(catalog.summary(e, obs_lat, obs_lon))
                context_parts.append("---")
            break

    # --- General stats if no specific context was found ---
    if not context_parts:
        context_parts.append(
            f"DATABASE: {len(catalog)} solar eclipses from 2001â€“2100. "
            f"Types: Total (68), Annular (72), Hybrid (7), Partial (77). "
            f"Ask about a specific date, location, or eclipse type for detailed info."
        )

    return "\n".join(context_parts), obs_lat, obs_lon


def chat(user_message):
    """
    Send a message to the chatbot. Returns the assistant's reply
    and any extracted coordinates (lat, lon) or (None, None).
    """
    # Build eclipse context
    context, obs_lat, obs_lon = build_eclipse_context(user_message)

    # Add context as a hidden system-level note
    augmented_msg = f"{user_message}\n\n[ECLIPSE DATABASE CONTEXT]\n{context}"

    chat_history.append({"role": "user", "content": augmented_msg})

    messages = [{"role": "system", "content": SYSTEM_PROMPT}] + chat_history

    try:
        response = completion(
            model="openai/GPT-4.1-mini",
            messages=messages,
            api_base=API_BASE,
            api_key=api_key,
        )
        reply = response.choices[0].message.content
    except Exception as e:
        reply = f"âš  LLM error: {e}"

    chat_history.append({"role": "assistant", "content": reply})

    return reply, obs_lat, obs_lon


print("âœ“ Chat function ready.")

âœ“ Chat function ready.


## Interactive chat interface

This cell builds a widget-based chat UI inside the notebook using `ipywidgets`:

- **`chat_output`**: scrollable output area that displays the conversation history.
- **`text_input` + `send_btn`**: text field and button for submitting questions.
- **`coord_display`**: shows the last latitude/longitude the chatbot detected, so you can copy them into the visualization notebook.

The `on_send` callback calls `chat()`, renders the reply, and updates the coordinate display if a location was detected. Type a question and press Enter (or click Send) to try it.

In [7]:
# ============================================================
# CHAT INTERFACE
# ============================================================
# Type a question and press Enter (or click Send).
# The chatbot will search the eclipse database, inject relevant
# data as context, and use the LLM to answer.
#
# Try:
#   "When is the next eclipse visible from Austin?"
#   "Tell me about the 2026 total eclipse"
#   "Where should I go to see the 2027 Aug total eclipse?"
#   "What eclipses can I see from Tokyo in the next 20 years?"
# ============================================================

# --- Widgets ---
chat_output = widgets.Output(layout={"border": "1px solid #444", "width": "100%",
                                      "min_height": "300px", "max_height": "500px",
                                      "overflow_y": "auto"})
text_input = widgets.Text(
    placeholder="Ask about an eclipseâ€¦ (press Enter)",
    layout={"width": "80%"},
)
send_btn = widgets.Button(description="Send", button_style="primary",
                          layout={"width": "18%"})
coord_display = widgets.HTML(value="<i>No coordinates detected yet.</i>")

# Store last detected coords so user can copy them to the visualization
last_coords = {"lat": None, "lon": None}


def append_chat(role, text):
    """Append a message to the chat display."""
    with chat_output:
        if role == "user":
            display(HTML(
                f'<div style="margin:6px 0;padding:8px 12px;background:#1a3a5c;'
                f'color:#ddd;border-radius:10px;text-align:right;">'
                f'<b>You:</b> {text}</div>'
            ))
        else:
            display(HTML(
                f'<div style="margin:6px 0;padding:8px 12px;background:#2a2a2a;'
                f'color:#eee;border-radius:10px;">'
                f'<b>ðŸŒ’ Eclipse Bot:</b><br>{text}</div>'
            ))


def on_send(_=None):
    msg = text_input.value.strip()
    if not msg:
        return
    text_input.value = ""

    append_chat("user", msg)

    # Show "thinkingâ€¦"
    with chat_output:
        thinking = display(HTML('<i style="color:#888;">Thinkingâ€¦</i>'), display_id=True)

    reply, lat, lon = chat(msg)

    # Replace "thinking" with actual reply
    with chat_output:
        if thinking:
            thinking.update(HTML(""))  # clear thinking indicator
        append_chat("assistant", reply.replace("\n", "<br>"))

    # Update coord display
    if lat is not None and lon is not None:
        last_coords["lat"] = lat
        last_coords["lon"] = lon
        coord_display.value = (
            f'<b>Last detected location:</b> {lat:.1f}Â°N, {lon:.1f}Â°E  '
            f'<span style="color:#888;">â€” use these in the Visualization notebook</span>'
        )

text_input.continuous_update = False
text_input.observe(on_send, names='value')
send_btn.on_click(on_send)

# --- Layout ---
display(widgets.VBox([
    chat_output,
    widgets.HBox([text_input, send_btn]),
    coord_display,
]))

VBox(children=(Output(layout=Layout(border_bottom='1px solid #444', border_left='1px solid #444', border_rightâ€¦