GitHub - Team-ASHTOJ/Tethys: Conversational AI model for ARGO data

Tethys: Conversational ARGO Float Data Explorer 🌊💬

Tethys is a conversational AI system designed to make exploring ARGO float oceanographic data simple, intuitive, and interactive. Instead of wrestling with NetCDF files or SQL queries, users can ask questions in plain English and receive answers, summaries, and visualizations.

This project bridges the gap between vast ocean datasets and the researchers, students, and enthusiasts who want to understand them quickly and effectively.

⸻

🚀 Features • Natural Language Queries Ask: “Show me active floats in the Indian Ocean with high salinity readings last month.” Get: A direct, readable answer. • Interactive Map Visualizations View float locations, trajectories, and parameters (temperature, salinity, pressure, etc.) on a world map. • Data Summarization Generate quick summaries of float data, missions, or regional trends. • Semantic Search (Vector DB) Retrieve relevant information using embeddings, even if your query doesn’t match exact keywords.

🏗️ System Architecture

Tethys is powered by a Retrieval-Augmented Generation (RAG) pipeline: 1. Data Ingestion • Download and preprocess ARGO NetCDF files. • Convert them into Parquet format and store them in PostgreSQL. • Create embeddings and push them into a Chroma/FAISS vector database. 2. User Query • User interacts through the frontend dashboard (chat + map). 3. Backend Engine • Query is embedded → relevant chunks retrieved from vector DB + PostgreSQL. 4. LLM Augmentation • Retrieved data + query are sent to a language model. • Generates a coherent, human-readable response. 5. Response & Visualization • Final answer and visual data (maps/graphs) are sent back to the frontend.

⸻

📊 Example Queries • “Show me floats near the Bay of Bengal with high surface temperature.” • “Summarize salinity trends in the Arabian Sea in 2023.” • “List active floats with depth > 1000m in the Indian Ocean.”

⚙️ Tech Stack • Data Handling: Python, xarray, pandas, parquet • Databases: PostgreSQL (relational), Chroma/FAISS (vector) • Embeddings/LLM: Sentence Transformers / HuggingFace models • Frontend: Streamlit, Plotly, Leaflet • Backend: Retrieval-Augmented Generation (RAG)

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Tethys		Tethys
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages