Skip to content

Team-ASHTOJ/Tethys

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Tethys: Conversational ARGO Float Data Explorer 🌊💬

Tethys is a conversational AI system designed to make exploring ARGO float oceanographic data simple, intuitive, and interactive. Instead of wrestling with NetCDF files or SQL queries, users can ask questions in plain English and receive answers, summaries, and visualizations.

This project bridges the gap between vast ocean datasets and the researchers, students, and enthusiasts who want to understand them quickly and effectively.

🚀 Features • Natural Language Queries Ask: “Show me active floats in the Indian Ocean with high salinity readings last month.” Get: A direct, readable answer. • Interactive Map Visualizations View float locations, trajectories, and parameters (temperature, salinity, pressure, etc.) on a world map. • Data Summarization Generate quick summaries of float data, missions, or regional trends. • Semantic Search (Vector DB) Retrieve relevant information using embeddings, even if your query doesn’t match exact keywords.

🏗️ System Architecture

Tethys is powered by a Retrieval-Augmented Generation (RAG) pipeline: 1. Data Ingestion • Download and preprocess ARGO NetCDF files. • Convert them into Parquet format and store them in PostgreSQL. • Create embeddings and push them into a Chroma/FAISS vector database. 2. User Query • User interacts through the frontend dashboard (chat + map). 3. Backend Engine • Query is embedded → relevant chunks retrieved from vector DB + PostgreSQL. 4. LLM Augmentation • Retrieved data + query are sent to a language model. • Generates a coherent, human-readable response. 5. Response & Visualization • Final answer and visual data (maps/graphs) are sent back to the frontend.

📊 Example Queries • “Show me floats near the Bay of Bengal with high surface temperature.” • “Summarize salinity trends in the Arabian Sea in 2023.” • “List active floats with depth > 1000m in the Indian Ocean.”

⚙️ Tech Stack • Data Handling: Python, xarray, pandas, parquet • Databases: PostgreSQL (relational), Chroma/FAISS (vector) • Embeddings/LLM: Sentence Transformers / HuggingFace models • Frontend: Streamlit, Plotly, Leaflet • Backend: Retrieval-Augmented Generation (RAG)

About

Conversational AI model for ARGO data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages