Skip to content

beallio/wherewolf

Repository files navigation

Wherewolf

CI PyPI version License: MIT

A production-grade, local SQL workbench for querying files (CSV, Parquet, JSON) using DuckDB or Spark.

Features

  • Multi-Engine Support: Execute SQL via DuckDB (local) or Spark (local[*]). Native support for CSV, Parquet, JSON, and Excel (.xlsx, .xls).
  • 📁 Dataset Catalog: Improved file browser with directory-first sorting, folder icons, and extension filtering for a cleaner experience.
  • 🔗 Multi-Table Queries: Perform JOINs, unions, and subqueries across different file formats in a single session.
  • 📊 Schema & Metadata HUD: Instant visibility of column names and data types for any dataset in your catalog.
  • SQL Translation: Real-time translation between DuckDB and SparkSQL dialects using SQLGlot.
  • Modern UI: Distraction-free interface with a hidden toolbar, reduced whitespace, and clear visual hierarchy.
  • Safe Preview: Scrollable results limited to 1000 rows.
  • Query History: Persists past queries in ~/.wherewolf/history.json.
  • Export: Download query results as CSV, Excel, or Parquet.
  • Execution Metrics: Tracks row count and execution time.

Wherewolf Screenshot

Installation

Ensure you have uv installed.

From PyPI (Recommended)

uv tool install wherewolf
wherewolf

From Source

git clone https://github.com/beallio/wherewolf.git
cd wherewolf
uv sync

Usage

If running from source:

uv run streamlit run src/wherewolf/app.py
  1. Use the Manage Dataset Catalog section in the sidebar to browse and add files.
  2. Each file is assigned an alias (e.g., users, orders).
  3. Write your SQL query using these aliases in the editor.
  4. Click Run to execute.
  5. View results, execution metrics, or switch the Metadata Focus to inspect other schemas.
  6. Export or view the translated SQL if needed.

Development

Run tests:

uv run pytest

Lint/Format:

ruff check . --fix
ruff format .

For information on how to release new versions, see RELEASING.md.

Dependencies

  • streamlit
  • duckdb
  • pyspark
  • ibis-framework
  • sqlglot
  • pandas
  • pyarrow
  • openpyxl

About

Wherewolf is a production-grade, local SQL workbench designed for data engineers and analysts to query local files (CSV, Parquet, JSON) with ease. Built with Streamlit, it provides a unified interface to execute SQL against either DuckDB or PySpark engines without requiring complex setup.

Topics

Resources

License

Stars

Watchers

Forks

Contributors