Release v0.1.0: feat: lakesense — sketch-based ML data observability for the lakehouse · ramannanda9/lakesense

v0.1.0
5a87813
Choose a tag to compare

Filter

View all tags

v0.1.0: feat: lakesense — sketch-based ML data observability for the lakehouse

v0.1.0
5a87813
Choose a tag to compare

Filter

View all tags

ramannanda9 tagged this 30 Mar 19:45

Core architecture with provider-based sketch profiling and LLM interpretation:

- Sketch engine: MinHash, HLL, KLL via datasketches with mergeable baselines
- Sketch providers: pandas, spark (distributed), streaming
- Column profiling: deterministic scalar metrics per column per run
- Signal detection: Jaccard decay, cardinality shifts, null rate, schema drift
- Two-tier LLM interpretation: Anthropic and OpenAI providers
- Storage backends: Parquet (zero-infra) and DuckDB (optional)
- Heuristic rules engine for fast severity classification
- CI pipeline with per-provider test matrix and ruff/mypy checks
- Tag-triggered PyPI publish workflow with OIDC trusted publishing
- Pre-commit hooks with ruff linting and formatting (line-length 120)

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!