Add chdb skills: DataStore (pandas API) and SQL#14
Conversation
Add two new skills for chdb, the in-process ClickHouse engine for Python: - chdb-datastore: Pandas-compatible DataStore API. Drop-in pandas replacement with ClickHouse performance, supporting 16+ data sources and cross-source joins. - chdb-sql: Raw ClickHouse SQL API. Covers chdb.query(), Session, DB-API 2.0, parametrized queries, UDFs, streaming, and all ClickHouse table functions. Each skill includes SKILL.md, API references, runnable examples, metadata.json, README.md, and a verify_install.py script.
evellasques
left a comment
There was a problem hiding this comment.
Really nice addition!
I just added two minor comments (feel free to ignore).
- Fix .where() description: it follows pandas semantics (masks non-matching values with NaN) rather than being an alias for .filter() - Add .where() usage example showing correct behavior - Add sorted values assertion to verify_install.py check_sort()
Great suggestions, fixed |
There was a problem hiding this comment.
Pull request overview
Adds two new agent skills—chdb-datastore (pandas-compatible DataStore API) and chdb-sql (in-process ClickHouse SQL for Python)—and updates repository docs to advertise and describe these skills alongside existing ClickHouse best-practices guidance.
Changes:
- Introduces
skills/chdb-datastore/with SKILL definition, API/connectors references, runnable examples, and an install verification script. - Introduces
skills/chdb-sql/with SKILL definition, SQL/table-function/API references, runnable examples, and an install verification script. - Updates root documentation (
README.md,AGENTS.md) to include both new skills in the repo overview/structure.
Reviewed changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| skills/chdb-sql/scripts/verify_install.py | Adds an environment verification script for chdb SQL usage. |
| skills/chdb-sql/references/table-functions.md | Documents ClickHouse table functions relevant to chdb SQL usage. |
| skills/chdb-sql/references/sql-functions.md | Provides a quick reference for common ClickHouse SQL functions. |
| skills/chdb-sql/references/api-reference.md | Documents chdb SQL API surface (query/session/dbapi/etc.). |
| skills/chdb-sql/metadata.json | Adds skill metadata (version/org/date/abstract/references). |
| skills/chdb-sql/examples/examples.md | Adds runnable usage examples and expected outputs for SQL workflows. |
| skills/chdb-sql/SKILL.md | Defines activation guidance + quick-start for the SQL skill. |
| skills/chdb-sql/README.md | Maintainer/trigger-phrase overview for the SQL skill. |
| skills/chdb-datastore/scripts/verify_install.py | Adds an environment verification script for DataStore usage. |
| skills/chdb-datastore/references/connectors.md | Documents DataStore connectors across files/cloud/databases/lakes. |
| skills/chdb-datastore/references/api-reference.md | Documents DataStore API surface and pandas-like operations. |
| skills/chdb-datastore/metadata.json | Adds skill metadata (version/org/date/abstract/references). |
| skills/chdb-datastore/examples/examples.md | Adds runnable usage examples and expected outputs for DataStore workflows. |
| skills/chdb-datastore/SKILL.md | Defines activation guidance + quick-start for the DataStore skill. |
| skills/chdb-datastore/README.md | Maintainer/trigger-phrase overview for the DataStore skill. |
| README.md | Updates repo overview and skill list to include chdb skills. |
| AGENTS.md | Updates repository structure documentation to include both new skills. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
pjhampton
left a comment
There was a problem hiding this comment.
Generally, this is a good addition. You should know that we can't currently support it in our code product just yet - soon. Another thing we have in the CI of this project is to execute the code samples so we catch if the skills go stale. I recommend looking into that as a follow up PR.
|
@auxten should we merge? |
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
- Resolve README.md conflict: include both architecture-advisor (from upstream) and chdb skills sections - Fix expected output order in chdb-sql examples (Bob=178 before Carol=177 to match ORDER BY total DESC) - Fix Session docs inconsistency: use Session() as canonical in-memory form (not Session(path=":memory:")) - Add try/finally to verify_install.py check_session() for cleanup
|
@pjhampton Pls merge, most suggestions from copilot are fixed. |
Summary
Add chdb-datastore skill: Pandas-compatible DataStore API for chdb. Drop-in pandas replacement backed by ClickHouse (
import chdb.datastore as pd), supporting 16+ data sources (MySQL, PostgreSQL, S3, MongoDB, ClickHouse, Iceberg, Delta Lake, etc.) and 10+ file formats with cross-source joins.Add chdb-sql skill: In-process ClickHouse SQL API for Python. Covers
chdb.query(), Session, DB-API 2.0, parametrized queries, UDFs, streaming, and all ClickHouse table functions.Update root
README.mdandAGENTS.mdto include the new skills.Skill Structure
Each skill follows the agent-skills format with:
SKILL.mdmetadata.jsonREADME.mdreferences/*.mdexamples/examples.mdscripts/verify_install.pyWhy Two Skills?
The skills are split by usage pattern so agents load only what's relevant:
Both cross-reference each other in their SKILL.md so the agent knows when to switch.
Test Plan
SKILL.mdfrontmatter parses correctly (name, description, license, metadata fields)python scripts/verify_install.pyin both skill directories (requirespip install chdb)npx skills addand verify agent activation on trigger phrasesnpx skills add auxten/agent-skills --listcorrectly detected all 3 skills (chdb-datastore, chdb-sql, clickhouse-best-practices)npx skills add --allinstalled 3 skills to 42 agent directories via symlinks.claude/skills/,.windsurf/skills/etc. contain correct symlinksskills-lock.jsongenerated with correct source and hashes