feat: add dataset list command#53
Merged
Merged
Conversation
Lists the datasets ingested into the cluster (tables in training_test_datasets), reusing the dataset rm exec seam (SPDYExecutor + findRunningPod) to query the mysql pod via information_schema — so a never-pushed cluster lists empty instead of erroring. Bare 'tracebloc dataset list' uses the current kubeconfig context; --kubeconfig/--context/--namespace override it, and --output-json emits {namespace,release,count,datasets[]} on stdout (human output → stderr). Wired into the dataset subtree, the home screen, and the parent doc comment.
Tests are cluster-free: parseDatasetList (raw mysql output → []string), renderDatasetList (empty + populated), writeDatasetListJSON (shape + nil→[]).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Contributor
|
👋 Heads-up — Code review queue is at 18 / 8 Above the WIP limit. The team convention is to review existing PRs before opening new work. Open PRs currently in Code review (oldest first):
Pull from review before opening new work. (This is a nudge from the kanban WIP check, not a block.) |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 860c271. Configure here.
saadqbal
approved these changes
Jun 4, 2026
dataset list --output-json now emits a JSON error object on early-failure paths (kubeconfig, no parent release, cluster query), not just on success — mirroring the dataset push fix from #49. runDatasetList uses a named return + jsonEmitted flag + a defer; adds writeDatasetListErrorJSON. Covered by TestRunDatasetList_OutputJSONEarlyFailureEmitsJSON. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Adds
tracebloc dataset list— a read-only listing of the datasets ingested into the cluster (the tables intraining_test_datasets). Scope A: names only.Behaviour
tracebloc dataset listruns against your current kubeconfig context + its namespace.--kubeconfig/--context/--namespaceare optional overrides (zero-value-safe, same ascluster info);--output-jsonemits{namespace, release, count, datasets[]}on stdout (human output → stderr).dataset push;--helplinks the dashboard (https://ai.tracebloc.io/metadata) for the full catalog.Mechanism
Reuses the
dataset rmexec seam —push.SPDYExecutor+findRunningPod+IngestionDatabase— to run one query in the mysql pod:information_schema(notSHOW TABLES) means a never-pushed cluster returns an empty list, not an error. Raw output →[]stringvia a pure, unit-tested parser.Exit codes
0listed (incl. empty) ·3kubeconfig ·4no parent release in the namespace ·7cluster query failed.Verification
make cigreen. Cluster-free tests:parseDatasetList,renderDatasetList(empty + populated),writeDatasetListJSON(shape + nil→[]).--output-jsonproduced clean JSON on stdout (banner on stderr).🤖 Generated with Claude Code
Note
Low Risk
Read-only cluster query reusing the existing mysql exec seam; no data mutations or auth changes.
Overview
Adds
tracebloc dataset list, a read-only way to see ingested dataset table names in the cluster. It is registered underdataset, mentioned on the root home screen, and follows the same kubeconfig / namespace flags and exit codes (3, 4, 7) asdataset pushandcluster info.The command loads the parent release, then calls new
push.ListDatasets, which execs into the mysql pod (sameSPDYExecutor/findRunningPodpath as teardown) and runs aninformation_schemaquery so a never-pushed cluster returns an empty list instead of failing. Human output lists names (with an empty-state hint towarddataset push);--output-jsonwrites{namespace, release, count, datasets[]}on stdout with banner on stderr, including JSON error objects on early failures likedataset push.Unit tests cover mysql output parsing, rendering, JSON shape, and the JSON-on-failure contract.
Reviewed by Cursor Bugbot for commit 13f004c. Bugbot is set up for automated code reviews on this repo. Configure here.