feat: dot notation for indexes create and databases load by eddietejeda · Pull Request #92 · hotdata-dev/hotdata-cli

eddietejeda · 2026-05-20T00:54:44Z

Summary

hotdata indexes create airbnb.listings[description] --type bm25 — replaces --connection-id / --schema / --table / --columns flags with a single positional arg using bracket notation for columns
hotdata databases load airbnb.listings --url <url> — new top-level shorthand replacing hotdata databases tables load airbnb listings --url ...
--name on indexes create is now optional; auto-derived as {table}_{cols}_{type} when omitted
Schema defaults to public in both commands when only db.table is given (no schema segment)
Add connections::resolve_connection_id(api, name_or_id) to look up a connection by name or raw ID
Remove the load: ... hint from databases create success output

Examples

# Create a BM25 index — name auto-derived as "listings_description_bm25"
hotdata indexes create airbnb.listings[description] --type bm25

# Load a parquet file (shorthand)
hotdata databases load airbnb.listings --url https://www.hotdata.dev/data/sf-airbnb-listings.parquet

Test plan

hotdata indexes create airbnb.listings[description] --type bm25 resolves connection by name and creates index
hotdata indexes create airbnb.public.listings[name,description] --type bm25 --name my_idx uses explicit name and schema
hotdata indexes create --dataset-id ds123 --columns col --type bm25 still works for dataset scope
hotdata databases load airbnb.listings --url <url> loads the table
hotdata databases load airbnb.public.listings --file ./data.parquet uses explicit schema
hotdata databases create no longer prints the load: hint

🤖 Generated with Claude Code

… notation for databases load - `hotdata indexes create airbnb.listings[col1,col2] --type bm25` replaces verbose --connection-id/--schema/--table/--columns flags - `hotdata databases load airbnb.listings --url ...` replaces `hotdata databases tables load airbnb listings --url ...` - `--name` on indexes create is now optional (auto-derived from table+cols+type) - Add `connections::resolve_connection_id` to look up connection by name or ID - Remove load hint from `databases create` success output Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

sentry · 2026-05-20T00:55:45Z

Codecov Report

❌ Patch coverage is 41.95804% with 83 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/main.rs	47.61%	66 Missing ⚠️
src/connections.rs	0.00%	17 Missing ⚠️

📢 Thoughts on this report? Let us know!

claude · 2026-05-20T00:57:54Z

+                                            );
+                                            std::process::exit(1);
+                                        });
+                                    let auto = format!("dataset_{type}", type = r#type);


nit: auto-name for the dataset branch is dataset_{type}, which collides for every additional index on the same dataset (e.g. creating BM25 indexes over two different columns both auto-name to dataset_bm25 and the second create will fail). The columns are already captured in cols here — mirroring the connection branch with something like format!("dataset_{cols}_{type}", cols = cols.replace(',', "_")) would make the default useful instead of guaranteed-collision. (not blocking)

claude · 2026-05-20T00:57:58Z

+    let table_part = &target[..bracket_pos];
+    let cols_raw = target[bracket_pos + 1..].trim_end_matches(']');


nit: this silently accepts unclosed/garbled brackets. airbnb.listings[col (missing ]) parses successfully as cols_raw = "col" because trim_end_matches(']') does nothing when there's no ]. Likewise airbnb.listings[col]extra parses as a single column "col]extra" and is sent to the server. Worth requiring target.ends_with(']') (or checking that bracket_pos is followed by a matching ] at the end) and rejecting otherwise, so users get a clear local error rather than a confusing API error. (not blocking)

claude · 2026-05-20T00:58:02Z

+fn parse_db_target(target: &str) -> (String, String, String) {
+    let parts: Vec<&str> = target.splitn(4, '.').collect();
+    match parts.as_slice() {
+        [db, tbl] => (db.to_string(), "public".to_string(), tbl.to_string()),
+        [db, schema, tbl] => (db.to_string(), schema.to_string(), tbl.to_string()),
+        _ => {
+            eprintln!(
+                "error: target must be 'database.table' or 'database.schema.table'"
+            );
+            std::process::exit(1);
+        }
+    }
+}
+
+/// Parse an index target like `airbnb.listings[col1,col2]` or
+/// `airbnb.public.listings[col1,col2]` into `(conn_name, schema, table, columns)`.
+/// Schema defaults to `public` when only two dot-parts are given.
+fn parse_index_target(target: &str) -> (String, String, String, Vec<String>) {
+    let Some(bracket_pos) = target.find('[') else {
+        eprintln!(
+            "error: target must include columns in brackets, e.g. airbnb.listings[col1,col2]"
+        );
+        std::process::exit(1);
+    };
+    let table_part = &target[..bracket_pos];
+    let cols_raw = target[bracket_pos + 1..].trim_end_matches(']');
+
+    let parts: Vec<&str> = table_part.splitn(4, '.').collect();
+    let (conn, schema, table) = match parts.as_slice() {
+        [c, t] => (c.to_string(), "public".to_string(), t.to_string()),
+        [c, s, t] => (c.to_string(), s.to_string(), t.to_string()),
+        _ => {
+            eprintln!(
+                "error: target must be 'connection.table[cols]' or 'connection.schema.table[cols]'"
+            );
+            std::process::exit(1);
+        }
+    };
+
+    let columns: Vec<String> = cols_raw
+        .split(',')
+        .map(|s| s.trim().to_string())
+        .filter(|s| !s.is_empty())
+        .collect();
+
+    if columns.is_empty() {
+        eprintln!("error: no columns specified in brackets");
+        std::process::exit(1);
+    }
+
+    (conn, schema, table, columns)


nit: these two parsers (parse_db_target, parse_index_target) are pure, have several edge cases (2- vs 3-segment, default-public, bracket placement, empty cols, malformed dot counts), and are easy to regress. A small #[cfg(test)] mod tests covering happy paths + each error branch would lock the behavior down — especially since the existing databases_cli.rs tests don't exercise the new shorthand. (not blocking)

claude · 2026-05-20T00:58:06Z

+        /// Table and columns to index: `connection.table[col1,col2]`
+        /// or `connection.schema.table[col1,col2]`. Schema defaults to `public`.
+        #[arg(conflicts_with = "dataset_id")]
+        target: Option<String>,


super nit: [ and ] are glob metacharacters in zsh (and in bash with extglob/failglob), so hotdata indexes create airbnb.listings[description] --type bm25 will fail with no matches found for many users out of the box. Worth a line in the doc comment telling users to quote the target ('airbnb.listings[description]'). Same note would help on DatabasesCommands::Load even though it has no brackets, since the README example uses brackets in nearby commands. (not blocking)

claude · 2026-05-20T00:58:10Z

+pub fn resolve_connection_id(api: &ApiClient, name_or_id: &str) -> String {
+    let body: ListResponse = api.get("/connections");
+    match body
+        .connections
+        .iter()
+        .find(|c| c.id == name_or_id || c.name == name_or_id)
+    {
+        Some(conn) => conn.id.clone(),
+        None => {
+            use crossterm::style::Stylize;
+            eprintln!(
+                "{}",
+                format!("error: no connection named or with id '{name_or_id}'").red()
+            );
+            std::process::exit(1);
+        }
+    }
+}


super nit: this lists every connection in the workspace even when the caller already has a raw ID. For workspaces with many connections (or once /connections paginates) this adds a needless round trip and could miss a valid ID past the first page. A cheap optimization: if name_or_id matches a known ID shape (e.g. starts with a known prefix or looks like a UUID), try GET /connections/{id} first and fall back to listing on 404. Fine to defer — the existing connections::list has the same limitation. (not blocking)

- Dataset auto-name now includes columns (dataset_{cols}_{type}) to avoid collision - Validate bracket closure in parse_index_target; reject unclosed/garbled brackets - Add quoting note to indexes create doc comment for shell glob safety - resolve_connection_id tries GET /connections/{id} directly for conn* prefixed IDs - Add unit tests for parse_db_target and parse_index_target (6 tests) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

claude · 2026-05-20T01:05:28Z

+        /// Quote the argument to prevent shell glob expansion:
+        /// `hotdata indexes create 'airbnb.listings[description]' --type bm25`
+        #[arg(conflicts_with = "dataset_id")]
+        target: Option<String>,


nit: README.md is now out of sync with this command — lines 252–261 still document hotdata indexes create --connection-id <id> --schema <schema> --table <table> --name <name> --columns <cols> ..., but this PR removes those flags from Create. A user copying that example will hit a clap error. Worth updating the README block to show the new connection.table[cols] form (and the dataset-scope path that now requires --columns), plus a one-line mention of the hotdata databases load shorthand alongside tables load. (not blocking)

claude Bot reviewed May 20, 2026

View reviewed changes

claude Bot previously approved these changes May 20, 2026

View reviewed changes

eddietejeda dismissed claude[bot]’s stale review via 52642e2 May 20, 2026 01:01

claude Bot reviewed May 20, 2026

View reviewed changes

claude Bot approved these changes May 20, 2026

View reviewed changes

eddietejeda merged commit 6eb5854 into main May 20, 2026
11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: dot notation for indexes create and databases load#92

feat: dot notation for indexes create and databases load#92
eddietejeda merged 2 commits into
mainfrom
eddietejeda-dot-notation

eddietejeda commented May 20, 2026 •

edited

Loading

Uh oh!

sentry Bot commented May 20, 2026 •

edited

Loading

Uh oh!

claude Bot May 20, 2026

Uh oh!

claude Bot May 20, 2026

Uh oh!

claude Bot May 20, 2026

Uh oh!

claude Bot May 20, 2026

Uh oh!

claude Bot May 20, 2026

Uh oh!

claude Bot May 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		let table_part = &target[..bracket_pos];
		let cols_raw = target[bracket_pos + 1..].trim_end_matches(']');

Conversation

eddietejeda commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Examples

Test plan

Uh oh!

sentry Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

claude Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

eddietejeda commented May 20, 2026 •

edited

Loading

sentry Bot commented May 20, 2026 •

edited

Loading