Skip to content

feat(api-server): Add private endpoint to retrieve timestamp column names for a dataset.#2058

Merged
hoophalab merged 10 commits intoy-scope:mainfrom
hoophalab:api-get-timestamp
Mar 7, 2026
Merged

feat(api-server): Add private endpoint to retrieve timestamp column names for a dataset.#2058
hoophalab merged 10 commits intoy-scope:mainfrom
hoophalab:api-get-timestamp

Conversation

@hoophalab
Copy link
Copy Markdown
Contributor

@hoophalab hoophalab commented Mar 2, 2026

Description

This PR adds a private endpoint to get the timestamp column name(s).

The route is subject to change and is only intended to be used by grafana plugin. So we didn't add openapi docs.

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

GET http://localhost:3001/column_metadata/{dataset_name}/timestamp returns a list of column names of Timestamp type.

Summary by CodeRabbit

  • New Features

    • Added a new API endpoint to retrieve timestamp column metadata for datasets, enabling better data discovery and querying capabilities.
  • Improvements

    • Enhanced error handling with clearer, more descriptive error messages for invalid dataset names and missing datasets.

@hoophalab hoophalab requested a review from a team as a code owner March 2, 2026 22:24
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 2, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

The pull request introduces a new HTTP endpoint to retrieve timestamp column names from a dataset. It includes dataset name validation using a compiled regex, new error types for invalid and missing datasets, a database query method, and HTTP route integration with proper error mapping.

Changes

Cohort / File(s) Summary
Dataset Validation Infrastructure
components/clp-rust-utils/Cargo.toml, components/clp-rust-utils/src/dataset.rs, components/clp-rust-utils/src/lib.rs
Added regex dependency and a new public static VALID_DATASET_NAME_REGEX using LazyLock for reusable dataset name validation across the codebase.
API Error Handling
components/api-server/src/error.rs
Added two new ClientError variants: InvalidDatasetName and DatasetNotFound(String) to support error cases for the new timestamp column retrieval feature.
Client Logic
components/api-server/src/client.rs
Implemented get_timestamp_column_names async method that validates dataset names, queries the clp_{dataset_name}_column_metadata table for type 14 (timestamp), and returns column names or appropriate errors.
HTTP Route Integration
components/api-server/src/routes.rs
Added "/column_metadata/{dataset_name}/timestamp" endpoint with handler that calls the client method. Extended HandlerError with BadRequest variant and updated error mapping logic for the new error types.

Sequence Diagram

sequenceDiagram
    participant Client as HTTP Client
    participant Handler as Route Handler
    participant Service as Client Service
    participant DB as Database
    
    Client->>Handler: GET /column_metadata/{dataset_name}/timestamp
    Handler->>Service: get_timestamp_column_names(dataset_name)
    Service->>Service: Validate dataset_name with VALID_DATASET_NAME_REGEX
    alt Invalid Format
        Service-->>Handler: Err(InvalidDatasetName)
        Handler-->>Client: 400 Bad Request
    else Valid Format
        Service->>DB: Query clp_{dataset_name}_column_metadata WHERE type = 14
        alt Table Not Found
            DB-->>Service: Error (table not found)
            Service-->>Handler: Err(DatasetNotFound)
            Handler-->>Client: 404 Not Found
        else Success
            DB-->>Service: Vec<column_name>
            Service-->>Handler: Ok(Vec<String>)
            Handler-->>Client: 200 OK + JSON array
        end
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title accurately describes the main change: adding a private endpoint to retrieve timestamp column names for datasets.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@hoophalab hoophalab requested a review from junhaoliao March 2, 2026 22:31
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@components/api-server/src/client.rs`:
- Around line 511-515: The query is using a hard-coded magic number 14 for
NodeType::Timestamp which is brittle; replace the literal with a named constant
(e.g., TIMESTAMP_NODE_TYPE: i32) and use that in the sqlx::query(...).bind(...)
call (or, preferably, import a shared constant from a common crate that mirrors
NodeType::Timestamp) so that the value is defined in one place and reused across
call sites instead of using the literal in the rows/sqlx::query code path.
- Around line 505-509: The dataset name regex in VALID_DATASET_NAME_REGEX
currently allows hyphens but must match the canonical rule (alphanumeric and
underscores only); update the Regex in the LazyLock initialization used to
validate dataset_name in the validation block so it uses the pattern
^[a-zA-Z0-9_]+$ instead of ^[a-zA-Z0-9_-]+$, keeping the same error return
(ClientError::InvalidDatasetName) on non-match.

In `@components/api-server/src/routes.rs`:
- Around line 235-242: The current map_err closure logs all failures as error
even for client-side validation (4xx); change it to convert the incoming err
into a HandlerError first (via HandlerError::from(err)) and then log at a lower
level for BadRequest (e.g., tracing::warn! or debug!) and at error for other
variants. Concretely, in the closure used where timestamp keys are fetched (the
map_err around HandlerError::from), produce the HandlerError value, match on it
(e.g., HandlerError::BadRequest(_)) and call tracing::warn!(...) for that case
and tracing::error!(...) for the rest, then return the HandlerError.

ℹ️ Review info

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a0f0933 and 56b8095.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (4)
  • components/api-server/Cargo.toml
  • components/api-server/src/client.rs
  • components/api-server/src/error.rs
  • components/api-server/src/routes.rs

Comment thread components/api-server/src/client.rs Outdated
Comment thread components/api-server/src/client.rs Outdated
Comment thread components/api-server/src/routes.rs
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
components/api-server/src/client.rs (1)

511-515: 🛠️ Refactor suggestion | 🟠 Major

Replace 14i32 with a named/shared constant.

The hard-coded discriminator is a brittle cross-language coupling point. Promote it to a constant (or shared source) so drift is easier to detect and prevent.

♻️ Minimal containment fix
+const TIMESTAMP_NODE_TYPE: i32 = 14;
+
         let rows = sqlx::query(&format!("SELECT name FROM `{table_name}` WHERE type = ?"))
-            .bind(14i32)
+            .bind(TIMESTAMP_NODE_TYPE)
             .fetch_all(&self.sql_pool)
             .await?;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/api-server/src/client.rs` around lines 511 - 515, Replace the
hard-coded 14i32 in the SQL query bind with a named constant to avoid brittle
cross-language coupling: add a module-level or pub const such as
NODE_TYPE_TIMESTAMP (type i32, value 14) near this client module (or a shared
constants module) and use bind(NODE_TYPE_TIMESTAMP) in the rows creation in
client.rs (the SQL query that currently calls sqlx::query(...).bind(14i32)).
Keep the existing comment about syncing with SchemaTree.hpp and reference that
constant in any other places that used the literal so drift is easier to detect.
components/api-server/src/routes.rs (1)

235-242: ⚠️ Potential issue | 🟡 Minor

Use severity-aware logging in the error mapper.

The map_err closure currently logs every failure at warn, including internal failures. Keep client validation failures at lower severity, but log server-side failures at error to avoid masking operational issues.

♻️ Suggested adjustment
     let keys = client
         .get_timestamp_column_name(&dataset_name)
         .await
         .map_err(|err| {
-            tracing::warn!(
-                "Failed to get timestamp keys for dataset '{}': {:?}",
-                dataset_name,
-                err
-            );
-            HandlerError::from(err)
+            let handler_err = HandlerError::from(err);
+            match &handler_err {
+                HandlerError::BadRequest(_) => tracing::warn!(
+                    "Invalid dataset name '{}'",
+                    dataset_name
+                ),
+                _ => tracing::error!(
+                    "Failed to get timestamp keys for dataset '{}': {:?}",
+                    dataset_name,
+                    handler_err
+                ),
+            }
+            handler_err
         })?;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/api-server/src/routes.rs` around lines 235 - 242, The map_err
closure that logs failures when fetching timestamp keys should distinguish
client validation errors from server-side failures: inside the closure (the one
that currently calls tracing::warn! and returns HandlerError::from(err)), match
or inspect the incoming error to decide severity—use tracing::warn! for
client/validation errors and tracing::error! for internal/server errors—while
still returning HandlerError::from(err) and including dataset_name and the error
details in the log message; update the closure around dataset_name, err, and
HandlerError::from to implement this conditional logging.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@components/api-server/src/client.rs`:
- Around line 511-515: Replace the hard-coded 14i32 in the SQL query bind with a
named constant to avoid brittle cross-language coupling: add a module-level or
pub const such as NODE_TYPE_TIMESTAMP (type i32, value 14) near this client
module (or a shared constants module) and use bind(NODE_TYPE_TIMESTAMP) in the
rows creation in client.rs (the SQL query that currently calls
sqlx::query(...).bind(14i32)). Keep the existing comment about syncing with
SchemaTree.hpp and reference that constant in any other places that used the
literal so drift is easier to detect.

In `@components/api-server/src/routes.rs`:
- Around line 235-242: The map_err closure that logs failures when fetching
timestamp keys should distinguish client validation errors from server-side
failures: inside the closure (the one that currently calls tracing::warn! and
returns HandlerError::from(err)), match or inspect the incoming error to decide
severity—use tracing::warn! for client/validation errors and tracing::error! for
internal/server errors—while still returning HandlerError::from(err) and
including dataset_name and the error details in the log message; update the
closure around dataset_name, err, and HandlerError::from to implement this
conditional logging.

ℹ️ Review info

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 56b8095 and 7ea1354.

📒 Files selected for processing (2)
  • components/api-server/src/client.rs
  • components/api-server/src/routes.rs

Copy link
Copy Markdown
Member

@junhaoliao junhaoliao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since this returns a Vec<String>, the name should be plural. note a user may compress different inputs with different --timestamp-key or no timestamp key so pratically there can be multiple or zero timestamp columns

Comment thread components/api-server/src/client.rs Outdated
Comment thread components/api-server/src/routes.rs Outdated
Comment thread components/api-server/src/routes.rs Outdated
Comment thread components/api-server/src/routes.rs Outdated
Copy link
Copy Markdown
Member

@junhaoliao junhaoliao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code lgtm and matches my understanding. i feel clp-s / Rust experts might wanna take another look. cc @gibber9809 @LinZhihao-723

Comment thread components/api-server/src/client.rs
Comment thread components/api-server/src/client.rs Outdated
Comment thread components/api-server/src/error.rs
Comment thread components/api-server/src/routes.rs Outdated
Comment thread components/api-server/src/routes.rs Outdated
Comment thread components/api-server/src/routes.rs Outdated
Comment thread components/api-server/src/routes.rs Outdated
@hoophalab hoophalab requested a review from junhaoliao March 3, 2026 17:15
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
components/api-server/src/routes.rs (1)

228-242: ⚠️ Potential issue | 🟡 Minor

Downgrade logging severity for expected 4xx outcomes in this handler.

At Line 236, invalid dataset names and unknown datasets are expected request outcomes; logging them as error will inflate noise and alert fatigue. Convert to HandlerError first, then log warn/debug for BadRequest/NotFound and keep error for 5xx paths.

♻️ Suggested adjustment
     let names = client
         .get_timestamp_column_names(&dataset_name)
         .await
         .map_err(|err| {
-            tracing::error!(
-                "Failed to get timestamp column names for dataset '{}': {:?}",
-                dataset_name,
-                err
-            );
-            HandlerError::from(err)
+            let handler_err = HandlerError::from(err);
+            match handler_err {
+                HandlerError::BadRequest(_) | HandlerError::NotFound => tracing::warn!(
+                    "Rejected timestamp-column lookup for dataset '{}': {:?}",
+                    dataset_name,
+                    handler_err
+                ),
+                HandlerError::InternalServer => tracing::error!(
+                    "Failed to get timestamp column names for dataset '{}': {:?}",
+                    dataset_name,
+                    handler_err
+                ),
+            }
+            handler_err
         })?;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/api-server/src/routes.rs` around lines 228 - 242, In
get_timestamp_column_names, stop unconditionally logging the raw error as
tracing::error when client.get_timestamp_column_names fails; instead convert the
error to a HandlerError first (using HandlerError::from(err) or similar), match
on the resulting HandlerError variants (e.g., BadRequest, NotFound) and emit a
lower-severity log (warn or debug) for those 4xx cases and only use
tracing::error for other/5xx variants before returning the HandlerError; update
the closure in the map_err for client.get_timestamp_column_names to perform this
conversion+match and then return the HandlerError.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@components/api-server/src/routes.rs`:
- Around line 228-242: In get_timestamp_column_names, stop unconditionally
logging the raw error as tracing::error when client.get_timestamp_column_names
fails; instead convert the error to a HandlerError first (using
HandlerError::from(err) or similar), match on the resulting HandlerError
variants (e.g., BadRequest, NotFound) and emit a lower-severity log (warn or
debug) for those 4xx cases and only use tracing::error for other/5xx variants
before returning the HandlerError; update the closure in the map_err for
client.get_timestamp_column_names to perform this conversion+match and then
return the HandlerError.

ℹ️ Review info

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7ea1354 and 8cf12cb.

📒 Files selected for processing (3)
  • components/api-server/src/client.rs
  • components/api-server/src/error.rs
  • components/api-server/src/routes.rs

junhaoliao
junhaoliao previously approved these changes Mar 3, 2026
Copy link
Copy Markdown
Member

@junhaoliao junhaoliao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. let's see if @LinZhihao-723 has more comments before merging

Copy link
Copy Markdown
Member

@LinZhihao-723 LinZhihao-723 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, no big problem.
I'm thinking maybe we should have a plan for documenting internal APIs (but this doesn't need to be in this PR).

Comment thread components/api-server/src/client.rs Outdated
Comment thread components/api-server/src/client.rs Outdated
Comment thread components/api-server/src/client.rs Outdated
Comment thread components/api-server/Cargo.toml Outdated
Comment thread components/api-server/src/routes.rs Outdated
match err {
ClientError::SearchJobNotFound(_) => Self::NotFound,
ClientError::SearchJobNotFound(_) | ClientError::DatasetNotFound(_) => Self::NotFound,
ClientError::InvalidDatasetName => Self::BadRequest("Invalid dataset name."),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about:

  • Make BadRequest(&'static str) -> BadRequest(String)
  • Change this line to:
Suggested change
ClientError::InvalidDatasetName => Self::BadRequest("Invalid dataset name."),
ClientError::InvalidDatasetName => Self::BadRequest(format!("{err}")),

In this way, you don't need to repeat the same error string twice across the code base ("Invalid dataset name."). Also, if later we have other bad request errors, we can simply do:

            ClientError::InvalidDatasetName | ClientError::SomeOtherError => Self::BadRequest(format!("{err}")),

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed &str to String but I don't fully agree with exposing Display's result in HTTP argument

For example, grpc considers exception messages as sensitive data and doesn't return them to clients by default.
https://learn.microsoft.com/en-us/aspnet/core/grpc/security?view=aspnetcore-10.0#exceptions

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • In general, I agree we shouldn't return errors/exceptions unconditionally.
  • However, in this case, and in other cases I could think of for returning the error for a bad request with invalid parameters, the error message should be descriptive for users to understand what exactly is rejected. That said, I'm not sure if it's a good idea to always repeat the error message twice. A more elegant way for this is probably to have another enum error for all bad request errors that can be directly displayed to users.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I just realized we already have this layer of mapping to convert internal errors to HandlerError::InternalError. I think it should be OK to not consider this bad request error as an internal error and directly print its display result then. 'cuz otherwise we'd consider it as an internal error.

Co-authored-by: Lin Zhihao <59785146+LinZhihao-723@users.noreply.github.com>
Co-authored-by: Lin Zhihao <59785146+LinZhihao-723@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@components/api-server/src/client.rs`:
- Around line 516-519: The query built with sqlx::query_scalar(&format!("SELECT
name FROM `{table_name}` WHERE type = ?")) (used with TIMESTAMP_NODE_TYPE,
fetch_all and self.sql_pool) lacks an ORDER BY and can return non-deterministic
results; modify the SQL string to append a stable ordering (for example "ORDER
BY name ASC") so the call via
sqlx::query_scalar(...).bind(TIMESTAMP_NODE_TYPE).fetch_all(&self.sql_pool).await
returns a deterministic, consistently ordered list.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: c85c3509-5a82-441c-9906-76b50bf85871

📥 Commits

Reviewing files that changed from the base of the PR and between 8cf12cb and 6a6fb72.

📒 Files selected for processing (1)
  • components/api-server/src/client.rs

Comment on lines +516 to +519
sqlx::query_scalar(&format!("SELECT name FROM `{table_name}` WHERE type = ?"))
.bind(TIMESTAMP_NODE_TYPE)
.fetch_all(&self.sql_pool)
.await
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

cat -n components/api-server/src/client.rs | sed -n '480,535p'

Repository: y-scope/clp

Length of output: 2572


🏁 Script executed:

# Search for the table schema definition
rg "column_metadata" --type rust -A 5 -B 5

Repository: y-scope/clp

Length of output: 1812


🏁 Script executed:

# Check for tests that call get_timestamp_column_names
rg "get_timestamp_column_names" --type rust -B 3 -A 3

Repository: y-scope/clp

Length of output: 1932


🏁 Script executed:

# Look for similar query patterns with ORDER BY in this file
rg "SELECT.*WHERE.*ORDER BY" components/api-server/src/client.rs

Repository: y-scope/clp

Length of output: 37


🏁 Script executed:

# Search for tests that use the timestamp endpoint or method
rg "timestamp.*column" --type rust -i | head -20

Repository: y-scope/clp

Length of output: 687


🏁 Script executed:

# Look at the table schema or schema definitions
rg "column_metadata" --type rust --type sql -B 3 -A 3 | head -40

Repository: y-scope/clp

Length of output: 1164


🏁 Script executed:

# Check if there's any documentation or comments about column order
rg "timestamp.*column\|column.*timestamp" components/ -i --type rust --type md

Repository: y-scope/clp

Length of output: 37


🏁 Script executed:

# Search for test files related to the timestamp columns endpoint
fd "test" --type f -x grep -l "timestamp.*column\|get_timestamp" {} \; 2>/dev/null | head -10

Repository: y-scope/clp

Length of output: 278


🏁 Script executed:

# Look for any test directories and check for integration tests
find . -type d -name "*test*" -o -name "*tests*" | head -10

Repository: y-scope/clp

Length of output: 438


🏁 Script executed:

# Check if there's a schema definition file or migration that defines column_metadata table
fd "schema\|migration" --type f | grep -E "\.(sql|rs)$" | head -10

Repository: y-scope/clp

Length of output: 37


🏁 Script executed:

# Search for tests in the api-server directory
find components/api-server -type f -name "*test*" -o -name "*tests*" 2>/dev/null

Repository: y-scope/clp

Length of output: 37


🏁 Script executed:

# Look for any test files that might test the client or routes
fd "\\.rs$" components/api-server | xargs grep -l "test\|#\[test\]" 2>/dev/null | head -10

Repository: y-scope/clp

Length of output: 37


🏁 Script executed:

# Check if there are integration or e2e tests
find . -path "*/api-server*" -name "*.rs" -type f | xargs grep -l "get_timestamp_column_names" 2>/dev/null

Repository: y-scope/clp

Length of output: 130


🏁 Script executed:

# Look for how Grafana or other consumers might use this endpoint
rg "timestamp.*column\|column.*timestamp" --type rust --type md -C 2 | head -50

Repository: y-scope/clp

Length of output: 37


🏁 Script executed:

# Search for ORDER BY usage patterns in the same file
rg "SELECT.*FROM.*WHERE" components/api-server/src/client.rs -A 2 -B 2

Repository: y-scope/clp

Length of output: 843


🏁 Script executed:

# Look for the table schema definition or any schema files
find components -name "*.sql" -o -name "*.rs" | xargs grep -l "CREATE TABLE.*column_metadata\|column_metadata.*CREATE" 2>/dev/null | head -5

Repository: y-scope/clp

Length of output: 37


🏁 Script executed:

# Check for schema or database initialization code
rg "CREATE TABLE\|column_metadata" components/api-server --type rust -B 2 -A 8 | head -60

Repository: y-scope/clp

Length of output: 37


🏁 Script executed:

# Look at similar query patterns in the codebase
rg "query_scalar.*SELECT" components/api-server/src/client.rs -B 1 -A 5

Repository: y-scope/clp

Length of output: 384


🏁 Script executed:

# Check if there's a schema definition in the core codebase
rg "column_metadata\|SchemaTree" components/core --type hpp --type cpp -B 3 -A 3 | head -80

Repository: y-scope/clp

Length of output: 82


🏁 Script executed:

# Look for how column_metadata table is created or structured
rg "column_metadata" components/core/src/clp_s --type hpp --type cpp -B 2 -A 5 | head -60

Repository: y-scope/clp

Length of output: 82


🏁 Script executed:

# Check if there are any docs on the API or the column endpoint
find . -name "*.md" -type f | xargs grep -l "timestamp.*column\|column.*timestamp" 2>/dev/null | head -5

Repository: y-scope/clp

Length of output: 133


🏁 Script executed:

# Check the documentation files for API specs or column ordering expectations
cat docs/src/user-docs/core-clp-s.md | grep -A 10 -B 10 "timestamp\|column" | head -40

Repository: y-scope/clp

Length of output: 2048


🏁 Script executed:

# Look for any API documentation or specification files
find . -name "*.md" -o -name "*.openapi*" -o -name "*.swagger*" | xargs grep -l "timestamp.*column" 2>/dev/null

Repository: y-scope/clp

Length of output: 89


🏁 Script executed:

# Check if there's any indication of how the column metadata is ordered when written
rg "INSERT INTO.*column_metadata\|column_metadata.*INSERT" components/ --type rust --type cpp -B 2 -A 2 | head -40

Repository: y-scope/clp

Length of output: 37


Add deterministic ordering to the timestamp column query.

The query at line 516 lacks an ORDER BY clause, so results can be returned in different orders across executions. For a Grafana consumer and tests that depend on consistent responses, this introduces brittleness. Add explicit stable ordering.

Suggested fix
-        let names: Vec<String> =
-            sqlx::query_scalar(&format!("SELECT name FROM `{table_name}` WHERE type = ?"))
+        let names: Vec<String> =
+            sqlx::query_scalar(&format!(
+                "SELECT name FROM `{table_name}` WHERE type = ? ORDER BY name"
+            ))
                 .bind(TIMESTAMP_NODE_TYPE)
                 .fetch_all(&self.sql_pool)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/api-server/src/client.rs` around lines 516 - 519, The query built
with sqlx::query_scalar(&format!("SELECT name FROM `{table_name}` WHERE type =
?")) (used with TIMESTAMP_NODE_TYPE, fetch_all and self.sql_pool) lacks an ORDER
BY and can return non-deterministic results; modify the SQL string to append a
stable ordering (for example "ORDER BY name ASC") so the call via
sqlx::query_scalar(...).bind(TIMESTAMP_NODE_TYPE).fetch_all(&self.sql_pool).await
returns a deterministic, consistently ordered list.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
components/api-server/src/client.rs (1)

513-517: ⚠️ Potential issue | 🟡 Minor

Add a stable ORDER BY to the timestamp-column query.

Line 514 still leaves row order up to MySQL, so the same dataset can produce different response ordering across runs. That makes the Grafana consumer and any response-based tests unnecessarily brittle.

Suggested fix
-        let names: Vec<String> =
-            sqlx::query_scalar(&format!("SELECT name FROM `{table_name}` WHERE type = ?"))
+        let names: Vec<String> = sqlx::query_scalar(&format!(
+            "SELECT name FROM `{table_name}` WHERE type = ? ORDER BY name ASC"
+        ))
                 .bind(TIMESTAMP_NODE_TYPE)
                 .fetch_all(&self.sql_pool)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/api-server/src/client.rs` around lines 513 - 517, The SELECT for
the timestamp nodes is missing a deterministic ORDER BY; update the
sqlx::query_scalar call in client.rs (the block that builds names with
table_name and TIMESTAMP_NODE_TYPE) to include an explicit ORDER BY on the
timestamp column (e.g., "ORDER BY `timestamp` ASC" or DESC as required) so
fetch_all(&self.sql_pool).await always returns rows in a stable order; ensure
the formatted SQL includes the ORDER BY clause around the same table_name
interpolation.
components/api-server/src/routes.rs (1)

235-242: ⚠️ Potential issue | 🟡 Minor

Do not log 4xx validation failures at error.

This closure converts InvalidDatasetName into a 400, but it still emits an error-level log first. That will create avoidable noise for bad client input.

Suggested fix
         .await
         .map_err(|err| {
-            tracing::error!(
-                "Failed to get timestamp column names for dataset '{}': {:?}",
-                dataset_name,
-                err
-            );
-            HandlerError::from(err)
+            let handler_err = HandlerError::from(err);
+            match &handler_err {
+                HandlerError::BadRequest(_) => tracing::warn!(
+                    "Invalid dataset name '{}'",
+                    dataset_name
+                ),
+                _ => tracing::error!(
+                    "Failed to get timestamp column names for dataset '{}': {:?}",
+                    dataset_name,
+                    handler_err
+                ),
+            }
+            handler_err
         })?;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/api-server/src/routes.rs` around lines 235 - 242, The closure
mapping errors for get timestamp column names currently always logs at error
level; change it to only log at error for server/internal errors and log
client/validation errors (e.g., InvalidDatasetName) at a lower level
(debug/info) or omit logging. Inside the map_err closure that returns
HandlerError::from(err), match or inspect the incoming err (the result converted
into HandlerError) and call tracing::debug!(...) or skip logging when the error
is InvalidDatasetName, but keep tracing::error!(...) for other error kinds,
preserving the dataset_name context in the message.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@components/api-server/src/client.rs`:
- Around line 513-517: The SELECT for the timestamp nodes is missing a
deterministic ORDER BY; update the sqlx::query_scalar call in client.rs (the
block that builds names with table_name and TIMESTAMP_NODE_TYPE) to include an
explicit ORDER BY on the timestamp column (e.g., "ORDER BY `timestamp` ASC" or
DESC as required) so fetch_all(&self.sql_pool).await always returns rows in a
stable order; ensure the formatted SQL includes the ORDER BY clause around the
same table_name interpolation.

In `@components/api-server/src/routes.rs`:
- Around line 235-242: The closure mapping errors for get timestamp column names
currently always logs at error level; change it to only log at error for
server/internal errors and log client/validation errors (e.g.,
InvalidDatasetName) at a lower level (debug/info) or omit logging. Inside the
map_err closure that returns HandlerError::from(err), match or inspect the
incoming err (the result converted into HandlerError) and call
tracing::debug!(...) or skip logging when the error is InvalidDatasetName, but
keep tracing::error!(...) for other error kinds, preserving the dataset_name
context in the message.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: aec3549c-3852-4c7c-93f4-ebc9da9899f8

📥 Commits

Reviewing files that changed from the base of the PR and between 6a6fb72 and 09ff98a.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (5)
  • components/api-server/src/client.rs
  • components/api-server/src/routes.rs
  • components/clp-rust-utils/Cargo.toml
  • components/clp-rust-utils/src/dataset.rs
  • components/clp-rust-utils/src/lib.rs

@hoophalab hoophalab requested a review from LinZhihao-723 March 6, 2026 16:51
Copy link
Copy Markdown
Member

@LinZhihao-723 LinZhihao-723 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm.

@hoophalab hoophalab merged commit e08bb93 into y-scope:main Mar 7, 2026
22 checks passed
@junhaoliao junhaoliao added this to the March 2026 milestone Mar 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants