feat(webui): Add dataset name validation and timestamp key docs link in compression form (resolves #1703).#1707
Conversation
WalkthroughRemoved a local SqlTableSuffix and centralized it in common config; added CLP_DEFAULT_TABLE_PREFIX, dataset-name schema/constants and client-side validator; integrated the validator into the compression form and updated helper text and import sites. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Possibly related issues
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (3)
components/webui/client/src/config/sql-table-suffix.ts(1 hunks)components/webui/client/src/pages/IngestPage/Compress/ClpSFormItems.tsx(2 hunks)components/webui/client/src/pages/IngestPage/Compress/validation.ts(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}
⚙️ CodeRabbit configuration file
- Prefer
false == <expression>rather than!<expression>.
Files:
components/webui/client/src/pages/IngestPage/Compress/validation.tscomponents/webui/client/src/pages/IngestPage/Compress/ClpSFormItems.tsxcomponents/webui/client/src/config/sql-table-suffix.ts
🧠 Learnings (1)
📓 Common learnings
Learnt from: haiqi96
Repo: y-scope/clp PR: 0
File: :0-0
Timestamp: 2025-07-29T14:04:13.769Z
Learning: User haiqi96 requested creating a GitHub issue to document a bug fix from PR #1136, which addressed MySQL compatibility issues with invalid SQL CAST operations in the WebUI component.
Learnt from: Bill-hbrhbr
Repo: y-scope/clp PR: 1036
File: components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py:188-212
Timestamp: 2025-07-03T12:59:46.638Z
Learning: In CLP schedulers (compression and query), runtime dataset validation is required; the helper validate_and_cache_dataset keeps a local cache and only queries the DB on cache misses because dataset additions are rare.
🔇 Additional comments (2)
components/webui/client/src/config/sql-table-suffix.ts (1)
1-12: Enum extension for metadata table suffixes looks consistentThe additional suffixes (ARCHIVE_TAGS, DATASETS, FILES, TAGS) and the doc comment tying this to
clp_metadata_db_utils.pykeep the WebUI in sync with the backend naming. This works well with the later use ofObject.values(SqlTableSuffix)in the validation utilities.components/webui/client/src/pages/IngestPage/Compress/validation.ts (1)
43-59: AlignvalidateDatasetNamewith code style guidelines; review type safety needsTwo points here:
Coding style guideline compliance
The file matches the TypeScript pattern requiringfalse == <expression>instead of!<expression>. Current code uses!datasetNameand!(/^\w+$/).test(datasetName). Replace these with the preferred pattern:
- if (!datasetName) {
- if (false == datasetName || 0 === datasetName.length) {
return null;
}```diff - if (!(/^\w+$/).test(datasetName)) { + const isAlphanumeric = (/^\w+$/).test(datasetName); + if (false == isAlphanumeric) { return "Dataset name can only contain alphanumeric characters and underscores."; }
- Type signature review
Verify whether callers pass potentially undefined/null values. If the form validation passesunknowntypes, consider acceptingstring | null | undefinedto make the function safer and more self-documenting. Otherwise, the currentstringtype is appropriate.
components/webui/client/src/pages/IngestPage/Compress/validation.ts
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
instead of doing this only in the client, can we update the Typebox schema directly so validations can be performed in the backend as well? e.g.,
/**
* Matching the `MYSQL_TABLE_NAME_MAX_LEN` in `clp_py_utils.clp_metadata_db_utils`.
*/
const MYSQL_TABLE_NAME_MAX_LEN = 64;
/**
* Maximum length among all table suffixes.
*/
const TABLE_SUFFIX_MAX_LEN = Math.max(
...Object.values(SqlTableSuffix).map((suffix) => suffix.length)
);
/**
* Matching the default `clp_table_prefix` in `clp_py_utils.clp_config`.
*/
const CLP_DEFAULT_TABLE_PREFIX = "clp_";
/**
* Dataset name validation constants matching `clp_package_utils.general.validate_dataset_name`.
* - Pattern: only alphanumeric characters and underscores.
* - Max length: computed using the default table prefix.
*/
const DATASET_NAME_PATTERN = "^\\w+$";
const DATASET_NAME_SEPARATOR_LEN = 1;
const DATASET_NAME_MAX_LEN =
MYSQL_TABLE_NAME_MAX_LEN -
CLP_DEFAULT_TABLE_PREFIX.length -
DATASET_NAME_SEPARATOR_LEN -
TABLE_SUFFIX_MAX_LEN;
/**
* Schema for request to create a new compression job.
*/
const CompressionJobCreationSchema = Type.Object({
paths: Type.Array(Type.String()),
dataset: Type.Optional(Type.String({
pattern: DATASET_NAME_PATTERN,
maxLength: DATASET_NAME_MAX_LEN,
})),
timestampKey: Type.Optional(Type.String()),
});There was a problem hiding this comment.
i thought of this, but the CLP_DEFAULT_TABLE_PREFIX is actually variable, like it is configurable. which could complicate implementation.
There was a problem hiding this comment.
maybe just the pattern and not max length?
There was a problem hiding this comment.
okay i made the change
components/webui/client/src/pages/IngestPage/Compress/ClpSFormItems.tsx
Outdated
Show resolved
Hide resolved
components/webui/client/src/pages/IngestPage/Compress/ClpSFormItems.tsx
Outdated
Show resolved
Hide resolved
…Items.tsx Co-authored-by: Junhao Liao <junhao@junhao.ca>
…Items.tsx Co-authored-by: Junhao Liao <junhao@junhao.ca>
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
components/webui/client/src/pages/IngestPage/Compress/ClpSFormItems.tsx(2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}
⚙️ CodeRabbit configuration file
- Prefer
false == <expression>rather than!<expression>.
Files:
components/webui/client/src/pages/IngestPage/Compress/ClpSFormItems.tsx
🧠 Learnings (2)
📓 Common learnings
Learnt from: haiqi96
Repo: y-scope/clp PR: 0
File: :0-0
Timestamp: 2025-07-29T14:04:13.769Z
Learning: User haiqi96 requested creating a GitHub issue to document a bug fix from PR #1136, which addressed MySQL compatibility issues with invalid SQL CAST operations in the WebUI component.
Learnt from: Bill-hbrhbr
Repo: y-scope/clp PR: 1036
File: components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py:188-212
Timestamp: 2025-07-03T12:59:46.638Z
Learning: In CLP schedulers (compression and query), runtime dataset validation is required; the helper validate_and_cache_dataset keeps a local cache and only queries the DB on cache misses because dataset additions are rare.
📚 Learning: 2024-11-21T15:51:33.203Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 596
File: components/log-viewer-webui/client/src/api/query.js:16-23
Timestamp: 2024-11-21T15:51:33.203Z
Learning: In `components/log-viewer-webui/client/src/api/query.js`, the `ExtractJsonResp` type definition is accurate as-is and does not require modification. When suggesting changes to type definitions, ensure they align with the server-side definitions, referencing the source code if necessary.
Applied to files:
components/webui/client/src/pages/IngestPage/Compress/ClpSFormItems.tsx
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: package-image
- GitHub Check: lint-check (ubuntu-24.04)
- GitHub Check: lint-check (macos-15)
🔇 Additional comments (2)
components/webui/client/src/pages/IngestPage/Compress/ClpSFormItems.tsx (2)
1-9: Imports are correct and necessary.The added
Typographyimport supports the enhanced helper text with an external documentation link, andvalidateDatasetNameenables inline form validation.
42-53: Validator logic is sound.The validation correctly delegates to
validateDatasetNameand surfaces errors to the form. The existing review comment about simplifying the async pattern and improving type safety remains valid.
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (8)
components/webui/client/src/config/sql-table-suffix.ts(0 hunks)components/webui/client/src/pages/IngestPage/Compress/validation.ts(1 hunks)components/webui/client/src/pages/IngestPage/Details/sql.ts(1 hunks)components/webui/client/src/pages/IngestPage/SpaceSavings/sql.ts(1 hunks)components/webui/client/src/pages/SearchPage/SearchControls/TimeRangeInput/sql.ts(1 hunks)components/webui/client/src/pages/SearchPage/SearchState/Presto/useTimestampKeyInit/sql.ts(1 hunks)components/webui/common/src/config.ts(1 hunks)components/webui/common/src/schemas/compression.ts(2 hunks)
💤 Files with no reviewable changes (1)
- components/webui/client/src/config/sql-table-suffix.ts
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}
⚙️ CodeRabbit configuration file
- Prefer
false == <expression>rather than!<expression>.
Files:
components/webui/client/src/pages/IngestPage/Compress/validation.tscomponents/webui/client/src/pages/IngestPage/Details/sql.tscomponents/webui/common/src/config.tscomponents/webui/client/src/pages/SearchPage/SearchControls/TimeRangeInput/sql.tscomponents/webui/common/src/schemas/compression.tscomponents/webui/client/src/pages/IngestPage/SpaceSavings/sql.tscomponents/webui/client/src/pages/SearchPage/SearchState/Presto/useTimestampKeyInit/sql.ts
🧠 Learnings (3)
📓 Common learnings
Learnt from: haiqi96
Repo: y-scope/clp PR: 0
File: :0-0
Timestamp: 2025-07-29T14:04:13.769Z
Learning: User haiqi96 requested creating a GitHub issue to document a bug fix from PR #1136, which addressed MySQL compatibility issues with invalid SQL CAST operations in the WebUI component.
Learnt from: Bill-hbrhbr
Repo: y-scope/clp PR: 1036
File: components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py:188-212
Timestamp: 2025-07-03T12:59:46.638Z
Learning: In CLP schedulers (compression and query), runtime dataset validation is required; the helper validate_and_cache_dataset keeps a local cache and only queries the DB on cache misses because dataset additions are rare.
Learnt from: Bill-hbrhbr
Repo: y-scope/clp PR: 1036
File: components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py:204-211
Timestamp: 2025-07-03T12:58:18.407Z
Learning: In the CLP codebase, the validate_and_cache_dataset function in components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py uses in-place updates of the existing_datasets set parameter rather than returning a new set, as preferred by the development team.
Learnt from: haiqi96
Repo: y-scope/clp PR: 594
File: components/clp-package-utils/clp_package_utils/scripts/native/del_archives.py:90-96
Timestamp: 2024-11-15T16:22:33.635Z
Learning: In the CLP codebase, `table_prefix` is a trusted configuration value and can be safely embedded into SQL queries.
📚 Learning: 2024-11-21T15:51:33.203Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 596
File: components/log-viewer-webui/client/src/api/query.js:16-23
Timestamp: 2024-11-21T15:51:33.203Z
Learning: In `components/log-viewer-webui/client/src/api/query.js`, the `ExtractJsonResp` type definition is accurate as-is and does not require modification. When suggesting changes to type definitions, ensure they align with the server-side definitions, referencing the source code if necessary.
Applied to files:
components/webui/client/src/pages/IngestPage/Compress/validation.ts
📚 Learning: 2024-11-15T16:22:33.635Z
Learnt from: haiqi96
Repo: y-scope/clp PR: 594
File: components/clp-package-utils/clp_package_utils/scripts/native/del_archives.py:90-96
Timestamp: 2024-11-15T16:22:33.635Z
Learning: In the CLP codebase, `table_prefix` is a trusted configuration value and can be safely embedded into SQL queries.
Applied to files:
components/webui/client/src/pages/IngestPage/Compress/validation.tscomponents/webui/client/src/pages/IngestPage/Details/sql.tscomponents/webui/common/src/config.tscomponents/webui/client/src/pages/IngestPage/SpaceSavings/sql.tscomponents/webui/client/src/pages/SearchPage/SearchState/Presto/useTimestampKeyInit/sql.ts
🧬 Code graph analysis (1)
components/webui/client/src/pages/IngestPage/Compress/validation.ts (1)
components/webui/common/src/schemas/compression.ts (2)
DatasetNameSchema(70-70)DATASET_NAME_MAX_LEN(68-68)
🔇 Additional comments (6)
components/webui/common/src/config.ts (1)
31-47: SqlTableSuffix and table-prefix centralisation looks soundEnum values and the default prefix are clearly documented to mirror the Python config/metadata utilities, and exporting them here gives a single source of truth for downstream SQL builders. No issues from a correctness or maintainability perspective.
Also applies to: 50-56
components/webui/client/src/pages/IngestPage/Details/sql.ts (1)
1-1: Using shared SqlTableSuffix from common config is appropriateSwitching to the central
@webui/common/configexport keeps table-suffix usage consistent across the UI without changing the query logic.components/webui/client/src/pages/SearchPage/SearchState/Presto/useTimestampKeyInit/sql.ts (1)
1-1: Central SqlTableSuffix import keeps Presto timestamp-key SQL alignedRefactoring to import
SqlTableSuffixfrom the common config removes the local dependency while preserving existing query behaviour.components/webui/client/src/pages/SearchPage/SearchControls/TimeRangeInput/sql.ts (1)
1-4: Shared config imports for CLP storage engine and suffixes look correctPulling
CLP_STORAGE_ENGINESandSqlTableSuffixfrom@webui/common/configmakes this time-range query builder consistent with other modules and keeps engine/suffix definitions centralised.components/webui/client/src/pages/IngestPage/SpaceSavings/sql.ts (1)
1-1: SqlTableSuffix import refactor maintains existing space-savings behaviourUsing
SqlTableSuffixfrom the shared config keeps the archive-table naming consistent with other parts of the UI while leaving the query logic unchanged.components/webui/common/src/schemas/compression.ts (1)
6-43: Verify dataset-name schema alignment with backend validators for non-default table prefixesDefining
DATASET_NAME_PATTERN,DATASET_NAME_MAX_LEN, andDatasetNameSchemahere provides a clear single source for dataset-name rules. UsingSqlTableSuffixandCLP_DEFAULT_TABLE_PREFIXto derive max length from MySQL's table-name limit is sensible.However, this computation uses the hardcoded default table prefix, while
clp_table_prefixis configurable at runtime in Python. Backend validators (clp_package_utils.general.validate_dataset_nameand schedulers'validate_and_cache_dataset) may validate based on the effective runtime prefix. If non-default prefixes are deployed, a dataset name could pass this schema but fail backend validation or cause table-name issues.Confirm that either:
- Deployed environments always use the default prefix, or
- Backend validators use the same default-prefix-based calculation, or
- The effective prefix will be configurable in this module
Backend schedulers already perform runtime validation (learnings confirm), so keeping rules aligned will prevent UI/scheduler validation mismatches.
| validator: async (_, value: unknown) => { | ||
| const error = validateDatasetName(value as string); | ||
| if (error) { | ||
| return Promise.reject(new Error(error)); | ||
| } | ||
|
|
||
| return Promise.resolve(); | ||
| }, |
There was a problem hiding this comment.
as discussed in the other PR, i think the explicit usages of Promise here are "redundantly correct" - they work, but the async keyword already handles the Promise wrapping
maybe it's better to remove?
Description
Implement suggestions in #1703 - Adds validation to dataset to match package + link for escaped characters.
Checklist
breaking change.
Validation performed
validation works as expected. Link opens in new browser
Summary by CodeRabbit
New Features
Bug Fixes / UX
✏️ Tip: You can customize this high-level summary in your review settings.