Harden DuckDB extension persistence#185
Conversation
cd7a08e to
4b9c638
Compare
66c6541 to
c712342
Compare
9a6d2d2 to
67dd8ce
Compare
|
@codex review |
b3ee67c to
1b91738
Compare
67dd8ce to
6cdc4af
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6cdc4afeab
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| impl Drop for DefinitionsFileLock { | ||
| fn drop(&mut self) { | ||
| self.file.take(); | ||
| let _ = fs::remove_file(&self.path); |
There was a problem hiding this comment.
Guard lock cleanup against stolen locks
If a writer is paused or slow for longer than DEFINITIONS_STALE_LOCK_AFTER, another process can classify its lock as stale, remove and recreate the same .lock file, and then this original guard will unconditionally remove that newer lock when it drops. That opens a window for two writers to enter the critical section and lose or corrupt persisted definitions; write an owner token into the lock and only unlink when it still matches, or avoid stealing locks that may still be live.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2baf6794d9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| idx += 1; | ||
| } | ||
|
|
||
| if let Some(statement_start) = start { |
There was a problem hiding this comment.
Split persisted model blocks without semicolons
When definitions are created through clients that omit trailing semicolons (for example con.execute("SEMANTIC CREATE MODEL ...")), append_definition_to_content preserves the semicolonless text. This fallback then makes statement_ranges return one statement spanning the rest of the sidecar, so model_definition_ranges sees all subsequent MODEL definitions as part of the first block; replacing the first model can drop later models, and adding a metric to a later model reports that it is not present in the persisted file. Split on top-level definition keywords as well as semicolons, or normalize persisted definitions to include semicolons.
Useful? React with 👍 / 👎.
e598752 to
9a62d99
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9a62d99e9a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if is_stale_definitions_lock(&lock_path) { | ||
| let _ = fs::remove_file(&lock_path); |
There was a problem hiding this comment.
Verify stale lock before unlinking
When two processes concurrently encounter the same stale sidecar lock, both can pass is_stale_definitions_lock before either recreates it; one process can then remove the old file and create its lock, only for the other process to execute this unconditional remove_file and delete that new lock before acquiring its own. That leaves two writers in the critical section despite the owner-token cleanup on drop; the fresh evidence in the current code is that the stale-removal path still unlinks by path without verifying the token or metadata it classified as stale.
Useful? React with 👍 / 👎.
9a62d99 to
17f7749
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 17f77491f3
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| } else if (ch == ',' && depth == 0) { | ||
| break; |
There was a problem hiding this comment.
Track braces when scanning model properties
When the new CREATE MODEL name (...) shorthand is used with a structured property before the top-level name, this scanner treats commas inside {} or [] values as top-level separators because it only tracks parentheses. For example SEMANTIC CREATE MODEL orders (metadata {owner analytics, name internal}, table orders) is a valid body shape for the Rust parser's metadata object support, but this code resumes at the nested name key and reports a spurious name mismatch against orders, blocking the model from being created. Track bracket/brace depth here (as the Rust property parser does) before breaking on commas.
Useful? React with 👍 / 👎.
3791e32 to
9c48e6c
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9c48e6cbdd
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| return false; | ||
| } | ||
| } | ||
| return std::isspace(value[pos + keyword.size()]); |
There was a problem hiding this comment.
Recognize all supported name separators
When a SEMANTIC CREATE MODEL outer (...) body uses the supported name: or name= property syntax, this check returns false because it only accepts whitespace after name. The Rust SQL parser accepts : and = separators, so the later injection path adds a second name instead of validating the existing one; for example SEMANTIC CREATE MODEL orders (name: customers, table orders, primary_key order_id) is accepted and parsed/persisted as customers rather than being rejected as a mismatch.
Useful? React with 👍 / 👎.
9c48e6c to
ce1456b
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ab3938df65
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if !result.is_empty() { | ||
| result.push_str("\n\n"); | ||
| } | ||
| result.push_str(trimmed_definition); |
There was a problem hiding this comment.
Terminate persisted definitions before appending
When a client creates a simple AS definition without a trailing semicolon (for example SEMANTIC CREATE METRIC revenue AS SUM(amount)) and a later model/metric is appended, this preserves the semicolonless text. The Rust SQL parser's simple AS parsers read the expression until a semicolon or EOF, so on autoload the next MODEL line is consumed as part of the metric SQL instead of being loaded as a separate definition; normalize persisted definitions with a semicolon before adding the newline.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 14aeffbb61
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| bytes | ||
| .get(idx + keyword.len()) | ||
| .map(|byte| byte.is_ascii_whitespace()) | ||
| .unwrap_or(true) |
There was a problem hiding this comment.
Accept compact definition keywords
When definitions use the parser-supported compact form without a space after the keyword, such as MODEL(name orders, ...) or METRIC(name revenue, ...), this boundary check returns false because it only accepts whitespace. The previous scanner accepted ( and parse_sql_model still accepts these forms, so a model created through the FFI as MODEL(name orders, ...) is persisted but later sidemantic_add_definition reports the model is not present in the sidecar; compact item replacements also fail to remove the old item and then fail validation as duplicates. Treat ( as a valid keyword boundary here as well.
Useful? React with 👍 / 👎.
Base: #184