Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 46f404bc84
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
sidemantic/api_server.py
Outdated
| if ";" in normalized: | ||
| raise ValueError("Multiple SQL statements are not supported") |
There was a problem hiding this comment.
Allow semicolons inside SQL string literals
Treating any remaining ; as a multi-statement separator rejects valid single-statement SQL such as SELECT ';' AS sep or predicates with semicolons inside string literals. This makes /sql and /sql/compile fail on legitimate inputs even though they are not multiple statements; statement-count validation should be parse-based instead of a raw substring check.
Useful? React with 👍 / 👎.
sidemantic/api_server.py
Outdated
| table = record_batch_reader_to_table(reader) | ||
| response_format = _resolve_response_format(request, format_override) |
There was a problem hiding this comment.
Avoid materializing full result before Arrow response
The response path reads the entire RecordBatchReader into a table before deciding the output format, so Arrow requests still fully materialize and then re-serialize all rows in memory. For large query results this can cause high memory usage or OOMs and defeats the expected scalability of Arrow transport.
Useful? React with 👍 / 👎.
- Fix unused duckdb import and unsorted imports in build_demo_duckdb.py - Use sqlglot.parse() for multi-statement detection instead of raw semicolon check, so semicolons inside string literals are accepted - Stream Arrow responses directly from RecordBatchReader to IPC bytes without materializing the full table first - Add tests for semicolon-in-literal and multi-statement rejection
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 49fec6607c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
sidemantic/api_server.py
Outdated
| body = await request.body() | ||
| if len(body) > self.max_body_bytes: |
There was a problem hiding this comment.
Enforce body limit before reading full request into memory
The size guard still buffers the entire payload with await request.body() before rejecting, so requests without a trustworthy Content-Length (e.g., chunked uploads) can consume unbounded memory and only fail after allocation. That defeats max_request_body_bytes as a hard safety limit and can OOM the API process under large or malicious inputs.
Useful? React with 👍 / 👎.
sidemantic/sql/query_rewriter.py
Outdated
| # mistaken for statement separators. | ||
| if ";" in sql: | ||
| statements = [s.strip() for s in sql.split(";") if s.strip()] | ||
| statements = sqlglot.parse(sql, dialect=self.dialect) |
There was a problem hiding this comment.
Keep strict=False fallback when semicolon parse fails
In non-strict mode this rewriter is expected to pass through unsupported SQL, but the new semicolon handling calls sqlglot.parse(...) outside the parse fallback block. If sqlglot cannot parse a semicolon-terminated client/system statement, it now raises before the strict=False passthrough can run, which can break PostgreSQL server flows that rely on non-strict rewriting.
Useful? React with 👍 / 👎.
sidemantic/server/common.py
Outdated
| if ";" in filter_str: | ||
| raise ValueError("Filter contains disallowed SQL: multi-statement input") |
There was a problem hiding this comment.
Allow semicolons in literal filter values
This rejects any filter containing ; before parsing, so valid single-expression filters like status = ';' (or other text values containing semicolons) are treated as disallowed SQL. That causes legitimate /query and /compile structured requests to fail even though no multi-statement execution is happening.
Useful? React with 👍 / 👎.
- Stream request body incrementally in MaxRequestBodyMiddleware so chunked uploads without Content-Length are rejected before full buffering (prevents OOM under large/malicious inputs) - Wrap sqlglot.parse() in query_rewriter semicolon check with try/except so non-strict mode still passes through unparseable SQL - Remove raw semicolon substring check from validate_filter_expression; use sqlglot.parse() after successful parse_one so semicolons inside string literals in filters are accepted
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: af912b65f9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
sidemantic/api_server.py
Outdated
| # literals (e.g. SELECT ';') are not rejected. | ||
| import sqlglot | ||
|
|
||
| statements = sqlglot.parse(normalized) |
There was a problem hiding this comment.
Handle parse failures as 4xx in SQL normalization
/sql and /sql/compile both call _normalize_sql_query, but sqlglot.parse(normalized) can throw non-ValueError parse exceptions on malformed input; those bypass the app’s ValueError handler and surface as 500 responses. That turns ordinary client-side bad SQL into an internal server error path, so parse errors should be caught here and re-raised as ValueError (or HTTPException) to return a stable 4xx.
Useful? React with 👍 / 👎.
Prevents malformed SQL from surfacing as 500 instead of 400.
No description provided.