Skip to content

Conversation

@cofin
Copy link
Member

@cofin cofin commented Nov 18, 2025

Summary

Fixes COUNT query generation to properly validate FROM clause existence before attempting to create COUNT(*) queries. Addresses upstream bug report where select_with_total raised confusing error for SQL with only ORDER BY clause.

The Problem

select_with_total() raised "Cannot create COUNT query from empty SQL expression" when SELECT statement lacked FROM clause (e.g., "SELECT * ORDER BY id"). The root cause was attempting to use expr.args.get("from") without validating it wasn't None, which would pass None to sqlglot's .from_() method.

The Solution

Added explicit validation using sqlglot AST inspection to check expr.args.get("from") exists before attempting COUNT generation. Provides clear, actionable error message: "SELECT statement missing FROM clause. COUNT queries require a FROM clause to determine which table to count rows from."

Key Changes

  • Validation: Check if not expr.args.get("from") before building COUNT query
  • Clear error message: Explains why FROM clause is required for COUNT queries
  • Comprehensive tests: 12 unit tests covering edge cases (ORDER BY only, WHERE only, empty SELECT, etc.)
  • Backwards compatible: No behavior change for valid SELECT...FROM queries

Test Coverage

  • ✅ 12 new unit tests in test_count_query_edge_cases.py
  • ✅ Tests missing FROM clause scenarios (ORDER BY only, WHERE only, SELECT *)
  • ✅ Tests valid SELECT...FROM cases still work correctly
  • ✅ Tests error message clarity and actionability
  • ✅ All existing tests pass (4100+ tests with no regressions)

Example Error Messages

Before (confusing):

ImproperConfigurationError: Cannot create COUNT query from empty SQL expression

After (clear):

ImproperConfigurationError: Cannot create COUNT query: SELECT statement missing FROM clause. 
COUNT queries require a FROM clause to determine which table to count rows from.

Testing

# Run new tests
uv run pytest tests/unit/test_driver/test_count_query_edge_cases.py -v

# Run full suite
uv run pytest -n 2 --dist=loadgroup

Adds validation to _create_count_query to ensure SELECT statements
have a FROM clause before attempting to generate COUNT(*) queries.

Previously, malformed SQL like "SELECT * ORDER BY id" would raise
a confusing "empty SQL expression" error. Now provides clear error:
"SELECT statement missing FROM clause. COUNT queries require a FROM
clause to determine which table to count rows from."

Uses sqlglot AST inspection (expr.args.get("from")) to detect missing
FROM clauses before attempting to build COUNT queries. This prevents
None from being passed to sqlglot's .from_() method.

Test Plan:
- Added 12 comprehensive unit tests for edge cases
- Verified error messages are clear and actionable
- Confirmed no regression in valid SELECT...FROM queries
- All existing tests pass (4100+ tests)
@cofin cofin merged commit ef9c21b into main Nov 18, 2025
18 of 19 checks passed
@cofin cofin deleted the fix/select-with-total-from-validation branch November 18, 2025 15:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants