fix: embedded benchmark timing fallback + Python wrapper resilience by genezhang · Pull Request #250 · genezhang/clickgraph

genezhang · 2026-03-28T01:57:40Z

Summary

Benchmark runner falls back to wall-clock timing when FFI bindings are stale
Python wrapper handles missing FFI methods gracefully (no crash on stale _ffi.py)

First LDBC Embedded Results (chdb + sf0.003 Parquet)

26/36 passed, 5 skipped, 10 failed (72% pass rate)
Database init: 0.22s

Category	Count	Details
Passed	26	short-1..7, complex-1..6,8,9,11,13, bi-1..3,5,9,11..14
chdb timeout	5	complex-7,10, bi-7,17,18 (session expired)
Parse gap	2	bi-4 (CALL subquery), bi-6 (syntax)
CTE resolution	2	complex-12, bi-8 (column not found)
Type mismatch	1	complex-14 (UInt64 vs Int64)
Skipped	5	bi-10,15,16,19,20 (known feature gaps)

🤖 Generated with Claude Code

- embedded_benchmark.py: Fall back to wall-clock timing when FFI bindings lack get_compiling_time/get_execution_time (stale bindings). Use len(list(qr)) for row count instead of qr.num_rows. - clickgraph/__init__.py: Gracefully handle missing FFI methods in QueryResult.__init__ with try/except (prevents crash when bindings are not regenerated after new FFI methods are added). First end-to-end LDBC results on embedded chdb: 26/36 passed, 5 skipped, 10 failed (72% pass rate) Failures: 5 chdb session timeout, 2 parse gaps, 2 CTE resolution, 1 type mismatch Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

genezhang

Code Review: Timing fallback + Python wrapper resilience

Small PR, two changes. The intent is good — don't crash on stale FFI bindings — but I have concerns about the approach.

Observations

1. try/except AttributeError in the wrapper is a band-aid

The three separate try/except blocks in QueryResult.__init__ mask a real problem: stale _ffi.py bindings that are out of sync with libclickgraph_ffi.so. Silent fallback (0.0 timing, empty types) means users won't know their bindings are stale — they'll just get wrong/missing data with no warning.

Better approach: a single try/except with a logged warning:

try:
    self._compiling_time = ffi_result.get_compiling_time()
    self._execution_time = ffi_result.get_execution_time()
    self._column_data_types = ffi_result.get_column_data_types()
except AttributeError:
    import warnings
    warnings.warn(
        "Stale FFI bindings: timing/type methods not available. "
        "Regenerate with: uniffi-bindgen generate ...",
        stacklevel=2,
    )
    self._compiling_time = 0.0
    self._execution_time = 0.0
    self._column_data_types = []

This way the user sees a clear warning once, not silent zeros.

2. result.row_count = len(list(qr)) consumes the iterator

The original code used qr.num_rows (a property). The new code does len(list(qr)) which materializes all rows into a list and then discards them. This:

Wastes memory for large result sets
Consumes the iterator — the qr object is now exhausted
Means the verbose first-row print was removed (the qr[0] line) likely because the iterator was already consumed

qr.num_rows should still work since it's a property on the wrapper, not a timing method. Was this change intentional? If num_rows also fails on stale bindings, wrap it in the same try/except rather than consuming the iterator.

3. Benchmark timing fallback is fine

The try/except AttributeError with wall-clock fallback in embedded_benchmark.py is reasonable for a benchmark runner — it's a development tool, not library code. The wall-clock fallback preserves the benchmark's ability to report something even with older bindings.

4. Verbose output removed

The --verbose first-row print was deleted:

-                if args.verbose and qr.num_rows > 0:
-                    print(f"\n--- {query_id} (first row) ---")
-                    print(qr[0])

This was useful for debugging query results. If it was removed because qr[0] fails after list(qr) consumes the iterator, fix the root cause (use num_rows) rather than removing the feature.

Verdict

The benchmark runner change is fine. The Python wrapper change needs refinement — use a single warning instead of silent fallback, and keep num_rows instead of len(list(qr)).

- Python wrapper: consolidated 3 try/except into single block with warnings.warn() including regeneration instructions - Benchmark: restored qr.num_rows (doesn't consume iterator) and --verbose first-row output Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

genezhang commented Mar 28, 2026

View reviewed changes

genezhang merged commit e3a9465 into main Mar 28, 2026
4 checks passed

genezhang deleted the test/embedded-benchmark-fixes branch March 28, 2026 03:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: embedded benchmark timing fallback + Python wrapper resilience#250

fix: embedded benchmark timing fallback + Python wrapper resilience#250
genezhang merged 2 commits intomainfrom
test/embedded-benchmark-fixes

genezhang commented Mar 28, 2026

Uh oh!

genezhang left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

genezhang commented Mar 28, 2026

Summary

First LDBC Embedded Results (chdb + sf0.003 Parquet)

Uh oh!

genezhang left a comment

Choose a reason for hiding this comment

Code Review: Timing fallback + Python wrapper resilience

Observations

Verdict

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant