_operation_result re-fires itself via self.operations = result[3], doubling stats work on every quick_command_args change

## Problem

`_operation_result` in `buckaroo/dataflow/dataflow.py` is registered as an `@observe` callback on `('sampled_df', 'cleaning_method', 'quick_command_args', 'operations')`. The body of the method writes to a trait it observes:

```python
self.cleaned = result
self.operations = result[3]   # <-- re-fires the same observer
```

This causes the observer to fire **twice** per `quick_command_args` change: once for the external change, once for the inner reassignment. Every downstream observer (`cleaned → processed_result → _summary_sd → populate_df_meta`) runs twice. The existing `(id(df), id(klasses))` dedupe in `_summary_sd` (added per #709 for widget construction) can't help because the second pass legitimately produces a fresh `processed_df` object id.

The docstring already flags the hazard:

> Obviously this can lead to cycles so this code must be approached carefully.

…but the guard never landed.

## Reproduction

Test data: 500k-row citibike parquet (~34MB, 13 columns). Sending `buckaroo_state_change` with `quick_command_args = {"search": ["elect"]}` from a WS client, measuring round-trip:

| Backend | state_change | post-search page | total |
|---|---|---|---|
| pandas (`POST /load mode="buckaroo"`) | 1369ms | 4ms | 1374ms |
| xorq (`POST /load_expr`, #776) | 3970ms | 139ms | 4118ms |
| polars (`POST /load mode="lazy"`) | (timed out — see "out of scope" below) | — | — |

For comparison, the equivalent naive `df[c].str.contains('elect')` across all 13 string columns standalone (no widget) is **505ms**. So the widget overhead is ~2.7× even for pandas; this issue accounts for half of that.

Per-method timing during the xorq state_change (instrumentation reverted):

```
xorq backend, search "elect":
  _get_summary_sd:                    1837ms  processed_df_id=...578960
  populate_df_meta _expr_count(proc): 117ms
  populate_df_meta _expr_count(orig): 2ms
  [re-enters via self.operations = result[3]]
  _get_summary_sd:                    1846ms  processed_df_id=...066896  ← DUPLICATE
  populate_df_meta _expr_count(proc): 117ms                              ← DUPLICATE
  populate_df_meta _expr_count(orig): 2ms
  total:                              3970ms
```

Both passes produce the same logical result; the second is pure duplicate work.

Surfaced by the pydata-app companion (https://github.com/paddymul/pydata-app) once #776 wired up `/load_expr` and a 500k-row entry hit the search box.

## Proposed fix

Reentry guard in `_operation_result`:

```diff
     def _operation_result(self, _change:Any) -> None:
         \"\"\"...
+        Concretely: this method sets ``self.operations = result[3]`` while
+        also observing the ``operations`` trait, so the observer re-fires
+        inside its own execution. Without the reentry guard below, every
+        downstream observer (cleaned → processed_result → _summary_sd →
+        populate_df_meta) runs twice per single ``quick_command_args``
+        change — measured as ~2× search latency for a 500k-row xorq frame.
         \"\"\"
+        if getattr(self, '_in_operation_result', False):
+            return
+        self._in_operation_result = True
+        try:
             result = self.ac_obj.handle_ops_and_clean(
                 self.sampled_df, self.cleaning_method, self.quick_command_args, self.operations)

             if result is None:
                 return
             else:
                 self.cleaned = result
                 self.operations = result[3]
             self.operation_results = {'transformed_df':None,
                 'generated_py_code': self.generated_code}
+        finally:
+            self._in_operation_result = False
```

**Measured after fix:**

| Backend | state_change | total |
|---|---|---|
| pandas | 704ms (was 1369ms) | 708ms (was 1374ms) |
| xorq | 1919ms (was 3970ms) | 2061ms (was 4118ms) |

Roughly 50% reduction across the board. 943 of 944 buckaroo unit tests pass (the one failure is `tests/unit/server/test_mcp_uvx_install.py::test_uvx_no_stdout_pollution` — unrelated to the dataflow change).

## Why this matters

- Halves search latency on every catalog-entry view with a non-empty filter input — the workload the audience hits the moment they type into the search box.
- Removes pure duplicate work; no behavioural change. Both passes today produce equivalent state.
- Necessary precondition for any per-column-stats optimization downstream (otherwise improvements get halved by the duplicate fire).

## Out of scope

- **polars/lazy search isn't implemented server-side at all.** `_handle_buckaroo_state_change` early-returns for `mode != "buckaroo"` (`websocket_handler.py:54`) and `handle_infinite_request_lazy` doesn't read `search` from `payload_args` (`data_loading.py:164`). My probe above timed out for that reason, not because of this bug. Worth a separate issue.
- Remaining 1.8s in xorq is one `_get_summary_sd` call iterating per-column. Probably wants a future ticket: batch all column stats into one xorq query, or skip the per-column histogram refresh when only `quick_command_args.search` changed (schema hasn't actually moved).

Happy to PR this as a fixup on #776 or as a standalone PR against main — whichever you prefer.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

_operation_result re-fires itself via self.operations = result[3], doubling stats work on every quick_command_args change #780

Problem

Reproduction

Proposed fix

Why this matters

Out of scope

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Backend	state_change	post-search page	total
pandas (`POST /load mode="buckaroo"`)	1369ms	4ms	1374ms
xorq (`POST /load_expr`, #776)	3970ms	139ms	4118ms
polars (`POST /load mode="lazy"`)	(timed out — see "out of scope" below)	—	—

Backend	state_change	total
pandas	704ms (was 1369ms)	708ms (was 1374ms)
xorq	1919ms (was 3970ms)	2061ms (was 4118ms)

_operation_result re-fires itself via self.operations = result[3], doubling stats work on every quick_command_args change #780

Description

Problem

Reproduction

Proposed fix

Why this matters

Out of scope

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions