fix(mixed-timeseries): preserve all-NaN metric columns after pivot when Jinja evaluates to NULL by aminghadersohi · Pull Request #40004 · apache/superset

aminghadersohi · 2026-05-11T00:24:05Z

SUMMARY

When a Jinja conditional metric expression (e.g. {% if filter_values('x') %}SUM(col){% else %}null{% endif %}) evaluates to null because no dashboard filter is selected, the SQL query returns an all-NULL column for that metric. pandas pivot_table with dropna=True (the default, controlled by drop_missing_columns: !show_empty_columns in pivotOperator.ts) then silently drops that column from the pivot result.

Downstream post-processing steps — rename and rolling — use validate_column_args to assert that all referenced columns exist before executing. Because the metric column was dropped, they raise:

InvalidPostProcessingError("Referenced columns not available in DataFrame.")

This surfaces as an error on mixed-timeseries charts whenever rename (triggered by groupby + truncate_metric) or rolling is configured.

Root cause chain:

Jinja template with filter_values() evaluates to null when no filter selected
SQL returns all-NULL column for the metric
pivot_table(dropna=True) drops the all-NaN column
rename / rolling post-processing fails because column is missing

Fix: Introduce _restore_dropped_metric_columns() in pivot.py which re-adds any expected metric columns that pivot_table dropped due to all-NaN values. This preserves the expected DataFrame schema for downstream post-processing, while still dropping sparse category combinations (the intended behavior of drop_missing_columns).

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

Before: Mixed-timeseries chart with Jinja SQL metric and no active dashboard filter shows error "Referenced columns not available in DataFrame."

After: Chart renders correctly (empty series since metric evaluates to NULL, but no error).

TESTING INSTRUCTIONS

Create a mixed-timeseries chart with a SQL-type custom metric using a Jinja conditional:

{% if filter_values('my_column') %}
  SUM(CASE WHEN my_column IN {{ filter_values('my_column') | where_in }} THEN value ELSE 0 END)
{% else %}
  null
{% endif %}

View the chart without any dashboard filter selected
Before fix: chart shows "Referenced columns not available in DataFrame" error
After fix: chart renders without error (no data series shown for the NULL metric)

ADDITIONAL INFORMATION

Regression tests added in tests/unit_tests/pandas_postprocessing/test_pivot.py:

test_pivot_preserves_all_nan_metric_flat: flat pivot (no groupby) — metric column restored as all-NaN
test_pivot_preserves_all_nan_metric_with_columns: MultiIndex pivot (with groupby) — metric level preserved at level 0

…en Jinja evaluates to NULL When a Jinja conditional metric expression (e.g. using filter_values()) evaluates to null/NULL because no dashboard filter is selected, the SQL query returns an all-NULL column. pandas pivot_table with dropna=True (controlled by drop_missing_columns, default True) then silently drops that column. Downstream post-processing steps like rename and rolling use validate_column_args to assert referenced columns exist before executing. Because the metric column was dropped, they raise InvalidPostProcessingError("Referenced columns not available in DataFrame"), which surfaces as an error on mixed-timeseries charts. Fix: introduce _restore_dropped_metric_columns() which re-adds any expected metric columns that pivot_table dropped due to all-NaN values. This keeps the DataFrame schema consistent with what the frontend and subsequent post-processing expect, while still allowing sparse category combinations to be dropped. Fixes SC-100398 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

bito-code-review · 2026-05-11T00:24:14Z

Code Review Agent Run #fd63b5

Actionable Suggestions - 0

Filtered by Review Rules

Bito filtered these suggestions based on rules created automatically for your feedback. Manage rules.

tests/unit_tests/pandas_postprocessing/test_pivot.py - 2
- Missing return type hint · Line 209-209
- Missing return type hint · Line 234-234

Review Details

Files reviewed - 2 · Commit Range: 4da9c03..4da9c03
- superset/utils/pandas_postprocessing/pivot.py
- tests/unit_tests/pandas_postprocessing/test_pivot.py
Files skipped - 0
Tools
- Whispers (Secret Scanner) - ✔︎ Successful
- Detect-secrets (Secret Scanner) - ✔︎ Successful
- MyPy (Static Code Analysis) - ✔︎ Successful
- Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

/review - Manually triggers a full AI review.
/pause - Pauses automatic reviews on this pull request.
/resume - Resumes automatic reviews.
/resolve - Marks all Bito-posted review comments as resolved.
/abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by

codecov · 2026-05-11T00:28:20Z

Codecov Report

❌ Patch coverage is 37.50000% with 10 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.83%. Comparing base (f67dd4a) to head (4da9c03).
⚠️ Report is 1 commits behind head on master.

Files with missing lines	Patch %	Lines
superset/utils/pandas_postprocessing/pivot.py	37.50%	7 Missing and 3 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master   #40004      +/-   ##
==========================================
- Coverage   63.83%   63.83%   -0.01%     
==========================================
  Files        2589     2589              
  Lines      137821   137837      +16     
  Branches    31928    31935       +7     
==========================================
+ Hits        87978    87984       +6     
- Misses      48327    48334       +7     
- Partials     1516     1519       +3

Flag	Coverage Δ
hive	`39.36% <12.50%> (-0.01%)`	⬇️
mysql	`59.00% <37.50%> (-0.01%)`	⬇️
postgres	`59.08% <37.50%> (-0.01%)`	⬇️
presto	`41.05% <12.50%> (-0.01%)`	⬇️
python	`60.52% <37.50%> (-0.01%)`	⬇️
sqlite	`58.72% <37.50%> (-0.01%)`	⬇️
unit	`100.00% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

aminghadersohi · 2026-05-11T00:29:35Z

Closing in favour of PR from correct fork.

codeant-ai-for-open-source · 2026-05-11T00:30:38Z

+            category_values = (
+                df.columns.get_level_values(-1).unique()
+                if len(df.columns) > 0
+                else [None]
+            )
+            for metric in missing:
+                for cat in category_values:
+                    df[(metric, cat)] = float("nan")


Suggestion: The restoration logic for MultiIndex pivots only uses the last column level (get_level_values(-1)) and then writes 2-tuples ((metric, cat)), which is incorrect when the pivot has more than one columns dimension. For pivots like columns=["a","b"], this creates malformed/incomplete column keys (or can raise when tuple depth does not match), so all-NaN metrics are not restored correctly and downstream operations can still fail. Build restored keys using the full non-metric tuple for every existing column combination (all levels after metric), not just the last level. [logic error]

Severity Level: Major ⚠️

- ❌ Multi-column pivot post-processing can crash during pivot(). - ❌ All-NaN metrics not restored for multi-level pivots. - ⚠️ Mixed-timeseries charts using multi-column pivots may fail. - ⚠️ Rolling/rename post-processing fails for affected metrics.

Steps of Reproduction ✅

1. Configure a query with post-processing in `superset/common/query_object.py:500-534` so that `exec_post_processing()` includes a pivot operation with `operation: "pivot"` and `options` containing `columns=["a", "b"]`, `drop_missing_columns=True` (default), and multiple metrics (e.g. `"metric": {"operator": "mean"}` and `"metric2": {"operator": "mean"}`). 2. Ensure the database result feeding this query contains multiple grouping columns `a` and `b` and that, for at least one metric (e.g. `"metric"`), all values are NULL/NaN for every row; this matches the all-NaN metric scenario already covered for single-column pivots in `tests/unit_tests/pandas_postprocessing/test_pivot.py:test_pivot_preserves_all_nan_metric_flat` (lines 70-92), but extended here to `columns=["a", "b"]` as in `test_pivot_eliminate_cartesian_product_columns` (lines 16-66). 3. When `exec_post_processing()` calls `pandas_postprocessing.pivot()` (implemented in `superset/utils/pandas_postprocessing/pivot.py:60-142`) with `columns=["a", "b"]`, `df.pivot_table(...)` (lines 122-131) produces a MultiIndex on `df.columns` where level 0 is the metric name and subsequent levels correspond to each `columns` entry (`a`, then `b`). Because one metric is entirely NaN and `dropna=drop_missing_columns=True`, pandas drops that metric's columns completely. 4. `_restore_dropped_metric_columns()` (lines 31-57 in `pivot.py`) then runs for this MultiIndex DataFrame. In the MultiIndex branch (lines 41-52), it collects only the last column level via `df.columns.get_level_values(-1).unique()` (line 45) and, for each missing metric, assigns new columns using 2‑tuples `df[(metric, cat)] = float("nan")` (line 52). For a MultiIndex with more than two levels (metric + `a` + `b`), this key has the wrong depth: on current pandas versions this either (a) raises a pandas error when attempting to insert a column with a tuple shorter than `df.columns.nlevels`, or (b) creates malformed/incomplete column labels that don't match the original `(metric, a, b)` shape. In either case, the all-NaN metric is not properly restored for every `(a, b)` combination, so subsequent post-processing such as `rename` and `rolling` (both decorated with `validate_column_args` in `superset/utils/pandas_postprocessing/utils.py:114-131`) can still fail with `InvalidPostProcessingError("Referenced columns not available in DataFrame.")` when they reference that metric.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖

This is a comment left during a code review. **Path:** superset/utils/pandas_postprocessing/pivot.py **Line:** 45:52 **Comment:** *Logic Error: The restoration logic for MultiIndex pivots only uses the last column level (`get_level_values(-1)`) and then writes 2-tuples (`(metric, cat)`), which is incorrect when the pivot has more than one `columns` dimension. For pivots like `columns=["a","b"]`, this creates malformed/incomplete column keys (or can raise when tuple depth does not match), so all-NaN metrics are not restored correctly and downstream operations can still fail. Build restored keys using the full non-metric tuple for every existing column combination (all levels after metric), not just the last level. Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise. Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix

👍 | 👎

bito-code-review · 2026-05-11T00:34:03Z

The flagged issue is correct: the restoration logic incorrectly uses only the last column level for MultiIndex pivots, creating tuples with wrong depth for multi-column pivots. To resolve, modify the code to collect all unique combinations of non-metric levels and build full tuples. No other comments found in the PR.

superset/utils/pandas_postprocessing/pivot.py

if isinstance(df.columns, pd.MultiIndex):
        existing_metrics = set(df.columns.get_level_values(0))
        missing = [m for m in expected_metrics if m not in existing_metrics]
        if missing:
            if df.columns.nlevels > 1:
                unique_combos = df.columns.droplevel(0).unique()
                for metric in missing:
                    for combo in unique_combos:
                        df[(metric,) + combo] = float("nan")
            else:
                # Fallback for single level, though unlikely
                pass

pull-request-size Bot added the size/M label May 11, 2026

dosubot Bot added change:backend Requires changing the backend global:jinja Related to Jinja templating viz:charts:timeseries Related to Timeseries labels May 11, 2026

aminghadersohi closed this May 11, 2026

codeant-ai-for-open-source Bot reviewed May 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(mixed-timeseries): preserve all-NaN metric columns after pivot when Jinja evaluates to NULL#40004

fix(mixed-timeseries): preserve all-NaN metric columns after pivot when Jinja evaluates to NULL#40004
aminghadersohi wants to merge 1 commit into
apache:masterfrom
aminghadersohi:bug-100398

aminghadersohi commented May 11, 2026

Uh oh!

bito-code-review Bot commented May 11, 2026 •

edited

Loading

Code Review Agent Run #fd63b5

Uh oh!

codecov Bot commented May 11, 2026 •

edited

Loading

Uh oh!

aminghadersohi commented May 11, 2026

Uh oh!

codeant-ai-for-open-source Bot May 11, 2026

Uh oh!

bito-code-review Bot commented May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

aminghadersohi commented May 11, 2026

SUMMARY

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

Uh oh!

bito-code-review Bot commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review Agent Run #fd63b5

Uh oh!

codecov Bot commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

aminghadersohi commented May 11, 2026

Uh oh!

codeant-ai-for-open-source Bot May 11, 2026

Choose a reason for hiding this comment

Uh oh!

bito-code-review Bot commented May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

bito-code-review Bot commented May 11, 2026 •

edited

Loading

codecov Bot commented May 11, 2026 •

edited

Loading