Skip to content

Add stale query warning with revert button for query executions#1679

Closed
wbh123456 wants to merge 11 commits intopinterest:masterfrom
wbh123456:bohaowang/stale-warning
Closed

Add stale query warning with revert button for query executions#1679
wbh123456 wants to merge 11 commits intopinterest:masterfrom
wbh123456:bohaowang/stale-warning

Conversation

@wbh123456
Copy link
Copy Markdown
Contributor

@wbh123456 wbh123456 commented Mar 26, 2026

Summary

Add execution snapshots and stale query warning

When a user edits a query after running it, the displayed execution results may no longer match the current query text. This PR adds a "stale query" warning banner that appears when the editor content diverges from what was actually executed. The warning includes a Revert button that restores the editor to the exact query that produced the selected execution.

Key changes:

  • useExecutionSnapshots hook to capture the query text at execution time
  • useStaleQueryWarning hook with debounced diff detection
  • StaleQueryWarning component with revert action
  • Integrated into both DataDoc query cells and the Query Composer
  • Unit tests for snapshot storage and stale detection logic

Minor Changes

  • Consolidate the content on AGENTS.md and make the CLAUDE.md to point to a single source.

Jira

https://pinterest.atlassian.net/browse/AP-5669

Demo

Screen.Recording.2026-04-02.at.4.15.39.PM.mov

- Implemented `addRunInputSnapshot` and `useExecutionSnapshots` for managing query execution inputs.
- Added `shouldComputeStaleWarning` and `useStaleQueryWarning` to handle stale query notifications.
- Integrated snapshot recording in `DataDocQueryCell` and `QueryComposer` components.
- Displayed stale query warnings in `DataDocQueryExecutions` and `QueryComposerExecution` components.

Made-with: Cursor
- Created AGENTS.md to provide an overview of Querybook, including its tech stack, directory layout, plugin system, and local development instructions.
- Added CLAUDE.md as a reference to AGENTS.md for internal documentation purposes.

Made-with: Cursor
@wbh123456 wbh123456 force-pushed the bohaowang/stale-warning branch from ef80698 to 3e51487 Compare March 26, 2026 21:23
- Added `getSnapshotForExecution` utility to retrieve execution snapshots based on execution ID.
- Updated `useStaleQueryWarning` to include `onRevert` functionality for reverting to previous query states.
- Modified `StaleQueryWarning` component to display a revert button when applicable.
- Integrated snapshot handling in `DataDocQueryExecutions` and `QueryComposerExecution` components to improve user experience with stale queries.
- Updated AGENTS.md to clarify CI checks for backend and frontend changes, including Docker and non-Docker testing commands.
- Added details on formatting checks and pre-commit hook setup to prevent common CI failures.
- Revised README.md to link to the full contribution guide, emphasizing local test execution.
- Expanded contributing.mdx with comprehensive testing instructions for both backend and frontend, including CI-equivalent commands and linting setup.
@wbh123456 wbh123456 changed the title Add execution snapshots and stale query warning Add stale query warning with revert button for query executions Mar 27, 2026
Comment thread docs_website/docs/developer_guide/contributing.mdx Outdated
Comment thread docs_website/docs/developer_guide/contributing.mdx Outdated
Comment thread querybook/webapp/components/DataDocQueryExecutions/StaleQueryWarning.tsx Outdated
}

export const StaleQueryWarning: React.FC<IProps> = ({ onRevert }) => (
<Message
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make the warning a bit less intrusive; I imagine the majority of users are intentionally making change.

Can we hook into the execution state indicator below the cell and add a new state (with tooltip) there? This is more inline with the patterns that other notebook tools use

Copy link
Copy Markdown
Contributor Author

@wbh123456 wbh123456 Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you that it is a bit intrusive. I made it into a banner on purpose because we want to warn the user the query has been edited without an execution to remind them to perform the execution. This will also avoid the confusion of sending AI the staled query.

While making the warning too large makes it too intrusive, making it too small defeats the purpose of the warning.

I will play around with a few different UX and let's sync offline on this UI design and find a middle ground.

Comment thread AGENTS.md Outdated
Comment on lines +83 to +86
## Pinterest Internal Deployment

Pinterest deploys Querybook internally via a separate repo (`datahub-pinterest`) that uses this repo's Docker image as a base and layers Pinterest-specific plugins on top. Changes to core Querybook go here; Pinterest-specific features belong in that plugin repo. Do not add Pinterest-internal details to this file.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not include this in the oss repo

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be helpful to AI agents to cross referencing the codes if they are both in the local. The section here only specifies a minimal of info with the repo name. I believe it is not too sensitive to share this info in this repo.

WDYT on this?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely agree that this isn't too sensitive! My concern is more that this is going to pollute the context of any users outside of Pinterest that use a coding agent; since this is a top level agents file it'll likely always get picked up and could cause issues

If we have issues with agents not knowing where certain code should live, we should iterate on the datahub agents

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pollute the context of any users outside of Pinterest that use a coding agent
I agree with you. Let me remove this section.

Comment thread AGENTS.md
Comment thread AGENTS.md
Comment thread AGENTS.md
Comment thread AGENTS.md Outdated
Comment thread AGENTS.md Outdated
@wbh123456 wbh123456 force-pushed the bohaowang/stale-warning branch from 9cecad8 to 8989468 Compare April 2, 2026 20:15
Comment thread AGENTS.md Outdated
Comment on lines +83 to +86
## Pinterest Internal Deployment

Pinterest deploys Querybook internally via a separate repo (`datahub-pinterest`) that uses this repo's Docker image as a base and layers Pinterest-specific plugins on top. Changes to core Querybook go here; Pinterest-specific features belong in that plugin repo. Do not add Pinterest-internal details to this file.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely agree that this isn't too sensitive! My concern is more that this is going to pollute the context of any users outside of Pinterest that use a coding agent; since this is a top level agents file it'll likely always get picked up and could cause issues

If we have issues with agents not knowing where certain code should live, we should iterate on the datahub agents

Comment thread querybook/webapp/hooks/queryEditor/useStaleQueryWarning.ts
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we just directly access execution state, rather than use snapshotting?

It seems like the current implementation doesn't bootstrap well (if the saved query doesn't match execution on opening a datadoc), and doesn't respect variables changes

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is good point and is it something we have to make tradeoff upon. I think this is something worth discussing:

  • Snapshot approach (current): Compares raw editor text at run time vs raw editor text now -- apples-to-apples. But this creates the bootstrapping problem you mentioned above.
  • Execution state approach (suggested): Always available from Redux on load, no bootstrapping gap. But execution.query is the fully transformed SQL (post-Jinja, post-sampling, post-row-limit). Comparing it against raw editor text would cause false-positive stale warnings on every query using templates or sampling.

I think neither approach is not perfect unless we make some more significant changes into the system to perfectly track the query executions. Not mentioning that there will also be "selected query" vs. "full query in the editor" problem.

I decided to use the Snapshot approach approach in the end because it is the most accurate - only with the tradeoff on some false negatives (bootstrapping problem). I believe false negatives are better then false positives in this case.

So to put in simple words, I dont think there is a perfect solution here but the current approach is the one that works the best. We can discuss more on this offline.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does having the separate saved stale check add anything other than allowing us to display different tooltips? May be worth removing that extra logic to simplify this hook. In the end, we are only trying to indicate the current SQL does not match the viewed execution

Comment on lines +74 to +76
{warningState && (
<StaleQueryWarning variant={warningState} />
)}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we render this in the same position as in the datadoc implementation

peerReviewParams
)
);
recordRunInputSnapshot(data.id, query);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason we want to snapshot the query instead of the query (q) that the execution is created with?

@wbh123456
Copy link
Copy Markdown
Contributor Author

wbh123456 commented Apr 15, 2026

Risks with stale-warning badge

Approach 1: In-Memory Snapshot (in the PR)

Compare the live editor text against a snapshot of the raw editor text captured at run time, stored in React component state.

  • Bootstrapping / page refresh (false negative): Snapshots live only in browser memory. After a refresh, the system falls back to the current saved cell content, so edits made before the last run are invisible — no warning appears even though results are stale.
  • Cross-session and collaborator edits (false negative): If another user (or the same user in a previous session) edited the cell after the last run, the fallback baseline is the new text, masking the mismatch.
  • No server-side raw-input record: QueryExecution.query stores rendered SQL, not what was in the editor. Persisting snapshots server-side would require a schema migration and ongoing storage cost for every execution.

Approach 2: Compare Against Executed Query in DB

Compare the live editor text against QueryExecution.query (the rendered, actually-executed SQL).

  • Templated queries (false positive): The editor contains {{ today }} but the DB stores the resolved date. These never match, so the banner fires permanently on any templated cell.
  • Macros (false positive): Inline {% macro %} blocks and macros.ds_add(...) calls expand during rendering. The raw-vs-rendered mismatch guarantees a perpetual warning.
  • Multi-statement mismatch (false positive): The comparison is cell-level, but statement splitting, comment stripping, and whitespace normalization mean the stored SQL rarely matches the raw editor text character-for-character, even immediately after running.

Decision: Not Implementing

While a theoretically perfect solution may exist (e.g., persisting raw input server-side, tracking variable values, doing per-statement diffing), the engineering effort is disproportionate to the benefit. Comparable platforms such as Google Colab and Jupyter do not provide stale-result warnings either. We will leave staleness management to the user/author and remove the prototyped StaleQueryWarning code.

@wbh123456 wbh123456 closed this Apr 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants