sqldb: avoid materializing non-terminal payments#10851
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly optimizes the Highlights
New Features🧠 You can now enable Memory (public preview) to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request refactors the FetchNonTerminalPayments SQL query in both the source SQL file and the generated Go code. The common table expression (CTE) non_terminal_ids and its UNION operation have been removed. Instead, the non-terminal payment filtering logic has been inline-expanded directly into the main query's WHERE clause using OR and EXISTS/NOT EXISTS subqueries. I have no feedback to provide as there are no review comments.
Postgres validationWanted to add some context on why the SQLite plan was problematic and confirm the rewrite is safe on Postgres, since the PR description only covers SQLite. What was happening on SQLiteThe original query expressed "non-terminal payments" as a CTE that unioned two ID sets — payments without a settled attempt, and payments with any unresolved attempt — and then joined that CTE back to SQLite's planner could not push the cursor into the CTE. It produced: In other words, every call to The rewrite turns the selector payment-driven: scan First page on the affected DB drops from multi-second to ~0.4 s. Postgres — does this regress?Ran
Master plan excerpt — note the CTE is inlined into a Hash Join, not materialized as a separate node: This PR's plan excerpt — Why Postgres doesn't have the SQLite problemPostgres 12+ inlines non-recursive CTEs by default, so on master the For this PR's shape, Postgres chooses Conclusion
|
Motivation
FetchInFlightPaymentsusesFetchNonTerminalPaymentsduring SQL payment recovery, including router startup payment resume andTrackPaymentssubscription setup.The current query first builds a
non_terminal_idsCTE usingUNION, then applies pagination outside the CTE:On SQLite this can force materialization of the full candidate set before the outer
WHERE/ORDER BY/LIMITcan be applied. On a node with a large payment history,EXPLAIN QUERY PLANshowed:That means the query can effectively scan historical payment/attempt/resolution tables before returning a single page. In one observed DB,
payments WHERE fail_reason IS NULLalone contained ~250k rows, causing startup recovery to spend a long time insideFetchNonTerminalPayments.What Changed
This rewrites
FetchNonTerminalPaymentsto be payment-driven:The non-terminal checks are kept as correlated
EXISTS/NOT EXISTSpredicates on each candidate payment.This lets SQLite use the primary-key payment cursor first and stop once the requested page is found, instead of materializing all non-terminal IDs up front.
The observed query plan for the rewritten shape is:
On the affected DB, the rewritten query returned the first page in ~0.4s.
Semantics
The returned payment set is unchanged:
The change only alters the selector shape so pagination can be applied before scanning historical attempts/resolutions.
Testing
make sqlc