Summary:
Original commit: 4818b7136e8308035a7a4c96c83b3401a350c54d / D40404
A bug was discovered with INSERT with ON CONFLICT DO UPDATE when some
secondary indexes fail to be updated and end up corrupted. This bug has multiple
causes.
The main cause is duplicate logic which makes the list of indexes unaffected by the
update. Historically this logic was a part of the function that detects if the update
statement is a single line update. It is a part of the single line update criteria: all
indexes are unaffected by update, and the list of unaffected indexes was a side
effect of this function. Even if the statement is found not a single line this list is used
further on to skip updates of those indexes. However, the list was not always built.
It wasn't built in case of INSERT with ON CONFLICT DO UPDATE, and in some
cases the function determined that the statement is not a single line and returned
before making that list. Hence as a part of more advanced index update optimization
functionality to make a list of unaffected indexes, along of the list of indexes that
maybe unaffected per row, depending if old and new values differ. Second
implementation added entries to the unaffected list, even if primary key columns
were affected. This bug is fixed in this diff, however proper fix would be to
deduplicate the logic. We need a follow up change,
https://github.com/yugabyte/yugabyte-db/issues/25200 was filed to track it.
The bug was mitigated by the per-row logic, where the unaffected indexes list was
cleared if a PK column is found affected. That is the right thing to clear the list of
unaffected indexes for that row being built, but it also means the list of always
unaffected indexes should be initially empty. That diff adds an assertion check to
validate that.
Third cause was discrepancy in the logic, whether advanced index optimization
should be applied or not. In planner INSERT with ON CONFLICT DO UPDATE
allowed advanced index optimization, but not in executor, that's why the mitigation
explained in the previous paragraph did not work. Actual logic in the executor may
be simpler: apply the optimization if it is (still) allowed in the configuration and
planner prepared a data structure for such optimization. Latter basically makes the
criteria in planner the source of trues.
Jira: DB-14205
Test Plan: ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressUpdateOptimized#schedule'
Reviewers: kramanathan, smishra
Reviewed By: kramanathan
Subscribers: yql
Tags: #jenkins-ready
Differential Revision: https://phorge.dev.yugabyte.com/D40674