Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FIX: constraint check exception with snapshot memory leak in AO/AOCS #136

Merged
merged 1 commit into from
Nov 3, 2023

Conversation

lss602726449
Copy link
Contributor

@lss602726449 lss602726449 commented Aug 10, 2023

closes: #93


Change logs

This bug comes from AppendOnlyVisimapStore_Init calling RegisterSnapshot in appendonly_fetch_init, which creates a deep copy of the current snapshot. Therefore, when a constraint is violated(see function check_exclusion_or_unique_constraint in detail), EROOR will be performed directly, but UnregisterSnapshot in AppendOnlyVisimap_Finish is not called. Therefore, an Asset error will occur when the resource is finally recycled.

In order to solve this problem, we directly delete the RegisterSnapshot in AppendOnlyVisimapStore_Init which is useless and at the same time delete the UnregisterSnapshot.

Why are the changes needed?

FIX: constraint check exception with snapshot memory leak in AO/AOCS

Does this PR introduce any user-facing change?

No

How was this patch tested?

No need

Contributor's Checklist

Here are some reminders and checklists before/when submitting your pull request, please check them:

  • Make sure your Pull Request has a clear title and commit message. You can take git-commit template as a reference.
  • Sign the Contributor License Agreement as prompted for your first-time contribution.
  • List your communication in the GitHub Issues or Discussions (if has or needed).
  • Document changes.
  • Add tests for the change
  • Pass make installcheck
  • Pass make -C src/test installcheck-cbdb-parallel
  • Feel free to @cloudberrydb/dev team for review and approval when your PR is ready🥳

@lss602726449 lss602726449 force-pushed the snapshot_leak2 branch 3 times, most recently from 66659dc to 0e60b4c Compare August 16, 2023 05:11
@lss602726449 lss602726449 changed the title delete register and unregister in AppendOnlyVisimapStore_Init FIX: constraint check exception with snapshot memory leak in AO/AOCS Aug 21, 2023
@avamingli
Copy link
Collaborator

Need a rebase.

_bt_check_unique(DirtySnapshot is created here)
   -> table_index_fetch_tuple_check
      -> ...
         -> AppendOnlyVisimapStore_Init
            -> RegisterSnapshot(snapshot)
UnregisterSnapshot is called in AppendOnlyVisimapStore_Finish. If
transaction aborts, there is no chance to call AppendOnlyVisimapStore_Finish.

In order to solve this problem, we directly delete the RegisterSnapshot in
AppendOnlyVisimapStore_Init which is useless and at the same time delete
the UnregisterSnapshot.
@my-ship-it my-ship-it merged commit f544f1d into cloudberrydb:main Nov 3, 2023
9 of 10 checks passed
@liming01
Copy link

liming01 commented Nov 21, 2023

@lss602726449 , @avamingli , @my-ship-it

It seems that this fix has 2 problems:

  • AppendOnlyVisimapStore_Init() & AppendOnlyVisimapStore_Finish() can be called by other scenarios. i.e. _bt_check_unique() is not the only function calls these 2 functions internally. So we can not just remove calling UnregisterSnapshot().
  • The root cause of crash in this scenario I guess is: the stack variable DirtySnapshot initial value is random ( ref to link ), when the snapshot->copied is true, RegisterSnapshotOnOwner() will directly register this snapshot instead of copying one. So we just need to memset DirtySnapshot to '\0' before using it.

@avamingli
Copy link
Collaborator

@lss602726449 , @avamingli , @my-ship-it

It seems that this fix has 2 problems:

  • AppendOnlyVisimapStore_Init() & AppendOnlyVisimapStore_Finish() can be called by other scenarios. i.e. _bt_check_unique() is not the only function calls these 2 functions internally. So we can not just remove calling UnregisterSnapshot().

  • The root cause of crash in this scenario I guess is: the stack variable DirtySnapshot initial value is random ( ref to link ), when the snapshot->copied is true, RegisterSnapshotOnOwner() will directly register this snapshot instead of copying one. So we just need to memset DirtySnapshot to '\0' before using it.

Hi,thanks! Reasonable, +1.

@lss602726449
Copy link
Contributor Author

Thanks, it is my mistake. I'll fix this problem and test as you suggest

@lss602726449
Copy link
Contributor Author

@avamingli, @liming01
I'm sorry to answer this too late. I have try to fix this problem by setting snapshot->copied. Howerver it doesn't work. And I find that DirtySnapshot will not cause deep copy in the CopySnapshot. I'm thinking of other solutions.

	if (!IsMVCCSnapshot(snapshot))
		return snapshot;

baotingfang pushed a commit that referenced this pull request Dec 1, 2023
…#136)

_bt_check_unique(DirtySnapshot is created here)
   -> table_index_fetch_tuple_check
      -> ...
         -> AppendOnlyVisimapStore_Init
            -> RegisterSnapshot(snapshot)
UnregisterSnapshot is called in AppendOnlyVisimapStore_Finish. If
transaction aborts, there is no chance to call AppendOnlyVisimapStore_Finish.

In order to solve this problem, we directly delete the RegisterSnapshot in
AppendOnlyVisimapStore_Init which is useless and at the same time delete
the UnregisterSnapshot.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CHECK index unique throw exception
5 participants