Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: race condition on deletes #25170

Merged
merged 6 commits into from Feb 29, 2024
Merged

fix: race condition on deletes #25170

merged 6 commits into from Feb 29, 2024

Conversation

ankush
Copy link
Member

@ankush ankush commented Feb 29, 2024

Problem is best understood in form of a timeline:

Timestamp User A action User B action
1 Submits a draft document ---
2 (submission ongoing) deletes the same draft document
3 submission completed ---
4 --- gets lock, deletes the document assuming nothing changed

So a document with docstatus 1 in DB gets deleted.

If you look at binary logs MySQL is thinking:

  1. Draft created
  2. Draft submitted
  3. Submitted document deleted, even though in deleted document it will appear as draft document because of REPEATABLE READ isolation.

This is reliably reproducible using two consoles (to slow down the transaction commits)

Fix:

  • Just locking doc before delete is enough for preventing submission but not for preventing normal save + delete.
  • Lock the document with NOWAIT mode to fail immediately if document is locked.

Co-Authored-By: @rohitwaghchaure

TODO:

  • add test utils for simulating parallel execution. (either thread or using switching connections)
  • add tests for this

Locking only prevents this kinda race conditions:
- User A deletes doc
- User B modifies doc so that it's not deletable anymore.
@ankush ankush requested a review from a team as a code owner February 29, 2024 11:05
@ankush ankush requested review from akhilnarang and removed request for a team February 29, 2024 11:05
@ankush ankush marked this pull request as draft February 29, 2024 11:05
@github-actions github-actions bot added the add-test-cases Add test case to validate fix or enhancement label Feb 29, 2024
@ankush ankush marked this pull request as ready for review February 29, 2024 12:16
@ankush ankush added backport version-14-hotfix backport to version 14 backport version-15-hotfix Backport the PR to v15 and removed add-test-cases Add test case to validate fix or enhancement labels Feb 29, 2024
@ankush
Copy link
Member Author

ankush commented Feb 29, 2024

postgres at it again 🤡

@ankush ankush removed the backport version-14-hotfix backport to version 14 label Feb 29, 2024
@ankush ankush merged commit 39c511a into frappe:develop Feb 29, 2024
23 checks passed
@ankush ankush deleted the delete_conflict branch February 29, 2024 12:47
@ankush
Copy link
Member Author

ankush commented Feb 29, 2024

V14 merged on #25169

ankush added a commit that referenced this pull request Feb 29, 2024
* fix: lock the doc before deleting

Locking only prevents this kinda race conditions:
- User A deletes doc
- User B modifies doc so that it's not deletable anymore.

(cherry picked from commit 8f00aae)

* feat: nowait to skip blocking locks

(cherry picked from commit e810fb7)

* fix: prevent deletion if document is locked

(cherry picked from commit fc5ce04)

* test: utils for simulating two connections

(cherry picked from commit 5116768)

* test: NOWAIT functionality

(cherry picked from commit 0c9cc2e)

* fix(postgres): treat LockNotAvailable as timeout

It's a lock timeout in a way.

(cherry picked from commit b4fe722)

* test: separate out risky tests

---------

Co-authored-by: Ankush Menat <ankush@frappe.io>
ankush added a commit that referenced this pull request Feb 29, 2024
…ackport #24298 and #25170) (#25169)

* feat: Skip locked rows while selecting (#24298)

(cherry picked from commit e45e313)

* fix: lock the doc before deleting

Locking only prevents this kinda race conditions:
- User A deletes doc
- User B modifies doc so that it's not deletable anymore.

* feat: nowait to skip blocking locks

* fix: prevent deletion if document is locked

* test: utils for simulating two connections

* test: NOWAIT functionality

* fix(postgres): treat LockNotAvailable as timeout

It's a lock timeout in a way.

---------

Co-authored-by: Ankush Menat <ankush@frappe.io>
frappe-pr-bot pushed a commit that referenced this pull request Mar 5, 2024
# [14.67.0](v14.66.3...v14.67.0) (2024-03-05)

### Bug Fixes

* add "If Owner" column to roles viewer ([#25218](#25218)) ([#25219](#25219)) ([51db516](51db516))
* always show is_standard on web form ([#25144](#25144)) ([#25147](#25147)) ([92ea2e7](92ea2e7))
* dont get message attribute in error string in  has_login_limit_exceeded ([#25165](#25165)) ([ae07a32](ae07a32))
* dont translate numbers ([#25208](#25208)) ([#25209](#25209)) ([9a7b217](9a7b217))
* escape single quotes ([#25104](#25104)) ([#25153](#25153)) ([37bd2ad](37bd2ad)), closes [/github.com//pull/25078#discussion_r1504084483](https://github.com//github.com/frappe/frappe/pull/25078/issues/discussion_r1504084483)
* restrict method for security critical endpoints (backport [#25105](#25105)) ([#25107](#25107)) ([3b34ca6](3b34ca6))
* **search:** Don't break when query doesn't return title ([#25168](#25168)) ([#25196](#25196)) ([2718373](2718373))
* update file attached_to details in submitted doc ([#25140](#25140)) ([f116cdc](f116cdc))
* Use current language in attachment prints ([0e32b42](0e32b42))
* use name for RQ worker instead of PID (backport [#25175](#25175)) ([#25176](#25176)) ([359b103](359b103))
* **UX:** correctly disable standard web form form ([#25143](#25143)) ([#25145](#25145)) ([95f5dde](95f5dde))
* **UX:** list filter take zero as null ([#25156](#25156)) ([#25221](#25221)) ([d04607a](d04607a))
* **UX:** reload form after renaming field (backport [#25159](#25159)) ([#25160](#25160)) ([6c7b4cd](6c7b4cd))

### Features

* basic tracing using monitor trace id (backport [#22126](#22126)) ([#25216](#25216)) ([59791dc](59791dc))
* Skip locked rows while selecting + fix delete race condition (backport [#24298](#24298) and [#25170](#25170)) ([#25169](#25169)) ([1119bfd](1119bfd))
* track skipped patch with traceback (backport [#20931](#20931)) ([#25214](#25214)) ([f50b53d](f50b53d))
frappe-pr-bot pushed a commit that referenced this pull request Mar 5, 2024
# [15.17.0](v15.16.1...v15.17.0) (2024-03-05)

### Bug Fixes

* add "If Owner" column to roles viewer ([#25218](#25218)) ([#25220](#25220)) ([adf6a2a](adf6a2a))
* always show is_standard on web form ([#25144](#25144)) ([#25148](#25148)) ([196483b](196483b))
* better error message ([b94e978](b94e978))
* dont translate numbers ([#25208](#25208)) ([#25210](#25210)) ([c659eca](c659eca))
* escalate print failures ([a987c2d](a987c2d))
* escape single quotes ([#25104](#25104)) ([#25154](#25154)) ([adb7e38](adb7e38)), closes [/github.com//pull/25078#discussion_r1504084483](https://github.com//github.com/frappe/frappe/pull/25078/issues/discussion_r1504084483)
* Export `None` as type if select as no options ([#25211](#25211)) ([#25212](#25212)) ([79a8afd](79a8afd))
* filter Implementation is set operator ([#25182](#25182)) ([#25195](#25195)) ([20a7879](20a7879)), closes [#25180](#25180)
* No need to sort keys while saving JSON to DB ([#25205](#25205)) ([#25206](#25206)) ([a471499](a471499))
* Only validate fetch from when user modifies it ([72da497](72da497))
* race condition on deletes (backport [#25170](#25170)) ([#25171](#25171)) ([1657d55](1657d55))
* **search:** Don't break when query doesn't return title ([#25168](#25168)) ([#25197](#25197)) ([3dafbf9](3dafbf9))
* **setup_module_map:** fix caching ([338c895](338c895))
* specify print_language in communication attachments ([4e53b12](4e53b12))
* task_id parameter for publish_progress ([30a2f27](30a2f27))
* update file attached_to details in submitted doc ([#25141](#25141)) ([73f1a73](73f1a73))
* Use current language in attachment prints ([ea4c11b](ea4c11b))
* use name for RQ worker instead of PID ([#25175](#25175)) ([#25177](#25177)) ([7264734](7264734))
* **UX:** correctly disable standard web form form ([#25143](#25143)) ([2d907e7](2d907e7))
* **UX:** list filter take zero as null ([#25156](#25156)) ([#25222](#25222)) ([963ff89](963ff89))
* **UX:** reload form after renaming field ([#25159](#25159)) ([#25161](#25161)) ([7754e4a](7754e4a))
* **UX:** set default print language from print format ([c11f140](c11f140))

### Features

* Add doc rename hook in server script ([#25085](#25085)) ([#25110](#25110)) ([b729c03](b729c03))
* let users unlock stuck documents (backport [#24782](#24782)) ([#25225](#25225)) ([b0db641](b0db641))
* task_id for submit_cancel_or_update_docs ([bf8d102](bf8d102))
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 15, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant