Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix various deadlocks in Dgraph #2548

Merged
merged 7 commits into from
Aug 24, 2018
Merged

Fix various deadlocks in Dgraph #2548

merged 7 commits into from
Aug 24, 2018

Conversation

manishrjain
Copy link
Contributor

@manishrjain manishrjain commented Aug 23, 2018

Fixed a bunch of long-standing deadlock issues:

  1. Deadlock caused by recursive locking in posting/list.go in an internal function, which was causing applyCh to block when applying a mutation on a posting list with a read from a query.
  2. Deadlock caused by loss of Raft ConfState during a restart of a node. We were not picking up the previous ConfState, hence it was set by default to nil in the next CreateSnapshot. Now we pick up the state, and ensure that the snapshot has a valid ConfState. This basically caused a node to see an empty group, and never participate in elections.
  3. Fix A tick missed to fire. Node blocks too long #2541 -- A Tick missed to fire, caused due to the repeated calling of raft Storage.FirstIndex(). This was causing Badger to create an iterator every time, which was expensive. Now we cache the first index, to avoid repeatedly looking it up.

Also introduced golang/glog library for better logging.

This change is Reviewable

@manishrjain manishrjain changed the title [WIP] Mrjn/deadlocks Fix various deadlocks in Dgraph Aug 24, 2018
@manishrjain manishrjain merged commit 8779066 into master Aug 24, 2018
@manishrjain manishrjain deleted the mrjn/deadlocks branch August 24, 2018 01:29
dna2github pushed a commit to dna2fork/dgraph that referenced this pull request Jul 19, 2019
Fixed a bunch of long-standing deadlock issues:

1. Deadlock caused by recursive locking in posting/list.go in an internal function, which was causing `applyCh` to block when applying a mutation on a posting list with a read from a query.
2. Deadlock caused by loss of Raft ConfState during a restart of a node. We were not picking up the previous ConfState, hence it was set by default to nil in the next CreateSnapshot. Now we pick up the state, and ensure that the snapshot has a valid ConfState. This basically caused a node to see an empty group, and never participate in elections.
3. Fix dgraph-io#2541 -- A Tick missed to fire, caused due to the repeated calling of `raft Storage.FirstIndex()`. This was causing Badger to create an iterator every time, which was expensive. Now we cache the first index, to avoid repeatedly looking it up.

Also introduced golang/glog library for better logging.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

1 participant