Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix reapTime logic in NetworkDB #1944

Merged
merged 3 commits into from Sep 22, 2017
Merged

Conversation

@fcrisciani
Copy link

fcrisciani commented Sep 19, 2017

  • Added remainingReapTime field in the table event.
    Wihtout it a node that did not have a state for the element
    was marking the element for deletion setting the max reapTime.
    This was creating the possibility to keep the entry being resync
    between nodes forever avoding the purpose of the reap time
    itself.

  • On broadcast of the table event the node owner was rewritten
    with the local node name, this was not correct because the owner
    should continue to remain the original one of the message

Signed-off-by: Flavio Crisciani flavio.crisciani@docker.com

@fcrisciani fcrisciani force-pushed the fcrisciani:netdb-fix-reap branch 3 times, most recently from f3a3aca to d2c57f4 Sep 20, 2017
@codecov-io

This comment has been minimized.

Copy link

codecov-io commented Sep 20, 2017

Codecov Report

Merging #1944 into master will decrease coverage by 0.06%.
The diff coverage is 48.42%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1944      +/-   ##
==========================================
- Coverage      38%   37.94%   -0.07%     
==========================================
  Files         137      137              
  Lines       27149    27308     +159     
==========================================
+ Hits        10317    10361      +44     
- Misses      15562    15674     +112     
- Partials     1270     1273       +3
Impacted Files Coverage Δ
networkdb/broadcast.go 83.63% <100%> (+0.3%) ⬆️
networkdb/cluster.go 45.69% <4.76%> (-0.44%) ⬇️
networkdb/networkdb.pb.go 37.31% <48.67%> (-1.54%) ⬇️
networkdb/delegate.go 70.17% <55.55%> (-0.67%) ⬇️
networkdb/networkdb.go 60% <89.18%> (+2.98%) ⬆️
endpoint_cnt.go 86.08% <0%> (+2.6%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c34e58a...fbba555. Read the comment docs.

@fcrisciani fcrisciani force-pushed the fcrisciani:netdb-fix-reap branch 3 times, most recently from b388729 to 78f80e7 Sep 20, 2017
return false
}

params := strings.Split(path[1:], "/")

This comment has been minimized.

Copy link
@vieux

vieux Sep 20, 2017

Contributor

are we 100% sure the path will have 4 / in it ?

This comment has been minimized.

Copy link
@fcrisciani

fcrisciani Sep 20, 2017

Author

why 4? it is actually 3 like: fmt.Sprintf("/%s/%s/%s", nid, tname, key)
Also the insertion and deletion is now contained in single functions: createOrUpdateEntry and deleteEntry

This comment has been minimized.

Copy link
@vieux

vieux Sep 20, 2017

Contributor

sorry I meant 3

This comment has been minimized.

Copy link
@mavenugo

mavenugo Sep 21, 2017

Contributor

Yes. this is the current expectation in the code and as @fcrisciani mentioned it is all organized now under the same functions and this assumption is valid.


okTable, okNetwork := nDB.deleteEntry(nid, tname, key)
if !okTable {
logrus.Errorf("Could not delete entry in table %s with network id %s and key %s as it does not exist", tname, nid, key)

This comment has been minimized.

Copy link
@vieux

vieux Sep 20, 2017

Contributor

I'm having some trouble reading the message, could we replace it by the table here and the network a couple of lines down ?

This comment has been minimized.

Copy link
@fcrisciani

fcrisciani Sep 20, 2017

Author

I actually left the original messages, I just had to change the bool on top.
Anyway here it refers to the entry itself not the table.
Maybe a better rephrase can be: Table delete failed, entry with key:%s does not exists in the table:%s network:%s
WDYT?

This comment has been minimized.

Copy link
@vieux

vieux Sep 20, 2017

Contributor

@fcrisciani yeah, sound good

This comment has been minimized.

Copy link
@fcrisciani

fcrisciani Sep 20, 2017

Author

k changing it

// This case can happen if the cluster is running different versions of the engine where the old version does not have the
// field. If that is not the case, this can be a BUG
if e.deleting && e.reapTime == 0 {
logrus.Warnf("handleTableEvent object %+v has a 0 reapTime, is the cluster running the same docker engine version?", tEvent)
e.reapTime = reapInterval

This comment has been minimized.

Copy link
@vieux

vieux Sep 20, 2017

Contributor

is this change of behavior expected here ?
shouldn't we log when e.reapTime == 0 but still do e.reapTime = reapInterval if it is not ?

This comment has been minimized.

Copy link
@fcrisciani

fcrisciani Sep 20, 2017

Author

nope, this is actually the point of this patch, the reapTime should be the one passed in the proto (line 237).
If that is 0 and the entry is marked for deletion we are in front of a bug (nodes does not set the value properly) or we are dealing with an old client that does not set it because does not have the new proto.

The old behavior was such that no matter when you get the notification the clock was being set to the max value of 30 min

This comment has been minimized.

Copy link
@mavenugo

mavenugo Sep 21, 2017

Contributor

Except in that very unfortunate circumstance that the ResidualReapTime populated by the peer has infact gone down to exactly 0 seconds at the time of populating the message (https://github.com/docker/libnetwork/pull/1944/files#diff-ced1ed36c3141cb15bdf5acafed78d67R583). If that happens in such a rare situation, we will reset this back to 30 seconds and the event will get cleaned up in 30 seconds (or in the more rarest of occasions we get another genuine 0 seconds ResidualReapTime).

@fcrisciani is that correct ?

This comment has been minimized.

Copy link
@mavenugo

mavenugo Sep 21, 2017

Contributor

BTW. yes. This is the actual point of this patch and I think it is a good idea to get this done.

This comment has been minimized.

Copy link
@fcrisciani

fcrisciani Sep 21, 2017

Author

@mavenugo your point is correct only for the case of the truncation of the number. I will add an extra check that the number truncated is still strictly greater than 0.
The improvement here is that the lock is taken only 1 time and not following the same pattern of saving the keys to delete and then do another semaphore race to delete them. So right now the only case in this patch that turns out to be 0 is when the residual time in seconds is 0.xxxx that truncated will be 0.

@vieux

This comment has been minimized.

Copy link
Contributor

vieux commented Sep 20, 2017

LGTM (with the log format update)

@fcrisciani fcrisciani force-pushed the fcrisciani:netdb-fix-reap branch from 78f80e7 to 3350cb8 Sep 21, 2017
msg: buf,
id: tEvent.NetworkID,
tname: tEvent.TableName,
key: tEvent.Key,
node: nDB.config.NodeName,
node: tEvent.NodeName,

This comment has been minimized.

Copy link
@mavenugo

mavenugo Sep 21, 2017

Contributor

Being a rebroadcast event and based on the current logic of depending on the residual reaptimer, what is the purpose of this change ?

This comment has been minimized.

Copy link
@vieux

vieux Sep 21, 2017

Contributor

if I understood correctly flavio's explanation, it's not strictly related to this PR. it's an issue he saw while he was looking at the code.

Without this change, when the event was rebroadcasted, the node original node emitting the event was replaced with the one rebroadcasting it. So you could receive event saying it originated from one node, but actually came from another one.

This comment has been minimized.

Copy link
@fcrisciani

fcrisciani Sep 21, 2017

Author

The node is simply rebroadcasting the message of another node, at the top of the function there is the check that skips messages where the owner matches with the current node.
Without this change the ownership of the entry is messed up and can cause issues on deletion

This comment has been minimized.

Copy link
@mavenugo

mavenugo Sep 21, 2017

Contributor

ok. this is a bit subtle and I think a good testing coverage will help mitigate the risk.

defer nDB.RUnlock()
return len(nDB.networkNodes[nid])
},
RetransmitMult: 4,

This comment has been minimized.

Copy link
@mavenugo

mavenugo Sep 21, 2017

Contributor

Nit: Pls reorganize this code so that not-so-important changes doesn't popup. The only real change in this blob is the addition of entriesNumber and the rest of if is purely code reorganization. It is very distracting to see such changes in a PR with other sensitive changes.

@@ -141,6 +141,9 @@ type network struct {

// Number of gossip messages sent related to this network during the last stats collection period
qMessagesSent int

// Number of entries on the network
entriesNumber int

This comment has been minimized.

Copy link
@mavenugo

mavenugo Sep 21, 2017

Contributor

I guess this is used only for debugging purposes... It would be good to write a comment reflecting the purpose and we should not be tempted to use this as a reference counter for any logic in the future.

This comment has been minimized.

Copy link
@fcrisciani

fcrisciani Sep 21, 2017

Author

will add extra comments

@mavenugo

This comment has been minimized.

Copy link
Contributor

mavenugo commented Sep 21, 2017

@fcrisciani overall the changes LGTM. Only a few minor questions/comments.

Flavio Crisciani added 2 commits Sep 19, 2017
Flavio Crisciani
- Added remainingReapTime field in the table event.
  Wihtout it a node that did not have a state for the element
  was marking the element for deletion setting the max reapTime.
  This was creating the possibility to keep the entry being resync
  between nodes forever avoding the purpose of the reap time
  itself.

- On broadcast of the table event the node owner was rewritten
  with the local node name, this was not correct because the owner
  should continue to remain the original one of the message

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
Flavio Crisciani
- Changed the loop per network. Previous implementation was taking a
  ReadLock to update the reapTime but now with the residualReapTime
  also the bulkSync is using the same ReadLock creating possible
  issues in concurrent read and update of the value.
  The new logic fetches the list of networks and proceed to the
  cleanup network by network locking the database and releasing it
  after each network. This should ensure a fair locking avoiding
  to keep the database blocked for too much time.

  Note: The ticker does not guarantee that the reap logic runs
  precisely every reapTimePeriod, actually documentation says that
  if the routine is too long will skip ticks. In case of slowdown
  of the process itself it is possible that the lifetime of the
  deleted entries increases, it still should not be a huge problem
  because now the residual reaptime is propagated among all the nodes
  a slower node will let the deleted entry being repropagate multiple
  times but the state will still remain consistent.

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
@fcrisciani fcrisciani force-pushed the fcrisciani:netdb-fix-reap branch from 3350cb8 to 3feb3aa Sep 21, 2017
@fcrisciani

This comment has been minimized.

Copy link
Author

fcrisciani commented Sep 21, 2017

@mavenugo @vieux addressed all the comments, can you guys have a new pass on it?
I will start in the mean time another round of test on it

Copy link
Contributor

mavenugo left a comment

Thanks for addressing the comments. LGTM

@fcrisciani fcrisciani mentioned this pull request Sep 21, 2017
@vieux
vieux approved these changes Sep 21, 2017
Copy link
Contributor

vieux left a comment

LGTM

Flavio Crisciani
Make sure that the network is garbage collected after
the entries. Entries to be deleted requires that the network
is present.

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
@fcrisciani fcrisciani force-pushed the fcrisciani:netdb-fix-reap branch from 3537443 to fbba555 Sep 22, 2017
@vieux

This comment has been minimized.

Copy link
Contributor

vieux commented Sep 22, 2017

New commit LGTM

@mavenugo mavenugo merged commit 0f08d31 into docker:master Sep 22, 2017
2 checks passed
2 checks passed
ci/circleci Your tests passed on CircleCI!
Details
dco-signed All commits are signed
@fcrisciani fcrisciani deleted the fcrisciani:netdb-fix-reap branch Sep 22, 2017
fcrisciani pushed a commit to fcrisciani/docker that referenced this pull request Sep 24, 2017
Flavio Crisciani
Fix for networkDB garbage collection (PR: docker/libnetwork#1944)
Added extra logs to monitor the netowrkDB status and number of entries per network

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
andrewhsu pushed a commit to docker/docker-ce that referenced this pull request Sep 26, 2017
Fix for networkDB garbage collection (PR: docker/libnetwork#1944)
Added extra logs to monitor the netowrkDB status and number of entries per network

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
Upstream-commit: 0404342
Component: engine
salah-khan added a commit to salah-khan/moby that referenced this pull request Nov 15, 2017
Fix for networkDB garbage collection (PR: docker/libnetwork#1944)
Added extra logs to monitor the netowrkDB status and number of entries per network

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.