New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Service discovery hardening #1796

Merged
merged 2 commits into from Jun 12, 2017

Conversation

Projects
None yet
2 participants
@fcrisciani
Member

fcrisciani commented Jun 8, 2017

This patch addresses several issues found on the Service Discovery feature.

Create SetMatrix data structure
SetMatrix is a simple matrix of sets.
Added tests

This data structure will be used in following commit to handle
transient states where the same key can momentarely be associated
to more than a value

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>

@fcrisciani fcrisciani changed the title from Service discovery rewrite to Service discovery hardening Jun 8, 2017

Show outdated Hide outdated endpoint.go Outdated
Show outdated Hide outdated sandbox.go Outdated
@mavenugo

Just pushing the initial set of review comments. I have a few more areas to cover.

// but in any case the deleteServiceInfoToCluster will follow doing the cleanup if needed.
// In case the deleteServiceInfoToCluster arrives first, this one is happening after the endpoint is
// removed from the list, in this situation the delete will bail out not finding any data to cleanup
// and the add will bail out not finding the endpoint on the sandbox.

This comment has been minimized.

@mavenugo

mavenugo Jun 10, 2017

Contributor

Thanks for the excellent summary of the problem.

@mavenugo

mavenugo Jun 10, 2017

Contributor

Thanks for the excellent summary of the problem.

Show outdated Hide outdated service.go Outdated
Show outdated Hide outdated network.go Outdated
Show outdated Hide outdated network.go Outdated
Show outdated Hide outdated network.go Outdated
@@ -285,7 +285,6 @@ func (nDB *NetworkDB) CreateEntry(tname, nid, key string, value []byte) error {
nDB.indexes[byNetwork].Insert(fmt.Sprintf("/%s/%s/%s", nid, tname, key), entry)
nDB.Unlock()
nDB.broadcaster.Write(makeEvent(opCreate, tname, nid, key, value))

This comment has been minimized.

@mavenugo

mavenugo Jun 10, 2017

Contributor

Can you pls explain why it is okay to remove the broadcast event ?

@mavenugo

mavenugo Jun 10, 2017

Contributor

Can you pls explain why it is okay to remove the broadcast event ?

This comment has been minimized.

@fcrisciani

fcrisciani Jun 10, 2017

Member

This is triggering a notification of the handleEpTableEvent for local endpoints. This is not necessary considering that local endpoints got already configured by logic before. Also that can be risky because now the timing with remote events can arrive out of sync due to the fact that remote events will take more time than local dispatched ones

@fcrisciani

fcrisciani Jun 10, 2017

Member

This is triggering a notification of the handleEpTableEvent for local endpoints. This is not necessary considering that local endpoints got already configured by logic before. Also that can be risky because now the timing with remote events can arrive out of sync due to the fact that remote events will take more time than local dispatched ones

This comment has been minimized.

@mavenugo

mavenugo Jun 10, 2017

Contributor

am not against this change. But I like to understand this a bit more.

broadcaster.Write ultimately calls appropriate Watch routines such as handleEpTableEvent or handleNodeTableEvent or other tables managed by other components (such as overlay driver).

CreateEntry in networkdb is a generic API to create an event of any type. But changing this generic function for a specific case of Endpoint management seems risky without considering how it impacts other events.
What if there is another table (not endpoint), with an event created by one component and consumed not just by remote peers, but also by another local component which networkdb as the mechanism for communication ?

Also, if we remove this call, can you point me to another code path for which handleEpTableEvent is called for local endpoints ?

@mavenugo

mavenugo Jun 10, 2017

Contributor

am not against this change. But I like to understand this a bit more.

broadcaster.Write ultimately calls appropriate Watch routines such as handleEpTableEvent or handleNodeTableEvent or other tables managed by other components (such as overlay driver).

CreateEntry in networkdb is a generic API to create an event of any type. But changing this generic function for a specific case of Endpoint management seems risky without considering how it impacts other events.
What if there is another table (not endpoint), with an event created by one component and consumed not just by remote peers, but also by another local component which networkdb as the mechanism for communication ?

Also, if we remove this call, can you point me to another code path for which handleEpTableEvent is called for local endpoints ?

This comment has been minimized.

@mavenugo

mavenugo Jun 11, 2017

Contributor

Based on the review on all the current usage of CreateEntry and Watch, there is no such case that will break and hence am fine with not having to broadcast this event from networkDB.

@mavenugo

mavenugo Jun 11, 2017

Contributor

Based on the review on all the current usage of CreateEntry and Watch, there is no such case that will break and hence am fine with not having to broadcast this event from networkDB.

@fcrisciani

This comment has been minimized.

Show comment
Hide comment
@fcrisciani

fcrisciani Jun 10, 2017

Member
Member

fcrisciani commented Jun 10, 2017

// In such cases the resolution will be based on the first element of the set, and can vary
// during the system stabilitation
elem, ok := elemSet[0].(ipInfo)
if !ok {

This comment has been minimized.

@mavenugo

mavenugo Jun 10, 2017

Contributor

What is the purpose of adding this defensive check ?
ipMap is completely controlled by libnetwork core and what is the purpose of checking for ipInfo type ?

@mavenugo

mavenugo Jun 10, 2017

Contributor

What is the purpose of adding this defensive check ?
ipMap is completely controlled by libnetwork core and what is the purpose of checking for ipInfo type ?

This comment has been minimized.

@fcrisciani

fcrisciani Jun 10, 2017

Member

the SetMatrix is a generic data structure that store potentially any kind of type. This, as far as I know, is the only way to cast back to the original type.

@fcrisciani

fcrisciani Jun 10, 2017

Member

the SetMatrix is a generic data structure that store potentially any kind of type. This, as far as I know, is the only way to cast back to the original type.

This comment has been minimized.

@mavenugo

mavenugo Jun 11, 2017

Contributor

ok. This is a defensive check and am okay to keep it in and am hoping to never see the Error message ever :)

@mavenugo

mavenugo Jun 11, 2017

Contributor

ok. This is a defensive check and am okay to keep it in and am hoping to never see the Error message ever :)

Show outdated Hide outdated network.go Outdated
@@ -285,7 +285,6 @@ func (nDB *NetworkDB) CreateEntry(tname, nid, key string, value []byte) error {
nDB.indexes[byNetwork].Insert(fmt.Sprintf("/%s/%s/%s", nid, tname, key), entry)
nDB.Unlock()
nDB.broadcaster.Write(makeEvent(opCreate, tname, nid, key, value))

This comment has been minimized.

@mavenugo

mavenugo Jun 10, 2017

Contributor

am not against this change. But I like to understand this a bit more.

broadcaster.Write ultimately calls appropriate Watch routines such as handleEpTableEvent or handleNodeTableEvent or other tables managed by other components (such as overlay driver).

CreateEntry in networkdb is a generic API to create an event of any type. But changing this generic function for a specific case of Endpoint management seems risky without considering how it impacts other events.
What if there is another table (not endpoint), with an event created by one component and consumed not just by remote peers, but also by another local component which networkdb as the mechanism for communication ?

Also, if we remove this call, can you point me to another code path for which handleEpTableEvent is called for local endpoints ?

@mavenugo

mavenugo Jun 10, 2017

Contributor

am not against this change. But I like to understand this a bit more.

broadcaster.Write ultimately calls appropriate Watch routines such as handleEpTableEvent or handleNodeTableEvent or other tables managed by other components (such as overlay driver).

CreateEntry in networkdb is a generic API to create an event of any type. But changing this generic function for a specific case of Endpoint management seems risky without considering how it impacts other events.
What if there is another table (not endpoint), with an event created by one component and consumed not just by remote peers, but also by another local component which networkdb as the mechanism for communication ?

Also, if we remove this call, can you point me to another code path for which handleEpTableEvent is called for local endpoints ?

for _, ep := range sb.getConnectedEndpoints() {
if ep.enableService(false) {
if err := ep.deleteServiceInfoFromCluster(); err != nil {

This comment has been minimized.

@mavenugo

mavenugo Jun 10, 2017

Contributor

Thinking more about this, Isnt it better to have deleteServiceInfoFromCluster called from DisableService, but remove it from sbLeave in order to be consistent with the way EnableService and DisableService are handled. Also it brings in consistency between Join and Leave. WDYT ?

@mavenugo

mavenugo Jun 10, 2017

Contributor

Thinking more about this, Isnt it better to have deleteServiceInfoFromCluster called from DisableService, but remove it from sbLeave in order to be consistent with the way EnableService and DisableService are handled. Also it brings in consistency between Join and Leave. WDYT ?

This comment has been minimized.

@fcrisciani

fcrisciani Jun 11, 2017

Member

That would be great as a future addition, but we have to investigate some corner cases. For example I validated that if you do the docker kill <container> no DisableService is called. This of course will leave stale entries behind.

@fcrisciani

fcrisciani Jun 11, 2017

Member

That would be great as a future addition, but we have to investigate some corner cases. For example I validated that if you do the docker kill <container> no DisableService is called. This of course will leave stale entries behind.

This comment has been minimized.

@mavenugo

mavenugo Jun 11, 2017

Contributor

👍

@mavenugo

mavenugo Jun 11, 2017

Contributor

👍

type loadBalancerBackend struct {
ip net.IP
containerName string
taskAliases []string

This comment has been minimized.

@mavenugo

mavenugo Jun 10, 2017

Contributor

It is weird to see taskAliases and containerName carried in loadBalancerBackend, where this structure is used so far for the VIP based IPVS programming. Are we changing this for other purposes now (such as DNS-RR as well) ?

@mavenugo

mavenugo Jun 10, 2017

Contributor

It is weird to see taskAliases and containerName carried in loadBalancerBackend, where this structure is used so far for the VIP based IPVS programming. Are we changing this for other purposes now (such as DNS-RR as well) ?

This comment has been minimized.

@mavenugo

mavenugo Jun 11, 2017

Contributor

I can see why we have to carry these additional details. This is required especially so that cleanupServiceBindings can call the lower level SD functions that expects to see containerName and taskAliases. But still, this seems out of place.

@mavenugo

mavenugo Jun 11, 2017

Contributor

I can see why we have to carry these additional details. This is required especially so that cleanupServiceBindings can call the lower level SD functions that expects to see containerName and taskAliases. But still, this seems out of place.

This comment has been minimized.

@mavenugo

mavenugo Jun 11, 2017

Contributor

@fcrisciani Please consider picking up this change mavenugo@92820b9

This is a minor rework of your changes that enables us to avoid the need to change this structure and also helps scope this PR for fixing the race issues and minimize the code-reorg for a later activity if there is a need.

@mavenugo

mavenugo Jun 11, 2017

Contributor

@fcrisciani Please consider picking up this change mavenugo@92820b9

This is a minor rework of your changes that enables us to avoid the need to change this structure and also helps scope this PR for fixing the race issues and minimize the code-reorg for a later activity if there is a need.

Show outdated Hide outdated agent.go Outdated
Show outdated Hide outdated agent.go Outdated
Show outdated Hide outdated agent.go Outdated
Show outdated Hide outdated agent.go Outdated
}
}
if addService && len(vip) != 0 {

This comment has been minimized.

@mavenugo

mavenugo Jun 11, 2017

Contributor

Its hard for me to judge if addService variable is introduced here just as an optimization to avoid calling addSvcRecords multiple times for the same svcName <-> vip combination ?
or Is there a real problem if we call addSvcRecords multiple times for the same svcName <-> vip combination ?

@mavenugo

mavenugo Jun 11, 2017

Contributor

Its hard for me to judge if addService variable is introduced here just as an optimization to avoid calling addSvcRecords multiple times for the same svcName <-> vip combination ?
or Is there a real problem if we call addSvcRecords multiple times for the same svcName <-> vip combination ?

This comment has been minimized.

@fcrisciani

fcrisciani Jun 11, 2017

Member

no real problem is only an optimization and also more symmetric towards the delSvcRecords that has the rmService (the case of the removal the logic is mandatory of course)

@fcrisciani

fcrisciani Jun 11, 2017

Member

no real problem is only an optimization and also more symmetric towards the delSvcRecords that has the rmService (the case of the removal the logic is mandatory of course)

c := n.getController()
agent := c.getAgent()
name := ep.Name()

This comment has been minimized.

@fcrisciani

fcrisciani Jun 11, 2017

Member

@mavenugo I guess to handle properly the case of anonymous container this has to be brought up

@fcrisciani

fcrisciani Jun 11, 2017

Member

@mavenugo I guess to handle properly the case of anonymous container this has to be brought up

if n.ingress {
ingressPorts = ep.ingressPorts
}
name := ep.Name()

This comment has been minimized.

@fcrisciani

fcrisciani Jun 11, 2017

Member

@mavenugo same here for the delete this name is needed

@fcrisciani

fcrisciani Jun 11, 2017

Member

@mavenugo same here for the delete this name is needed

Service discovery logic rework
changed the ipMap to SetMatrix to allow transient states
Compacted the addSvc and deleteSvc into a one single method
Updated the datastructure for backends to allow storing all the information needed
to cleanup properly during the cleanupServiceBindings
Removed the enable/disable Service logic that was racing with sbLeave/sbJoin logic
Add some debug logs to track further race conditions

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
@mavenugo

@fcrisciani Thanks for addressing all the comments and fixing the hard to reproduce race test-cases.

LGTM

@mavenugo mavenugo merged commit 86ae3cf into docker:master Jun 12, 2017

2 checks passed

ci/circleci Your tests passed on CircleCI!
Details
dco-signed All commits are signed

@fcrisciani fcrisciani deleted the fcrisciani:name-resolution-race branch Jun 12, 2017

mavenugo added a commit to mavenugo/docker that referenced this pull request Jun 12, 2017

Vendoring libnetwork f4a15a0890383619ad797b3bd2481cc6f46a978d
Contains Service Discovery hardening fixes via
docker/libnetwork#1796

Fixes multiple issues such as moby#32830

Signed-off-by: Madhu Venugopal <madhu@docker.com>

andrewhsu pushed a commit to docker/docker-ce that referenced this pull request Jun 13, 2017

Vendoring libnetwork 4f5310be349d9299f6cab6d5822312f00cfa965c
This is a cherry-pick of moby/moby#33634
that brings in docker/libnetwork#1796.

Signed-off-by: Madhu Venugopal <madhu@docker.com>

andrewhsu pushed a commit to docker/docker-ce that referenced this pull request Jun 24, 2017

Vendoring libnetwork f4a15a0890383619ad797b3bd2481cc6f46a978d
Contains Service Discovery hardening fixes via
docker/libnetwork#1796

Fixes multiple issues such as #32830

Signed-off-by: Madhu Venugopal <madhu@docker.com>
Upstream-commit: 6868b8e
Component: engine
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment