New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix IP overlap with empty EndpointSpec #2505

Merged
merged 1 commit into from Feb 8, 2018

Conversation

Projects
None yet
6 participants
@fcrisciani
Member

fcrisciani commented Feb 6, 2018

Passing and empty EndpointSpec in the service spec
was correctly triggering the VIP allocation but
the leader election was erroneusly handling the IPAM
state restore.
This fix ensure that the EndpointSpec if not specified is
actually added to the ServiceSpec selection endpoint mode VIP.
Also the allocate service has now the restart flag that will skip
the deallocation logic that was erroneously triggered.

Fix: moby/moby#33795

Signed-off-by: Flavio Crisciani flavio.crisciani@docker.com

@fcrisciani fcrisciani force-pushed the fcrisciani:fixendpointspec branch from 060844d to 687a9ae Feb 6, 2018

@codecov

This comment has been minimized.

codecov bot commented Feb 6, 2018

Codecov Report

Merging #2505 into master will increase coverage by 0.36%.
The diff coverage is 50%.

@@            Coverage Diff             @@
##           master    #2505      +/-   ##
==========================================
+ Coverage   61.32%   61.68%   +0.36%     
==========================================
  Files         131      129       -2     
  Lines       21451    21293     -158     
==========================================
- Hits        13154    13135      -19     
+ Misses       6878     6753     -125     
+ Partials     1419     1405      -14
@marcusmartins

This comment has been minimized.

Member

marcusmartins commented Feb 6, 2018

@fcrisciani fcrisciani force-pushed the fcrisciani:fixendpointspec branch from 687a9ae to 944e9a6 Feb 6, 2018

@@ -940,7 +940,7 @@ func updatePortsInHostPublishMode(s *api.Service) {
s.Endpoint.Spec = s.Spec.Endpoint.Copy()
}
func (a *Allocator) allocateService(ctx context.Context, s *api.Service) error {
func (a *Allocator) allocateService(ctx context.Context, s *api.Service, restart bool) error {

This comment has been minimized.

@anshulpundir

anshulpundir Feb 6, 2018

Contributor

Please add a comment for this function along with a description for each of the params.

This comment has been minimized.

@fcrisciani
@@ -306,6 +306,7 @@ func validateTaskSpec(taskSpec api.TaskSpec) error {
func validateEndpointSpec(epSpec *api.EndpointSpec) error {
// Endpoint spec is optional
if epSpec == nil {
epSpec = &api.EndpointSpec{Mode: api.ResolutionModeVirtualIP}

This comment has been minimized.

@anshulpundir

anshulpundir Feb 6, 2018

Contributor

I would suggest doing this outside this function. This function is for validation purposes only and should not have any side effects.

This comment has been minimized.

@dperny

dperny Feb 6, 2018

Member

I strongly agree. We should not modify the user's spec after it's been sent to Swarmkit. There are Good Reasons to do this.

This comment has been minimized.

@fcrisciani

fcrisciani Feb 7, 2018

Member

moved it outside

This comment has been minimized.

@anshulpundir

anshulpundir Feb 7, 2018

Contributor

also probably return an InvalidArgument here.

This comment has been minimized.

@fcrisciani

fcrisciani Feb 7, 2018

Member

then we don't need to set any default if we bail out with an error

This comment has been minimized.

@anshulpundir

anshulpundir Feb 7, 2018

Contributor

The change essentially is to make EndpointSpec mandatory, correct ? So, to avoid breaking old clients, we're populating it in swarm.

Its OK to return an error from validateEndpointSpec and handle it wherever it is called by filling in an empty EndpointSpec.

@@ -1188,7 +1190,7 @@ func (a *Allocator) procUnallocatedServices(ctx context.Context) {
var allocatedServices []*api.Service
for _, s := range nc.unallocatedServices {
if !nc.nwkAllocator.IsServiceAllocated(s) {
if err := a.allocateService(ctx, s); err != nil {
if err := a.allocateService(ctx, s, false); err != nil {

This comment has been minimized.

@anshulpundir

anshulpundir Feb 6, 2018

Contributor

nit: add a comment on why restart flag is false here.

@@ -587,8 +587,8 @@ func (a *Allocator) allocateServices(ctx context.Context, existingAddressesOnly
continue
}
if err := a.allocateService(ctx, s); err != nil {
log.G(ctx).WithError(err).Errorf("failed allocating service %s during init", s.ID)
if err := a.allocateService(ctx, s, existingAddressesOnly); err != nil {

This comment has been minimized.

@anshulpundir

anshulpundir Feb 6, 2018

Contributor

nit: Please add a comment on how existingAddressesOnly relates to the restart flag which is an arg for allocateService()

This comment has been minimized.

@fcrisciani

fcrisciani Feb 7, 2018

Member

change the name of the variable to match to show that the relation is that they are the same thing

This comment has been minimized.

@anshulpundir

anshulpundir Feb 7, 2018

Contributor

Did you mean to make this change ? I don't see it in the patch @fcrisciani

This comment has been minimized.

@fcrisciani

fcrisciani Feb 7, 2018

Member

the name of the variable is changed inside the allocateService not here, I kept the original name existingAddressesOnly

@@ -274,7 +274,7 @@ func (a *Allocator) doNetworkAlloc(ctx context.Context, ev events.Event) {
}
updatePortsInHostPublishMode(s)
} else {
if err := a.allocateService(ctx, s); err != nil {
if err := a.allocateService(ctx, s, false); err != nil {

This comment has been minimized.

@anshulpundir

anshulpundir Feb 6, 2018

Contributor

nit: add a comment on why restart flag is false here.

@@ -244,7 +244,7 @@ func (a *Allocator) doNetworkAlloc(ctx context.Context, ev events.Event) {
break
}
if err := a.allocateService(ctx, s); err != nil {
if err := a.allocateService(ctx, s, false); err != nil {

This comment has been minimized.

@anshulpundir

anshulpundir Feb 6, 2018

Contributor

nit: add a comment on why restart flag is false here.

This comment has been minimized.

@fcrisciani

fcrisciani Feb 7, 2018

Member

I put a comment on the allocateService itself, instead to spread a one line everywhere saying that is not a restart case

@@ -244,6 +245,7 @@ vipLoop:
}
for _, nAttach := range specNetworks {
if nAttach.Target == eAttach.NetworkID {
log.L.WithFields(logrus.Fields{"service_id": s.ID, "vip": eAttach.Addr}).Infof("allocate vip")

This comment has been minimized.

@anshulpundir

anshulpundir Feb 6, 2018

Contributor

super nit: use Info instead fo Infof ?

This comment has been minimized.

@abhi

abhi Feb 6, 2018

Member

I think this should be Debug ? Info will print it for every ip ?

This comment has been minimized.

@fcrisciani

fcrisciani Feb 6, 2018

Member

I was actually thinking to want this printed. Concerns?

This comment has been minimized.

@nishanttotla

nishanttotla Feb 6, 2018

Contributor

@fcrisciani only concern is that it might be too frequent to print in the logs. We normally would like to avoid that.

This comment has been minimized.

@dperny

dperny Feb 6, 2018

Member

Definitely change from Info to Debug. It'll be too verbose for Info.

This comment has been minimized.

@fcrisciani

fcrisciani Feb 6, 2018

Member

@nishanttotla today we have 0 visibility and debug is very difficult, do you have any other idea?

This comment has been minimized.

@fcrisciani

fcrisciani Feb 7, 2018

Member

Guys putting it to debug, but we will continue to have no clue of what is going on and what is assigned especially when allocation are not correct

@@ -796,6 +796,145 @@ func TestAllocatorRestoreForDuplicateIPs(t *testing.T) {
}
}
func TestAllocatorRestartNoEndpointSpec(t *testing.T) {

This comment has been minimized.

@anshulpundir

anshulpundir Feb 6, 2018

Contributor

Please add a summery of this test case.

This comment has been minimized.

@fcrisciani
@abhi

abhi approved these changes Feb 6, 2018

LGTM. One nit

@@ -244,6 +245,7 @@ vipLoop:
}
for _, nAttach := range specNetworks {
if nAttach.Target == eAttach.NetworkID {
log.L.WithFields(logrus.Fields{"service_id": s.ID, "vip": eAttach.Addr}).Infof("allocate vip")

This comment has been minimized.

@abhi

abhi Feb 6, 2018

Member

I think this should be Debug ? Info will print it for every ip ?

if err := a.allocateService(ctx, s); err != nil {
log.G(ctx).WithError(err).Errorf("failed allocating service %s during init", s.ID)
if err := a.allocateService(ctx, s, existingAddressesOnly); err != nil {
log.G(ctx).WithField("existingAddressesOnly", existingAddressesOnly).WithError(err).Errorf("failed allocating service %s during init", s.ID)

This comment has been minimized.

@nishanttotla

nishanttotla Feb 6, 2018

Contributor

nit: can this comment be made more clear? Should be it something like "failed to allocate network..." or something of that nature, perhaps with more info?

This comment has been minimized.

@dperny

dperny Feb 6, 2018

Member

The error field will give that information.

@fcrisciani fcrisciani force-pushed the fcrisciani:fixendpointspec branch 2 times, most recently from 9be2619 to 03eda38 Feb 7, 2018

@dperny

This comment has been minimized.

Member

dperny commented Feb 7, 2018

LGTM

@anshulpundir

Looks good. Minor nits. I'll merge as soon as you address these.

@@ -796,6 +796,149 @@ func TestAllocatorRestoreForDuplicateIPs(t *testing.T) {
}
}
// TestAllocatorRestartNoEndpointSpec covers the leader election case when the service Spec
// does not contains the EndpointSpec.

This comment has been minimized.

@anshulpundir

anshulpundir Feb 7, 2018

Contributor

nit: does not contains => does not contain

This comment has been minimized.

@fcrisciani
@@ -796,6 +796,149 @@ func TestAllocatorRestoreForDuplicateIPs(t *testing.T) {
}
}
// TestAllocatorRestartNoEndpointSpec covers the leader election case when the service Spec
// does not contains the EndpointSpec.
// The expected behavior iis that the VIP(s) are still correctly populated inside

This comment has been minimized.

@anshulpundir

anshulpundir Feb 7, 2018

Contributor

iis => is

This comment has been minimized.

@fcrisciani
@@ -587,8 +587,8 @@ func (a *Allocator) allocateServices(ctx context.Context, existingAddressesOnly
continue
}
if err := a.allocateService(ctx, s); err != nil {
log.G(ctx).WithError(err).Errorf("failed allocating service %s during init", s.ID)
if err := a.allocateService(ctx, s, existingAddressesOnly); err != nil {

This comment has been minimized.

@anshulpundir

anshulpundir Feb 7, 2018

Contributor

Did you mean to make this change ? I don't see it in the patch @fcrisciani

@@ -940,7 +940,10 @@ func updatePortsInHostPublishMode(s *api.Service) {
s.Endpoint.Spec = s.Spec.Endpoint.Copy()
}
func (a *Allocator) allocateService(ctx context.Context, s *api.Service) error {
// allocateService takes care to align the desired state with the spec passed
// the last parameter is true only during restart when the data are read from raft

This comment has been minimized.

@anshulpundir

anshulpundir Feb 7, 2018

Contributor

nit: data are => data is

This comment has been minimized.

@fcrisciani
Fix IP overlap with empty EndpointSpec
Passing and empty EndpointSpec in the service spec
was correctly triggering the VIP allocation but
the leader election was erroneusly handling the IPAM
state restore trying to release the VIP.
The fix focuses on proper handling of the restart case.

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>

@fcrisciani fcrisciani force-pushed the fcrisciani:fixendpointspec branch from 03eda38 to bd4e923 Feb 7, 2018

@anshulpundir anshulpundir merged commit 608c4dd into docker:master Feb 8, 2018

3 checks passed

ci/circleci Your tests passed on CircleCI!
Details
codecov/project 61.68% (target 0%)
Details
dco-signed All commits are signed

@fcrisciani fcrisciani deleted the fcrisciani:fixendpointspec branch Feb 8, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment