Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

e2e app remove spec can race and leak services #3243

Closed
SpComb opened this issue Feb 5, 2018 · 3 comments
Closed

e2e app remove spec can race and leak services #3243

SpComb opened this issue Feb 5, 2018 · 3 comments

Comments

@SpComb
Copy link
Contributor

SpComb commented Feb 5, 2018

There's probably a conflict on the simple-lb-1 container hanging around from the earlier kontena app deploy spec... the sleep 1 is not enough to avoid the race entirely, it can take up to 30s for the agent to remove the containers.

  1) app subcommand with the app-command plugin app remove removes a deployed app
     Failure/Error: raise Error.new(cmd, k.code, k.out) unless k.code.zero?
     Shell::Error:
       command failed with code 1: kontena app deploy
 [done] Creating redis      
 [done] Creating lb      
 [done] Creating nginx      
 [done] Deploying redis      
       ⊛ Deployed instance e2e/null/simple-redis-1 to node travis-job-e1d123be-f9e8-4f03-9e5d-6610b8f63798
 [fail] Deploying lb      
        [error] 500 : halting deploy of e2e/null/simple-lb, one or more instances failed:
       	⊗ Failed to deploy instance e2e/null/simple-lb-1 to node travis-job-e1d123be-f9e8-4f03-9e5d-6610b8f63798: GridServiceInstanceDeployer::ServiceError: Docker::Error::ClientError: Conflict. The container name "/simple-lb-1" is already in use by container "d510ae99add6f37643375ac476522bdc45f97b42f49ee84ca098572b097a406f". You have to remove (or rename) that container to be able to reuse that name.
     # ./spec/support/shell.rb:28:in `block in run!'
     # ./spec/support/shell.rb:27:in `tap'
     # ./spec/support/shell.rb:27:in `run!'
     # ./spec/features/plugin/app-command/remove_spec.rb:15:in `block (5 levels) in <top (required)>'
     # ./spec/support/shell.rb:54:in `block in with_fixture_dir'
     # ./spec/support/shell.rb:53:in `chdir'
     # ./spec/support/shell.rb:53:in `with_fixture_dir'
     # ./spec/features/plugin/app-command/remove_spec.rb:14:in `block (4 levels) in <top (required)>'
  2) service list shows empty list by default
     Failure/Error: expect(k.out.split("\r\n").size).to eq(1)
       expected: 1
            got: 4
       (compared using ==)
     # ./spec/features/service/list_spec.rb:6:in `block (2 levels) in <top (required)>'
@SpComb
Copy link
Contributor Author

SpComb commented Feb 5, 2018

Timing from the kontena-agent logs for a failed run. Is there maybe a 10s stop timeout on the simple-lb-1 container?

kontena-agent | D, [2018-02-05T11:08:10.923531 #1] DEBUG -- Kontena::Workers::ServicePodWorker: state of simple-lb-1: running
kontena-agent | I, [2018-02-05T11:08:10.925592 #1]  INFO -- Kontena::Workers::ServicePodWorker: creating simple-lb-1
kontena-agent | I, [2018-02-05T11:08:10.925643 #1]  INFO -- Kontena::ServicePods::Creator: creating service: simple-lb-1
kontena-agent | I, [2018-02-05T11:08:12.065031 #1]  INFO -- Kontena::NetworkAdapters::Weave: Create container=simple-lb-1 in network=kontena with overlay_cidr=10.81.128.75/16
kontena-agent | D, [2018-02-05T11:08:12.065121 #1] DEBUG -- Kontena::ServicePods::Creator: creating container: simple-lb-1
kontena-agent | D, [2018-02-05T11:08:12.123668 #1] DEBUG -- Kontena::ServicePods::Creator: container created: simple-lb-1
kontena-agent | I, [2018-02-05T11:08:12.387335 #1]  INFO -- Kontena::ServicePods::Creator: service started: simple-lb-1
kontena-agent | D, [2018-02-05T11:08:12.412362 #1] DEBUG -- Kontena::Workers::LogWorker: /simple-lb-1/{"io.kontena.container.deploy_rev"=>"2018-02-05 11:08:10 UTC", "io.kontena.container.name"=>"simple-lb-1", "io.kontena.container.overlay_cidr"=>"10.81.128.75/16", "io.kontena.container.overlay_network"=>"kontena", "io.kontena.container.pod"=>"simple-lb-1", "io.kontena.container.service_revision"=>"1", "io.kontena.container.stop_grace_period"=>"10", "io.kontena.container.type"=>"container", "io.kontena.grid.name"=>"e2e", "io.kontena.platform.name"=>"e2e", "io.kontena.service.id"=>"5a783b16aeb4d60007d516cc", "io.kontena.service.instance_number"=>"1", "io.kontena.service.name"=>"simple-lb", "io.kontena.stack.name"=>"null"}
kontena-agent | D, [2018-02-05T11:08:12.419163 #1] DEBUG -- Kontena::Workers::ContainerLogWorker: starting to stream logs from /simple-lb-1
kontena-agent | D, [2018-02-05T11:08:12.436653 #1] DEBUG -- Kontena::Workers::ServicePodWorker: sync simple-lb-1 state at 2018-02-05 11:08:10 UTC: {:service_id=>"5a783b16aeb4d60007d516cc", :instance_number=>1, :rev=>"2018-02-05 11:08:10 UTC", :state=>"running", :error=>nil}
kontena-agent | D, [2018-02-05T11:08:12.437016 #1] DEBUG -- Kontena::Workers::ServicePodWorker: state of simple-lb-1: running
kontena-agent | D, [2018-02-05T11:08:12.523493 #1] DEBUG -- Kontena::Workers::ServicePodWorker: skip simple-lb-1 state sync at 2018-02-05 11:08:10 UTC: unchanged
kontena-agent | D, [2018-02-05T11:08:14.207537 #1] DEBUG -- Kontena::Workers::ServicePodWorker: state of simple-lb-1: running
kontena-agent | D, [2018-02-05T11:08:14.232406 #1] DEBUG -- Kontena::Workers::ServicePodWorker: skip simple-lb-1 state sync at 2018-02-05 11:08:10 UTC: unchanged
kontena-agent | D, [2018-02-05T11:08:19.269892 #1] DEBUG -- Kontena::Workers::ServicePodWorker: state of simple-lb-1: terminated
kontena-agent | I, [2018-02-05T11:08:19.294267 #1]  INFO -- Kontena::Workers::ServicePodWorker: terminating simple-lb-1
kontena-agent | I, [2018-02-05T11:08:19.306221 #1]  INFO -- Kontena::ServicePods::Terminator: terminating service: /simple-lb-1
kontena-agent | D, [2018-02-05T11:08:28.017910 #1] DEBUG -- Kontena::Workers::ServicePodWorker: state of simple-lb-1: running
kontena-agent | I, [2018-02-05T11:08:28.025898 #1]  INFO -- Kontena::Workers::ServicePodWorker: creating simple-lb-1
kontena-agent | I, [2018-02-05T11:08:28.025945 #1]  INFO -- Kontena::ServicePods::Creator: creating service: simple-lb-1
kontena-agent | I, [2018-02-05T11:08:29.275187 #1]  INFO -- Kontena::NetworkAdapters::Weave: Create container=simple-lb-1 in network=kontena with overlay_cidr=10.81.128.39/16
kontena-agent | D, [2018-02-05T11:08:29.275899 #1] DEBUG -- Kontena::ServicePods::Creator: creating container: simple-lb-1
kontena-agent | W, [2018-02-05T11:08:29.280305 #1]  WARN -- Kontena::Workers::ServicePodWorker: failed to sync simple-lb-1 at 2018-02-05 11:08:27 UTC: Conflict. The container name "/simple-lb-1" is already in use by container "dc0959258aafc4547e63c7512fbc05e977df6b1727bfbc4bfb7dd8fdaf8869c3". You have to remove (or rename) that container to be able to reuse that name.
kontena-agent | W, [2018-02-05T11:08:29.280860 #1]  WARN -- Kontena::Workers::ServicePodWorker: Conflict. The container name "/simple-lb-1" is already in use by container "dc0959258aafc4547e63c7512fbc05e977df6b1727bfbc4bfb7dd8fdaf8869c3". You have to remove (or rename) that container to be able to reuse that name.
kontena-agent | D, [2018-02-05T11:08:29.289466 #1] DEBUG -- Kontena::Workers::ServicePodWorker: sync simple-lb-1 state at 2018-02-05 11:08:27 UTC: {:service_id=>"5a783b27aeb4d60007d519b8", :instance_number=>1, :rev=>"2018-02-05 11:08:27 UTC", :state=>"missing", :error=>"Docker::Error::ClientError: Conflict. The container name \"/simple-lb-1\" is already in use by container \"dc0959258aafc4547e63c7512fbc05e977df6b1727bfbc4bfb7dd8fdaf8869c3\". You have to remove (or rename) that container to be able to reuse that name.\n"}
kontena-agent | D, [2018-02-05T11:08:42.420577 #1] DEBUG -- Kontena::Workers::ServicePodWorker: state of simple-lb-1: running
kontena-agent | I, [2018-02-05T11:08:42.422656 #1]  INFO -- Kontena::Workers::ServicePodWorker: creating simple-lb-1
kontena-agent | I, [2018-02-05T11:08:42.422702 #1]  INFO -- Kontena::ServicePods::Creator: creating service: simple-lb-1
kontena-agent | I, [2018-02-05T11:08:42.460349 #1]  INFO -- Kontena::NetworkAdapters::Weave: Create container=simple-lb-1 in network=kontena with overlay_cidr=10.81.128.3/16
kontena-agent | D, [2018-02-05T11:08:42.461379 #1] DEBUG -- Kontena::ServicePods::Creator: creating container: simple-lb-1
kontena-agent | D, [2018-02-05T11:08:42.503853 #1] DEBUG -- Kontena::ServicePods::Creator: container created: simple-lb-1
kontena-agent | I, [2018-02-05T11:08:42.745919 #1]  INFO -- Kontena::ServicePods::Creator: service started: simple-lb-1
kontena-agent | D, [2018-02-05T11:08:42.760356 #1] DEBUG -- Kontena::Workers::LogWorker: /simple-lb-1/{"io.kontena.container.deploy_rev"=>"2018-02-05 11:08:27 UTC", "io.kontena.container.name"=>"simple-lb-1", "io.kontena.container.overlay_cidr"=>"10.81.128.3/16", "io.kontena.container.overlay_network"=>"kontena", "io.kontena.container.pod"=>"simple-lb-1", "io.kontena.container.service_revision"=>"1", "io.kontena.container.stop_grace_period"=>"10", "io.kontena.container.type"=>"container", "io.kontena.grid.name"=>"e2e", "io.kontena.platform.name"=>"e2e", "io.kontena.service.id"=>"5a783b27aeb4d60007d519b8", "io.kontena.service.instance_number"=>"1", "io.kontena.service.name"=>"simple-lb", "io.kontena.stack.name"=>"null"}
kontena-agent | D, [2018-02-05T11:08:42.776402 #1] DEBUG -- Kontena::Workers::ContainerLogWorker: starting to stream logs from /simple-lb-1
kontena-agent | D, [2018-02-05T11:08:42.838286 #1] DEBUG -- Kontena::Workers::ServicePodWorker: sync simple-lb-1 state at 2018-02-05 11:08:27 UTC: {:service_id=>"5a783b27aeb4d60007d519b8", :instance_number=>1, :rev=>"2018-02-05 11:08:27 UTC", :state=>"running", :error=>nil}
kontena-agent | D, [2018-02-05T11:09:12.530313 #1] DEBUG -- Kontena::Workers::ServicePodWorker: state of simple-lb-1: running
kontena-agent | D, [2018-02-05T11:09:12.563431 #1] DEBUG -- Kontena::Workers::ServicePodWorker: skip simple-lb-1 state sync at 2018-02-05 11:08:27 UTC: unchanged

@SpComb
Copy link
Contributor Author

SpComb commented Feb 5, 2018

Yes, looking at docker events, the container terminate can occasionally take up to 10s:

2018-02-05T11:23:23.109672903Z container create e89e54466b577691b96bd94b112daabfe4f4419d1c489f36d6415192e5ed3d24 (image=kontena/lb:latest, io.kontena.container.deploy_rev=2018-02-05 11:23:21 UTC, io.kontena.container.name=simple-lb-1, io.kontena.container.overlay_cidr=10.81.128.70/16, io.kontena.container.overlay_network=kontena, io.kontena.container.pod=simple-lb-1, io.kontena.container.service_revision=1, io.kontena.container.stop_grace_period=10, io.kontena.container.type=container, io.kontena.grid.name=e2e, io.kontena.platform.name=e2e, io.kontena.service.id=5a783ea1aeb4d60007d544a6, io.kontena.service.instance_number=1, io.kontena.service.name=simple-lb, io.kontena.stack.name=null, name=simple-lb-1)
2018-02-05T11:23:23.344975035Z container start e89e54466b577691b96bd94b112daabfe4f4419d1c489f36d6415192e5ed3d24 (image=kontena/lb:latest, io.kontena.container.deploy_rev=2018-02-05 11:23:21 UTC, io.kontena.container.name=simple-lb-1, io.kontena.container.overlay_cidr=10.81.128.70/16, io.kontena.container.overlay_network=kontena, io.kontena.container.pod=simple-lb-1, io.kontena.container.service_revision=1, io.kontena.container.stop_grace_period=10, io.kontena.container.type=container, io.kontena.grid.name=e2e, io.kontena.platform.name=e2e, io.kontena.service.id=5a783ea1aeb4d60007d544a6, io.kontena.service.instance_number=1, io.kontena.service.name=simple-lb, io.kontena.stack.name=null, name=simple-lb-1)


2018-02-05T11:23:30.475102226Z container kill e89e54466b577691b96bd94b112daabfe4f4419d1c489f36d6415192e5ed3d24 (image=kontena/lb:latest, io.kontena.container.deploy_rev=2018-02-05 11:23:21 UTC, io.kontena.container.name=simple-lb-1, io.kontena.container.overlay_cidr=10.81.128.70/16, io.kontena.container.overlay_network=kontena, io.kontena.container.pod=simple-lb-1, io.kontena.container.service_revision=1, io.kontena.container.stop_grace_period=10, io.kontena.container.type=container, io.kontena.grid.name=e2e, io.kontena.platform.name=e2e, io.kontena.service.id=5a783ea1aeb4d60007d544a6, io.kontena.service.instance_number=1, io.kontena.service.name=simple-lb, io.kontena.stack.name=null, name=simple-lb-1, signal=15)

2018-02-05T11:23:40.476305690Z container kill e89e54466b577691b96bd94b112daabfe4f4419d1c489f36d6415192e5ed3d24 (image=kontena/lb:latest, io.kontena.container.deploy_rev=2018-02-05 11:23:21 UTC, io.kontena.container.name=simple-lb-1, io.kontena.container.overlay_cidr=10.81.128.70/16, io.kontena.container.overlay_network=kontena, io.kontena.container.pod=simple-lb-1, io.kontena.container.service_revision=1, io.kontena.container.stop_grace_period=10, io.kontena.container.type=container, io.kontena.grid.name=e2e, io.kontena.platform.name=e2e, io.kontena.service.id=5a783ea1aeb4d60007d544a6, io.kontena.service.instance_number=1, io.kontena.service.name=simple-lb, io.kontena.stack.name=null, name=simple-lb-1, signal=9)
2018-02-05T11:23:40.524217426Z container die e89e54466b577691b96bd94b112daabfe4f4419d1c489f36d6415192e5ed3d24 (exitCode=137, image=kontena/lb:latest, io.kontena.container.deploy_rev=2018-02-05 11:23:21 UTC, io.kontena.container.name=simple-lb-1, io.kontena.container.overlay_cidr=10.81.128.70/16, io.kontena.container.overlay_network=kontena, io.kontena.container.pod=simple-lb-1, io.kontena.container.service_revision=1, io.kontena.container.stop_grace_period=10, io.kontena.container.type=container, io.kontena.grid.name=e2e, io.kontena.platform.name=e2e, io.kontena.service.id=5a783ea1aeb4d60007d544a6, io.kontena.service.instance_number=1, io.kontena.service.name=simple-lb, io.kontena.stack.name=null, name=simple-lb-1)
2018-02-05T11:23:40.663111554Z container stop e89e54466b577691b96bd94b112daabfe4f4419d1c489f36d6415192e5ed3d24 (image=kontena/lb:latest, io.kontena.container.deploy_rev=2018-02-05 11:23:21 UTC, io.kontena.container.name=simple-lb-1, io.kontena.container.overlay_cidr=10.81.128.70/16, io.kontena.container.overlay_network=kontena, io.kontena.container.pod=simple-lb-1, io.kontena.container.service_revision=1, io.kontena.container.stop_grace_period=10, io.kontena.container.type=container, io.kontena.grid.name=e2e, io.kontena.platform.name=e2e, io.kontena.service.id=5a783ea1aeb4d60007d544a6, io.kontena.service.instance_number=1, io.kontena.service.name=simple-lb, io.kontena.stack.name=null, name=simple-lb-1)
2018-02-05T11:23:40.685169571Z container destroy e89e54466b577691b96bd94b112daabfe4f4419d1c489f36d6415192e5ed3d24 (image=kontena/lb:latest, io.kontena.container.deploy_rev=2018-02-05 11:23:21 UTC, io.kontena.container.name=simple-lb-1, io.kontena.container.overlay_cidr=10.81.128.70/16, io.kontena.container.overlay_network=kontena, io.kontena.container.pod=simple-lb-1, io.kontena.container.service_revision=1, io.kontena.container.stop_grace_period=10, io.kontena.container.type=container, io.kontena.grid.name=e2e, io.kontena.platform.name=e2e, io.kontena.service.id=5a783ea1aeb4d60007d544a6, io.kontena.service.instance_number=1, io.kontena.service.name=simple-lb, io.kontena.stack.name=null, name=simple-lb-1)

@SpComb
Copy link
Contributor Author

SpComb commented Feb 6, 2018

Fixed in #3245 as part of #3244

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant