Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deleting a host causes Rancher to retry purge requests forever #5047

Closed
sheng-liang opened this issue Jun 8, 2016 · 2 comments
Closed

Deleting a host causes Rancher to retry purge requests forever #5047

sheng-liang opened this issue Jun 8, 2016 · 2 comments
Assignees
Labels
kind/bug Issues that are defects reported by users or that we know have reached a real release

Comments

@sheng-liang
Copy link

sheng-liang commented Jun 8, 2016

Rancher Version: v1.1.0-dev2-rc5

Docker Version: 1.10.3

OS and where are the hosts located? (cloud, bare metal, etc): Ubuntu 14.04, DigitalOcean

Setup Details: (single node rancher vs. HA rancher, internal DB vs. external DB) single node Rancher

Environment Type: (Cattle/Kubernetes/Swarm/Mesos) Cattle

Steps to Reproduce: Delete hosts in Cattle environment directly from DO (not from Rancher), Rancher shows hosts in reconnect state. Then deactivate the hosts in Rancher, and delete the host. Later on I also deleted the stack that was running.

Results: Rancher retries purge requests forever

Expected: Rancher should give up after a while
image

@will-chan will-chan added the kind/bug Issues that are defects reported by users or that we know have reached a real release label Jun 8, 2016
@will-chan will-chan added this to the Release 1.1.0 milestone Jun 8, 2016
@alena1108
Copy link

alena1108 commented Jun 9, 2016

Debugged the issue; volume.purge gets stuck on volumestoragepoolmap.remove process. volumestoragepoolmap.remove triggers agent event that times out forever in this case. It happens because host's agent doesn't get removed along with the host removal. So while host is in removed state, the agent is still being active. That causes this bug.

Seems like state transition from reconnecting to removed is missing for agent.remove process, enabling should fix the problem.

@soumyalj
Copy link

Tested on master with the above steps. Did not see volumes stuck in purge state.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Issues that are defects reported by users or that we know have reached a real release
Projects
None yet
Development

No branches or pull requests

5 participants