Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When setting up the host, there is a delay before go-machine-service is ready. If user adds host before it's ready, the vm will not get created and immediately go into an "Active" state. #203

Closed
deniseschannon opened this issue Mar 13, 2015 · 20 comments
Assignees
Labels
area/container area/machine Issues that deal with rancher-machine kind/bug Issues that are defects reported by users or that we know have reached a real release

Comments

@deniseschannon
Copy link

Version: v0.11.1

When adding a Digital Ocean machine, it sometimes just never gets started. Other times, it works.

For the time that never gets started, the UI eventually shows my host to look like this:

screen shot 2015-03-13 at 4 06 13 pm

When I check the API in the hosts section, I can't find this host to check any UI status, so not sure why it's showing up in my UI.

@cjellick looked into this. The bug is most likely related to the fact that after "Host Setup", there is a delay before Machine Service comes online. Therefore, if you try and create machine before Machine Service is online it will go to this state.

@vincent99
Copy link
Contributor

This screen shows a combination of /v1/machines and /v1/hosts, so you're seeing that the machine is active (which should mean the DigitalOcean VM is created) but the agent is never checking in, so the host record we expect to eventually get created isn't.

@deniseschannon
Copy link
Author

The DigitalOcean VM was never created. In the past when this has happened before, I've seen it just sit in "Creating" state, but this was the first time that it went to an "Active" state.

@vincent99
Copy link
Contributor

Can you find the corresponding /v1/machines entry? What state is it in?

@deniseschannon
Copy link
Author

State="Active"

@vincent99 vincent99 changed the title UI issue with adding new Digital Ocean machine that doesn't actual create machine [machine] DigitalOcean machine becomes active without creating VM Mar 14, 2015
@vincent99
Copy link
Contributor

Ok, so this is an issue in Machine and/or the Rancher integration with it, the UI is showing what it's told. Ping @cjellick @will-chan

@cjellick
Copy link

Looking. If you didn't delete the machine, let me know the name and/or id so I can look it up in the api and logs

@cjellick
Copy link

I looked at this one a little bit over the weekend. Seems like there could be a legitimate concurrency issue. Something to keep an eye on if nothing else.

It could also be that the machine got into a funky state and that the error handling changes could alleviate this.

@deniseschannon deniseschannon added the kind/bug Issues that are defects reported by users or that we know have reached a real release label Mar 17, 2015
@vincent99
Copy link
Contributor

This happened to me twice today against a new 10acre with the :beta tag.. Anything you want to look at @cjellick?

@cjellick
Copy link

Will touch base with @will-chan on it today. My hunch is that it is related to the update we did to push more detailed status changes.

@cjellick
Copy link

@vincent99 Did you have multiple machine create requests going at the same time or just a single one when this happened?

@vincent99
Copy link
Contributor

They were separated by several minutes so it should've been a single request at a time

@deniseschannon
Copy link
Author

Updated the main issue with this comment:

@cjellick looked into this. The bug is most likely related to the fact that after "Host Setup", there is a delay before Machine Service comes online. Therefore, if you try and create machine before Machine Service is online it will go to this state.

@deniseschannon deniseschannon changed the title [machine] DigitalOcean machine becomes active without creating VM [machine] When setting up the host, there is a delay before go-machine-service is ready. If user adds host before it's ready, the vm will not get created and immediately go into an "Active" state. Mar 18, 2015
@deniseschannon
Copy link
Author

@cjellick @will-chan Any update on when this might be fixed? Just ran into it again today.

If anything, is there a way to throw an error message if we try to add a host before go-machine-service is ready?

Alternatively, could we show the "host setup" page on the first time we launch Rancher? This would hopefully allow go-machine-service time to start before we start adding hosts.

@deniseschannon
Copy link
Author

Minimally, this should be in an error state and not an active state.

@vincent99
Copy link
Contributor

I tried that initially but both @ibuildthecloud and I dislike setup questions being the first thing you see...

@vincent99 vincent99 changed the title [machine] When setting up the host, there is a delay before go-machine-service is ready. If user adds host before it's ready, the vm will not get created and immediately go into an "Active" state. When setting up the host, there is a delay before go-machine-service is ready. If user adds host before it's ready, the vm will not get created and immediately go into an "Active" state. Apr 23, 2015
@vincent99 vincent99 added area/machine Issues that deal with rancher-machine area/container and removed area/none labels Apr 23, 2015
@cjellick
Copy link

This seems to be happening more lately. Are more people using the feature or has something change?
Either way, seems like something we need to fix.

@cjellick
Copy link

Possible fix: adda a validation filter to the machine API that errors out if the appropriate external handler is not active for physical host.
We could also expose that as a link or action on physical host so that vince could block the API if it is not active. Possibly, we could dynamically remove POST as a resource method on machine if an external handler is not configured.
What say you @vincent99 and @ibuildthecloud?

@vincent99
Copy link
Contributor

Knowing that it's not active is useful, but doesn't really help much.. roughly the first thing the user has to do is add a host, so if it's not active yet all I can really do is show them

image

Can we fix the reason it takes a while to startup in the first place? And have machines that are created that fail go to error instead of active?

@cjellick
Copy link

fixed by @hibooboo2. ping him with follow up Qs :-)

@deniseschannon deniseschannon added this to the Milestone 2/17/2016 milestone Feb 23, 2016
@deniseschannon
Copy link
Author

Unable to reproduce with v0.59.1 so it seems to be fixed. If I see it again, I'll re-open.

rmweir pushed a commit to rmweir/rancher that referenced this issue Jan 11, 2023
KevinJoiner pushed a commit to KevinJoiner/rancher that referenced this issue Jan 23, 2023
KevinJoiner pushed a commit to KevinJoiner/rancher that referenced this issue Jan 23, 2023
KevinJoiner pushed a commit to KevinJoiner/rancher that referenced this issue Jan 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/container area/machine Issues that deal with rancher-machine kind/bug Issues that are defects reported by users or that we know have reached a real release
Projects
None yet
Development

No branches or pull requests

3 participants