Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intermittent failures in Rackspace #1507

Closed
auser opened this issue Jan 25, 2013 · 27 comments
Closed

Intermittent failures in Rackspace #1507

auser opened this issue Jan 25, 2013 · 27 comments

Comments

@auser
Copy link

auser commented Jan 25, 2013

Working with OpenStack and Rackspace, I get indeterministic failures to get the public ip with bootstrap on fog.

After calling compute.servers.bootstrap, sometimes bootstrap completes, and other times it hangs indefinitely... After looking into the indefinite hanging, I've printed the result of the def setup(credentials) inside rackspace/models/compute_v2/servers.rb to add a debugging print statement (bad, I know):

    def setup(credentials = {})
      pp [:setup, credentials, public_ip_address, username]
      requires :public_ip_address, :identity, :public_key, :username
      Fog::SSH.new(public_ip_address, username, credentials).run([
        %{mkdir .ssh},
         ....
    end

Which will sometimes result in the STDOUT to print:

[:setup, {:password=>"VZ6qXX4wXYpA"}, "", "ubuntu"]
@geemus
Copy link
Member

geemus commented Jan 25, 2013

@bradgignac @brianhartsock @krames - thoughts?

@auser
Copy link
Author

auser commented Jan 25, 2013

It gets the "password," but not the ip address... which to me seems odd...

@krames
Copy link
Member

krames commented Jan 25, 2013

I had experienced that as well. I created a pull request to address it here:

#1475

@bradgignac thought this issue might be with the Cloud Servers REST API. Let me follow up with that team and see if I can get some answers.

I am not quite sure what you are trying to do, but you might be able to use Cloud Server's personalization functionality to get around the issue in the interim.

Here is an example:

https://github.com/rackspace/fog/blob/server_docs/lib/fog/rackspace/examples/cloud_servers/create_server.rb

You would need to tweak the code to upload your keys instead of fog.txt.

@auser
Copy link
Author

auser commented Jan 25, 2013

I tried the personalization_api and I couldn't get it to work… I'll check your script with mine though to see if perhaps I was doing it "incorrectly"

Thanks!

On Jan 25, 2013, at 2:12 PM, Kyle Rames notifications@github.com wrote:

I had experienced that as well. I created a pull request to address it here:

#1475

@bradgignac thought this issue might an issue with the Cloud Servers REST API. Let me follow up with that team and see if I can get some answers.

I am not quite sure what you are trying to do, but you might be able to use Cloud Server's personalization functionality to get around the issue in the interim.

Here is an example:

https://github.com/rackspace/fog/blob/server_docs/lib/fog/rackspace/examples/cloud_servers/create_server.rb

You would need to tweak the code to upload your keys instead of fog.txt.


Reply to this email directly or view it on GitHub.

@krames
Copy link
Member

krames commented Jan 25, 2013

I just sent an email to the owner of the cloud servers api. I will see what they say. In the meantime, let me know if I can with anything else.

@auser
Copy link
Author

auser commented Jan 26, 2013

@krames thanks... I'll let you know (can try it in an hour-ish)

@krames
Copy link
Member

krames commented Jan 28, 2013

@auser Any luck with personalities?

@auser
Copy link
Author

auser commented Jan 28, 2013

@krames thanks for checking in... sadly not fixed... it's somewhat unrelated anyway being that the IP is not fetched...

@krames
Copy link
Member

krames commented Jan 28, 2013

I have managed to get a hold of the cloud server's team and I believe we know what is happening.

Fog is using the accessIPv4 field for the public ip address --which is not the real public IP address of the server. It is the address that the public should use to access the server. (It could be a load balancer or firewall for example). This field is not guaranteed to be populated when the server transitions from 'BUILDING' to 'ACTIVE'.

The real IP address can actually be found in the addresses attribute and is guaranteed to be populated when a server becomes active.

This issue is going to require a bug fix. I am currently trying to figure out whether I need to update the public_ip_address method to return the real public ip address from the addresses attribute or if I should update bootstrap to wait for the accessIPv4 address to be populated. What are your thoughts @bradgignac and @brianhartsock?

In the interim, I would either suggest the personality functionality or writing your own version of bootstrap that waits for the public_ip_address to be populated. Here is a commit where I do just that rackspace@575d3ab

@auser
Copy link
Author

auser commented Jan 28, 2013

Arighty, thanks... I'll go with your suggestion until the patch gets in place. Tahnks

@brianhartsock
Copy link
Member

This is annoying problem, I see a few things we should fix.

  • Figure out how to get the SSH to not hang and throw an error when this happens
  • Wait for public_ipv4_address in bootstrap - I like this better than just using addresses since you may not always have a public IP address assigned. It feels more "right".

Openstack has the same issue, we should try to fix both places. @dprince you have any thoughts?

@krames
Copy link
Member

krames commented Jan 29, 2013

@brianhartsock I have already created #1475 to address your second point.

I will look into adding/setting a SSH timeout later today.

@auser
Copy link
Author

auser commented Jan 31, 2013

Another issue... I get a BadRequest when using personality in the launch profile. I "BELIEVE" it's because there are some funky characters (it's a pubkey) in the contents. How do you suggest we avoid that?

@krames
Copy link
Member

krames commented Jan 31, 2013

Hmmm....Did you Base64 encode your key?

@auser
Copy link
Author

auser commented Feb 1, 2013

@krames... that doesn't quite work when trying to "login" to the server though... the key is Base64 and then obviously it doesn't match the key...

@auser
Copy link
Author

auser commented Feb 1, 2013

Well... now I almost have it working... it appears Fog is getting the password back incorrectly... I can't ssh with the password, nor with my personality keys... I'm using personality to set the authorized_keys for the ubuntu user. That doesn't allow me to login, nor does using the password Fog gets back... am I missing something entirely?

@krames
Copy link
Member

krames commented Feb 1, 2013

It could be that the image is not using the ubuntu user for the root account.

What value do you get when you call server.username? Can you do an ssh from the terminal using that user and the password supplied in server.password?

If that does not work, can you try changing the admin password and attempt to do the ssh from the terminal? You can reset the password by calling

server.change_admin_password "myAwesomePassword"

If you are able to successfully login, can you verify that the personality file was able to save your key?

@brianhartsock
Copy link
Member

Rackspace images don't use an ubuntu user by default, they use root. Does that help? Feel free to post the complete code snippet an we might be able to provide more feedback.

@krames
Copy link
Member

krames commented Feb 5, 2013

@auser A fix for this issue has just been merged into master and should be in the next release of fog which should hopefully be at the end of February.

In the meantime, if you would like to work against the master branch you can add the following line to your gem file

gem "fog", :git => git://github.com/fog/fog.git

@auser
Copy link
Author

auser commented Feb 6, 2013

@krames, thanks... do you know what commit it is? I'd like to see what the 'fix' is.

Thanks

@krames
Copy link
Member

krames commented Feb 6, 2013

Sure, the crux of the fix is in this commit 575d3ab#lib/fog/rackspace/models/compute_v2/servers.rb

After this commit, we removed timeout from the bootstrap method signature in keeping with the bootstrap methods for other providers.

We also decided to wait for public_ip_address instead of accessIPv4. (public_ip_address actually calls accessIPv4, but the setup method on the server object uses public_ip_address. I figured the consistency might prevent issues in the future).

@brianseeders
Copy link

I just happened to run across this and have some hopefully useful information. I'm very new to Fog, and I apologize if it isn't relevant here.

Users who are using Rackspace's RackConnect (hybrid cloud) offering will potentially run into another problem: After the server is built and has come online, Rackspace's RackConnect Automation scripts will run, and the public-facing IP address will actually get re-assigned. There is a piece of metadata that gets populated related to the status of the process. See: http://www.rackspace.com/knowledge_center/article/how-to-programmatically-determine-the-rackconnect-automation-status-of-your-cloud-servers.

This was a problem for the knife-rackspace (Chef) plugin, and the current workaround for it there is (unfortunately) to simply use the private IP address.

@krames
Copy link
Member

krames commented Feb 13, 2013

@brianseeders Thanks! I didn't even think about that case.

@brianhartsock @bradgignac I am not too familiar with Rack Connect, what is the best way to address this? Could we add a rack_connect attribute to server to indicate that bootstrap should use the private ip address?

At the very least we should add a note to the API documentation.

@brianseeders
Copy link

Keep in mind, that will work only if the machine running Fog is inside the RackConnected network (or VPN of course) and has a RackConnect rule that allows access.

Seems like the best way to handle would be to have both an option to use the private IP, and an option to wait for rackconnect_automation_status to be "DEPLOYED"

@brianhartsock
Copy link
Member

At some point, we should fix this in fog. The logic required for rackconnect requires some information in Auth and Servers APIs. First, you must check if the user has the rackconnect auth role. If so, then you would have to look at the server meta-data to determine if RackConnect provisioning has been completed. It might be simpler to just add a rackconnect option to the bootstrap method for the Rackspace provider wait on the server status and meta-data rackconnect status. Little crazy, but that's how it currently works.

@krames
Copy link
Member

krames commented Oct 23, 2013

@auser I believe we successfully addressed this issue and I am going to close the issue. If you are still having problems please reopen it and I will take another look. Thanks!

@krames krames closed this as completed Oct 23, 2013
@geemus
Copy link
Member

geemus commented Oct 25, 2013

Thanks!

On Wed, Oct 23, 2013 at 9:08 AM, Kyle Rames notifications@github.comwrote:

Closed #1507 #1507.


Reply to this email directly or view it on GitHubhttps://github.com//issues/1507
.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants