Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

agentlib.c causing Seg Fault when running nova-agent #44

Closed
michaelsmit opened this issue Apr 26, 2014 · 2 comments
Closed

agentlib.c causing Seg Fault when running nova-agent #44

michaelsmit opened this issue Apr 26, 2014 · 2 comments

Comments

@michaelsmit
Copy link

@michaelsmit michaelsmit commented Apr 26, 2014

On line 76, ifa_addr is dereferenced without checking the value first. If it is null, we see the joyous SIGSEGV.

76: if (ifa->ifa_addr->sa_family != PF_PACKET)

One instance where it is null is when the network interface is one installed by OpenVPN (i.e. tun0).

This is arguably two issues, but I suspect both can be addressed by checking to ensure ifa_addr is not null, and merely continuing to the next interface if it is.

I experienced this firsthand when trying to launch a new Rackspace public cloud server from an image of a Rackspace public cloud server that had openvpn installed. nova-agent segfaulted on boot when it received "resetnetwork", and the correct hostname and IP were never written to the new cloud server.

-Mike

Steps to verify: on a machine with a tun0 interface.

  1. Terminate nova-agent if it is not already.
  2. Write a message to xenstore telling nova-agent to reset the network - you can force it to segfault any time you want - ideal for debugging!:
    $ uuid=$(uuidgen)
    $ xenstore-write data/host/$uuid '{"name":"resetnetwork","value":""}'
  3. Run attached to debugger:
    $ gdb /usr/sbin/nova-agent
    (gdb) r -n -l debug /usr/share/nova-agent/nova-agent.py

Program received signal SIGSEGV, Segmentation fault.
[Switching to LWP 11780]
0xf7900f9f in _agentlib_get_interfaces (self=0x0, args=0x0) at agentlib.c:76
76 agentlib.c: No such file or directory.
in agentlib.c

// Of course I don't have the source at hand, but line 76 is:
// 76: if (ifa->ifa_addr->sa_family != PF_PACKET)

  1. Explore the ifa struct:

(gdb) print ifa->ifa_addr->sa_family
Cannot access memory at address 0x0

(gdb) print ifa->ifa_addr
$2 = (struct sockaddr *) 0x0

(gdb) print ifa->ifa_name
$4 = 0x808eb04 "tun0"

  1. Remove open-vpn, and note that nova-agent runs without segfaulting.

I was using Ubuntu 10.04.4 on a next-generation server on Rackspace at the time.

@ssamppa

This comment has been minimized.

Copy link

@ssamppa ssamppa commented Jun 6, 2014

Hi
When will this fix be released, this issue is causing nova-agent to crash very often on our server.
This should be released ASAP.

@naterh

This comment has been minimized.

Copy link
Contributor

@naterh naterh commented Nov 7, 2014

This issue has been resolved per the above PR. If there are still issues with images in your cloud providers infrastructure, contact their image maintainers to include these fixes. Closing this issue.

@naterh naterh closed this Nov 7, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.