Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

auth-zone section in config may lead to segfault #220

Closed
sibeream opened this issue Apr 15, 2020 · 9 comments
Closed

auth-zone section in config may lead to segfault #220

sibeream opened this issue Apr 15, 2020 · 9 comments
Assignees

Comments

@sibeream
Copy link
Contributor

Hello,
I tried the following with an unbound build from the current master.
If I have auth-zone configured and I start and stop the service in a short period of time, segfault happens.
The reason is that env variable is NULL on services/authzone.c:6394.
if(env->outnet->want_to_quit) {.
It happens because of the daemon_cleanup function in daemon/daemon.c calls auth_zones_cleanup function, which calls xfr_probe_disown, which has the following code xfr->task_probe->env = NULL;
After that, worker_delete is called, which does a chain of calls ending up in mesh_state_cleanup function (services/mesh.c:861), which iterates through callback list and calls callbacks themselves. One of those callbacks is auth_xfer_probe_lookup_callback (services/authzone.c:6385) which doesn't expect env to be NULL
The possible solution would be to change 'if' statement on services/authzone.c:6394 to look something like if(env && env->outnet->want_to_quit) {.
However I'm not sure if suggested solution satisfies application logic.

@landgraf
Copy link

Addition:
FQDN of nameservers as masters in auth-zone section leads to segfault on cleanup due to mesh.count not equal to 0. Using master's servers ip addresses instead of fqdn fixes the issue.
Even if usage of fqdn's is not allowed this situation should be handled without null pointer derefence.

@gthess
Copy link
Member

gthess commented Apr 15, 2020

Hi all, thanks for reporting!
@sibeream: a fix is in the works.
@landgraf: Hostname instead of IP address is allowed but you need to make sure you don't find yourself in a circular dependency. Apart from that, I am not able to reproduce. Could you share an example configuration?

@gthess gthess self-assigned this Apr 15, 2020
@landgraf
Copy link

Hi all, thanks for reporting!
@sibeream: a fix is in the works.
@landgraf: Hostname instead of IP address is allowed but you need to make sure you don't find yourself in a circular dependency. Apart from that, I am not able to reproduce. Could you share an example configuration?

I was able to reproduce with config as simple as:

auth-zone:
	name: "."
	for-downstream: no
	for-upstream: yes
	fallback-enabled: yes
	## SIGSEGV
	## master: b.root-servers.net
	## OK
	master: 199.9.14.201
	## OK 
	master: 2001:500:200::b

@landgraf
Copy link

@gthess
Segfaults with master: b.root-servers.net and works fine with 199.9.14.201 and/or 2001:500:200::b

@gthess
Copy link
Member

gthess commented Apr 15, 2020

Which version of unbound are you running?

@landgraf
Copy link

landgraf commented Apr 15, 2020

Which version of unbound are you running?

unbound-1.10.0-1.fc33
unbound-1.9.6-1.fc31
current master

steps to reproduce:
unbound -c ./config_mentioned_earlier -d
Ctrl-C (or signal 15 in gdb prompt)

@gthess
Copy link
Member

gthess commented Apr 15, 2020

I believe you are referring to the same issue that sibeream reported (env==NULL) as the only reproducible SEGV here is of the same nature.
Using IP and not hostname skips the resolution phase and is faster to complete the callback before SIGTERM is received (thus avoiding the SEGV condition).

@landgraf
Copy link

I believe you are referring to the same issue that sibeream reported (env==NULL) as the only reproducible SEGV here is of the same nature.
Using IP and not hostname skips the resolution phase and is faster to complete the callback before SIGTERM is received (thus avoiding the SEGV condition).

Yes, We're working on the same issue.

@gthess gthess closed this as completed in 8a87fc6 Apr 15, 2020
@gthess
Copy link
Member

gthess commented Apr 15, 2020

It should be fixed now, if not (or something similar happens) feel free to reopen.

jedisct1 added a commit to jedisct1/unbound that referenced this issue Apr 20, 2020
* nlnet/master:
  - Fix for count of reply states in the mesh.
  Fix that it is --enable-rpath, for NLnetLabs#222.
  - Fix NLnetLabs#222: --with-rpath, fails to rpath python lib.
  - Document SNI support in unbound-anchor.8.in.
  - Update Changelog for PR NLnetLabs#221.
  - Enable SNI by default in unbound-anchor.
  Revert "- Remove SNI support from unbound-anchor; TLS is used only for"
  - Remove SNI support from unbound-anchor; TLS is used only for   encryption and not validation.
  - Add SNI support on more TLS connections (fixes NLnetLabs#193). - Add SNI support to unbound-anchor.
  - Add doxygen documentation for DSCP.
  - Fix for posix shell syntax for trap in run_msg.sh test script.
  - Fix for posix shell syntax for trap in nsd-control-setup.
  - Fix help return code in unbound-control-setup script.
  - Fix NLnetLabs#220: auth-zone section in config may lead to segfault.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants