Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reporting - Unbound DNS - Stopped after two hours #6284

Closed
2 tasks done
wtremmel opened this issue Jan 30, 2023 · 14 comments
Closed
2 tasks done

Reporting - Unbound DNS - Stopped after two hours #6284

wtremmel opened this issue Jan 30, 2023 · 14 comments
Assignees
Labels
bug Production bug
Milestone

Comments

@wtremmel
Copy link

Important notices

Before you add a new report, we ask you kindly to acknowledge the following:

Describe the bug

After enabling Unbound DNS reporting it stopped after about two hours. Clicking "Reset DNS Data" on the Reporting/Settings page restarted reporting again. Unbound itself worked fine the whole time

Logging showed:
2023-01-30T13:33:55 | Informational | unbound | [9208:0] info: dnsbl_module: attempting to open pipe |  
2023-01-30T13:33:44 | Informational | unbound | [9208:2] info: dnsbl_module: no logging backend found.

To Reproduce

  1. I rsyned a file with local-data: and local-data-ptr: into /var/unbound/etc
  2. Restart unbound using unbound-control -c /var/unbound/unbound.conf reload

Relevant log files

2023-01-30T13:33:55 | Informational | unbound | [9208:0] info: dnsbl_module: attempting to open pipe |  
2023-01-30T13:33:44 | Informational | unbound | [9208:2] info: dnsbl_module: no logging backend found.

every 10 seconds

This is the log immediately after the restart:

2023-01-30T14:06:07 | Informational | unbound | [11560:2] info: dnsbl_module: no logging backend found. |  
2023-01-30T14:06:07 | Informational | unbound | [11560:2] info: dnsbl_module: attempting to open pipe |  
2023-01-30T14:05:57 | Informational | unbound | [11560:0] info: generate keytag query _ta-4f66. NULL IN |  
2023-01-30T14:05:56 | Informational | unbound | [11560:2] info: dnsbl_module: no logging backend found. |  
2023-01-30T14:05:56 | Informational | unbound | [11560:2] info: dnsbl_module: attempting to open pipe |  
2023-01-30T14:05:56 | Informational | unbound | [11560:0] info: start of service (unbound 1.17.1).

@fichtner fichtner added the bug Production bug label Jan 30, 2023
@fichtner fichtner added this to the 23.7 milestone Jan 30, 2023
@swhite2
Copy link
Member

swhite2 commented Jan 30, 2023

What happens if you restart Unbound from the GUI? Some critical elements are bootstrapped within this sequence of events, those elements are skipped with unbound-control.

If the error persists it might be useful to go through the logs, e.g.

  • search the log grid directly for Error or Exception
  • /var/log/resolver/xxx.log and grep on Error or Exception to pinpoint the moment of failure.

@wtremmel
Copy link
Author

This is the full log from the moment of restart from the command line. No errors or exceptions visible:

<29>1 2023-01-30T14:05:56+01:00 fw01.ch5.garf.de unbound 11560 - [meta sequenceId="79"] [11560:0] notice: Restart of unbound 1.17.1.
<165>1 2023-01-30T14:05:56+01:00 fw01.ch5.garf.de unbound 66193 - [meta sequenceId="80"] Closing logger
<29>1 2023-01-30T14:05:56+01:00 fw01.ch5.garf.de unbound 11560 - [meta sequenceId="81"] [11560:0] notice: init module 0: python
<30>1 2023-01-30T14:05:56+01:00 fw01.ch5.garf.de unbound 11560 - [meta sequenceId="82"] [11560:0] info: dnsbl_module: updating blocklist.
<30>1 2023-01-30T14:05:56+01:00 fw01.ch5.garf.de unbound 11560 - [meta sequenceId="83"] [11560:0] info: dnsbl_module: blocklist loaded. length is 0
<29>1 2023-01-30T14:05:56+01:00 fw01.ch5.garf.de unbound 11560 - [meta sequenceId="84"] [11560:0] notice: init module 1: dns64
<29>1 2023-01-30T14:05:56+01:00 fw01.ch5.garf.de unbound 11560 - [meta sequenceId="85"] [11560:0] notice: init module 2: validator
<29>1 2023-01-30T14:05:56+01:00 fw01.ch5.garf.de unbound 11560 - [meta sequenceId="86"] [11560:0] notice: init module 3: iterator
<30>1 2023-01-30T14:05:56+01:00 fw01.ch5.garf.de unbound 11560 - [meta sequenceId="87"] [11560:0] info: start of service (unbound 1.17.1).
<30>1 2023-01-30T14:05:56+01:00 fw01.ch5.garf.de unbound 11560 - [meta sequenceId="88"] [11560:2] info: dnsbl_module: attempting to open pipe
<30>1 2023-01-30T14:05:56+01:00 fw01.ch5.garf.de unbound 11560 - [meta sequenceId="89"] [11560:2] info: dnsbl_module: no logging backend found.
<30>1 2023-01-30T14:05:57+01:00 fw01.ch5.garf.de unbound 11560 - [meta sequenceId="90"] [11560:0] info: generate keytag query _ta-4f66. NULL IN
<30>1 2023-01-30T14:06:07+01:00 fw01.ch5.garf.de unbound 11560 - [meta sequenceId="91"] [11560:2] info: dnsbl_module: attempting to open pipe
<30>1 2023-01-30T14:06:07+01:00 fw01.ch5.garf.de unbound 11560 - [meta sequenceId="92"] [11560:2] info: dnsbl_module: no logging backend found.

@wtremmel
Copy link
Author

wtremmel commented Jan 30, 2023

Same log when restarted from the GUI:

<29>1 2023-01-30T14:31:28+01:00 fw01.ch5.garf.de unbound 1167 - [meta sequenceId="64"] [1167:0] notice: init module 0: python
<30>1 2023-01-30T14:31:28+01:00 fw01.ch5.garf.de unbound 1167 - [meta sequenceId="65"] [1167:0] info: dnsbl_module: updating blocklist.
<30>1 2023-01-30T14:31:28+01:00 fw01.ch5.garf.de unbound 1167 - [meta sequenceId="66"] [1167:0] info: dnsbl_module: blocklist loaded. length is 0
<29>1 2023-01-30T14:31:28+01:00 fw01.ch5.garf.de unbound 1167 - [meta sequenceId="67"] [1167:0] notice: init module 1: dns64
<29>1 2023-01-30T14:31:28+01:00 fw01.ch5.garf.de unbound 1167 - [meta sequenceId="68"] [1167:0] notice: init module 2: validator
<29>1 2023-01-30T14:31:28+01:00 fw01.ch5.garf.de unbound 1167 - [meta sequenceId="69"] [1167:0] notice: init module 3: iterator
<30>1 2023-01-30T14:31:29+01:00 fw01.ch5.garf.de unbound 1167 - [meta sequenceId="70"] [1167:0] info: start of service (unbound 1.17.1).
<30>1 2023-01-30T14:31:30+01:00 fw01.ch5.garf.de unbound 1167 - [meta sequenceId="71"] [1167:1] info: generate keytag query _ta-4f66. NULL IN
<30>1 2023-01-30T14:31:30+01:00 fw01.ch5.garf.de unbound 1167 - [meta sequenceId="72"] [1167:1] info: dnsbl_module: attempting to open pipe
<30>1 2023-01-30T14:31:30+01:00 fw01.ch5.garf.de unbound 1167 - [meta sequenceId="73"] [1167:1] info: dnsbl_module: no logging backend found.
<165>1 2023-01-30T14:31:30+01:00 fw01.ch5.garf.de unbound 5021 - [meta sequenceId="74"] Backgrounding unbound logging backend.
<30>1 2023-01-30T14:31:40+01:00 fw01.ch5.garf.de unbound 1167 - [meta sequenceId="75"] [1167:0] info: dnsbl_module: attempting to open pipe
<30>1 2023-01-30T14:31:40+01:00 fw01.ch5.garf.de unbound 1167 - [meta sequenceId="76"] [1167:0] info: dnsbl_module: successfully opened pipe

Here I get an error in the "Details" tab of Reporting:
/usr/local/opnsense/mvc/app/controllers/OPNsense/Base/ApiControllerBase.php:60: array_keys(): Argument #1 ($array) must be of type array, null given
This resolved itself somehow after about 5 minutes.

@swhite2
Copy link
Member

swhite2 commented Jan 30, 2023

What is the output of configctl unbound qstats details 10 and ps -fax | grep unbound?

@wtremmel
Copy link
Author

More on the Error after restarting from the GUI:

[30-Jan-2023 14:34:16 Europe/Berlin] TypeError: array_keys(): Argument #1 ($array) must be of type array, null given in /usr/local/opnsense/mvc/app/controllers/OPNsense/Base/ApiControllerBase.php:60
Stack trace:
#0 /usr/local/opnsense/mvc/app/controllers/OPNsense/Base/ApiControllerBase.php(60): array_keys(NULL)
#1 /usr/local/opnsense/mvc/app/controllers/OPNsense/Unbound/Api/OverviewController.php(125): OPNsense\Base\ApiControllerBase->searchRecordsetBase(NULL)
#2 [internal function]: OPNsense\Unbound\Api\OverviewController->searchQueriesAction()
#3 [internal function]: Phalcon\Dispatcher\AbstractDispatcher->callActionMethod(Object(OPNsense\Unbound\Api\OverviewController), 'searchQueriesAc...', Array)
#4 [internal function]: Phalcon\Dispatcher\AbstractDispatcher->dispatch()
#5 /usr/local/opnsense/www/api.php(24): Phalcon\Mvc\Application->handle('/api/unbound/ov...')
#6 {main}

@wtremmel
Copy link
Author

configctl unbound qstats details 10:

[{"uuid":"57a7f0cb-30ad-46ca-8206-aed06a614341","time":1675086930,"client":"tapocam01.ch5.garf.de","family":"ip4","type":"A","domain":"euw1-relay-i-010a8679e77649bc1.dcipc.i.tplinknbu.com.","action":"Pass","source":"Cache","blocklist":"None","rcode":"NOERROR","resolve_time_ms":0,"dnssec_status":"Insecure","ttl":270},{"uuid":"5e6b7d8a-010b-4f82-8e49-792ed74ed66e","time":1675086924,"client":"fdd1:9494:7978:15:182a:b0af:d8ed:1234","family":"ip6","type":"AAAA","domain":"imap.mail.me.com.akadns.net.","action":"Pass","source":"Recursion","blocklist":"None","rcode":"NOERROR","resolve_time_ms":13,"dnssec_status":"Insecure","ttl":274},{"uuid":"e7a95afa-f241-43a0-815b-05463ffdaf19","time":1675086924,"client":"tapocam01.ch5.garf.de","family":"ip4","type":"A","domain":"euw1-relay-i-010a8679e77649bc1.dcipc.i.tplinknbu.com.","action":"Pass","source":"Cache","blocklist":"None","rcode":"NOERROR","resolve_time_ms":0,"dnssec_status":"Insecure","ttl":276},{"uuid":"d63e7eff-ad69-4bbd-8565-6537a3951895","time":1675086924,"client":"kellercam.ch5.garf.de","family":"ip4","type":"AAAA","domain":"lancre.wtremmel.dedyn.io.","action":"Pass","source":"Cache","blocklist":"None","rcode":"NOERROR","resolve_time_ms":0,"dnssec_status":"Secure","ttl":59},{"uuid":"afcbac2c-f47c-4069-a828-43155cc5cb2e","time":1675086924,"client":"kellercam.ch5.garf.de","family":"ip4","type":"A","domain":"lancre.wtremmel.dedyn.io.","action":"Pass","source":"Recursion","blocklist":"None","rcode":"NOERROR","resolve_time_ms":20,"dnssec_status":"Secure","ttl":60},{"uuid":"4b9d8d56-ae15-46a0-a413-f6e5e29c4337","time":1675086923,"client":"fdd1:9494:7978:15:182a:b0af:d8ed:1234","family":"ip6","type":"AAAA","domain":"lancre.wtremmel.dedyn.io.","action":"Pass","source":"Recursion","blocklist":"None","rcode":"NOERROR","resolve_time_ms":31,"dnssec_status":"Secure","ttl":60},{"uuid":"6b86b373-bad7-4c95-b6e3-a4ff44a7bcc5","time":1675086920,"client":"192.168.24.188","family":"ip4","type":"A","domain":"network.satnogs.org.","action":"Pass","source":"Cache","blocklist":"None","rcode":"NOERROR","resolve_time_ms":0,"dnssec_status":"Insecure","ttl":180},{"uuid":"d6cd57c8-f343-4184-bc0c-eee92d3e44ee","time":1675086920,"client":"192.168.24.188","family":"ip4","type":"AAAA","domain":"network.satnogs.org.","action":"Pass","source":"Cache","blocklist":"None","rcode":"NOERROR","resolve_time_ms":0,"dnssec_status":"Insecure","ttl":180},{"uuid":"5218c750-ca5f-46f9-bd3b-38d56656265d","time":1675086919,"client":"mqtt.ch5.garf.de","family":"ip4","type":"A","domain":"vocapi.wirelesscar.net.","action":"Pass","source":"Recursion","blocklist":"None","rcode":"NOERROR","resolve_time_ms":46,"dnssec_status":"Insecure","ttl":30},{"uuid":"c549b199-9b00-4032-a80f-d8c347dbf36a","time":1675086919,"client":"mqtt.ch5.garf.de","family":"ip4","type":"AAAA","domain":"vocapi.wirelesscar.net.","action":"Pass","source":"Recursion","blocklist":"None","rcode":"NOERROR","resolve_time_ms":56,"dnssec_status":"Insecure","ttl":30}]

ps -fax | grep unbound:

 1167  -  Ss       0:03.19 /usr/local/sbin/unbound -c /var/unbound/unbound.conf
 4984  -  Is       0:00.00 daemon: /usr/local/opnsense/scripts/unbound/logger.py[5021] (daemon)
 5021  -  S        0:29.00 /usr/local/bin/python3 /usr/local/opnsense/scripts/unbound/logger.py (python3.9)
37762  0  S+       0:00.00 grep unbound

@swhite2
Copy link
Member

swhite2 commented Jan 30, 2023

no strange things happening here.. You could try restarting unbound, killing the logger process (in your case kill 5021) and running it manually from the command line until it fails: /usr/local/opnsense/scripts/unbound/logger.py.

@deku-m
Copy link

deku-m commented Feb 3, 2023

I must say this morning i woke up and my unbound DNS was stoppend, had to restart it.
Yes i also have DNS reporting on.

Tried to check gui log file but there is nothing in it. Only time and date.
Any tips for more debugging?

@swhite2
Copy link
Member

swhite2 commented Feb 3, 2023

@deku-m Can you try # opnsense-patch 02f44d902 (02f44d9)?

There might've been a race related to work queuing. Let's see if this fixes a bailout on the part of the logging mechanism.

@deku-m
Copy link

deku-m commented Feb 8, 2023

opnsense-patch 02f44d9

I applied the patch today lets see how it goes.
Also implemented monit for monitoring and restarting unbound when crashing.

@wtremmel
Copy link
Author

wtremmel commented Feb 9, 2023

Also applied the patch, reporting run fine for 18 hours, now I get an error when opening the details tab:

/usr/local/opnsense/mvc/app/controllers/OPNsense/Base/ApiControllerBase.php:60: array_keys(): Argument #1 ($array) must be of type array, null given

Update: Looks like the problem fixed itself. Report is running (again?)

@AdSchellevis
Copy link
Member

Just added a patch 63bac3c to prevent a crash in these cases (opnsense-patch 63bac3c). Although the caller should try to pass a null, it's likely better to prevent any output in that case (intention was likely to offer an empty set here).

@wtremmel
Copy link
Author

Since the last update the problem seems fixed.

@fichtner
Copy link
Member

fichtner commented Mar 6, 2023

Confirmed fixed.

@fichtner fichtner closed this as completed Mar 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Production bug
Development

No branches or pull requests

5 participants