-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IIAB 6.7/master fails to install on latest Ubuntu 18.04: "Unable to start service dnsmasq" #1306
Comments
@mrdavidhaag dhcpd_install: False dnsmasq_install: True |
Thank you so much for your quick response and good advice.
I will try this on Monday.
…On Sat, Nov 24, 2018 at 10:45 PM Tim Moody ***@***.***> wrote:
@mrdavidhaag <https://github.com/mrdavidhaag>
Some settings in /etc/iiab/local_vars.yml may conflict as there has been
confusion between default_vars.yml and various local_vars.yml files in the
1 line installer. Check /etc/iiab/local_vars.yml and see if you have the
following settings. When both named and dnsmasq are installed and enabled
the conflict you describe will occur. Then rerun the install.
dhcpd_install: False
dhcpd_enabled: False
named_install: False
named_enabled: False
block_DNS: False
dnsmasq_install: True
dnsmasq_enabled: True
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1306 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ArNASILPOhfXQPBPGOYNVJgyVT9bth6Fks5uyU1fgaJpZM4YxO1G>
.
|
@jvonau can you confirm? And suggest alternatives for IIAB 6.7/master if dnsmasq is suddenly conflicting on Ubuntu 18.04 as is claimed above? (Personally I've never had any such problems installing IIAB 6.7/master on Ubuntu 18.04 on VirtualBox, so I'd also ask if @mrdavidhaag 's use of Ubuntu 18.04 on ProxMox is perhaps somehow different than others' use of Ubuntu 18.04 on VirtualBox ?) |
After the failure on ProxMox VM, I tried Build from Scratch procedure on
barebones NUC DC32171YE machine with Ubuntu minimal desktop installation
loaded. Result was same dnsmasq error port 53 already in use at the TASK 4
stage. On this exact same NUC machine about 2 weeks ago I successfully
installed iiab with 1 line installer script, so I do not believe that it is
a hardware issue. Perhaps something with Ubuntu 18.04 or something changed
in the iiab scripts.
I have already looked at the /etc/iiab/local_vars.yml on both of the failed
installation machines and they only have;
dnsmasq_install: True
dnsmasq_enabled: True
The other mentioned settings are not listed or commented out in either
files.
dhcpd_install: False
dhcpd_enabled: False
named_install: False
named_enabled: False
block_DNS: False
The same is true for the /etc/iiab/local_vars.yml in the successful machine
from two weeks ago no mention of named or dhcpd
The co-installation of named and dnsmasq was the first suspect I had and
have already tried searching in the /etc/iiab/local_vars.yml to find if
settings were conflicting, but found no setting for installation of named.
However, when I watch during the installation, I can see the named
installation taking place, but am unable so far to find out where this is
coming from.
I will try to add the suggested False lines to the /etc/iiab/local_vars.yml
and try installation again. I will report results here.
…On Sun, Nov 25, 2018 at 9:41 AM A Holt ***@***.***> wrote:
@mrdavidhaag <https://github.com/mrdavidhaag>
Some settings in /etc/iiab/local_vars.yml may conflict as there has been
confusion between default_vars.yml and various local_vars.yml files in the
1 line installer. Check /etc/iiab/local_vars.yml and see if you have the
following settings. When both named and dnsmasq are installed and enabled
the conflict you describe will occur. Then rerun the install.
dhcpd_install: False
dhcpd_enabled: False
named_install: False
named_enabled: False
block_DNS: False
dnsmasq_install: True
dnsmasq_enabled: True
@jvonau <https://github.com/jvonau> can you confirm? And suggest
alternatives for IIAB 6.7/master if dnsmasq is suddenly conflicting on
Ubuntu 18.04 as is claimed above?
(Personally I've never had any such problems installing IIAB 6.7/master on
Ubuntu 18.04 on VirtualBox, so I'd also ask if @mrdavidhaag
<https://github.com/mrdavidhaag> 's use of Ubuntu 18.04 on ProxMox is
perhaps somehow different than others' use of Ubuntu 18.04 on VirtualBox ?)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1306 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ArNASAqc7vqk1k0A7iaKb-3VR1FTYnl4ks5uyecvgaJpZM4YxO1G>
.
|
I added the suggested "False" lines just above the dnsmasq True statements
in the ?etc/iiab/local_vars.yml file as below screenshot on a fresh ProxMox
VM with Ubuntu 18.04 mini installation
[image: image.png]
This time bind - named did not install, however, same results with failure
in task4 trying to start dnsmasq service failed due to port 53 already in
use.
TASK [4-server-options : Restart apache2]
***************************************************************************************************************************************************************************************************
changed: [127.0.0.1]
TASK [4-server-options : Restart dnsmasq]
***************************************************************************************************************************************************************************************************
fatal: [127.0.0.1]: FAILED! => {"changed": false, "msg": "Unable to start
service dnsmasq: Job for dnsmasq.service failed because the control process
exited with error code.\nSee \"systemctl status dnsmasq.service\" and
\"journalctl -xe\" for details.\n"}
to retry, use: --limit @/opt/iiab/iiab/iiab-stages.retry
PLAY RECAP
**********************************************************************************************************************************************************************************************************************************
127.0.0.1 : ok=119 changed=80 unreachable=0 failed=1
root@awm:/opt/iiab/iiab# systemctl status dnsmasq.service
● dnsmasq.service - dnsmasq - A lightweight DHCP and caching DNS server
Loaded: loaded (/lib/systemd/system/dnsmasq.service; enabled; vendor
preset: enabled)
Active: failed (Result: exit-code) since Sun 2018-11-25 16:34:42 WIT;
46s ago
Process: 2509 ExecStart=/etc/init.d/dnsmasq systemd-exec (code=exited,
status=2)
Process: 2508 ExecStartPre=/usr/sbin/dnsmasq --test (code=exited,
status=0/SUCCESS)
Nov 25 16:34:42 box.lan systemd[1]: Starting dnsmasq - A lightweight DHCP
and caching DNS server...
Nov 25 16:34:42 box.lan dnsmasq[2508]: dnsmasq: syntax check OK.
Nov 25 16:34:42 box.lan dnsmasq[2509]: dnsmasq: failed to create listening
socket for port 53: Address already in use
Nov 25 16:34:42 box.lan systemd[1]: dnsmasq.service: Control process
exited, code=exited status=2
Nov 25 16:34:42 box.lan systemd[1]: dnsmasq.service: Failed with result
'exit-code'.
Nov 25 16:34:42 box.lan systemd[1]: Failed to start dnsmasq - A lightweight
DHCP and caching DNS server.
root@awm:/opt/iiab/iiab# netstat -nlpt | grep 53
tcp 0 0 127.0.0.53:53 0.0.0.0:* LISTEN
320/systemd-resolve
root@awm:/opt/iiab/iiab#
So the conflict does not seem to be bind-named but perhaps has to do with
systemd-resolve which is the only service listening to port 53 on
127.0.0.53:53 tcp as shown by netstat.
On this VM I can stop systemd-resolve.service and then start
dnsmasq.service without error, but then I cannot resolve any DNS to outside
world
root@awm:/opt/iiab/iiab# systemctl stop systemd-resolved.service
root@awm:/opt/iiab/iiab# systemctl start dnsmasq.service
root@awm:/opt/iiab/iiab# ping google.com
ping: google.com: Temporary failure in name resolution
root@awm:/opt/iiab/iiab# systemctl status dnsmasq.service
● dnsmasq.service - dnsmasq - A lightweight DHCP and caching DNS server
Loaded: loaded (/lib/systemd/system/dnsmasq.service; enabled; vendor
preset: enabled)
Active: active (running) since Sun 2018-11-25 16:38:09 WIT; 43s ago
Process: 2542 ExecStartPost=/etc/init.d/dnsmasq systemd-start-resolvconf
(code=exited, status=0/SUCCESS)
Process: 2533 ExecStart=/etc/init.d/dnsmasq systemd-exec (code=exited,
status=0/SUCCESS)
Process: 2532 ExecStartPre=/usr/sbin/dnsmasq --test (code=exited,
status=0/SUCCESS)
Main PID: 2541 (dnsmasq)
Tasks: 1 (limit: 4666)
CGroup: /system.slice/dnsmasq.service
└─2541 /usr/sbin/dnsmasq -x /run/dnsmasq/dnsmasq.pid -u dnsmasq
-7 /etc/dnsmasq.d,.dpkg-dist,.dpkg-old,.dpkg-new --local-service
--trust-anchor=.,19036,8,2,49aac11d7b6f6446702e54a1607371607a1a41855200fd2ce1cdde32f24e8fb5
--tru
Nov 25 16:38:09 box.lan systemd[1]: Starting dnsmasq - A lightweight DHCP
and caching DNS server...
Nov 25 16:38:09 box.lan dnsmasq[2532]: dnsmasq: syntax check OK.
Nov 25 16:38:09 box.lan systemd[1]: Started dnsmasq - A lightweight DHCP
and caching DNS server.
root@awm:/opt/iiab/iiab#
Question: Can IIAB run without dnsmasq.service? I also tried to make the
variables for dnsmasq to false, but then the script checks dnsmasq anyways
and fails too.
…On Sun, Nov 25, 2018 at 3:18 PM David Haag ***@***.***> wrote:
After the failure on ProxMox VM, I tried Build from Scratch procedure on
barebones NUC DC32171YE machine with Ubuntu minimal desktop installation
loaded. Result was same dnsmasq error port 53 already in use at the TASK 4
stage. On this exact same NUC machine about 2 weeks ago I successfully
installed iiab with 1 line installer script, so I do not believe that it is
a hardware issue. Perhaps something with Ubuntu 18.04 or something changed
in the iiab scripts.
I have already looked at the /etc/iiab/local_vars.yml on both of the
failed installation machines and they only have;
dnsmasq_install: True
dnsmasq_enabled: True
The other mentioned settings are not listed or commented out in either
files.
dhcpd_install: False
dhcpd_enabled: False
named_install: False
named_enabled: False
block_DNS: False
The same is true for the /etc/iiab/local_vars.yml in the successful
machine from two weeks ago no mention of named or dhcpd
The co-installation of named and dnsmasq was the first suspect I had and
have already tried searching in the /etc/iiab/local_vars.yml to find if
settings were conflicting, but found no setting for installation of named.
However, when I watch during the installation, I can see the named
installation taking place, but am unable so far to find out where this is
coming from.
I will try to add the suggested False lines to the /etc/iiab/local_vars.yml
and try installation again. I will report results here.
On Sun, Nov 25, 2018 at 9:41 AM A Holt ***@***.***> wrote:
> @mrdavidhaag <https://github.com/mrdavidhaag>
> Some settings in /etc/iiab/local_vars.yml may conflict as there has been
> confusion between default_vars.yml and various local_vars.yml files in the
> 1 line installer. Check /etc/iiab/local_vars.yml and see if you have the
> following settings. When both named and dnsmasq are installed and enabled
> the conflict you describe will occur. Then rerun the install.
>
> dhcpd_install: False
> dhcpd_enabled: False
> named_install: False
> named_enabled: False
> block_DNS: False
>
> dnsmasq_install: True
> dnsmasq_enabled: True
>
> @jvonau <https://github.com/jvonau> can you confirm? And suggest
> alternatives for IIAB 6.7/master if dnsmasq is suddenly conflicting on
> Ubuntu 18.04 as is claimed above?
>
> (Personally I've never had any such problems installing IIAB 6.7/master
> on Ubuntu 18.04 on VirtualBox, so I'd also ask if @mrdavidhaag
> <https://github.com/mrdavidhaag> 's use of Ubuntu 18.04 on ProxMox is
> perhaps somehow different than others' use of Ubuntu 18.04 on VirtualBox ?)
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#1306 (comment)>, or mute
> the thread
> <https://github.com/notifications/unsubscribe-auth/ArNASAqc7vqk1k0A7iaKb-3VR1FTYnl4ks5uyecvgaJpZM4YxO1G>
> .
>
|
Likely 18.04 and/or its latest version of dnsmasq changed? Let's try to investigate on Raspbian Lite on Raspberry Pi 3 too, to shed light & solve this. Profound thanks @mrdavidhaag for the detailed report. |
Related suggestion: PR #1303 ("bind9 and dnsmasq fighting over port 53") |
I agree with Adam that I have not seen this problem on VirtualBox, so your environment should be considered.
My experience is that containers, certainly docker, come with some networking already configured that can conflict with IIAB installation. |
@jvonau writes:
Is it possible Captive Portal changes over the last 2 weeks have caused this problem? In any case, @georgejhunt suggests trying: (#1303 (comment))
|
@jvonau writes:
|
Thank you. I have tried this before, however the installation fails when
trying to access internet.
If I stop systemd-resolve.service then the VM cannot resolve DNS as below
showing before and after systemd-resolved.service is stopped (the long ping
times to google.com are result of our Vsat service here in Papua, Indonesia)
*BEFORE*
root@awm:/home/awm#
root@awm:/home/awm# ping google.com
PING google.com (216.58.223.14) 56(84) bytes of data.
64 bytes from jnb01s07-in-f14.1e100.net (216.58.223.14): icmp_seq=1 ttl=41
time=979 ms
64 bytes from jnb01s07-in-f14.1e100.net (216.58.223.14): icmp_seq=2 ttl=41
time=976 ms
64 bytes from jnb01s07-in-f14.1e100.net (216.58.223.14): icmp_seq=3 ttl=41
time=1052 ms
64 bytes from jnb01s07-in-f14.1e100.net (216.58.223.14): icmp_seq=4 ttl=41
time=1067 ms
64 bytes from jnb01s07-in-f14.1e100.net (216.58.223.14): icmp_seq=5 ttl=41
time=960 ms
64 bytes from jnb01s07-in-f14.1e100.net (216.58.223.14): icmp_seq=6 ttl=41
time=1236 ms
64 bytes from jnb01s07-in-f14.1e100.net (216.58.223.14): icmp_seq=7 ttl=41
time=1050 ms
^C
--- google.com ping statistics ---
8 packets transmitted, 7 received, 12% packet loss, time 11306ms
rtt min/avg/max/mdev = 960.186/1046.257/1236.852/87.406 ms, pipe 2
root@awm:/home/awm# systemctl status systemd-resolved.service
● systemd-resolved.service - Network Name Resolution
Loaded: loaded (/lib/systemd/system/systemd-resolved.service; enabled;
vendor preset: enabled)
Active: active (running) since Sat 2018-11-24 08:10:32 WIT; 1 day 7h ago
Docs: man:systemd-resolved.service(8)
https://www.freedesktop.org/wiki/Software/systemd/resolved
https://www.freedesktop.org/wiki/Software/systemd/writing-network-configuration-managers
https://www.freedesktop.org/wiki/Software/systemd/writing-resolver-clients
Main PID: 310 (systemd-resolve)
Status: "Processing requests..."
Tasks: 1 (limit: 4666)
CGroup: /system.slice/systemd-resolved.service
└─310 /lib/systemd/systemd-resolved
Nov 24 08:10:32 awm systemd-resolved[310]: Negative trust anchors:
10.in-addr.arpa 16.172.in-addr.arpa 17.172.in-addr.arpa 18.172.in-addr.arpa
19.172.in-addr.arpa 20.172.in-addr.arpa 21.172.in-addr.arpa
22.172.in-addr.arpa 23.172.in-addr
Nov 24 08:10:32 awm systemd-resolved[310]: Using system hostname 'awm'.
Nov 24 08:10:32 awm systemd[1]: Started Network Name Resolution.
Nov 24 08:35:17 awm systemd-resolved[310]: Using degraded feature set (UDP)
for DNS server 10.10.35.1.
Nov 24 11:15:07 awm systemd-resolved[310]: Grace period over, resuming full
feature set (UDP+EDNS0) for DNS server 10.10.35.1.
Nov 24 11:15:07 awm systemd-resolved[310]: Using degraded feature set (UDP)
for DNS server 10.10.35.1.
Nov 25 15:37:55 awm systemd-resolved[310]: Server returned error NXDOMAIN,
mitigating potential DNS violation DVE-2018-0001, retrying transaction with
reduced feature level UDP.
Nov 25 15:37:55 awm systemd-resolved[310]: Using degraded feature set (UDP)
for DNS server 10.10.35.1.
Nov 25 15:44:35 awm systemd-resolved[310]: Grace period over, resuming full
feature set (UDP+EDNS0) for DNS server 10.10.35.1.
Nov 25 15:44:35 awm systemd-resolved[310]: Using degraded feature set (UDP)
for DNS server 10.10.35.1.
root@awm:/home/awm# netstat -lnpt | grep 53
tcp 0 0 127.0.0.53:53 0.0.0.0:* LISTEN
310/systemd-resolve
*AFTER*
root@awm:/home/awm# systemctl stop systemd-resolved.service
root@awm:/home/awm# netstat -lnpt | grep 53
root@awm:/home/awm# ping google.com
ping: google.com: Temporary failure in name resolution
root@awm:/home/awm# systemctl status systemd-resolved.service
● systemd-resolved.service - Network Name Resolution
Loaded: loaded (/lib/systemd/system/systemd-resolved.service; enabled;
vendor preset: enabled)
Active: inactive (dead) since Sun 2018-11-25 15:46:11 WIT; 32s ago
Docs: man:systemd-resolved.service(8)
https://www.freedesktop.org/wiki/Software/systemd/resolved
https://www.freedesktop.org/wiki/Software/systemd/writing-network-configuration-managers
https://www.freedesktop.org/wiki/Software/systemd/writing-resolver-clients
Process: 310 ExecStart=/lib/systemd/systemd-resolved (code=exited,
status=0/SUCCESS)
Main PID: 310 (code=exited, status=0/SUCCESS)
Status: "Shutting down..."
Nov 24 08:10:32 awm systemd[1]: Started Network Name Resolution.
Nov 24 08:35:17 awm systemd-resolved[310]: Using degraded feature set (UDP)
for DNS server 10.10.35.1.
Nov 24 11:15:07 awm systemd-resolved[310]: Grace period over, resuming full
feature set (UDP+EDNS0) for DNS server 10.10.35.1.
Nov 24 11:15:07 awm systemd-resolved[310]: Using degraded feature set (UDP)
for DNS server 10.10.35.1.
Nov 25 15:37:55 awm systemd-resolved[310]: Server returned error NXDOMAIN,
mitigating potential DNS violation DVE-2018-0001, retrying transaction with
reduced feature level UDP.
Nov 25 15:37:55 awm systemd-resolved[310]: Using degraded feature set (UDP)
for DNS server 10.10.35.1.
Nov 25 15:44:35 awm systemd-resolved[310]: Grace period over, resuming full
feature set (UDP+EDNS0) for DNS server 10.10.35.1.
Nov 25 15:44:35 awm systemd-resolved[310]: Using degraded feature set (UDP)
for DNS server 10.10.35.1.
Nov 25 15:46:11 awm systemd[1]: Stopping Network Name Resolution...
Nov 25 15:46:11 awm systemd[1]: Stopped Network Name Resolution.
root@awm:/home/awm#
…On Mon, Nov 26, 2018 at 6:49 AM A Holt ***@***.***> wrote:
@jvonau <https://github.com/jvonau> writes:
Disable and stop is what appears to be needed but that may not be the
complete solution..
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1306 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ArNASBXXwG4Loyu5HECBit7FNJCDxqQ_ks5uyxCEgaJpZM4YxO1G>
.
|
@jvonau writes:
|
@georgejhunt & all, do have a moment to respond to the ideas/questions above prior to Thursday's http://minutes.iiab.io 10AM NYC Time call? Can someone with access to RPi 3 and Ubuntu 18.04 validate the situation there, in a known environment on known hardware or on a known VM environment? |
Thank you for the follow up. Much appreciate your time and efforts.
Our goal is to test iiab as educational content server on Community
Cellular LTE
<https://www.internetsociety.org/blog/2018/09/building-a-community-lte-network-in-bokondini-indonesia/>
deployment in Papua, Indonesia. The idea is for users of the 4G signal to
have unlimited, non-revenue access to iiab educational content to make use
of available bandwidth. We have limited internet backhaul due to Vsat, but
have an abundance of on-site bandwidth. Hopefully integration of iiab will
allow us to put the excess bandwidth to good use.
…On Tue, Nov 27, 2018, 6:35 PM A Holt ***@***.*** wrote:
@georgejhunt <https://github.com/georgejhunt> & all, do have a moment to
respond to the ideas/questions above prior to Thursday's
http://minutes.iiab.io 10AM NYC Time call?
Can someone with access to RPi 3 and Ubuntu 18.04 validate the situation
there, in a known environment on known hardware or on a known VM
environment?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1306 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ArNASNFcXkaQdwB8RnHHzmxJ7o5P_8uEks5uzQd4gaJpZM4YxO1G>
.
|
I agree with Jerry. We should be running in appliance mode (only one
adapter). This information seems important:
"Our goal is to test iiab as educational content server on Community
Cellular LTE" -- on a preexisting network structure with ip addresses and
dns handled by someone else
You can verify the automatic config decisions our code made by reading
/etc/iiab/iiab.ini -- search for "iiab_network_mode"
At the same location, verify that named, dnsmasq_enabled, dhcpd are all
False. Jerry is more able than I to figure out the logic in
/opt/iiab/iiab/roles/network/tasks/computed_services.yml. It looks to me
that there may need to be some tweaking.
In the meantime, I think the VM's name resolution on port 53 must already
be in place. So I'd try these in local_vars.yml:
captive_portal_install: True
captive_portal_enabled: False
dnsmasq_install: True
dnsmasq_enabled: False
On Tue, Nov 27, 2018 at 2:45 AM mrdavidhaag <notifications@github.com>
wrote:
… Thank you for the follow up. Much appreciate your time and efforts.
Our goal is to test iiab as educational content server on Community
Cellular LTE
<
https://www.internetsociety.org/blog/2018/09/building-a-community-lte-network-in-bokondini-indonesia/
>
deployment in Papua, Indonesia. The idea is for users of the 4G signal to
have unlimited, non-revenue access to iiab educational content to make use
of available bandwidth. We have limited internet backhaul due to Vsat, but
have an abundance of on-site bandwidth. Hopefully integration of iiab will
allow us to put the excess bandwidth to good use.
On Tue, Nov 27, 2018, 6:35 PM A Holt ***@***.*** wrote:
> @georgejhunt <https://github.com/georgejhunt> & all, do have a moment to
> respond to the ideas/questions above prior to Thursday's
> http://minutes.iiab.io 10AM NYC Time call?
>
> Can someone with access to RPi 3 and Ubuntu 18.04 validate the situation
> there, in a known environment on known hardware or on a known VM
> environment?
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#1306 (comment)>, or
mute
> the thread
> <
https://github.com/notifications/unsubscribe-auth/ArNASNFcXkaQdwB8RnHHzmxJ7o5P_8uEks5uzQd4gaJpZM4YxO1G
>
> .
>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1306 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AB04HCMJrW1CqRgDn5oTgov7IwVpuPH6ks5uzRfBgaJpZM4YxO1G>
.
|
IIAB 6.7/master is definitely broken on Ubuntu 18.04 as tested on multiple NUCs: (unlike on Raspbian 2018-11-13 where IIAB 6.7/master still works)
Further detail:
Thanks all for helping to narrow down what broke IIAB 6.7/master — which worked 2 weeks ago. |
Thank you
Made the following changes in local_vars.yml:
captive_portal_install: True
captive_portal_enabled: False
dnsmasq_install: True
dnsmasq_enabled: False
and happy to report successful installation of iiab-install
this is on VM running Ubuntu 18.04-mini in ProxMox VME using the Do
everything from Scratch method
I have not tried 1 line installer-script (with edits)
Thank you all for super support and clear concise good advice
…On Wed, Nov 28, 2018 at 9:20 AM A Holt ***@***.***> wrote:
IIAB 6.7/master is definitely broken on Ubuntu 18.04 as tested on multiple
NUCs: (unlike on Raspbian 2018-11-13 where IIAB 6.7/master still works)
TASK [4-server-options : Restart dnsmasq] **************************************
fatal: [127.0.0.1]: FAILED! => {"changed": false, "msg": "Unable to start service dnsmasq: Job for dnsmasq.service failed because the control process exited with error code.\nSee \"systemctl status dnsmasq.service\" and \"journalctl -xe\" for details.\n"}
Further detail:
***@***.***:~# systemctl status dnsmasq
● dnsmasq.service - dnsmasq - A lightweight DHCP and caching DNS server
Loaded: loaded (/lib/systemd/system/dnsmasq.service; enabled; vendor preset:
Active: failed (Result: exit-code) since Tue 2018-11-27 18:57:11 EST; 12min a
Process: 14669 ExecStart=/etc/init.d/dnsmasq systemd-exec (code=exited, status
Process: 14662 ExecStartPre=/usr/sbin/dnsmasq --test (code=exited, status=0/SU
Nov 27 18:57:11 box.lan systemd[1]: Starting dnsmasq - A lightweight DHCP and ca
Nov 27 18:57:11 box.lan dnsmasq[14662]: dnsmasq: syntax check OK.
Nov 27 18:57:11 box.lan dnsmasq[14669]: dnsmasq: failed to create listening sock
Nov 27 18:57:11 box.lan systemd[1]: dnsmasq.service: Control process exited, cod
Nov 27 18:57:11 box.lan systemd[1]: dnsmasq.service: Failed with result 'exit-co
Nov 27 18:57:11 box.lan systemd[1]: Failed to start dnsmasq - A lightweight DHCP
lines 1-12/12 (END)...skipping...
● dnsmasq.service - dnsmasq - A lightweight DHCP and caching DNS server
Loaded: loaded (/lib/systemd/system/dnsmasq.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Tue 2018-11-27 18:57:11 EST; 12min ago
Process: 14669 ExecStart=/etc/init.d/dnsmasq systemd-exec (code=exited, status=2)
Process: 14662 ExecStartPre=/usr/sbin/dnsmasq --test (code=exited, status=0/SUCCESS)
Nov 27 18:57:11 box.lan systemd[1]: Starting dnsmasq - A lightweight DHCP and caching DNS server...
Nov 27 18:57:11 box.lan dnsmasq[14662]: dnsmasq: syntax check OK.
Nov 27 18:57:11 box.lan dnsmasq[14669]: dnsmasq: failed to create listening socket for 172.18.96.1: Address already in use
Nov 27 18:57:11 box.lan systemd[1]: dnsmasq.service: Control process exited, code=exited status=2
Nov 27 18:57:11 box.lan systemd[1]: dnsmasq.service: Failed with result 'exit-code'.
Nov 27 18:57:11 box.lan systemd[1]: Failed to start dnsmasq - A lightweight DHCP and caching DNS server.
Thanks all for helping to narrow down what broke IIAB 6.7/master — which
worked 2 weeks ago.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1306 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ArNASPuwT8gYLc0Ijhn9DQtyubp6-XFWks5uzda4gaJpZM4YxO1G>
.
|
Clean install of IIAB 6.7/master failed on a 100% fresh Ubuntu Server 18.04.1 VM:
George / Tim indicated yesterday (during our voice call) that the above would work. It did not. Seems we have some kind of common/intermittent failure? FYI, all updates were applied to the 18.04.1 VM prior to beginning the IIAB install, using @jvonau writes:
|
@jvonau requested output of
|
"dnsmasq & systemd Causing Intermittent CPU Spikes" @jvonau wonders if the above might be related? |
FYI I get the exact same/above failure on 2 other Ubuntu Server 18.04.1 machines, that happen to be NUC PC's (10.8.0.6 & 10.8.0.34) rather than VM's. FWIW these NUC PC's were updated using |
The primary issue is dnsmasq fails to start when being installed by the system package manager so this is an upstream problem, search Ubuntu's bug tracker for other who have run across this issue or file a bug. I'll bet this could be reproduced by installing dnsmasq on a new VM without using iiab with just apt. The above observation on real hardware points to dnsmasq breaking on currently configured out in the wild machines that most likely will receive an update and suddenly break. The above link points to a possible solution to deal with the issue with systemd-resolved and dnsmasq. |
@jvonau & All, there are many concrete suggestions here: How to avoid conflicts between dnsmasq and systemd-resolved? Which do you suggest we try? |
Seems unrelated to: #1569 dnsmasq sometimes fails to start on Raspbian Desktop (possibly also Lite?) |
Yes thanks. I was able to get it working with Debian. I plan on building a
few more and shipping to Haiti. Really appreciate all your help.
Stephen
On Mon, Mar 11, 2019 at 13:17 A Holt ***@***.***> wrote:
@holta <https://github.com/holta> correct those logs were from Desktop
version.
@MrSteve2 <https://github.com/MrSteve2> did you make progress using
Ubuntu 18.04.2 Desktop?
(Or using Debian 9.8 Server...or Desktop possibly?)
If so, definitely check out http://FAQ.IIAB.IO #15
<#15> and let us know:
"How do I customize my Internet-in-a-Box home page?"
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1306 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AEuigeB9eHp4E416EHu4g1Ie40m1nV6Zks5vVo_BgaJpZM4YxO1G>
.
--
See a Problem
(That impedes your efforts)
Solve the Problem
(in a disciplined way that helps gain new insights in how to do work)
Share What you Learn
(So the local discovery has systematic and broad impacts)
- excerpted from Steven Spear <http://www.thehighvelocityedge.com/>
…---
|
Okay, this is failing for me too: Setup as follows:
Network:
Latest commit hash: 23286e1 OS: |
Running |
Did you happen to try repeated runs of (We need to publish a workaround if possible, as this error is proving to be quite common.) |
Here are the main Networking issues others are facing at this time — these might or might not be related — but are good to keep in mind as we seek a more bulletproof IIAB installation process:
|
No. This was the very first run. After that, I tried starting dnsmasq manually and it worked. After that, I tried |
Manually restarting dnsmasq with success suggests that br0 was not fully initialized when dnsmasq was restarted... |
Should the networking playbook enforce this prereq, and if so pause for 30sec or whatever until br0 appears before checking again? (Or should we start with diagnostic/error messaging at that point in the playbook?) |
@jvonau suggests:
Great. Let's make this fault-tolerant across different OS's & HW that each have their own wacky timings. e.g. enforcing the prereqs we want and/or telling the implementer why it's failing if manual intervention is absolutely necessary. |
Working with @jvonau and Tony Anderson (PR #1636) on his Acer XC-885 desktop machine, there appears be a similar timing issue with starting of dnsmasq prior to the creation of br0 (bridge for LAN-side, Wi-Fi, hostapd):
By default there's a 1-second built-in delay, that we may now/soon want to change Line 43 of roles/network/defaults/main.yml -- from 1 second:
...up to something like 5 seconds?
(And then of course run...)
Context: this glitch only happens during IIAB's initial install, when IIAB's network role is run. |
@jvonau can you confirmed this is now solved by your recent dnsmasq PR's ? |
Trying to install iiab on VM in Proxmox using 1 line installer script 6.7 and installation failed at TASK 4
TASK [4-server-options : Restart dnsmasq] ******************************************************
fatal: [127.0.0.1]: FAILED! => {"changed": false, "msg": "Unable to start service dnsmasq: Job for dnsmasq.service failed because the control process exited with error code.\nSee "systemctl status dnsmasq.service" and "journalctl -xe" for details.\n"}
to retry, use: --limit @/opt/iiab/iiab/iiab-stages.retry
PLAY RECAP *************************************************************************************
127.0.0.1 : ok=176 changed=122 unreachable=0 failed=1
Tried using Do everything from Scratch method - same result dnsmasq: failed to create listening socket for port 53: Address already in use
netstat tells me that bind9 and systemd-resolve are both listening on port 53
root@box:/home/awm# netstat -nlpt | grep 53
tcp 0 0 10.10.35.20:53 0.0.0.0:* LISTEN 417/named
tcp 0 0 127.0.0.1:53 0.0.0.0:* LISTEN 417/named
tcp 0 0 127.0.0.53:53 0.0.0.0:* LISTEN 404/systemd-resolve
tcp 0 0 127.0.0.1:953 0.0.0.0:* LISTEN 417/named
So I tried to stop bind9 service and run iiab install again - same error
root@box:/opt/iiab/iiab# systemctl status dnsmasq.service
● dnsmasq.service - dnsmasq - A lightweight DHCP and caching DNS server
Loaded: loaded (/lib/systemd/system/dnsmasq.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Sat 2018-11-24 10:58:06 WIT; 24s ago
Process: 5557 ExecStart=/etc/init.d/dnsmasq systemd-exec (code=exited, status=2)
Process: 5556 ExecStartPre=/usr/sbin/dnsmasq --test (code=exited, status=0/SUCCESS)
Nov 24 10:58:06 box.lan systemd[1]: Starting dnsmasq - A lightweight DHCP and caching DNS server...
Nov 24 10:58:06 box.lan dnsmasq[5556]: dnsmasq: syntax check OK.
Nov 24 10:58:06 box.lan dnsmasq[5557]: dnsmasq: failed to create listening socket for port 53: Address already in use
Nov 24 10:58:06 box.lan systemd[1]: dnsmasq.service: Control process exited, code=exited status=2
Nov 24 10:58:06 box.lan systemd[1]: dnsmasq.service: Failed with result 'exit-code'.
Nov 24 10:58:06 box.lan systemd[1]: Failed to start dnsmasq - A lightweight DHCP and caching DNS server.
If I stop systemd-resolve.service then the tasks fail at check for internet.
Any ideas or suggestions greatly appreciated. My Linux skills are thin.
The text was updated successfully, but these errors were encountered: