Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Could not connect to broker.netmaker.domain.tld #1186

Closed
1 task done
rowhit opened this issue Jun 5, 2022 · 14 comments
Closed
1 task done

[Bug]: Could not connect to broker.netmaker.domain.tld #1186

rowhit opened this issue Jun 5, 2022 · 14 comments
Assignees
Labels
bug Something isn't working

Comments

@rowhit
Copy link

rowhit commented Jun 5, 2022

What happened?

Installation works fine and is able to create network and key but cannot add nodes to the network. It continues to complain that cannot connect to the broker. Although the *.netmaker.{{domain}} is forwarded correctly, which is required for the acme certificate anyway. I can ping the master node where I have installed netmaker but not the peers. Also cannot ssh to the master node even if I can ping and ufw shows the open port.

The ufw ports are open on the server and client as well:
To Action From


443/tcp ALLOW Anywhere
53/udp ALLOW Anywhere
53/tcp ALLOW Anywhere
51821:51830/udp ALLOW Anywhere
8883/tcp ALLOW Anywhere
22/tcp ALLOW Anywhere
443/tcp (v6) ALLOW Anywhere (v6)
53/udp (v6) ALLOW Anywhere (v6)
53/tcp (v6) ALLOW Anywhere (v6)
51821:51830/udp (v6) ALLOW Anywhere (v6)
8883/tcp (v6) ALLOW Anywhere (v6)
22/tcp (v6) ALLOW Anywhere (v6)

Here is the dashboard image. It seems to recognize the devices and get the right ip addresses but cannot ping the
image

Version

v0.14.2

What OS are you using?

Linux

Relevant log output

[netclient] 2022-06-05 16:15:24 joining default-net at api.netmaker.{{domain}}:443 
[netclient] 2022-06-05 16:15:24 starting wireguard 
[netclient] 2022-06-05 16:15:27 certificates/key saved  
[netclient] 2022-06-05 16:15:57 unable to connect to broker, retrying ... 
Ping tcp://broker.netmaker.{{domain}}:8883({{ip_address}}:8883) - Connected - time=131.885665ms
Ping tcp://broker.netmaker.{{domain}}:8883({{ip_address}}:8883) - Connected - time=190.40443ms
Ping tcp://broker.netmaker.{{domain}}:8883({{ip_address}}:8883) - Connected - time=130.113114ms
[netclient] 2022-06-05 16:16:01 could not connect to broker broker.netmaker.{{domain}} connect timeout 
[netclient] 2022-06-05 16:16:01 connection issue detected.. attempt connection with new certs and broker information 
[netclient] 2022-06-05 16:16:01 certificates/key saved  
[netclient] 2022-06-05 16:16:33 could not connect to broker at broker.netmaker.{{domain}}:8883 
[netclient] 2022-06-05 16:16:33 failed to publish update for join connection timeout

Contributing guidelines

  • Yes, I did.
@rowhit rowhit added the bug Something isn't working label Jun 5, 2022
@afeiszli
Copy link
Contributor

afeiszli commented Jun 6, 2022

Are you using Traefik as the proxy? What install instructions are you following?

@rowhit
Copy link
Author

rowhit commented Jun 6, 2022

I followed the instructions provided here: https://docs.netmaker.org/quick-start.html, it uses Caddyfile.

@atomlab
Copy link

atomlab commented Jun 7, 2022

The same problem. My mosquitto server listen on 8883 port on public ip without any proxy(Traefik or Nginx) before mosquitto.

[netclient] 2022-06-07 20:31:29 register at https://nm-api.mydomain.com/api/server/register
[netclient] 2022-06-07 20:31:30 certificates/key saved
[netclient] 2022-06-07 20:32:00 unable to connect to broker, retrying ...
Ping tcp://nm-mq.mydomain.com:8883(x.x.x.x:8883) - Connected - time=70.198881ms
Ping tcp://nm-mq.mydomain.com:8883(x.x.x.x:8883) - Connected - time=71.9163ms
Ping tcp://nm-mq.mydomain.com:8883(x.x.x.x:8883) - Connected - time=71.848659ms
[netclient] 2022-06-07 20:32:04 could not connect to broker nm-mq.mydomain.com connect timeout
[netclient] 2022-06-07 20:32:04 connection issue detected.. attempt connection with new certs and broker information
[netclient] 2022-06-07 20:32:04 register at https://nm-api.mydomain.com/api/server/register
[netclient] 2022-06-07 20:32:04 certificates/key saved
[netclient] 2022-06-07 20:32:05 restarting netclient.service
[netclient] 2022-06-07 20:32:36 could not connect to broker at nm-mq.mydomain.com:8883
[netclient] 2022-06-07 20:32:36 failed to publish update for join connection timeout
[netclient] 2022-06-07 20:32:37 restarting netclient.service
[netclient] 2022-06-07 20:32:38 joined  mynet

port check on client is ok

# telnet nm-mq.mydomain.com 8883
Trying x.x.x.x...
Connected to nm-mq.mydomain.com.
Escape character is '^]'.

/etc/mosquitto/conf.d/custom.conf

user root
per_listener_settings true

listener 8883
allow_anonymous false
require_certificate true
use_identity_as_username true

cafile /etc/netmaker/root.pem
certfile /etc/netmaker/server.pem
keyfile /etc/netmaker/server.key

listener 1883
allow_anonymous true

netmaker server

# netstat -pln|grep mos
tcp        0      0 0.0.0.0:8883            0.0.0.0:*               LISTEN      191063/mosquitto
tcp        0      0 0.0.0.0:1883            0.0.0.0:*               LISTEN      191063/mosquitto

Netmaker 0.14.2
Netclient 0.14.2
Mosquitto 2.0.11
Run method: binary + systemd

Netmaker server OS: Debian 11
Netclient OS: Ubuntu 20.04 LTS

@Nexxus-LMT
Copy link

Nexxus-LMT commented Jun 8, 2022

i went though similar setup issues. if you're still having this issue or using the same setup. i never used caddy but i found it to be lacking some configuration that were listed in the rest of the setup guides so to be cautious i ditched the caddy setup.

try the following:

FYI i am using my own nginx server providing ssl and reverse proxy infront of entire netmaker docker-compose setup and it works withe a little tweaking (and coffee)

  1. double check all necessary ports are open on your firewall setup.

  2. ensure you have setup a wildcard *.netmaker.your.domain

  3. i'm pretty sure broker.netmaker.your.domain is not in the default caddy config so if you know how then try to add it to it and load up the new config and reverse proxy it to you mq. not sure if its necessary bcuz i dont use caddy but for my nginx setup it was.

  4. This is just in case but if you get the "broker.netmaker.your.domain port blank error" then you can add the mqport: "8883" or
    apply broker.netmaker.your.domain:443 to etc/netclient/config/netconfig-(yournetwork)

    version: v0.14.2
    mqport: "8883"
    server: broker.netmaker.domain.tld

When it works Netmaker is awesome... but i guess it's not easy to keep so many guides up to date and address all tweaks users employ per use case.

@atomlab
Copy link

atomlab commented Jun 8, 2022

In my case I have updated openssl from 1.1.1n to 3.0.2 15. I think Openssl version is important. The problem is gone!
Actually I have moved netmaker from Debian 11 to Ubuntu 22.04 LTS.

Debian 11

# openssl version
OpenSSL 1.1.1n  15 Mar 2022

Broker logs

1654636549: New connection from x.x.x.x:49406 on port 8883.
1654636549: OpenSSL Error[0]: error:14094412:SSL routines:ssl3_read_bytes:sslv3 alert bad certificate
1654636549: Client <unknown> disconnected: Protocol error.

Ubuntu 22.04 LTS. Broker works well. Netmaker certs was recreated again too but it is not necessary maybe.

# openssl version
OpenSSL 3.0.2 15 Mar 2022 (Library: OpenSSL 3.0.2 15 Mar 2022)

Broker logs

1654701658: New connection from x.x.x.x:59130 on port 8883.
1654701658: New client connected from x.x.x.x:59130 as DjAd2xOlEZjdpP5nqbwDdM0 (p2, c1, k30, u'<my_node_name>').
1654701658: Client DjAd2xOlEZjdpP5nqbwDdM0 disconnected.

@rowhit
Copy link
Author

rowhit commented Jun 15, 2022

I am able a netmaker installation based on the quick guide (https://docs.netmaker.org/quick-start.html) which now uses traefik instead of Caddy by default seems to work nicely with no issue. I have tested a reasonably complex combination of devices, machines and containers behind NAT under multiple networks, and they seem to work fine. I am using the netmaker version v0.14.2 I believe this issue can be closed unless there are some outstanding issues. Thanks again for promptly looking into the issue.
image

@rowhit rowhit changed the title [Bug]: Could not connect to broker.netmaker.{{domain}} [Bug]: Could not connect to broker.netmaker.domain.tld Jun 15, 2022
@jr200
Copy link

jr200 commented Jun 19, 2022

I'm using 0.14.3. I think I had this same issue, which I think might be the same as #1100, and can be debugged with this gist.

I am using a combination of traefik and contained docker-compose files.

My steps were:

  1. docker-compose up -d on netmaker server
  2. netclient join -t TOKEN on node
  3. the mq container reported the error OpenSSL Error[0]: error:14094412:SSL routines:ssl3_read_bytes:sslv3 alert bad certificate.
  4. I ran docker-compose down on the netmaker server
  5. I removed the shared_certs docker volume: docker volume rm mynetmaker_shared_certs
  6. started the netmaker server: docker-compose up -d

On the node, I saw subscribed to node updates for node mynode..., which confirmed the issue was resolved.

@afeiszli
Copy link
Contributor

Based on the responses of @jr200 and @rowhit I am closing this issue

@c0da
Copy link

c0da commented Aug 29, 2022

i went though similar setup issues. if you're still having this issue or using the same setup. i never used caddy but i found it to be lacking some configuration that were listed in the rest of the setup guides so to be cautious i ditched the caddy setup.

try the following:

FYI i am using my own nginx server providing ssl and reverse proxy infront of entire netmaker docker-compose setup and it works withe a little tweaking (and coffee)

  1. double check all necessary ports are open on your firewall setup.
  2. ensure you have setup a wildcard *.netmaker.your.domain
  3. i'm pretty sure broker.netmaker.your.domain is not in the default caddy config so if you know how then try to add it to it and load up the new config and reverse proxy it to you mq. not sure if its necessary bcuz i dont use caddy but for my nginx setup it was.
  4. This is just in case but if you get the "broker.netmaker.your.domain port blank error" then you can add the mqport: "8883" or
    apply broker.netmaker.your.domain:443 to etc/netclient/config/netconfig-(yournetwork)
    version: v0.14.2
    mqport: "8883"
    server: broker.netmaker.domain.tld

When it works Netmaker is awesome... but i guess it's not easy to keep so many guides up to date and address all tweaks users employ per use case.

@Nexxus-LMT, would you mind sharing your nginx configuration? My netclient doesn't connect to "broker.netmaker.mydomain.com:443" (it shows the "unable to connect to broker, retrying ..." error), but what I find strange is that I don't even see the requests in the nginx-proxy log.

Thanks in advance!

@c0da
Copy link

c0da commented Aug 29, 2022

@Nexxus-LMT, would you mind sharing your nginx configuration? My netclient doesn't connect to "broker.netmaker.mydomain.com:443" (it shows the "unable to connect to broker, retrying ..." error), but what I find strange is that I don't even see the requests in the nginx-proxy log.

Thanks in advance!

I changed the MQ_PORT variable from "443" to "8883" and opened port 8883 to any and now it works. I thought this was changed so that mq can use 443 as well. 🤷‍♂️​

@mattkasun
Copy link
Contributor

MQ always uses port 8883 (and internally 1883). The recommended setup with traefik proxies mqtts traffic from 443 to 8883. In this case clients connect to the broker.:443 but traefik proxies the connection to :8883.

@c0da
Copy link

c0da commented Aug 29, 2022

MQ always uses port 8883 (and internally 1883). The recommended setup with traefik proxies mqtts traffic from 443 to 8883. In this case clients connect to the broker.:443 but traefik proxies the connection to :8883.

Thx @mattkasun, that was I was trying to achieve, but for some reason it seems netclient doesn't even try to connect to port 443 (I don't see anything going to broker in the nginx-proxy logs, although I see traffic to "api...").

If I change my MQ_PORT environment variable back to 443, and proxy broker:443 to :8883, I see this on the netclient side:

[netclient] 2022-08-29 21:56:27 joining net01 at api.netmaker.mydomain.com:443 
[netclient] 2022-08-29 21:56:27 network: net01 node tomato is using port 51821 
[netclient] 2022-08-29 21:56:27 starting wireguard 
[netclient] 2022-08-29 21:56:59 unable to connect to broker, retrying ... 
Ping tcp://broker.netmaker.mydomain.com:443(123.123.123.123:443) - Connected - time=7.755796ms
Ping tcp://broker.netmaker.mydomain.com:443(123.123.123.123:443) - Connected - time=7.060485ms
Ping tcp://broker.netmaker.mydomain.com:443(123.123.123.123:443) - Connected - time=7.05654ms
[netclient] 2022-08-29 21:57:33 could not connect to broker at broker.netmaker.mydomain.com:443 
[netclient] 2022-08-29 21:57:33 network: net01 failed to publish update for join connection timeout 

Would you know what might be happening?

@mattkasun
Copy link
Contributor

Netclient is trying to reach the broker at port 443 but is unable to connect. Check the nginx and mq logs on server

@c0da
Copy link

c0da commented Aug 30, 2022

In the nginx logs I haven't seen any logs (neither 200, nor 404, nor 503), which is very strange.. After spending a lot of time trying to make it work, I ended up using a new IP dedicated to netmaker so I could use traefik listening on port 443 without going through nginx-proxy. Now it works without problems. Thank you very much for your help, @mattkasun!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

8 participants