Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broker ssl/disable_ssl mismatch doesn't complain #837

Closed
sethhall opened this issue Mar 2, 2020 · 10 comments · Fixed by #1070
Closed

Broker ssl/disable_ssl mismatch doesn't complain #837

sethhall opened this issue Mar 2, 2020 · 10 comments · Fixed by #1070
Assignees
Labels
Area: Broker Type: Bug 🐛 Unexpected behavior or output.
Milestone

Comments

@sethhall
Copy link
Member

sethhall commented Mar 2, 2020

If Zeek has defined to not used ssl with broker (Broker::disable_ssl) and it connects to a remote broker instance expecting SSL, there doesn't seem to be any error message output. I would expect some Broker error to be output but none of the Broker events were generated or anything.

@jsiwek jsiwek added Area: Broker Type: Bug 🐛 Unexpected behavior or output. labels Mar 3, 2020
@jsiwek jsiwek added this to the 3.2.0 milestone Mar 3, 2020
@jsiwek jsiwek added this to Unassigned / Todo in Release 3.2.0 via automation Mar 3, 2020
@rsmmr
Copy link
Member

rsmmr commented Mar 3, 2020

@Neverlord isn't this better already with current Broker?

@Neverlord
Copy link
Member

isn't this better already with current Broker?

Apparently not. If I enable INFO logs, I see " initiating connection to ..." and nothing else. I'll look into it.

@Neverlord
Copy link
Member

Ok, turns out the way TLS does its handshake is the exact opposite of CAF's way. In TLS, the client sends its ClientHello before the server sends anything, while the server sends first in the CAF handshake. This is why nothing happens: the server sends nothing, because the TLS handshake is waiting for data from the client and the non-SSL client sends nothing because it waits for the CAF handshake.

If I reverse the roles (connecting from SSL-enabled Broker to non-SSL Broker), then CAF complains about an invalid handshake (as it should) and the connecting Broker emits the error remote endpoint unavailable: system_error(cannot_connect_to_node).

It's a CAF issue, so I've opened an issue in CAF's tracker.

@timwoj
Copy link
Contributor

timwoj commented Jun 30, 2020

The CAF side of this was merged. We're just waiting on a CAF update to close this.

@Neverlord
Copy link
Member

As far as I can tell, Zeek master bundles a Broker version that already includes CAF 0.17.5 (i.e., a fixed version).

jsiwek added a commit that referenced this issue Jul 3, 2020
Instead of only writing them in broker.log, which may be easy to
overlook.
@jsiwek
Copy link
Contributor

jsiwek commented Jul 3, 2020

In Zeek master, still seeing that sometimes a non-SSL client trying to connect to SSL server gives back no error/status indication of a failed attempt.

test.zeek:

option connect = T;

event zeek_init()
	{
	if ( connect )
		Broker::peer("127.0.0.1", 9999/tcp);
	else
		Broker::listen("127.0.0.1", 9999/tcp);
	}

event Broker::peer_added(endpoint: Broker::EndpointInfo, msg: string)
	{
	print "peer added";

	if ( connect )
		terminate();
	}

event Broker::peer_lost(endpoint: Broker::EndpointInfo, msg: string)
	{
	print "peer lost";
	}

event Broker::error(code: Broker::ErrorCode, msg: string)
	{
	print "error", code, msg;
	terminate();
	}

Start server:

$ zeek ./test.zeek connect=F Broker::disable_ssl=F

Then expected client behavior happens sometimes:

$ BROKER_CONSOLE_VERBOSITY=info zeek ./test.zeek connect=T Broker::disable_ssl=T
[broker/INFO] 2020-07-03T11:38:10.312 creating endpoint
[broker/INFO] 2020-07-03T11:38:10.312 creating subscriber for topic(s) []
[broker/INFO] 2020-07-03T11:38:10.312 creating subscriber for topic(s) [<$>/local/data/errors, <$>/local/data/statuses]
[broker/INFO] 2020-07-03T11:38:10.318 adding topic zeek/supervisor to subscriber
[broker/INFO] 2020-07-03T11:38:10.318 adding topic bro/supervisor to subscriber
[broker/INFO] 2020-07-03T11:38:10.325 starting to peer with 127.0.0.1:9999 retry: 30s [asynchronous]
[broker/INFO] 2020-07-03T11:38:10.326 initiating connection to 127.0.0.1:9999 (no SSL)
[broker/INFO] 2020-07-03T11:38:10.326 emit: peer_unavailable 127.0.0.1:9999
error, Broker::PEER_UNAVAILABLE, (invalid-node, *127.0.0.1:9999, "unable to connect to remote peer")
<params>, line 1: received termination signal

But other times, goes until ctrl-c:

$ BROKER_CONSOLE_VERBOSITY=info zeek ./test.zeek connect=T Broker::disable_ssl=T
[broker/INFO] 2020-07-03T11:38:08.964 creating endpoint
[broker/INFO] 2020-07-03T11:38:08.964 creating subscriber for topic(s) []
[broker/INFO] 2020-07-03T11:38:08.964 creating subscriber for topic(s) [<$>/local/data/errors, <$>/local/data/statuses]
[broker/INFO] 2020-07-03T11:38:08.969 adding topic zeek/supervisor to subscriber
[broker/INFO] 2020-07-03T11:38:08.969 adding topic bro/supervisor to subscriber
[broker/INFO] 2020-07-03T11:38:08.976 starting to peer with 127.0.0.1:9999 retry: 30s [asynchronous]
[broker/INFO] 2020-07-03T11:38:08.976 initiating connection to 127.0.0.1:9999 (no SSL)
^C<params>, line 1: received termination signal

Flipping the disable_ssl option (SSL-client / non-SSL-server) seems to work as expected all the time (or at least haven't seen a case where it hangs).

Also, the above examples are being done in a release build: ./configure --build-type=release

In a debug build, --build-type=debug, I wasn't hitting the hanging-client behavior at all.

@rsmmr
Copy link
Member

rsmmr commented Jul 6, 2020

@Neverlord can you take a look? I would think this is probably Broker-side, not Zeek.

@Neverlord
Copy link
Member

@jsiwek thanks for double-checking and reproducing. :)
@rsmmr it's definitely on the Broker/CAF side. I'll look into it.

@Neverlord
Copy link
Member

@jsiwek thanks for providing the script! Saved me a lot of time. This boils down to CAF immediately dropping a socket if initializing the SSL session fails. In release mode, it seems like the client sends data fast enough so that there's enough data available to determine whether or not the SSL handshake succeeds right away. However, that path did not actually close the socket. Hence the hanging. I have a patch ready: actor-framework/actor-framework#1120. With this change, I can no longer reproduce any hanging in debug or release mode and always get:

error, Broker::PEER_UNAVAILABLE, (invalid-node, *127.0.0.1:9999, "unable to connect to remote peer")

@jsiwek jsiwek assigned jsiwek and unassigned Neverlord Jul 15, 2020
@jsiwek
Copy link
Contributor

jsiwek commented Jul 15, 2020

@Neverlord thanks, I'll take care of adding the Zeek-side test case to wrap this issue up once that CAF-side patch lands.

jsiwek added a commit that referenced this issue Jul 17, 2020
Instead of only writing them in broker.log, which may be easy to
overlook.
rsmmr added a commit that referenced this issue Jul 17, 2020
…er-mismatch-errors'

* origin/topic/jsiwek/gh-837-improve-broker-mismatch-errors:
  GH-837: Add test cases for mismatched Broker SSL configs
  GH-837: emit Reporter errors for Broker errors
  Fix incorrect/missing Broker error status code numbers
Release 3.2.0 automation moved this from Unassigned / Todo to Done Jul 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: Broker Type: Bug 🐛 Unexpected behavior or output.
Projects
No open projects
Release 3.2.0
  
Done
Development

Successfully merging a pull request may close this issue.

5 participants