Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

There is no verification on packets in reliable transfer mode #39176

Closed
omegachysis opened this issue May 30, 2020 · 23 comments
Closed

There is no verification on packets in reliable transfer mode #39176

omegachysis opened this issue May 30, 2020 · 23 comments

Comments

@omegachysis
Copy link
Contributor

omegachysis commented May 30, 2020

Godot version:
v3.2.1.stable

OS/device including version:
Windows 10

Issue description:
Using reliable transfer mode with the high-level networking API there is no checksum capability exposed to the engine. Corrupted packets are always accepted happily on reliable mode either resulting in crashes due to marshalling errors:

or corrupted data that silently fails (without compalint), for example this world data:

Steps to reproduce:

  • Create a simple app with a server and a client using the high-level API:
func _on_Power_toggled(button_pressed: bool) -> void:
	if started: return
	if $Server.pressed:
		var peer = NetworkedMultiplayerENet.new()
		peer.create_server(8080, 100)
		get_tree().network_peer = peer
		started = true
		print("Created server")
	else:
		var peer = NetworkedMultiplayerENet.new()
		peer.create_client("localhost", 8080)
		get_tree().network_peer = peer
		started = true
		print("Created client")
  • Send some data using RPCs in reliable mode by calling rpc(...)
  • Use a network scrambler like clumsy (https://jagt.github.io/clumsy/)
  • Enable packet tampering at any probability and with checksum recomputation on or off in the scrambling program
  • Watch corrupted packets get accepted

Minimal reproduction project:
NetworkTest.zip

@Calinou
Copy link
Member

Calinou commented May 30, 2020

cc @Faless

@Calinou Calinou added this to the 4.0 milestone May 30, 2020
@omegachysis
Copy link
Contributor Author

I'm currently working on implementing a checksum into ENet in 3.2 stable for my own project using the already included XXHash algorithm. This can be easily injected into networked_multiplayer_enet.cpp just by setting the host->checksum callback after the host is initialized with enet_host_create(...).

@Faless
Copy link
Collaborator

Faless commented May 31, 2020

If it crash the engine, that's a bug. We solved few decode crash in the past while auditing the marshalls code, but you find new ones please post the stack trace.

Regarding packet tampering. If the packet is received and dropped with an error in console, that is expected. UDP has checksum (for malformed packets), on top of the data link one. If any of those fails, the OS drop the packet.

If a packet get tampered by e.g. a succesful MITM attack (similar to the case of you packet scrambler), there is nothing we can really do.
If you add an extra checksum, the attacker can fake that too. What you could add is some form of MAC/HMAC, but if you are worried at that level, since you need key exchange etc, you can just use DTLS.

@Meriipu
Copy link
Contributor

Meriipu commented May 31, 2020

Regarding packet tampering. If the packet is received and dropped with an error in console, that is expected. UDP has checksum (for malformed packets), on top of the data link one. If any of those fails, the OS drop the packet.

Is not the checksum optional for UDP?

@omegachysis
Copy link
Contributor Author

omegachysis commented May 31, 2020

Okay so you are saying the theoretically if a UDP packet is malformed due to not matching a checksum, the packet should be dropped? Because I'm reporting that what I am seeing is that none of the packets are dropped at all, even though they are definitely getting malformed (and the checksum is not getting recomputed, meaning it should fail). I think what is happening is that there is actually no checksum on the UDP layer Godot is using because it is disabled at the ENet level since a null ptr is passed in for the checksum function.

I should note that what started this investigation is that I was seeing actual malformed packets in real multiplayer testing getting through, and since it is really improbable those real world packet errors were somehow also messing with the checksum in just the right way, I think the checksum feature is disabled for our UDP connection from Godot.

@omegachysis
Copy link
Contributor Author

I found this in the ENet changelog.. could be relevant:
https://github.com/lsalzman/enet/blob/master/ChangeLog#L124

@omegachysis
Copy link
Contributor Author

omegachysis commented May 31, 2020

@Faless You gave me an idea to check the UDP traffic of Godot with Wireshark and it looks like you are correct, there is definitely a CRC checksum being sent and it looks like the OS is responsible for filtering it. Unfortunately it looks like the 16-bit checksum that UDP uses is not good enough for my needs since I had real issues in actual multiplayer. I might see if I can implement a 64-bit checksum using xxHash on top of the networking system or maybe DTLS will be enough.
image

Maybe it would still be worth explosing the checksum function pointer of ENet to users of the engine so they could enable better validation? I'll leave that up to you. Thanks for your input and help!

@Meriipu
Copy link
Contributor

Meriipu commented May 31, 2020

I am getting that [UDK CHECKSUM INCORRECT]-status on absolutely all of the packets, not just some. (I have not been able to test for the case where the packet leaves and reenters my network card though).

@omegachysis
Copy link
Contributor Author

I am getting that [UDK CHECKSUM INCORRECT]-status on absolutely all of the packets, not just some. (I have not been able to test for the case where the packet leaves and reenters my network card though).

Are you getting that with a packet scrambler on or just with normal networking? Also are you using the loopback sniffer of Wireshark or are you inspecting something else?

@Faless
Copy link
Collaborator

Faless commented May 31, 2020

Is not the checksum optional for UDP?

Yeah, UDP checksum in IPv4 is indeed optional, so if the sent packet has a 0-valued checksum field, no check will happen on the receiving machine.

On the other end, if the checksum value is non-0 the receiver will enforce the checksum and drop the packet if invalid (checksum can be disabled on most OS even on a per-socket basis, but AFAIK all OS have that enabled by default, which is also the sensible choice, see for example RFC1122 section 4.1.3.4).

I guess clumsy sets it to 0 when not checking redo checksum this causing the receiving end to not do the check.

@Meriipu
Copy link
Contributor

Meriipu commented May 31, 2020

I am getting that [UDK CHECKSUM INCORRECT]-status on absolutely all of the packets, not just some. (I have not been able to test for the case where the packet leaves and reenters my network card though).

Are you getting that with a packet scrambler on or just with normal networking? Also are you using the loopback sniffer of Wireshark or are you inspecting something else?

No scrambler. Loopback, yes -- so it may be that there was never any hardware to compute the checksum.

@omegachysis
Copy link
Contributor Author

I guess clumsy sets it to 0 when not checking redo checksum this causing the receiving end to not do the check.

I just tested this with Wireshark and clumsy with redo checksum unchecked with tampering on and it looks like it's working as expected on my system:
image

@Faless
Copy link
Collaborator

Faless commented May 31, 2020

Unfortunately it looks like the 16-bit checksum that UDP uses is not good enough for my needs

Of course all checksums can fail, although it should be quite rare (again, data link has error correction too, so it must be an undetectable error in the intersection of data link/UDP error correction failure).

@omegachysis
Copy link
Contributor Author

omegachysis commented May 31, 2020

I never made the connection between Pigeonhole and that, pretty neat. It's possible that the issues I was seeing was due to something else then, maybe multithreading issues. If not though, even if it should be rare to have corrupted data pass through, apparently it's not rare enough. If I can't continue causing any of these issues with normal play I'll probably just write my own small wrapper around ENet and pass in a bigger hash function for the ENet checksum callback.

@Faless
Copy link
Collaborator

Faless commented May 31, 2020

I just tested this with Wireshark and clumsy with redo checksum unchecked with tampering on and it looks like it's working as expected on my system:

I'm not sure I understand, you are tampering the packet without changing the checksum and wireshark still says checksum is correct? Maybe without the redo checksum option clumsy does, on purpose, generate data that has the same checksum then? I couldn't find any proper description of how that function works in their doc/manual :(.

@Faless
Copy link
Collaborator

Faless commented May 31, 2020

I'll probably just write my own small wrapper around ENet and pass in a bigger hash function for the ENet checksum callback.

Given ENet has that option for extra checksum (probably to apply it on fragmented packets as a whole) maybe we can expose that, without touching the higher level multiplayer API. Will need some investigation on how it works.

@omegachysis
Copy link
Contributor Author

I looked at the source code for clumsy a bit and it appears that without redo checksum checked, all it does it randomly mess wtih the packet's data on the bit level and doesn't try to recompute the checksum (like a MITM attack would).

I was saying that I can see from Wireshark that the UDP protocal we are using with Godot is sending checksums just like you said it should, but that clumsy isn't zero-ing it out like you guessed. From the results in Wireshark it looks like a vast majority of the time, with packet tampering on but with redo checksum off, the checksum is correctly being dropped by UDP (I was wrong then about Godot accepting all the packets, since I didn't realize this happens at the OS level), and what I see in Wireshark backs that up. Also what I notice though is apparently even with redo checksum off, I still occasionally see packets make it through because the 16-bit checksum UDP does is apparently not quite enough.

@omegachysis
Copy link
Contributor Author

omegachysis commented May 31, 2020

Given ENet has that option for extra checksum (probably to apply it on fragmented packets as a whole) maybe we can expose that, without touching the higher level multiplayer API. Will need some investigation on how it works.

I'm pretty new to the actual engine, but unfortunately when I tried to pass a value in for the checksum callback on ENet it broke the networking so I think you are right that it applies to entire packets and that breaks some of the other logic in the high-level networking.

I think also some of my early results I just got confused by redo checksum. If that's enabled it breaks everything but like you said there's nothing we could reasonably do about that. I noticed that if you don't check redo checksum, it appears like everything is okay, but I think the reason it still breaks in my actual game is because I am sending such large objects for my terrain data and I need a stronger hash function to guard against problems.

@Faless
Copy link
Collaborator

Faless commented May 31, 2020

By a quick look at the enet source code, it seems that length computation is done automatically for us, so this simple patch seems to work (and should add the extra ENet checksum = 42 to the packet).

I am still baffled by the fact that you are actually hitting such a problem, which I've never experienced myself.
I'm wondering if the problem you are hitting is in fact UDP/network related or not.

But maybe you can try implementing this extra hash function and see if that solves it completely.

diff --git a/modules/enet/networked_multiplayer_enet.cpp b/modules/enet/networked_multiplayer_enet.cpp
index 406eb467f0..6e8589862b 100644
--- a/modules/enet/networked_multiplayer_enet.cpp
+++ b/modules/enet/networked_multiplayer_enet.cpp
@@ -33,6 +33,10 @@
 #include "core/io/marshalls.h"
 #include "core/os/os.h"
 
+enet_uint32 enet_checksum_compute(const ENetBuffer *buffers, size_t bufferCount) {
+       return 42;
+}
+
 void NetworkedMultiplayerENet::set_transfer_mode(TransferMode p_mode) {
 
        transfer_mode = p_mode;
@@ -106,6 +110,7 @@ Error NetworkedMultiplayerENet::create_server(int p_port, int p_max_clients, int
                        p_out_bandwidth /* limit outgoing bandwidth if > 0 */);
 
        ERR_FAIL_COND_V_MSG(!host, ERR_CANT_CREATE, "Couldn't create an ENet multiplayer server.");
+       host->checksum = enet_checksum_compute;
 #ifdef GODOT_ENET
        if (dtls_enabled) {
                enet_host_dtls_server_setup(host, dtls_key.ptr(), dtls_cert.ptr());
@@ -162,6 +167,7 @@ Error NetworkedMultiplayerENet::create_client(const String &p_address, int p_por
        }
 
        ERR_FAIL_COND_V_MSG(!host, ERR_CANT_CREATE, "Couldn't create the ENet client host.");
+       host->checksum = enet_checksum_compute;
 #ifdef GODOT_ENET
        if (dtls_enabled) {
                enet_host_dtls_client_setup(host, dtls_cert.ptr(), dtls_verify, p_address.utf8().get_data());

@omegachysis
Copy link
Contributor Author

omegachysis commented May 31, 2020

@Faless Wow you are amazing. when I get a chance I will try that. Honestly I'm starting to think the whole issue I experienced was just something else the whole time. I'm basically unable to see the original issue anymore (unless redo checksum is on, but like we both agree, that should break everything). I think I probably just saw an extremely unlikely fluke or two and misunderstood the core issue (which honestly is probably just bad synchronization on my part or some other stupid thing). In the end I guess this isn't a bug at all but a feature request to expose a custom checksum function?

By returnin 42 from the function you passed into the checksum function pointer, wouldn't you just always be computing the checksum as 42? I tried using xxHash inside a function like enet_checksum_compute and returned a 32-bit hash from that.

Also, for my own sanity, the high-level multiplayer features are not thread-safe right?

Thanks again for all your help. If I find myself with the money sometime, I'd like to support Godot's development. I'm really impressed both with the engine and with the community.

@Faless
Copy link
Collaborator

Faless commented May 31, 2020

By returnin 42 from the function you passed into the checksum function pointer, wouldn't you just always be computing the checksum as 42? I tried using xxHash inside a function like enet_checksum_compute and returned a 32-bit hash from that.

Yeah, I didn't implement a real hashing function.
If you want to try it, you need to implement that (with whatever hash algorithm you choose) and the result must be a 32 bit integer.

Also, for my own sanity, the high-level multiplayer features are not thread-safe right?

No, it is not thread safe, you should avoid calling RPCs in threads unless you know what you are doing.
You could potentially disable automatic polling, but you will need to be very careful about protecting code with mutexes, both when calling RPCs/RSETs, and when calling poll itself, and being aware that signals will be emitted during the poll call, inside whatever thread is actually making the call.

@omegachysis
Copy link
Contributor Author

omegachysis commented May 31, 2020

Okay, awesome. I'll keep looking and open a conversation up if I notice anything off, but I'd say with all the investigation here I think I can comfortably say that the bug I thought I was reporting is just something else (probably an issue with bad synchronization). I'll probably implement a concurrent queue messaging service and pass all of my RPCs into that instead. You can close this if you like, or if you want to expose the callback in the future, change it to that. Either way I think it's safe to say this is resolved though.

@Calinou
Copy link
Member

Calinou commented May 2, 2021

You can close this if you like, or if you want to expose the callback in the future, change it to that. Either way I think it's safe to say this is resolved though.

Now that DTLS support is available for multiplayer encryption, this can be closed.

@Calinou Calinou closed this as completed May 2, 2021
@Calinou Calinou modified the milestones: 4.0, 3.2 May 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants