Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid UDP cipher search if a NAT mapping exists #83

Merged
merged 2 commits into from Sep 15, 2020
Merged

Conversation

bemasc
Copy link

@bemasc bemasc commented Sep 12, 2020

This change limits the UDP cipher search to the first packet on a NAT
mapping. Subsequent packets use the cipher associated with the NAT
mapping. This assumes that multiple keys do not share a client port,
but that assumption was already in place for the downstream connection,
which uses a fixed cipher for the duration of a NAT mapping.

This makes UDP streams perform similarly to TCP.

The BenchmarkUDPManyKeys test was reusing the same client port with
multiple ciphers (and not attempting to decrypt the returned packets),
so it had to be modified to avoid client port reuse. The new version of
that test shows the effect of this optimization:

Before:
BenchmarkUDPManyKeys-4 1660 752446 ns/op 130564 B/op 1740 allocs/op

After:
BenchmarkUDPManyKeys-4 12822 90546 ns/op 6845 B/op 112 allocs/op

This test is something of a worst-case scenario (100 keys on the same
IP, interleaved), so real usage won't show such a large improvement.

In real use, the main benefit is in avoiding copying of the cipher
list on every packet. The new BenchmarkSnapshot test shows that this
takes ~5 ns per key on my laptop. Decryption takes 7000 ns, so copying
the list dominates the upstream CPU usage of UDP streams on servers with
1500+ keys.

This change limits the UDP cipher search to the first packet on a NAT
mapping.  Subsequent packets use the cipher associated with the NAT
mapping.  This assumes that multiple keys do not share a client port,
but that assumption was already in place for the downstream connection,
which uses a fixed cipher for the duration of a NAT mapping.

This makes UDP streams perform similarly to TCP.

The BenchmarkUDPManyKeys test was reusing the same client port with
multiple ciphers (and not attempting to decrypt the returned packets),
so it had to be modified to avoid client port reuse.  The new version of
that test shows the effect of this optimization:

Before:
BenchmarkUDPManyKeys-4       1660    752446 ns/op  130564 B/op    1740 allocs/op

After:
BenchmarkUDPManyKeys-4      12822     90546 ns/op    6845 B/op     112 allocs/op

This test is something of a worst-case scenario (100 keys on the same
IP, interleaved), so real usage won't show such a large improvement.

In real use, the main benefit is in avoiding copying of the cipher
list on every packet.  The new BenchmarkSnapshot test shows that this
takes ~5 ns per key on my laptop.  Decryption takes 7000 ns, so copying
the list dominates the upstream CPU usage of UDP streams on servers with
1500+ keys.
@bemasc bemasc requested a review from fortuna September 12, 2020 01:24
@bemasc
Copy link
Author

bemasc commented Sep 12, 2020

@JonathanDCohen FYI, this corresponds to some discussions we've had.

Copy link

@fortuna fortuna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

shadowsocks/udp.go Show resolved Hide resolved
shadowsocks/udp.go Show resolved Hide resolved
ip := clientAddr.(*net.UDPAddr).IP
var cipher shadowaead.Cipher
unpackStart := time.Now()
textData, keyID, cipher, err = unpack(ip, textBuf, cipherData, s.ciphers)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe rename this unpack so it's clearer it's doing the trial decryption?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, renamed to findAccessKeyUDP to match findAccessKey in tcp.go.

@bemasc bemasc merged commit ac58257 into master Sep 15, 2020
bemasc pushed a commit that referenced this pull request Nov 2, 2020
Currently, each valid shadowsocks packet causes a NAT mapping to be
created (if there isn't a matching one already), even if that packet is
rejected due to a malformed or disallowed destination address. This is
a regression due to #83.  The regression results in a significant memory
leak, because unused mappings are never cleaned up.
@sbruens sbruens deleted the bemasc-udp-buffer branch March 5, 2024 22:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants