Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SIP] Shadowsocks v2 #157

Open
riobard opened this issue Feb 20, 2020 · 51 comments
Open

[SIP] Shadowsocks v2 #157

riobard opened this issue Feb 20, 2020 · 51 comments

Comments

@riobard
Copy link
Contributor

riobard commented Feb 20, 2020

This issue is to discuss the changes we want in the next major revision of Shadowsocks protocol. Right now I've done some preliminary research based on the SOCKS6 RFC draft and I have a prototype security layer that provides forward secrecy (except for early data) and 1-RTT latency (or 0-RTT if used with TCP Fast Open).

So here're the things I have in mind (no particular order of importance, and most are optional):

  1. v2 protocol roughly based on SOCKS6 (which is still a moving target).
  2. New security layer with PFS and 0/1-RTT (w/o TFO). (related issue Secure single-port multi-user authentication #54)
  3. Basic auth so we can officially support single-port multi-users without hacks.
  4. Native solution for DNS. (related issue [SIP009] Handling remote DNS in ss-server in a backwards compatible way #156)
  5. Better-defined semantics of proxy and VPN regarding errors and ICMP packets (related issue [SIP] Add SOCKS Reply Field #144)
  6. Multiplexing over single TCP connection (similar to HTTP/2) to reduce latency when TFO is not possible.

Please feel free to discuss the changes.

@Mygod

This comment has been minimized.

@ghost
Copy link

ghost commented Feb 20, 2020

Can v2 server also provide v1 service?

Multiplexing over single TCP connection

In HTTP/3, they multiplexing over single UDP "connection" to avoid TCP flow control stalled all sub stream in single TCP connection. SCTP is an option too, it's designed to multiplexing, but it's too rare...

And just mention here: First proposal (in clowwindy's original post) for shadowsocks is public key encryption. (如果有其他同学有兴趣加入的话,也许可以进一步做成公钥加密的。)

@riobard
Copy link
Contributor Author

riobard commented Feb 20, 2020

@studentmain For co-existence of v2 and v1: this is up to the implementation to decide as long as it's not possible to do down-grade attack.

Multiplexing over TCP (and HTTP/2 in particular) is a compromise because of UDP throttling. If HTTP/3 becomes popular and works well enough in practice we could switch to UDP as well. Meanwhile I have to deal with TCP and broken middle boxes killing TFO packets.

I've almost finished the public key encryption part (without RTT penalty).

@ghost
Copy link

ghost commented Feb 20, 2020

Then problem became how client choose correct protocol. Change in URL format may required.

HTTP/3 is modified QUIC, so I think at least it's good enough in Google's data center. But I'm not sure it's good enough too under pacific ocean. v2ray has QUIC support, maybe we can take a look at them.
https://blog.apnic.net/2018/05/15/how-much-of-the-internet-is-using-quic/
https://w3techs.com/technologies/details/ce-quic

We should use quantum safe cipher in public key encryption. https://github.com/open-quantum-safe/liboqs/tree/master#supported-algorithms

@riobard
Copy link
Contributor Author

riobard commented Feb 20, 2020

A ss2:// scheme should work fine.

I need to see hard evidence that UDP works without significant throttling before investing time on it.

Quantum security is beyond the scope. Public key encryption is mostly to support multi-users without too much security downsides.

@Mygod
Copy link
Contributor

Mygod commented Feb 20, 2020

Regarding the proposal, this is definitely too much. I prefer a minimalistic approach and offload features to plugins whenever possible. In fact, we could even make the default AEAD encryption as a (default) plugin and always run in plain (I guess that reduces v2 to simply socks6 over *, but KISS).

Detailed comments:

  • Regarding 2: I still stand the point that public-key crypto/key-exchange is expensive and unnecessary and I am strongly against it. Furthermore, you can offload it to plugins like TLS.
  • Regarding 3: We can simply employ basic auth from socks5/6 for this.
  • Regarding 6: I disdain the idea of multiplexing. I can only see it useful in case of server push in HTTP/2.
  • Regarding post-quantumness: Again, just let the plugins do their job.

@riobard
Copy link
Contributor Author

riobard commented Feb 21, 2020

@Mygod At minimal I'd like ss2-server to work like a regular SOCKS6 server with some special behaviors regarding authentication as to not leak its existence. But right now there's no other SOCKS6 clients to test with.

It's 2020 and public key crypto is easy and efficient with modern primitives. I've considered using just TLS but there are several major blocking issues, namely only TLS 1.3 technically supports 0/1-RTT mode, but many implementations (like the one in Go's stdlib) does not support 0-RTT at all, and there's no plan to add it any time soon. Additionally, TLS brings in a host of other issues regarding certificate management and domain verification that I don't want to force it on people. And the complexity of TLS is… well just read the RFC and judge it yourself. The new security layer aims to be very simple, efficient, and secure. If you do not care about any of those nice things, you can always run SOCKS-over-TLS (many existing clients support it) and no need to bother with Shadowsocks at all.

I'm still considering multiplexing. It has significant benefits in Shadowsocks use case, namely 1) reliable 0-RTT connection establishment even when TFO does not work, 2) better utilization of network bandwidth due to TCP congestion control, and 3) it cuts the number of connections/open files in half on the server side. However it does come with obvious drawbacks as well, like complexity and head-of-line blocking, both of which cannot be avoided. I'm experimenting with HTTP/2 CONNECT proxy now, and it does work better than I expected. But it's difficult to integrate and provide reasonable proxy semantics (mostly communicating errors between local and remote side).

@Mygod
Copy link
Contributor

Mygod commented Feb 21, 2020

Authentication without leaking existence can be achieved simply via socks6 over blank. It seems like socks6 draft specifies that the client can send payload immediately after authentication header so I don't see why this is an issue.

0-RTT cannot work. TLS 1.3 does 0-RTT by using a session ID. You need a handshake to establish a key exchange/session ID. You are not going to want to encrypt each packet using public-key encryption.

TLS has the added advantages for traffic hiding that a new protocol cannot provide. The advantage of TLS is that it makes traffic indistinguishable from other TLS traffic, say HTTPS, except for inspecting packet length distribution (see sssniff, etc), which I believe is somewhat reliable at best.

Also, the reason I oppose you building protocols from public-key crypto directly is exactly the reason I opposed OTA. We are living in the sad world where security is not as composable as you would like it to be, and you are not going to get world's best security experts to audit your protocol (despite how complicated TLS is, it is of popular interest and gets audited by everyone).

Multiplexing is useless, except for connection reuse, which should be implemented by plugins. I agree connection reuse could be useful but again this should be implemented by a plugin where the mimicked traffic does use such feature, say HTTP/2. We should make hiding traffic our priority instead of performance (especially when it's as minor as number of RTTs), etc. The plain protocol should not have long idle connections.

@Mygod
Copy link
Contributor

Mygod commented Feb 21, 2020

In conclusion, we should just do socks6 over [blank]. @madeye What do you think?

@riobard
Copy link
Contributor Author

riobard commented Feb 21, 2020

@Mygod Unfortunately you are wrong on many levels…

  1. Multi-user authentication won't work securely without public key crypto, see Secure single-port multi-user authentication #54
  2. 0-RTT works fine, except for early data (TLS 1.3 also shares this caveat), and client can choose how much early data to send. The trick is to pre-share server public key. We need this for server authentication anyway so no extra problem either. Also see issue Secure single-port multi-user authentication #54 (I need to update it with forward secrecy tho).
  3. Sure, I completely understand the benefit of TLS. But for obfuscation we all agree it should be done by plugin so there's no disagreement. We just need to provide a default when people don't want to bother with TLS. Current default is insufficient.
  4. The new security layer is basically vastly simplified TLS 1.3 so I'm confident. You might not agree and it's perfectly fine to use TLS instead (and accept your chosen TLS lib's limitations).
  5. Fewer RTTs is important for user experience. Long idle connections are the norm. Just check how many connections you phone keep to various clouds. And it's strange for a client to keep dozens of TCP connections to a single server in an increasingly HTTP/2 world.

@Mygod
Copy link
Contributor

Mygod commented Feb 21, 2020

I am not going to argue with your opinions so just some technical comments.

  1. "securely" depends on your security model. I mentioned in Secure single-port multi-user authentication #54 (comment) that your proposal actually does not achieve what you want (in particular I constructed an attacker in your model). However, I would argue that TLS does it pretty decently.
  2. You can make a plugin to do what you describe but I do not feel comfortable making an ad-hoc protocol a default choice.

My opinion: TLS isn't too hard to set up actually.

@riobard
Copy link
Contributor Author

riobard commented Feb 21, 2020

We can discuss the technical issues separately in #54.

I'm not against TLS. It's just that the TLS in Go stdlib does not provide what I want (0-RTT) and I don't want to bother with certificates and domain verification. Like I said before, you can always run SOCKS-over-TLS so there's no disagreement here.

@madeye
Copy link
Contributor

madeye commented Feb 21, 2020

Recently, I'm thinking about a side channel key exchange approach.

For example, do a Wireguard like key exchange (https://www.wireguard.com/protocol/) in a side channel (a standard 443 port, a random port, or even a different host server), then communicate using the current shadowsocks protocol.

@riobard
Copy link
Contributor Author

riobard commented Feb 21, 2020

@madeye The benefit is?

I think some of the commercial operators offer HTTPS-based subscription to do similar things. But I don't fully understand the reasoning behind that.

@ghost
Copy link

ghost commented Feb 21, 2020

@riobard So ss server itself can know which user will connect before any packet received. That will make active probing useless.

@riobard
Copy link
Contributor Author

riobard commented Feb 21, 2020

@studentmain Could you please explain a bit more in detail under what scenario will make it immune to probing?

@ghost
Copy link

ghost commented Feb 21, 2020

Client connected to side channel and finish handshake here. Then server will get client's IP address (maybe with IV user will used) before client connected to it. Attacker can't pass side channel handshake, so when a packet come in, server has no information about it, then server can reset connection or do whatever it like.

@riobard
Copy link
Contributor Author

riobard commented Feb 21, 2020

If it's IP-based firewalling, it seems very fragile given the mass deployment of Carrier-Grade NAT (CGNAT), in which you cannot guarantee the client's public IP when connecting to the authorization server is the same one when connecting to the relay server.

So the safest bet is for the client to get some kind of auth token from the authorization server and use that token to connect to the relay server. At this point I'm confused as to how it will be different than sending just a PSK?

@ghost
Copy link

ghost commented Feb 21, 2020

So the safest bet is for the client to get some kind of auth token from the authorization server and use that token to connect to the relay server.

Yes, token obtained in safe side channel is ok.

@riobard
Copy link
Contributor Author

riobard commented Feb 21, 2020

How is it different from the current approach with respect to active probing? I still don't understand the advantage of the split approach.

@ghost
Copy link

ghost commented Feb 21, 2020

In split approach, server can detect probe easier and more accurate. It works similar to port knocking for SSH.

@riobard
Copy link
Contributor Author

riobard commented Feb 21, 2020

So in the normal setup, we have to detect replay attack on the server in a black list style (previously used nonce will be rejected). But in the split approach, at least on the relay server, I assume it's more like a white list style (only authorized tokens will be accepted).

Then we're moving the attack surface from the relay server to the auth server. Also now because the auth server and relay server are different now, we have to consider the additional synchronization issue (client gets an auth token from auth server, but auth server has not yet delivered that token to the relay server when the client connects to the relay server).

It does not look too promising either. Or am I missing something here?

@ghost
Copy link

ghost commented Feb 21, 2020

Auth server can hide behind normal website (that's why it use 443) and works less frequent than relay server. So I think it's attack surface is much smaller, you need find it from many TLS website first.

additional synchronization issue

That's the problem need to be resolve. My solution is send keys to client after received relay server's confirmation. That will introduce more RTT for first packet. Auth server can send few dozens key to client for use in other connection, so it only affect first connection.

@riobard
Copy link
Contributor Author

riobard commented Feb 21, 2020

I see. My concern is that the split approach introduces many moving parts (it's officially a distributed system now) and the benefits are not very clear cut.

@Mygod
Copy link
Contributor

Mygod commented Feb 21, 2020

This is a too complicated solution for a problem that TLS can solve.

EDIT: Also you are making it easier to fingerprint the server.

@ghost
Copy link

ghost commented Feb 22, 2020

So here's three solution right?

  • Modified SOCKSv6 + new security layer
  • Extended SOCKSv6 + TLS
  • Side channel handshake

@ghost
Copy link

ghost commented Feb 22, 2020

Modified SOCKSv5 + TLS (via simple-obfs and v2ray-plugin) already tested for a long time.

About side channel handshake, as it needn't redesign packet format, we can test it on current code base.

@riobard
Copy link
Contributor Author

riobard commented Feb 22, 2020

More likely modified/simplified SOCKS6 + interchangeable security layer (custom/tls/plugin)?

Only issue is that SOCKS6 is still a draft and it's not clear if it will be widely adopted.

@ghost
Copy link

ghost commented Feb 22, 2020

Only problem of TLS is they need a domain name.

@riobard
Copy link
Contributor Author

riobard commented Feb 22, 2020

@studentmain And certificates and renewal handling (acme most likely), which is a hassle for many.

@ghost
Copy link

ghost commented Feb 22, 2020

Let's Encrypt ACME renewal can be automatic, so domain name is the only problem. Or they can only use self signed cert, that's another fingerprint...

@riobard
Copy link
Contributor Author

riobard commented Feb 22, 2020

Yeah that’s what acme is for. Still something to setup. Also TLS doesn’t work well in some corporate network with MitM decryption and company-issued CA. Buy it does come with the benefits that it will pass most firewalls and looks pretty innocent. There’s no one size fits all solution here.

@ghost
Copy link

ghost commented Feb 22, 2020

So we may have multiple security layer and let user choose one. Does multiple cipher choice still necessary here? How external plugin operate in this model (I don't want see SOCKSv6 over TLS over Websocket over TLS)?

@riobard
Copy link
Contributor Author

riobard commented Feb 22, 2020

I’m afraid SOCKS-over-WebSocket-over-TLS will still be necessary to work with CDN. Anyway the choice is relatively simple:

  • If you are not in MitM corp network with company CA and wants to look as innocent as possible, and don’t mind paying for a domain name every year and configuring ACME, run SOCKS-over-TLS with legitimate CA-issued certificates.
  • If you want the easiest to use and your network is relatively friendly, run SOCKS-over-the-new-secure-layer-which-does-not-have-a-name-yet.
  • If you have any special plugin, run naked SOCKS-over-whatever-plugin-you-have.

We should definitely make a flowchart to pick the right combo. 😂

@ghost
Copy link

ghost commented Feb 22, 2020

My suggestion:

Make new security layer as tiny as possible, provide forward secrecy. An optional built-in TLS (maybe with Websocket) layer can be enabled by user when there's no plugin.

Why I think FS is necessary: shadowsocks/shadowsocks-windows#2162 (comment)

@riobard
Copy link
Contributor Author

riobard commented Feb 22, 2020

Check #54 for the proposal. It’s as minimal as possible now.

@Mygod
Copy link
Contributor

Mygod commented Feb 23, 2020

@studentmain Not sure if you have heard of Tor.

@Dreamacro
Copy link

Why not use https://noiseprotocol.org/noise.html ?

@riobard
Copy link
Contributor Author

riobard commented Feb 25, 2020

@Dreamacro A few reasons:

  1. Complexity is over the roof (unless use 3rd libs).
  2. Handshakes means either additional RTTs or keeping state on both ends (we want neither).
  3. Might as well use TLS (more common and innocent-looking at least).

@Dreamacro
Copy link

@riobard noise protocol more simple and lightweight than TLS, and provides a lot of flexible handshake patterns. In the real-world, Wireguard and Whatsapp use it.

As for RTT, noise protocol will be less than TLS.

@riobard
Copy link
Contributor Author

riobard commented Feb 25, 2020

@Dreamacro So you are suggesting we use a specific Noise key exchange, or the whole suite? AFAIC we only need ECDH to setup ephemeral sessions keys. The rest of Noise does not bring much benefit.

My worry is that both WireGuard and Whatsapp are assumed to be blocked in the future (if not yet now), and there's nothing stopping censors to block all Noise-like protocols (if not disguised). Compared to the case of TLS, I guess most censors cannot afford killing TLS due to its importance.

@made-by-love
Copy link

made-by-love commented Mar 2, 2020

Can v2 server also provide v1 service?

Multiplexing over single TCP connection

In HTTP/3, they multiplexing over single UDP "connection" to avoid TCP flow control stalled all sub stream in single TCP connection. SCTP is an option too, it's designed to multiplexing, but it's too rare...

And just mention here: First proposal (in clowwindy's original post) for shadowsocks is public key encryption. (如果有其他同学有兴趣加入的话,也许可以进一步做成公钥加密的。)

I implemented pre-shard server public key (x25519 key exchange in libsodium) encryption in 2017, and zero overhead except for 32 bytes client's public key before nonce at the beginning of the first TCP packet. I Implemented in shadowsocks-python and libev version when I ported AEAD to Python version in 2017.

As it's not compatible to original shadowsocks, I didn't push the code.

Key exchange details in libsodium: https://download.libsodium.org/doc/key_exchange

服务器端
pk: 公钥,预先生成,客户端通过 api 获取服务器信息获取服务器 pk
sk: 私钥,和 pk 是一对,预先生成,可以用 shadowsocks/psk.py 生成<pk, sk>
rpk:远端/客户端公钥,客户端建立新连接的时候和 nonce 一起发送
rx:远端/客户端加密密码,也就是接收密码,解密用
tx:发送加密密码,发送给客户端加密的密码

客户端
pk: 公钥
sk: 私钥,和 pk 是一对,发起TCP链接的时候生成一次性 <pk, sk> 对
rpk:远端/服务器公钥,通过 api 获取服务器信息获取
rx:远端/服务器加密密码,也就是接收密码,解密用
tx:发送加密密码,发送给服务器端加密的密码

rx,tx 密码
rx 和 tx 是通过 pk,sk,rpk 计算得来
<rx, tx> = session_keys(pk, sk, rpk)

rx || tx = BLAKE2B-512(X25519(p.n))

@ohsorry
Copy link

ohsorry commented Mar 9, 2020

So I guess this is the place where discuss ss v2 protocol mentioned by @studentmain?
I haven't read the contents of socks v6 yet, so I may not be able to share my thoughts on v2 right now.
However, check my efforts on refactoring shadowsocks-windows.
I'v tried several times to figure out how shadowsocks works by reading the source code of shadowsocks-windows, but failed each time. Therefore I decided to refactor it, and give it a redesign.
The key work of refactoring has been done, and both the server and client work properly on my computer. Have a look at it: https://github.com/shadowsocks/Shadowsocks-Net

Thanks to @celeron533 for creating a repository for me.
My English really sucks and I shouldn't talk so much.

@EkkoG
Copy link

EkkoG commented Sep 1, 2020

Only problem of TLS is they need a domain name.

TLS has an ext named TLS-PSK before TLS 1.3, TLS 1.3 has include this part, not ext at all, https://tools.ietf.org/html/rfc8446#section-2.2
image
image from https://www.wikiwand.com/en/Transport_Layer_Security

openssl has official support TLS-PSK, and this is a Python warpper, it no need to have a domain at all. For user, the config can same as over TCP, no domain, no certificates.

The problem here is, language's stdlib TLS-PSK API is always missing, so it's need some third-party lib, or develop from scratch.

@riobard @studentmain @Mygod

@riobard
Copy link
Contributor Author

riobard commented Sep 1, 2020

@cielpy Presumably if we use TLS, we'd like to look as innocent as possible to blend in normal TLS traffic. The problem with TLS-PSK is that it has an easily-detectable & unique feature, and it is extremely rare in normal TLS traffic.

We could in theory just adopt TLS-PSK and call it a day, but if enough people are using it to evade GFW, it will be investigated by GFW admins and blocked.

@EkkoG
Copy link

EkkoG commented Sep 1, 2020

@riobard Yes but no, I know the situation you mentioned is almost like GFW blocked ESNI recently, but it a little different because TLS-PSK has existed for years and have it own scenes to be used, mostly IoT devices, ESNI is very new, so GFW can just to block it, for TLS-PSK, I think it will be more difficult when the admins make the block decision.

@EkkoG
Copy link

EkkoG commented Sep 1, 2020

If we know how much traffic have for now on the Internet, it will be easier to help us to make choice, but no...

@riobard
Copy link
Contributor Author

riobard commented Sep 1, 2020

There's no need to know global traffic pattern. Just capture traffic for a day at your home router and calculate the percentage of TLS-PSK in all TLS connections. I'd be impressed if it is more than 0.01% for an ordinary household.

@EkkoG
Copy link

EkkoG commented Sep 1, 2020

Maybe, I will if it's possible.

@ghost
Copy link

ghost commented Sep 1, 2020

Only problem of TLS is they need a domain name.

TLS has an ext named TLS-PSK before TLS 1.3, TLS 1.3 has include this part, not ext at all, https://tools.ietf.org/html/rfc8446#section-2.2
image
image from https://www.wikiwand.com/en/Transport_Layer_Security

openssl has official support TLS-PSK, and this is a Python warpper, it no need to have a domain at all. For user, the config can same as over TCP, no domain, no certificates.

The problem here is, language's stdlib TLS-PSK API is always missing, so it's need some third-party lib, or develop from scratch.

@riobard @studentmain @Mygod

Can TLS operate without domain name? Technicality yes, actually no. Here's one thing: no widely support = no widely use (= if we use it, we are looks strange)

@darhwa
Copy link

darhwa commented Oct 10, 2020

Only problem of TLS is they need a domain name.

TLS has an ext named TLS-PSK before TLS 1.3, TLS 1.3 has include this part, not ext at all, https://tools.ietf.org/html/rfc8446#section-2.2
image
image from https://www.wikiwand.com/en/Transport_Layer_Security

openssl has official support TLS-PSK, and this is a Python warpper, it no need to have a domain at all. For user, the config can same as over TCP, no domain, no certificates.

The problem here is, language's stdlib TLS-PSK API is always missing, so it's need some third-party lib, or develop from scratch.

@riobard @studentmain @Mygod

What is the status of TLS-PSK in this TLS 1.3 era? In my understanding, TLS 1.3 uses PSK for connection resumptions. But I'm not sure if the client can establish a new TLS 1.3 connection to the server with only PSK (without certificate).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants