Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIP004 - Support for AEADs implemented by large libraries #30

Closed
2 of 4 tasks
Mygod opened this issue Jan 12, 2017 · 280 comments
Closed
2 of 4 tasks

SIP004 - Support for AEADs implemented by large libraries #30

Mygod opened this issue Jan 12, 2017 · 280 comments

Comments

@Mygod
Copy link
Contributor

Mygod commented Jan 12, 2017

Use AEADs to replace stream cipher + OTA. Previous discussion: #29.

Proposed AEAD algorithms:

  • ChaCha20-Poly1305 (see also: xSocks)
  • XChaCha20-Poly1305
  • Salsa20-Poly1305
  • AES-256-GCM (faster but not low-end-box-friendly)

Update: The following shows an example of TCP stream in chacha20-ietf-poly1305 mode (original idea by @breakwa11 and @Noisyfox). Other AEAD should follow the similar format.

Cipher: chacha20-ietf-poly1305

TCP request (after encryption, *ciphertext*)
+--------+----------------+--------------+--------------+---------------+
| NONCE  | PayloadLen_TAG | *PayloadLen* | Payload_TAG  |   *Payload*   |
+--------+----------------+--------------+--------------+---------------+
|  12    |       16       |       2      |     16       |    Variable   |
+--------+----------------+--------------+--------------+---------------+

TCP Chunk (after encryption, *ciphertext*)
+--------------+------------+-----------+----------+
| DATA_LEN_TAG | *DATA_LEN* |  DATA_TAG |  *DATA*  |
+--------------+------------+-----------+----------+
|      16      |     2      |     16    | Variable |
+--------------+------------+-----------+----------+
@madeye
Copy link
Contributor

madeye commented Jan 12, 2017

My plan is forking ourselves, dropping all the support of stream ciphers and OTA.

After that the we can provide a new version of the protocol (v2?).

@wongsyrone
Copy link

I want to clarify AES-GCM is the fastest on machines with particular instruction sets, for example, server and client are both running on modern Intel CPU.

@wongsyrone
Copy link

I agree with #30 (comment)

@Mygod
Copy link
Contributor Author

Mygod commented Jan 12, 2017

I don't see the reason to drop compatibility for stream ciphers for now as @madeye has said:

However, I don't think any known attacks to shadowsocks protocol is realistic. And please don't forget that most of the traffic over shadowsocks is TLS...

People who has higher security standards could migrate to using these algorithms. And if benchmark shows its influence to performance is negligible we should start to encourage everyone to migrate to AEADs.

And I don't think adding these additional algorithms would be a lot of work (we need at most 1 more abstraction layer).

@wongsyrone
Copy link

They can coexist as different projects, it is easy to maintain IMO.

@Mygod
Copy link
Contributor Author

Mygod commented Jan 12, 2017

Considering SIP003, I think it's easier to maintain if implemented in one single project.

@madeye
Copy link
Contributor

madeye commented Jan 12, 2017

To support AEADs, first we need to define chunks for the TCP stream. Here is a proposal:

+----------+----------+----------+----------+
| DATA.LEN |   NONCE  |    TAG   |   DATA   | ...
+----------+----------+----------+----------+
|     2    |    8     |    16    | Variable | ...
+----------+----------+----------+----------| 

Assuming average data length equals to a typical MTU - 26 (1492 - 26 = 1466), then the maximum transfer efficiency is 1466 / 1492 = 98%.

@Noisyfox
Copy link

So which part of the chunk is encrypted? DATA? Or the whole chunk? Or both of them?

@madeye
Copy link
Contributor

madeye commented Jan 12, 2017

@Noisyfox DATA.LEN, NONCE and TAG should not be encrypted. Only DATA is encrypted. TAG is generated from encrypted DATA. For more details: https://github.com/lparam/xSocks/blob/master/src/crypto.c#L25

@Noisyfox
Copy link

That's basically the pattern of TLS record if I did understand correctly.

@madeye
Copy link
Contributor

madeye commented Jan 12, 2017

@Noisyfox Yes, exactly.

@wongsyrone
Copy link

@Mygod
Copy link
Contributor Author

Mygod commented Jan 12, 2017

Is DATA.LEN necessary?

@wongsyrone I don't think we need AEAD actually?

@madeye
Copy link
Contributor

madeye commented Jan 12, 2017

@Mygod Yes, I think so. It's required to split TCP stream to chunks.

@wongsyrone
Copy link

AE:

  • Encrypts a message with a key and a nonce to keep it confidential
  • Computes an authentication tag. This tag is used to make sure that the message hasn't been tampered with before decrypting it.

AEAD:

  • Encrypts a message with a key and a nonce to keep it confidential
  • Computes an authentication tag. This tag is used to make sure that the message, as well as optional, non-confidential (non-encrypted) data, haven't been tampered with.

AEAD can protect plain text as well.

@Mygod
Copy link
Contributor Author

Mygod commented Jan 12, 2017

@madeye What about letting it equal to packet length - 24?

@wongsyrone I don't think we have non-confidential data to protect however.

@madeye
Copy link
Contributor

madeye commented Jan 12, 2017

@Mygod In a TCP stream, there is not packet length...

@Mygod
Copy link
Contributor Author

Mygod commented Jan 12, 2017

Oh sorry I got confused because you mentioned MTU. 😅

@madeye
Copy link
Contributor

madeye commented Jan 12, 2017

MTU here is only to demonstrate how large a chunk could be in the real world... In our implementation, we should also limit the DATA.LEN, make it smaller than the socket buffer size, e.g. 2048 in shadowsocks-libev.

@wongsyrone
Copy link

NONCE is too short.

Note: the currently included AEAD cipher based on the draft only uses an eight byte nonce, which is too short to be safely generated randomly. A later version of libsodium will include the updated version with a longer nonce.

https://crypto.stackexchange.com/questions/29311/how-do-i-tack-on-additional-data-to-libsodiums-public-key-authenticated-encry

@madeye
Copy link
Contributor

madeye commented Jan 12, 2017

Then what about this:

+----------+----------+----------+----------+----------|
| DATA.LEN |   SEQ    |   NONCE  |    TAG   |   DATA   | ...
+----------+----------+----------+----------+----------+
|     2    |    8     |    8     |    16    | Variable | ...
+----------+----------+----------+----------+----------+

SEQ is a 64-bit sequence number. The real NONCE is SEQ+ NONCE. DATA.LEN + SEQ is also the AD of the AEAD.

(Why shouldn't I use TLS directly? 😅 )

@wongsyrone
Copy link

wongsyrone commented Jan 18, 2017

[draft] main idea from @breakwa11 and @Noisyfox
The purpose of Length tag is to keep length as-is and not tempered by the third party since this is the key to separate general ciphertext and auth tag.

Please note that this is not compatible with current OTA and original protocol.

/*
 * The main difference between OTA designed by madeye and this one is MtE vs EtM.
 *
 * The first nonce is either from client or server side, it is generated via randombytes_buf()
 * in libsodium, and the sequent nonces are incremented via sodium_increment() in libsodium.
 * Please note that sodium_increment() considers the number to be encoded in little-endian format.
 * IMPORTANT: nonce should be used only once, let us running on the right track.
 *
 * Data.Len is used to separate general ciphertext and Auth tag. We can start decryption
 * if and only if the verification is passed. 
 * Firstly, we do length check, then decrypt it, separate ciphertext and attached data tag
 * based on the verified length, verify data tag and decrypt the corresponding data.
 * Finally, do what you supposed to do, e.g. forward user data.
 *
 * For UDP, nonces are generated randomly without the incrementation.
 *
 * TCP request (before encryption)
 * +------+---------------------+------------------+
 * | ATYP | Destination Address | Destination Port |
 * +------+---------------------+------------------+
 * |  1   |       Variable      |         2        |
 * +------+---------------------+------------------+
 *
 * TCP request (after encryption)
 * +--------+-----------+---------------+----------+---------------+
 * | NONCE  |PayloadLen |PayloadLen_TAG | Payload  |  Payload_TAG  |
 * +--------+-----------+---------------+----------+---------------+
 * | Fixed  |     2     |     Fixed     | Variable |     Fixed     |
 * +--------+-----------+---------------+----------+---------------+
 * 
 * Payload input: atyp + dst.addr + dst.port
 * PayloadLen is length(atyp + dst.addr + dst.port)
 * Payload_TAG and PayloadLen_TAG are in plaintext
 *
 * TCP Chunk (before encryption)
 * +----------+
 * |  DATA    |
 * +----------+
 * | Variable |
 * +----------+
 *
 * Data.Len is a 16-bit big-endian integer indicating the length of DATA.
 *
 * TCP Chunk (after encryption)
 * +------------+---------+-----------+----------+
 * | Data.Len*  | Len_TAG |  DATA_TAG |  DATA*   |
 * +------------+---------+-----------+----------+
 * |      2     | Fixed   |  Fixed    | Variable |
 * +------------+---------+-----------+----------+
 *
 * Len_TAG and DATA_TAG have the same length, they are in plaintext.
 * After encryption, DATA -> DATA*
 *
 * UDP (before encryption)
 * +------+---------------------+------------------+----------+
 * | ATYP | Destination Address | Destination Port |   DATA   |
 * +------+---------------------+------------------+----------+
 * |  1   |       Variable      |         2        | Variable |
 * +------+---------------------+------------------+----------+
 *
 * UDP (after encryption)
 * +--------+----------+-----------+
 * | NONCE  |  DATA*   |  DATA_TAG |
 * +--------+----------+-----------+
 * | Fixed  | Variable |  Fixed    |
 * +--------+----------+-----------+
 *
 * DATA* is Encrypt(atyp + dst.addr + dst.port + DATA)
 * RSV and FRAG are dropped
 * Since UDP packet is either received completely or missed,
 * we don't have to keep a field to track its length.
 *
 */

@v2ray
Copy link

v2ray commented Jan 18, 2017

I don't see the point of "length-tag". AEAD support additional data to be authenticated as plaintext. You can use length as the "additional data" for temper proofing.

@madeye
Copy link
Contributor

madeye commented Jan 18, 2017

@v2ray Agreed. The DATA.LEN can be protected by AEAD as plaintext.

@Mygod
Copy link
Contributor Author

Mygod commented Jan 18, 2017

+1. But this reminds me that we need random noise indistinguishability more than authentication (i.e. IND-CCA2) since the original purpose of this repo is to circumvent GFW and it's much harder to perform a realistic CCA2 distinction. Let's review our additions against random noise indistinguishability:

  1. TAG: Check. This should be guaranteed by the security of the hash algorithm we used.
  2. DATA.LEN: Oh shit this is not random at all. Adversary can distinguish them very easily.
  3. NONCE: No problem as long as they are randomly generated. (which isn't the case in TCP)
  4. SEQ: Terrible idea. It makes distinction much more easier.

OTA also has this problem. So what do you think?

@Mygod
Copy link
Contributor Author

Mygod commented Jan 18, 2017

removes some of my useless comments

Okay I think I got this sorted out. What we need to do is just to put data.len in the encryption data; send nonce ONLY in the first data and it auto increments for EVERY data. And voilà!

@v3aqb
Copy link

v3aqb commented Jan 18, 2017 via email

@Mygod
Copy link
Contributor Author

Mygod commented Jan 18, 2017

@v3aqb

better encrypted

I beg to differ. I took a quick glance and you implemented what shadowsocks was designed to prevent.

@shinku721
Copy link

@madeye Chunks are not the pattern, but the length of the chunks is.
However, I agree with you that this cannot be a practical attack. I mean, it has some characters similar to the weakness of OTA mentioned above, that it has to replay the request header. Both attacks are currently in theory and no evidence can prove that the GFW has already adopted any of them.

The reason why I support AEAD ciphers is just that AEAD ciphers are widely used in current practice.
IMO there is no urgent need to deprecate OTA, but we must stop using the original protocol without forcing OTA.

@riobard
Copy link
Contributor

riobard commented Feb 11, 2017

Please note that SIP004 is superseded by SIP007 (#42).

@tigaly
Copy link

tigaly commented Feb 11, 2017

从梅林被老司机领到这里,我就想问问大家,有什么决定性的代码或者证据证明OTA是先天不足必须给钦定的AEAD让路么?如果有,请联系我,我提供个C段地址,求扫描

@hellofwy
Copy link

hellofwy commented Feb 11, 2017

@tigaly
可参考:https://blessing.studio/why-do-shadowsocks-deprecate-ota/,当然我并不是完全认同文章的内容。
就我这段时间对他们讨论的观察,决定用 AEAD 主要对安全的的考虑,并不是防识别。
因为现在也引入了插件功能,可以通过插件实现协议混淆防检测,或实现协议伪装欺骗运营商 QoS 获取加速,或者其他功能。

@tigaly
Copy link

tigaly commented Feb 11, 2017

@hellofwy 这篇文章针对OTA不就是那个很容易fix的漏洞么(通篇吹破哇)?还有什么直接证据呢?那么,出于安全级别的考虑有没有什么别的漏洞无解呢?

@hellofwy
Copy link

@tigaly
就我目前知道的,并没什么直接证据。
安全这东西无止境,建议关注 TLS 协议最新相关研究,毕竟这个是世界范围内广泛使用和关注的。
作为普通使用者,在意安全的时候,使用 HTTPS 就好。不要太依赖底层 ss 协议。

@tigaly
Copy link

tigaly commented Feb 11, 2017

@hellofwy 我并不是在意安全问题,我只是用梅林的时候发现了OTA更新被直接干掉了,才想来学习下,不过这种漏洞解决的方式真奇怪哈,要是哪天证明AEAD也有问题?是不是要搞个DEAD协议出来?

@v3aqb
Copy link

v3aqb commented Feb 11, 2017 via email

@breakwa11
Copy link

@tigaly 既然不在意安全问题,那你直接用最早的版本的协议就好了,用OTA干什么?OTA本质就是增强安全为目的设计的

@hellofwy
Copy link

@tigaly
想太多。
OTA 一直也只是标为实验性功能。
AEAD 是信息安全领域的已有名词,并不是这里的人自己创造的。

@tigaly
Copy link

tigaly commented Feb 11, 2017

@breakwa11 那你去掉OTA就是为了安全?有没有证据可以证明呢?我并不在意安全问题,并不是我心大到完全不在意安全,只要gfw探测不到就行了

@hellofwy 我当然知道,只是开玩笑的说法,对于大家这种严谨的做法,我真是大开眼界

@breakwa11
Copy link

@tigaly 什么叫做“我去掉OTA”??????????

@tigaly
Copy link

tigaly commented Feb 11, 2017

@breakwa11 除了你还有谁一直在鼓吹OTA有不可修复的漏洞?

@breakwa11
Copy link

所以我也可以说:windows和linux有一大堆漏洞,然后修复漏洞的人就是我了,是这种逻辑吧,太好了,成百上千的都是我弄的

@tigaly
Copy link

tigaly commented Feb 11, 2017

@breakwa11 你这手驴打滚偷换概念玩的6,我也懒得跟你说,OTA和AEAD这两个功能的更替,有多少你在里面带节奏,明眼人一看就知道,不是你想推脱就推脱的

@breakwa11
Copy link

那不然呢?SS的代码我一行没动,在一行没动的情况下就成了是我改的?我不动一根手指但几千代码都是我改的,哇,难不成你说madeye的号被盗了?

@RobertYim
Copy link

@tigaly @breakwa11 Please stop posting any irrelevant comments here.

@tigaly
Copy link

tigaly commented Feb 11, 2017

@breakwa11 有种叫教唆犯你不会没听过吧

@hellofwy
Copy link

@tigaly
这里讨论关于人的问题不好。
毕竟这里说最好都是软件的问题。
针对人的在其他地方讨论就好。

@tigaly
Copy link

tigaly commented Feb 11, 2017

@hellofwy 是啊,本来说的好好的,不知道哪来的人跳出来一阵节奏和歪楼,
另外我翻了下好多关于OTA的讨论都完全被人带节奏,TLS和SS的用途被混为一谈,我就不说了,反正我也不怎么用ss,你们大家慢慢加油吧

@qinyuhang
Copy link

qinyuhang commented Feb 11, 2017 via email

@tigaly
Copy link

tigaly commented Feb 11, 2017

@qiuyuzhou 我进这个讨论组第一句话就说了,谁能针对OTA探测,求扫描,我提供一个C段地址,只是其他人一直歪楼带节奏才变成这样

@qinyuhang
Copy link

qinyuhang commented Feb 11, 2017 via email

@tigaly
Copy link

tigaly commented Feb 11, 2017

@qiuyuzhou 如果她敢接就不会一直打嘴仗了,从sadoneli一直扯到这里来

@qinyuhang
Copy link

qinyuhang commented Feb 11, 2017 via email

@shadowsocks shadowsocks locked and limited conversation to collaborators Feb 11, 2017
@madeye
Copy link
Contributor

madeye commented Feb 11, 2017

Locked as off-topic.

@qiuyuzhou
Copy link

@tigaly 你@错人了。。。

dszhengyu referenced this issue in shadowsocks/shadowsocks-go Sep 4, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests