New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIP001: Header obfuscating #26

Closed
madeye opened this Issue Dec 13, 2016 · 78 comments

Comments

Projects
None yet
@madeye
Copy link
Contributor

madeye commented Dec 13, 2016

Shadowsocks Improvement Proposal 001

SIP001 - Allow header obfuscating to cheat on QoS.

Recently, QoS of some ISPs becomes unreasonable. A cheap way to solve this problem is header obfuscating, which inserts some fake headers before shadowsocks handshake packets.

For example, before a shadowsocks request, we insert this HTTP GET header:

    POST / HTTP/1.1\r\n
    Host: www.baidu.com:8388\r\n
    User-Agent: curl/7.45.1\r\n
    Accept: */*\r\n
    Content-Type: application/octet-stream\r\n
    Content-Length: 176\r\n
    \r\n

Similarly, we insert this HTTP header before a shadowsocks response.

    HTTP/1.1 200 OK\r\n
    Server: nginx/1.0.2\r\n
    Date: Tue, 13 Dec 2016 13:25:12 GMT\r\n
    Content-Type: application/octet-stream\r\n
    Content-Length: 176\r\n
    Connection: keep-alive\r\n
    Cache-Control: private, no-cache, no-store, proxy-revalidate, no-transform\r\n
    Pragma: no-cache\r\n
    \r\n

With this SIP, we may cheat on most of QoS mechanisms, avoiding QoS related packets dropping or bandwidth limit.

A demonstration can be found here: https://github.com/shadowsocks/shadowsocks-libev/tree/obfs

Any suggestion is welcome.

@madeye madeye added the enhancement label Dec 13, 2016

@Mygod

This comment has been minimized.

Copy link
Contributor

Mygod commented Dec 13, 2016

  1. This feature is optional and configurable right?
  2. Why does it use \r\n instead of \n?
  3. May I suggest to use POST and add Content-Length to the request since we need to post data to the server?
  4. Content-Type: text/html and Content-Encoding: gzip doesn't match the content the server returns which would be suspicious. How about application/octet-stream and remove Content-Encoding (which means anything is valid)?
@madeye

This comment has been minimized.

Copy link
Contributor

madeye commented Dec 13, 2016

  1. Yes, it would introduce additional features of the traffic. We may refine the implementation to make it closer to real HTTP traffic.
  2. It should be a problem. For now, we should warn the user about the risk and make this feature disabled by default.
  3. Do you mean we should fake the header like a CDN header?
@madeye

This comment has been minimized.

Copy link
Contributor

madeye commented Dec 13, 2016

@Mygod

  1. Right, optional and configurable.
  2. From RFC, it seems to be \r\n. Correct me if I'm wrong.
   HTTP/1.1 defines the sequence CR LF as the end-of-line marker for all
   protocol elements except the entity-body (see appendix 19.3 for
   tolerant applications). The end-of-line marker within an entity-body
   is defined by its associated media type, as described in section 3.7.

       CRLF           = CR LF
  1. Yes, it looks a good idea.
  2. Ditto.
@nekolab

This comment has been minimized.

Copy link

nekolab commented Dec 13, 2016

I suggest let user define request and response header by themselves, not use a fixed template.

@Mygod

This comment has been minimized.

Copy link
Contributor

Mygod commented Dec 13, 2016

@nekolab

This comment has been minimized.

Copy link

nekolab commented Dec 13, 2016

Fine, another question is this is a connection-level header or a conversation-level header.

A connection-level header only appears when TCP connection established, after that it won't be sent any more. A conversation-level header will appears everywhere in a TCP stream, each time invoke send will append fake header to the stream.

Neither POST nor GET method in HTTP can represent a connection-level header in semantics, because after a send-recv round, ordinary HTTP client will close the TCP connection or hold it for another HTTP connection (with another header), but TCP connection will still send and receive data.

I'm not familiar with the libev version of SS, after a quick look I believe this implementation use the connection-level header, correct me if I'm wrong.

The conversation-level header may looks more like an ordinary HTTP client works on POST method and multiplexing the connection, but will it decrease the performance, add the complexity to find and remove the fake header or add more(more more) characteristic to the protocol?

@v3aqb

This comment has been minimized.

Copy link

v3aqb commented Dec 13, 2016

how about use a websocket header?

@Mygod

This comment has been minimized.

Copy link
Contributor

Mygod commented Dec 14, 2016

Hmm. Maybe we can support both HTTP mode and WebSocket mode?

@madeye

This comment has been minimized.

Copy link
Contributor

madeye commented Dec 14, 2016

Websocket looks a great idea. It helps to avoid conversation headers mentioned by @nekolab.

I'm not a big fan of fully customized headers, which may introduce illegal usage of this feature.

@nekolab

This comment has been minimized.

Copy link

nekolab commented Dec 14, 2016

We may run some tests to confirm whether the websocket header can cheat QoS successfully or not. I'm not pretty sure since it's a new protocol and may be ignored by QoS, if it works, I vote yes for it.

@ayanamist

This comment has been minimized.

Copy link

ayanamist commented Dec 14, 2016

I dont think WebSocket header will cheat QoS since the cheat proved valid seems to be very bad implemented.

SSR with simple_http has been successfully proved to be valid on cheating QoS under Hangzhou Telecom. SSR with simple_http are using GET method with request body which is definitely a illegal formed http request.

Do you plan to move some data like IV from request body to request path like SSR does? This can make request url different from request to request which i think will increase detect difficulty.

@ayanamist

This comment has been minimized.

Copy link

ayanamist commented Dec 14, 2016

@wongsyrone I dont understand what you said. If a request is invalid, it can't bypass shadowsocks existent verification mechnism, so where a correct response comes from?
In fact i think it will decrease the risk of exposing server side, since it can emulate like a normal http server.

@madeye

This comment has been minimized.

Copy link
Contributor

madeye commented Dec 14, 2016

Update the websocket obfuscating via shadowsocks/shadowsocks-libev@6176903

Request:

    GET / HTTP/1.1\r\n
    Host: www.baidu.com:8388\r\n
    User-Agent: curl/7.18.1\r\n
    Upgrade: websocket\r\n
    Connection: Upgrade\r\n
    Sec-WebSocket-Key: XVOfcm44bdPb0+xNrmf4tg==\r\n
    \r\n

Response:

    HTTP/1.1 101 Switching Protocols\r\n
    Server: nginx/1.2.2\r\n
    Date: Wed, 14 Dec 2016 13:42:07 GMT\r\n
    Upgrade: websocket\r\n
    Connection: Upgrade\r\n
    Sec-WebSocket-Accept: byeMGrcAr+bKUtt+i2Thaw==\r\n
    \r\n

Basically, it's still a HTTP GET obfuscating. However, websocket protocol lets the whole traffic stream look more normal.

@Mygod

This comment has been minimized.

Copy link
Contributor

Mygod commented Dec 14, 2016

illegal usage of this feature.

Hmm I thought that was the point of this feature.

@simonsmh

This comment has been minimized.

Copy link

simonsmh commented Dec 14, 2016

@wongsyrone That's why it should be disabled by default if necessary.
@Mygod In another project shadowsocksr could ban these ip/domain for illegal usage at the server side. That's not the major issue.

@madeye

This comment has been minimized.

Copy link
Contributor

madeye commented Dec 14, 2016

Actually, I don't think we need to worry about adding new features.

The soul of shadowsocks is to solve a stupid problem (you know what I mean) with as less effort as possible. If any small change works well, we just add it. If not, we drop it.

As an optional protocol extension, even if this proposal introduces new problems, we can continue to refine it or just drop it.

As the next step, I suggest to do more tests in real environments and let's see what will happen.

@ghost

This comment has been minimized.

Copy link

ghost commented Dec 14, 2016

Assuming the proposal applies on TCP connections only. This feature is equivalent to a customized HTTP proxy (say ShadowHTTP).

The only difference is that ShadowHTTP only tranfers encrypted content, when normal HTTP proxy allows both plain and encrypted payload. The HTTP method may be different but configurable (discussed above). Going further, ShadowHTTP may have ability to proxy out (or deny) invalid request, in order to avoid detection/probing. This is one step further to be a normal HTTP proxy.

A HTTP proxy is fine, but it doesn't fit the need of a socks proxy. If your end goal is to cover UDP or provide other type of obfuscation, I would suggest the design to be more fundamental and extensible, to fit potential grow in the future.

@ayanamist

This comment has been minimized.

Copy link

ayanamist commented Dec 14, 2016

@v2ray No, it is not a HTTP proxy, but a SOCKS proxy obfuscated as a HTTP proxy which definitely fits the need of a SOCKS proxy.

@madeye

This comment has been minimized.

Copy link
Contributor

madeye commented Dec 14, 2016

@v2ray The proposal here is header obfuscation and the goal is to find a cheap way to cheat on QoS. In other words, it just does some simple obfuscation, no plan to implement full HTTP protocol.

@pexcn

This comment has been minimized.

Copy link

pexcn commented Dec 14, 2016

Good idea.

@librehat

This comment has been minimized.

Copy link
Contributor

librehat commented Dec 14, 2016

Will it be separately optionally enabled in client-side and server-side? (i.e., as a server, I received obfuscated request, am I allowed to respond with non-obfuscated response?) Or it would be similar to OTA, an obfuscated request will also make sure the response is also obfuscated.

@v3aqb

This comment has been minimized.

Copy link

v3aqb commented Dec 14, 2016

with URI like this?

ss://method:password@hostname:port/?obfs=http[&hostname=www.baidu.com]

or

ss://method:password@hostname:port/?obfs=http[&header=BASE64-ENCODED-HEADER-DATA]
@madeye

This comment has been minimized.

Copy link
Contributor

madeye commented Dec 15, 2016

@librehat Right, it's totally optional. Both client and server should enable the same obfuscation. On the server side, when the obfuscation is enabled, it still can handle normal protocol without obfuscating. So,

-------------------------------------------
| Client-Obfs |   Server-Obfs  |  Working |
| Yes         |   Yes          |  Yes     |
| Yes         |   No           |  No      |
| No          |   Yes          |  Yes     |
| No          |   No           |  Yes     |
-------------------------------------------

@v3aqb The first one looks better. As the hostname should be ASCII, no need to do base64 encoding.

@librehat

This comment has been minimized.

Copy link
Contributor

librehat commented Dec 15, 2016

@madeye Actually I don't think server need to be able to disable the obfs if it supports it since it should be fully back-compatible. We don't have to add one more config in server side each time a new feature is proposed (but it can also be up to each implementation)

@madeye

This comment has been minimized.

Copy link
Contributor

madeye commented Dec 15, 2016

@librehat I think there are two reasons why we need to provide an option on the server side:

  1. Prevent potential security issues. If any security issue is found in the future, users can easily disable obfuscating support on their servers. Or if a user doesn't want to take risk to enable obfuscating, he can still keep updating to the latest software with obfuscating disabled by default.
  2. Support different kinds of obfuscating. Currently, we only have HTTP obfuscating, but someday we may have more. So, it's necessary to provide an option for switching between different obfuscating implementations.
@ghost

This comment has been minimized.

Copy link

ghost commented Dec 17, 2016

Is there a reproducible test to show the problem, that is ISP will favor an HTTP request over a shadowsocks TCP request, in the first place? Because I am not observing it.

@ghost

This comment has been minimized.

Copy link

ghost commented Dec 17, 2016

@nekolab I don't believe HTTP spec 1.1 denied the possibility for multiplexing, in other words a strict request / response semantic is only conventional. A single obfuscation at the start of the TCP stream should be sufficient.

@madeye

This comment has been minimized.

Copy link
Contributor

madeye commented Dec 18, 2016

@nfjinjing If you have a link with China Telecom, you may try experiments around 9:00PM to 11:00PM everyday. Actually, according to some internal sources of Cisco, they have deployed similar QoS mechanism on ASR 1000 series for China Telecom years ago.

@ghost

This comment has been minimized.

Copy link

ghost commented Dec 18, 2016

@madeye That's very interesting. Unfortunately because of a different ISP, I can't verify it myself.

I tried the obfs branch at 3d71c2, how do I know if obfuscation is turned on? There seems to be no options to enable it, and I didn't find any HTTP headers with tcpdump.

@Artoria2e5

This comment has been minimized.

Copy link

Artoria2e5 commented Dec 30, 2016

Since @falseen has mentioned some ideas regarding a pluggable obfuscation system, I would like to bring up some attention in supporting Tor's Pluggable Transport protocol, which allows Tor to speak with separate obfuscating programs ("Pluggable Transports"; PTs) like obfs2/3/4, meek, fteproxy and ScrambleSuit. Tor has a very rich repository of PTs, and there is no reason not to use these field-tested and well-reviewed implementations.

For faking HTTP traffic for better QoS, Tor already has a fteproxy, which transforms traffic into something that matches a specified regex. Tor's evaluation highlights a few weaknesses in fteproxy, but some of them are actually not hard to fix since SS deployments have more space for customization:

  • fteproxy performs no effort in hiding the packet size/timing signatures. Since obfs4 can do all of these, a very lame hack is possible: just wrap fteproxy around an obfs4 configured to do these.
  • fteproxy uses a static key on Tor deployments , and therefore is vulnerable to active probing on its own level. But SS itself can perform some key derivation from given password(s) to make it non-static.
  • fteproxy currently cuts the connection on receiving a normal HTTP response. This is a fatal issue to be fixed by SS developers.

Regarding the super-well-known obfs4, there is actually some timing obfuscation not enabled by default due to non-trivial performance penalty and costs on censors like GFW themselves. It might be worth mentioning as there are increasing concerns over timing detection on looks-like-nothing transports like SS itself and obfs4.

A successful non-Tor PT protocol implementation is @gumblex's ptproxy.

In retrospect, even kcptun can be made a PT this way. The name "Pluggable Transport" itself does not limit the transport to obfuscators; it can be anything that provides a transport-layer tunnel. And who said that we can't chain them?


@falseen 提到的 SSR 混淆让我想起了 obfs4。obfs4 其实是 Tor 的插拔式传输层(PT)的一种。传输层程序(一般都是混淆器)通过一种公开协议与 Tor 交流,实际上已经实现了这个插件模式的提议。Tor 有很多很好的混淆组件,没有道理不用啊。

修正:SSR 那个 obfs 只是 obfuscation(混淆)的简写,我还当 obfs4 了呢。

@anonymous-contributor

This comment has been minimized.

Copy link

anonymous-contributor commented Dec 30, 2016

Personally speaking, I don't really think current obfuscation is really obfuscating anything.
Package sequence and timing are not changed at all.
This seems to be a dirty hack, for given ISP. Not elegent nor generic.

So I never like the idea itself.

Here +1 for Tor PT, and in fact, I'm already using obfsproxy(scramblesuite) for SS for a long time.
My ISP seems to RST my connection quite often with plain ss(of cource, AES encrypted).
I swithed to obfsproxy + ss, and things work fine since then.

I'm using the proxyed mode for now, but it should not be hard to support managed mode.
(Always want to add managed mode, but since current proxy mode works fine and I'm too lazy so...)

BTW, latest obfs4 PT only supports manged mode.

So I prefer to deperate the current dirty hack, and just implement obfsproxy managed mode.
This is not only generic, but also KISS.

Thanks

@madeye

This comment has been minimized.

Copy link
Contributor

madeye commented Dec 31, 2016

After reviewing the whole proposal again, I realized that I made a big mistake here.

As mentioned by @Artoria2e5 and @anonymous-contributor, this proposal is actually "a dirty hack for given ISP". We should not directly add this proposal to the shadowsocks protocol, which also breaks KISS that we have insisted in the past four years.

So, here are my next steps:

  1. I'll deprecate this change in the next release of shadowsocks-libev.
  2. As this proposal is still useful for many users. I plan to move all the related implementation from shadowsocks-libev to a new project (simple-obfs?). So, it you're already using this feature or working on your compatible implementation , don't worry, the new project will continue to work as a plugin server for you.
  3. As proposed by @Mygod in #27, we can keep adding more obfuscating tools, e.g. obfs4, as plugin server and recommend them to all the shadowsocks users.

BTW, I forked obfs4 months ago and modified it to work in standalone mode as a simple tunnel tool. It may be useful if we plan to add obfs4 as a plugin server in the future. https://github.com/madeye/obfs4-tunnel

Thanks again to all the suggestions and comments in this issue. You're awesome!

@Mygod

This comment has been minimized.

Copy link
Contributor

Mygod commented Dec 31, 2016

How does plugin server work? Shadowsocks clients are written in very different languages and does this mean every client should work on a plugin platform next before implementing new plugins?

EDIT: In comparison, Tor's approach seems more doable.

@madeye

This comment has been minimized.

Copy link
Contributor

madeye commented Dec 31, 2016

@Mygod I mean plugin servers like shadowsocks over kcptun, tor over obfs4. Any plugin server can work for every implementation of shadowsocks.

@Mygod

This comment has been minimized.

Copy link
Contributor

Mygod commented Dec 31, 2016

Yeah I just realized that. In that case is it possible to use multiple plugins at the same time? For example, shadowsocks over obfs4 over kcptun?

@madeye

This comment has been minimized.

Copy link
Contributor

madeye commented Dec 31, 2016

For now, I think we should avoid this kind of plugin over plugin... And actually, obfs over kcptun is meaningless.

@anonymous-contributor

This comment has been minimized.

Copy link

anonymous-contributor commented Dec 31, 2016

For easy configuration, I prefer to use obfsproxy managed mode instead of standalone one.
(More and more like tor, right?)

This makes us able to configure ss client like:

{
  "server": "test.example.com",
  "server_pot": "obfs4: 6666"
}

And configure server like:

{
  "port_password":
  {
        "obfs4: 6666": "OBFSpaSsWoRD",
        "6667": "PLAINpaSSwoRD"
  }
}

Such configuration can save a lot of time, and can avoid double password for scramblesuite.
(We can just hash the ss password and use the hash as scramblesuite password)

Further more, it's possible to stack all plugins together:
(I must be crazy to do that, although I didn't find a good method to pass tcptun parameters)

{
  "port_password":
   {
       "obfs4+tcptun: 6666": "WTFarewedoing?"
   }
}
@simonsmh

This comment has been minimized.

Copy link

simonsmh commented Dec 31, 2016

@ghost

This comment has been minimized.

Copy link

ghost commented Dec 31, 2016

I suggest we first clarify what it is that we are trying to obfuscate from:

There are two things that are could disrupt internet usage: gfw and ISP, which might be dealt with similarly or differently.

In this case, if I understand correctly, we are dealing with QoS, which is deployed at ISPs. What has been proposed, which has been proven useful in China Telecom, might be a dirty hack, but a working one.

obfs4 seems to me is not solving this particular problem, since it's a "look-like nothing obfuscation protocol" as described in the project page and that's exactly what an ISP think of shadowsocks. We at least need some test to show it's effectiveness in China Telecom to even consider it as an alternative.

obfs4 seems to prevent ISPs from resetting TCP connections according to @anonymous-contributor, but is this a general phenomenal or an isolated instance?

None of these should stop the development of the general architecture, of course. And since the original proposal still fit and has been tested, I see no reason to abandon it. As long as there is a pluggable design, it can be amended anytime.

@anonymous-contributor

This comment has been minimized.

Copy link

anonymous-contributor commented Dec 31, 2016

@nfjinjing it's more dependant on your VPS (and TCP congestion algorithm setup) than the so-call obfs.

Just as your benchmark shows, the dirty hack does improve the performance, but at a small scale compared to BBR TCP congestion algorithm.
The bottleneck lies in VPS location/route and TCP congestion algorithm.

So at least for me, integrate such hack into SS is not worthy.
Who knows when will other users request to add some extra dirty hack for other ISP.
If we start integrating it, it will begin an whac-a-mole in ss.

While the Tor PT method is both generic and KISS, any ISP specified hack can be one PT, and if there is really a lot of user need it, the project will grow and we will know.
And the generic Tor PT style interface will be quite easy for us to integrate (if using managed mode, just several lines of json config).

And for the obfs4 vs RST problem, it may be an individual problem, but it doesn't change the above KISS pricinple.

@gumblex

This comment has been minimized.

Copy link

gumblex commented Dec 31, 2016

@nfjinjing obfs4 can't prevent RSTs. Its purpose is to disrupt blocking or QoS that based on the observation of specific protocol characteristics and timing.

@madeye

This comment has been minimized.

Copy link
Contributor

madeye commented Dec 31, 2016

@anonymous-contributor Supporting PT looks a good idea. Any interest in opening a pull request?

@ghost

This comment has been minimized.

Copy link

ghost commented Dec 31, 2016

@anonymous-contributor thanks for the clarification. I'm not against PT or something similar, I'm in favor of it. I might had the wrong impression that the proposal, which could be implemented as a "plugin", is being replaced by obfs4.

@anonymous-contributor

This comment has been minimized.

Copy link

anonymous-contributor commented Dec 31, 2016

@madeye I'll spare some time for implementing the managed mode support, along with the json configuration part.

But don't expect it soon, it may be one or two month.

(I'm just too busy launching satellites in KSP 🚀 )

@madeye

This comment has been minimized.

Copy link
Contributor

madeye commented Dec 31, 2016

@anonymous-contributor Great! I'll keep studying more details about PT.

(And to be honest, I'm also busy with fighting against Germans in the Argonne Forest)

@ghost

This comment has been minimized.

Copy link

ghost commented Dec 31, 2016

@gumblex I'm curious if such capability is observed any ISP?

@liaozibo

This comment has been minimized.

Copy link

liaozibo commented Dec 31, 2016

I think ss can open a server side api. client side should do that also.
developers can dev lots of server side and client side plugins to obfuscate the tcp/udp steam.
if any developers drop out ,would not make any affect to ss community.
This dev mode can make ss community(protocol) much more strong.

@cat-new

This comment has been minimized.

Copy link

cat-new commented Jan 2, 2017

我们是否应该也要用投票的方式决定是否加入 混淆?
https://goo.gl/forms/PIJ4ykg6NCViKtdD2
此表单在 2017年1月31日24时 失效!

@simonsmh

This comment has been minimized.

Copy link

simonsmh commented Jan 2, 2017

@ghost

This comment has been minimized.

Copy link

ghost commented Jan 2, 2017

On a second thought, the "optional" feature of simple-obfs might be a problem. Just imagine what gfw will think when it sees a service that sometimes looks like an almost valid http request, where the host name probably won't match, and sometimes nothing at all.

blackgear added a commit to blackgear/shadowsocks-libev that referenced this issue Mar 15, 2017

Add HTTP/TLS obfuscating. [SIP001] (shadowsocks#1009)
Add experimental HTTP/TLS obfuscating as an **optional extension** of shadowsocks protocol.

More discussions can be found here: shadowsocks/shadowsocks-org#26

As this feature is still a SIP (Shadowsocks Improvement Proposal), it's very unstable and experimental. So,

1. Don't enable it unless you know what it is.
2. Be very careful when using it in production environment.

@madeye madeye removed the enhancement label Nov 9, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment