-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rosen: censorship-resistant proxy tunnel based on encapsulating traffic within cover protocols #57
Comments
Thanks for the hard work! Just want to point out a clarification: uTLS doesn't require you to run/include a headless chrome, it's just a fork of the Go tls library. The idea is to let you specify/mimic the client hello of popular implementations (like Chrome or Firefox), so censors can't block you based on your unique-looking client hello "fingerprint". If you want to see what your tool's fingerprint(s) currently are, you can upload a pcap here: https://tlsfingerprint.io/pcap and it will give you a list and compare those to our dataset of captured TLS fingerprints in live traffic. Also in case you are not aware, Sergey Frolov and I published a paper at FOCI this year that had a similar design to yours, and also supported Websockets (primarily to cut down on tunneling overhead): https://www.usenix.org/conference/foci20/presentation/frolov A prototype of the HTTPT tool is available here: https://github.com/sergeyfrolov/httpt However, there remains a lot of work to be done in figuring out what counts as "cover" traffic for a server. How do we avoid a censor being able to tell that serving an assets folder is a tell-tale sign of these kinds of proxies? We had some preliminary ideas in the HTTPT paper, but many were not implemented or tested against real censors. It would be interesting to have a suite of these (500s, 404s, asset folders, mimicking popular sites, etc) and see which ones censors are able to block. |
v2ray didn't do anything about it, and let user choose what to serve by put it behind a real web server. By doing so, there's no auto generated assets and common pattern to detect. AFAIK, it works pretty well under real censors. |
This is great, thanks for working on Rosen.
I see you already know about HTTPT, which has a similar design and goals. Here are a few other projects to look at for comparison or inspiration:
A modular architecture has been thought of and implemented many times. V2Fly may be the leader here. Pluggable transports can be thought of as adding a layer of modularity over dedicated transport programs. obfs4proxy is internally modular and supports a number of transports; currently in Tor Browser both obfs4 and meek are done by obfs4proxy.
This may just be a matter of terminology, but I wouldn't call what Rosen is doing "tunneling" except insofar as it resembles Go crypto/tls. For tunneling, you would have to use an actual instance of some common HTTPS implementation; a headless browser would be one way of doing that. But it could be even more specific; e.g., using a browser through a specific web service. uTLS is a good option; as @ewust notes, it's a modification of crypto/tls and no browser is involved. For what it's worth, the meek that's deployed in Tor Browser has used uTLS since October 2019 and in that time it has not been blocked by its TLS fingerprint, as far as I know. Before switching to uTLS, the meek deployment in Tor Browser used a headless Firefox, which worked okay but was logistically hard to work with and keep up to date; see Section V of "The use of TLS in censorship circumvention" for more discussion. You need to think about the server-side TLS fingerprint as well, and uTLS does not help there. One option is to use a frontend web server such as Apache or Nginx, with its own TLS certificate, that forwards everything to a local Rosen server. HTTPT and V2Fly are designed with this kind of deployment in mind.
This is a good idea. To my knowledge, there is still scant evidence for censors actually using attacks based on timing and packet sizes—the closest thing is probably the GFW's considering the length of the first data packet (Section 4) in detecting possible Shadowsocks connections—but it is a good thing to prepare for in the future. We have talked about padding schemes before at #9 (comment). Harder than designing a padding scheme, though, is deciding how to apply it—characterization of "normal" traffic patterns is IMO still an open and vital research question. Snowflake uses this protocol for padding and dnstt uses this one, but they are currently unused (inserting either no or a fixed amount of padding).
It's great that you have been able to test against actual DPI engines. Bear in mind that on-path DPI is, empirically, not the favored tool of censors; when possible they prefer to use "setup" features rather than "usage" features (Recommendation 3 on page 11). This means that protecting the IP address or domain name of the proxy is as important as having a good protocol fingerprint. The challenge here is informing censored users of where the proxies are located, without also informing the censor. One way to deal with this is to ask each user to set up their own personal proxy. |
^_^ |
I'm aware that uTLS is just a fork of Thanks for the link to tlsfingerprint.io, it will definitely be useful.
Yes I came across HTTPT, it's interesting. The approach you ended up with is similar to the one used in Rosen, except a typical HTTP handler is implemented instead of using nginx or something else as a reverse proxy. This latter approach could easily be implemented in Rosen by starting the server on a different port (without HTTPS) and putting any kind of reverse proxy in front of it. Using Cloudflare is another option. You say you used WebSockets to cut down on overhead. Did you see a significant improvement in performance? Rosen's performance seems to be mostly network limited for me so the main reason I was looking to use WebSockets is for a less suspicious traffic pattern.
I've thought about this too. Rosen currently implements the approach that @studentmain mentions: we have a hot-swappable local func staticHandler(w http.ResponseWriter, r *http.Request) {
staticWebsiteHandler.ServeHTTP(w, r)
} So this static "default" handler can be swapped out with any function that takes a request object and a response writer. @wkrp Thanks for the links.
Again I wonder how rare Go's default fingerprint is actually is and whether this is neccessary. uTLS does look quite simple in terms of its API though and flexible so it seems like there's not many downsides to including it. Since meek has used it with success, this is more reason to include it. In terms of using another frontend, I would say the same as what I replied to @ewust.
Censors not using packet sizes and timing patterns is what I expected, it likely requires more computational resources than is feasible in real time. I have been told that the GFW activates a more restrictive mode during sensitive times so perhaps in these situations it could be detected. To be honest, (except for meek) shadowsocks was the only software I looked at deeply while designing and implementing Rosen. I was kind of shocked by its poor cryptographic design and overall symplicity, so since it works well I'm more confident about Rosen and these other projects you mentioned. In terms of random padding schemes, my goal is just to destroy characteristics of Rosen's own fingerprint. For example, if there's no communication happening in the HTTP tunnel, there will be fixed-size, small, pings. Appending padding between 1-4KB randomly would destroy this fingerprint with little overhead. Of course making traffic look like existing services like Netflix or YouTube could be feasible but censors could tell we're not connecting to Netflix's servers for example. Also these services are dominated by downloads, restricting us quite a bit.
My intuition tells me you are right. Endpoint-fingerprinting resistance is very important as it could render all of our effort in other areas useless. The distribution problem is hard to solve, yes. My view is that users have to setup their own services or pay for someone to do this. Putting servers on CDNs like Amazon's or behind Cloudflare, and putting them behind their own domains, seems like the most resilient method. |
Early history of Shadowsocks is interesting. At first (2010) it uses substitution cipher. Soon they found substitution cipher is not secure, frequency analysis is enough to detect it. Then shadowsocks introduced RC4 cipher. This time, they forgot add an initialization vector. RC4 itself has no place for IV, so they generate an IV, mix it with password by MD5. This is the first stream cipher of shadowsocks. Then they finally imported (#included<>, [DllImported], depends on which language is used by the implementation...) OpenSSL, and they added those aes/camellia-xxx-yyy stream cipher from OpenSSL. Then you maybe already know other part. You see, things are not related at cryptography at the very begin. |
And why it works? Just like DES works in 1980, shadowsocks works in 2010. Every new protocol is "Unknown traffic" when it's born. And they usually decide allow unknown traffic. Sure, we know it is shadowsocks in a few days/months/years, shadowsocks not works now, just like DES not works in 199x. Then shadowsock changed it's protocol by replace it's S-Box with RC4, works for a while, just like 3DES works for a while, maybe getting detected. |
As part of my ongoing masters thesis I've been working on Rosen: https://github.com/awnumar/rosen
Related post here: https://spacetime.dev/rosen-censorship-resistant-proxy-tunnel
The text was updated successfully, but these errors were encountered: