Introduction to V2Ray #36
At the next Tor anti-censorship team reading group (Thursday June 11 at 16:00 UTC), we are going to be discussing V2Ray. The members of the team are not very familiar with V2Ray, and we want to broaden our understanding.
Here are some of my preliminary notes on V2Ray. I hope that some readers who are more familiar with V2Ray will be able to correct my misunderstandings and provide more detail. For example, I'm unsure about the relationship between V2Ray and VMess—they seem to have some historical relationship, but I'm not sure.
V2Ray itself is not a protocol or circumvention system by itself. Rather, V2Ray is a platform or framework that allows you to run one or more proxies, with various layered proxy protocols, transports, and obfuscation. For example, you could run SOCKS-in-TLS on one port, and VMess-in-QUIC (with the QUIC packets optionally obfuscated) on another port. On the client side, you can configure routing to control what traffic should use what proxy, or should not be proxied at all.
At the lowest level, V2Ray supports a variety of proxy protocols, some inherently obfuscated and some not:
There's an optional mux (multiplex) layer to tunnel multiple streams through one proxy connection.
The proxy protocols are not inherently implemented over any particular kind of network connection. Instead, you must specify a transport for each:
Any of the transport layers may optionally have a layer of TLS applied to them. The TLS option is obligatory with the HTTP/2 and QUIC transports.
Finally, at the highest level, some transports support additional, optional obfuscation options:
The V2Ray model provides a lot of flexibility. You could set up an unauthenticated SOCKS proxy without any encryption, or you could set up VMess open only to authorized users, tunneled through WebSocket with TLS.
The text was updated successfully, but these errors were encountered:
Hi, I am xiaokangwang aka Shelikhoo. I am currently one of the developers maintaining the V2Ray(V2Fly) project. I can provide more information about V2Ray projects and answer questions about it.
V2Ray is designed to be a platform in which the users can design their protocol stack to suit their own needs. One might need a proxy that shelters them the most formidable adversary and decides to use Websocket + TLS + VMess over CDN and nginx. Or just need to fool the ISP's QoS that throttle encrypted traffic with Fake HTTP Header + VMess. Or just wants to convert a socks5 proxy into HTTP proxy. V2Ray will cover them all. This also means not all tools provided in V2Ray are safe against the most advanced detection, although we are aiming at providing more tools that do.
V2Ray is currently being developed to be more undetectable, more secure, and suitable for more use cases.
Hi, thank you for your reply.
What's the relationship between V2Ray and VMess? I get the impression that VMess was the "original" protocol supported by V2Ray, and that V2Ray later grew to support more protocols. Is that right?
I see that V2Ray is the main project under Project V. What else is part of Project V?
VMess is a protocol designed by and for V2Ray. It the primary "proxy" protocol used by V2Ray, more on that later. V2Ray is never designed to focus on one specific "proxy" protocol but in practice, VMess is typically used in most of the cases for communication between V2Ray instances and other "proxy" protocols will be used for communicating with third-party software that doesn't support VMess protocol. Yes, V2Ray starts with VMess and support more and more proxies and transports later.
So far V2Ray is the only primary project in Project V, or to say it in another way, Project V is an alias for V2Ray. Other projects including domain-list-community are designed to support V2Ray and eventually integrated into V2Ray. We used to develop GUI clients for V2Ray, but with limited personal, eventually, the task for developing GUI clients shifted to third-party developers.
V2Ray purposefully does not support external transport that needs to be executed, as V2Ray allow configuration files to be read from the network, users will blindly copy configure files without checking it and often intergraded by other software that publishes in platforms that forbid developers from publishing software with behaviors that can be updated independently from the software(not to be specific, but particularly Apple App Store, and Google Play Store). Instead, users manually configure tunnels and other proxies, run them alongside V2Ray, and change V2Ray's configure file to send traffics to these tunnels or proxies. V2Ray has internal APIs that allow anyone to define a proxy or transport by implementing specific interfaces and register it with V2Ray. But for practical reasons these kinds of change will be merged into V2Ray's mainline code and become a part of V2Ray. Websocket transport is one of the examples. After I adopted Websock transport to V2Ray's API, it gets accepted into V2Ray's mainline codebase and was used ever since.
In V2Ray, we define a "proxy" to be a way to communicate where should traffic go and what the traffic is. "transport" is a way to communicate what an arbitrary data stream is regardless of the content of that data. This allows users to make interesting combinations of them to suit their own needs.
We had a great discussion about V2Ray today. We were lucky to have several knowledgeable people in attendance, including @xiaokangwang, @studentmain, @DuckSoft, and @tomac4t. Here is a summary of the main points.
About next-generation protocol. For v2ray, discussions are spread in https://github.com/v2ray/discussion/issues and https://github.com/v2ray/v2ray-core/issues ... There are so many proposals, you'll need some time to read all of them. For shadowsocks, their proposal is in shadowsocks/shadowsocks-org#157
Here's some interesting idea in their dicussion:
Thanks for the excellent information about V2Ray. I lead the team that created the Outline VPN and I consider other protocols, even though our flavor of Shadowsocks is working, including in Iran, Turkmenistan and somewhat in China.
One issue I've found with many protocol options is the lack of support for UDP. Some of them just don't support UDP at all other than DNS, some options tunnel UDP over TCP, defeating the purpose of UDP.
Does the V2Ray suite offer options to proxy UDP over UDP without delivery guarantees? It seems all transport options are reliable channels.
Currently, V2Ray's support for UDP traffic is quite limited. There is no support for SOCK_DGRAM like UDP traffic and they are treated like SOCK_SEQPACKET(SCTP like). However, there are plans to support UDP traffic and integrate a protocol called VLite that optimized for gaming. In fact, it is expected to be included in the next major functional update.
What kind of UDP usage would you advise V2Ray to support or optimize for? Is it gaming which requires low latency or torrenting (or QUIC) that requires high throughput?
The reason V2Ray deprioritized UDP over UDP support is that Chinese ISP always tries to sabotage UDP traffics. It often requires specialized and expensive setup to get UDP tunnel optimized for gaming to work as expected. And without such a setup, the packet loss imposed by a lot of ISP will make any UDP based protocol ineffective. For downloading and web browsing, it is almost always better and more economic to just use TCP(reliable connection) and get rid of the head of line blocking by establishing more TCP(reliable) connections(by changing network.http.max-connections in browser settings).
@xiaokangwang We see significant UDP traffic in Outline servers, so UDP is indeed being used. I don't have numbers, but I imagine the main use for UDP is video calls and watching videos (e.g YouTube). Gaming is probably relevant too. I believe none of those use cases work well on TCP.
Summary on Recently Discovered V2Ray Weaknesses
Date: Tuesday, June 16, 2020
This summary first appeared on GFW Report. We also maintain an up-to-date copy of the report on both net4people and ntc.party.
Several weaknesses were discovered in the V2Ray recently, which could be used to identify V2Ray clients or servers that run VMess, TLS or HTTP protocol. Below is our summary and understanding on these weaknesses.
In general, these weaknesses fall into three categories:
Replay Attacks against the VMess Protocol
The following table shows the structure of
VMess authenticates each request in two steps, using
First, the VMess server validates whether the timestamp in
Second, since the
VMess server indeed has a replay defense mechanism. In particular, the server records the (
Exploiting these weaknesses, many replay-based probes are creatively crafted to identify the VMess server. We introduce them below in separate sections.
Replays with padding length field changed
The malicious probe is a replay of the legitimate request, with many bytes changed as follows:
In total, the attacker makes 16 connections to the server. In each connection, the attacker:
If the Ms recorded among 16 connections happen to be a list of non-repeated integers with the delta of max and min is 15, then it is very likely that the server runs VMess protocol.
The explanations of the attack are as follows:
Relays that trigger inconsistent draining behaviors
After the patches to defeat the probes above, @nametoolong found two more types of replay-based probes that can still the detect VMess servers. Both of them are related to how the server closes the connection. Below, we introduce the first of them, and we leave the explanations of the second attack as an exercise to reader.
@nametoolong described the probes and the behaviors of the server as follows:
The byte 48 (counting from 0) that got changed is the last byte of the
In this attack, the attacker intentionally triggers the replay defense, and expects the inconsistent behaviors of the servers when seeing the same (
The V2Ray has actually been patched so that it will close the connections after reading a random number of bytes within a certain range, or after waiting for a random amount of time within a certain range. However, this attack is possible because of the inconsistent usage of the draining methods when different types of errors happen.
@nametoolong thus suggested:
Although we do not know whether the GFW uses active probing against VMess protocol, the attacks proposed above are feasible to the GFW. For example, it is observed that the GFW is capable of sending replay-based probes with no delay or arbitrarily long delay. We will investigate whether the GFW uses active probing against VMess protocol in the following work. At the same time, it will save us a lot of time if users can report which V2Ray servers were blocked when using what settings.
It may be a good idea to use a replay defense mechanism for the
Frolov et al. found that various popular circumvention tools, including obfs4, Shadowsocks Outline, Psiphon's OSSH and Lantern's Lampshade, can be identified using the TCP flags and timing information when the servers close the connections. Frolov et al. thus suggested that servers should "forever read" on errors, so that the probers will be the first to close the connection. This way, it not only reduces the information leaked by server's timeout value, but also let server to close the connection with FIN/ACK consistently (see Fig. 1 here for more details).
Unique TLS ClientHello Fingerprints
On May 30, 2020, @p4gefau1t reported V2Ray clients would send TLS ClientHello messages with very unique fingerprints. Such unique fingerprints not only gave a censor the opportunity to identify the V2Ray clients and servers, but also allowed a censor to accurately block the TLS traffic by V2Ray without much collateral damage.
@p4gefau1t further identified that these unique fingerprints were partially caused by the use of a hardcoded ciphersuite. Specifically, this rarely seen ciphersuite would be used,
V2Ray developer @xiaokangwang mitigated this weakness by using the default settings of go-tls library since
To our best knowledge, as early as November, 2019, @klzgrad had already investigated the fingerprints of V2Ray v4.21.3 as well as many other TLS-based circumvention tools. The result shows most of them have rarely seen TLS ClientHello fingerprints.
Failed to Mimic the HTTP Server
Since the parrot is dead since 2013, instead of reviving the parrot, using a real HTTP engine may be a more promising solution here. Many circumvention tools have been using the idea of
All credit goes to the authors of the corresponding works.
We want to thank @studentmain and @p4gefau1t for helping us understand their proposed replay attacks, and for sharing their inspiring thoughts on the future works. We are also grateful to David Fifield and @studentmain for offering detailed feedback on a draft of this summary.
We will investigate whether the GFW uses active probing against VMess protocol in the following work. At the same time, it will save us a lot of time if you, as a user, can report which circumvention services were blocked when using what settings. We encourage you to share your comments publicly or privately. Our private contact information can be found at the footer of GFW Report.
Since I am mentioned here it seems I'm obligated say a few words.
I think there is some dysfunction in the community. It was much earlier than November 2019 when I raised the issue of unique ClientHellos. In Jan 2019 I posted a twitter thread criticizing reinventing cryptos, HTTP parroting, etc. in Shadowsocks and V2Ray, and a user re-reported it to V2Ray. It was not well-received by the developers.
It went to the point that it was such a glaring issue but obviously nobody was going to take action, which forced me to materialize the ClientHello signature report of Nov 2019, as it's long established from Tor project's practice that the cipher list in ClientHello is being actively probed by the GFW. Unfortunately this report also seemed to have yielded no actionable results. In one discussion several months ago, half the people were skeptical about it.
I think most of these could be avoided with a minimum amount of literature reading, but in reality an adequate level of awareness is not achieved unless hard, working exploits are being posted publicly. Most of the exploits listed above are well-predicted from existing literature: fuzz your protocol, check your ClientHello, don't parrot. (I don't think V2Ray is really fuzzed enough with proper tooling?)
Although I had incentives not to publicize these issues the whole time, unlike the talented authors above. Materializing these issues from principle would be paradoxically detrimental to the logistics of circumvention work, because the time and energy invested into materializing these exploits do not produce new and diverse circumvention tools while materializing and publicizing it strictly reduces the amount of research overhead on the adversarial side. This argument is technically very wrong, but it's argued for the sake of logistics, given that the mentions of V2Ray airports already point to a shift of design thinking from pure academic exercise to practical ecosystem survivability with economic considerations (i.e. the proverbial "worse is better").
This incident was a wake-up call to the entire community. I'm sure the community will be more attentive to every report on security afterwards.
Also, the incident also allowed the community to refine its security breach reporting mechanism, which allows developers to patch the problem before the details being published.
Moreover, I believe that, few developers would intentionally leave a backdoor in their code. Times have limited developers' horizon. Back to 2016, there's no Frolov's great paper. The developers may not realize the fingerprint problem, and they probably didn't meant to create this "bug".
In a word, spreading knowledge is as important as fixing the problem. Even now, ordinary people don't know enough about TLS fingerprints. They don't know what having a unique fingerprint means. I would like to pay respect to all the people who've organise these things into documents for people. Your work is as important as the developers.
Thank you all.
Corrected: TLS fingerprint is nothing new in 2016. It was discussed in Ensafi2015b and Fifield2015a, and was mentioned in Dingledine2006a. (But Sergey's tlsfingerprint.io is the most interesting one.) I think it's the common problem of a non-academic circumvention system -- Nobody follow the latest paper.
I would like to quote dcf's words in IRC, "people have to see something for themselves to believe it." I agree with this. If you tell them how unique TLS fingerprint is, they didn't pay much attention to it. But if you provide a PoC or something like that, they will treat it as a big problem. (In this case, @DuckSoft use