Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduction to V2Ray #36

Open
wkrp opened this issue Jun 4, 2020 · 16 comments
Open

Introduction to V2Ray #36

wkrp opened this issue Jun 4, 2020 · 16 comments

Comments

@wkrp
Copy link
Collaborator

wkrp commented Jun 4, 2020

At the next Tor anti-censorship team reading group (Thursday June 11 at 16:00 UTC), we are going to be discussing V2Ray. The members of the team are not very familiar with V2Ray, and we want to broaden our understanding.

Here are some of my preliminary notes on V2Ray. I hope that some readers who are more familiar with V2Ray will be able to correct my misunderstandings and provide more detail. For example, I'm unsure about the relationship between V2Ray and VMess—they seem to have some historical relationship, but I'm not sure.

V2Ray itself is not a protocol or circumvention system by itself. Rather, V2Ray is a platform or framework that allows you to run one or more proxies, with various layered proxy protocols, transports, and obfuscation. For example, you could run SOCKS-in-TLS on one port, and VMess-in-QUIC (with the QUIC packets optionally obfuscated) on another port. On the client side, you can configure routing to control what traffic should use what proxy, or should not be proxied at all.

At the lowest level, V2Ray supports a variety of proxy protocols, some inherently obfuscated and some not:

There's an optional mux (multiplex) layer to tunnel multiple streams through one proxy connection.

The proxy protocols are not inherently implemented over any particular kind of network connection. Instead, you must specify a transport for each:

Any of the transport layers may optionally have a layer of TLS applied to them. The TLS option is obligatory with the HTTP/2 and QUIC transports.

Finally, at the highest level, some transports support additional, optional obfuscation options:

The V2Ray model provides a lot of flexibility. You could set up an unauthenticated SOCKS proxy without any encryption, or you could set up VMess open only to authorized users, tunneled through WebSocket with TLS.

@wkrp
Copy link
Collaborator Author

wkrp commented Jun 4, 2020

I found a V2Ray Beginniner's Guide. Also on GitHub at https://github.com/v2fly/v2ray-step-by-step.

What is the difference between V2Ray and Shadowsocks?

The difference is still that Shadowsocks is just a simple proxy tool; it is a protocol of encryption. However, V2Ray is designed as a platform, and any developer can use the modules provided by V2Ray to develop new proxy software.

Anyone familiar with the history of Shadowsocks should know that it is a self-using software developed by clowwindy. The original intention of the development is to make it easy and efficient to cross the firewall and censorship. Before clowwindy made Shadowsocks opensource, it has been used as a private proxy protocol for a long time. Whereas V2Ray was developed after clowwindy receive the menace from the Chinese government, the Project V team developed as a protest.

Due to the different historical backgrounds at birth, they have different features.

Merely speaking, Shadowsocks is a single proxy protocol, and V2Ray is more complicated than a single protocol proxy. Sounds a bit bleak to Shadowsocks? of course not! From another point of view, Shadowsocks is easy to deploy, and V2Ray has more complicated configurations while deploying.

Since V2Ray is more complicated, why we using it?

The advantages and disadvantages of something always come along. For instance, V2Ray has the following advantages:

  • A new and powerful protocol: V2Ray uses the new self-developed VMess protocol, which improves some of the existing shortcomings of Shadowsocks and is more difficult to detect by the firewall.
  • Better performance: Better network performance, specific data can be seen V2Ray official blog
  • More features: The following are some of the features of V2Ray:
    • mKCP: KCP protocol implementation on V2Ray, you don't need to install another kcptun.
    • Dynamic port: dynamically change the communication port to combat the speed limit of long-term large traffic port
    • Routing features: you can freely set the flow direction of the specified data packet, to block advertisements and enable anti-tracking
    • Outbound proxy, or say chain proxy, uses many links for better privacy
    • Obfuscation: similar to the obfuscation of ShadowsocksR, and the data package for mKCP can also be obfuscated. Obfuscating the traffic packets other protocol traffic, making inspection more difficult
    • WebSocket protocol: Only use WebSocket proxy, or for CDN intermediate proxy (better anti-blocking)
    • Mux: Multiplexing, further improving the concurrent performance of the proxy

VMess

VMess protocol is originated from and utilised in V2ray, and it is similar with Shadowsocks which is designed for obfuscating internet traffic to cheat Deep packet inspection of GFW. VMess is the primary protocol used to communicate between server and client.

@xiaokangwang
Copy link

xiaokangwang commented Jun 5, 2020

Hi, I am xiaokangwang aka Shelikhoo. I am currently one of the developers maintaining the V2Ray(V2Fly) project. I can provide more information about V2Ray projects and answer questions about it.

V2Ray is designed to be a platform in which the users can design their protocol stack to suit their own needs. One might need a proxy that shelters them the most formidable adversary and decides to use Websocket + TLS + VMess over CDN and nginx. Or just need to fool the ISP's QoS that throttle encrypted traffic with Fake HTTP Header + VMess. Or just wants to convert a socks5 proxy into HTTP proxy. V2Ray will cover them all. This also means not all tools provided in V2Ray are safe against the most advanced detection, although we are aiming at providing more tools that do.

V2Ray is currently being developed to be more undetectable, more secure, and suitable for more use cases.

@wkrp
Copy link
Collaborator Author

wkrp commented Jun 5, 2020

Hi, I am xiaokangwang aka Shelikhoo. I am currently one of the developers maintaining the V2Ray(V2Fly) project. I can provide more information about V2Ray projects and answer questions about it.

Hi, thank you for your reply.

What's the relationship between V2Ray and VMess? I get the impression that VMess was the "original" protocol supported by V2Ray, and that V2Ray later grew to support more protocols. Is that right?

I see that V2Ray is the main project under Project V. What else is part of Project V?

What is the procedure for adding a new proxy or transport to V2Ray? For example, Tor has pluggable transports and Shadowsocks has SIP003. Is there something similar for V2Ray?

@xiaokangwang
Copy link

xiaokangwang commented Jun 6, 2020

VMess is a protocol designed by and for V2Ray. It the primary "proxy" protocol used by V2Ray, more on that later. V2Ray is never designed to focus on one specific "proxy" protocol but in practice, VMess is typically used in most of the cases for communication between V2Ray instances and other "proxy" protocols will be used for communicating with third-party software that doesn't support VMess protocol. Yes, V2Ray starts with VMess and support more and more proxies and transports later.

So far V2Ray is the only primary project in Project V, or to say it in another way, Project V is an alias for V2Ray. Other projects including domain-list-community are designed to support V2Ray and eventually integrated into V2Ray. We used to develop GUI clients for V2Ray, but with limited personal, eventually, the task for developing GUI clients shifted to third-party developers.

V2Ray purposefully does not support external transport that needs to be executed, as V2Ray allow configuration files to be read from the network, users will blindly copy configure files without checking it and often intergraded by other software that publishes in platforms that forbid developers from publishing software with behaviors that can be updated independently from the software(not to be specific, but particularly Apple App Store, and Google Play Store). Instead, users manually configure tunnels and other proxies, run them alongside V2Ray, and change V2Ray's configure file to send traffics to these tunnels or proxies. V2Ray has internal APIs that allow anyone to define a proxy or transport by implementing specific interfaces and register it with V2Ray. But for practical reasons these kinds of change will be merged into V2Ray's mainline code and become a part of V2Ray. Websocket transport is one of the examples. After I adopted Websock transport to V2Ray's API, it gets accepted into V2Ray's mainline codebase and was used ever since.

@xiaokangwang
Copy link

xiaokangwang commented Jun 6, 2020

In V2Ray, we define a "proxy" to be a way to communicate where should traffic go and what the traffic is. "transport" is a way to communicate what an arbitrary data stream is regardless of the content of that data. This allows users to make interesting combinations of them to suit their own needs.

@wkrp
Copy link
Collaborator Author

wkrp commented Jun 11, 2020

We had a great discussion about V2Ray today. We were lucky to have several knowledgeable people in attendance, including @xiaokangwang, @studentmain, @DuckSoft, and @tomac4t. Here is a summary of the main points.

  • V2Ray (or V2Fly) is a platform for deploying proxy protocols. It is possible to configure it for circumvention purposes.
    • The V2Fly name exists because the team does not have full management access to the V2Ray domain name and github organization. Currently, releases are being mirrored at both https://github.com/v2ray/ and https://github.com/v2fly, but V2Fly is where future development is going to happen.
    • There is a large and active development community around it.
  • V2Ray has an interesting layered architecture.
    • It's based on a general notion of "inbounds" and "outbounds". On the client, you may have a "socks" inbound and a "vmess" outbound. On the server, you may have a "vmess" inbound and a "freedom" outbound ("freedom" means no further proxying).
    • Each inbound or outbound is divided into "proxy" and "transport" layers. You can compose any proxy (SOCKS, Shadowsocks, VMess...) with any transport (TCP, QUIC, WebSocket...). Some transports have their own additional obfuscation options.
    • You can chain proxies as proxy | proxy | proxy | ... | transport, but only the final hop can use a transport other than TCP.
    • Composition of this kind is difficult, because different pieces may accept/offer different interfaces. The proxy/transport split in V2Ray is pretty sensible and clean. It basically means that in place of a TCP connection, you can substitute any other reliable stream abstraction. You can still end up with some weird effects, though, like how in some configurations it's the proxy that's responsible for obfuscation, while in others it's the transport.
  • VMess is the original proxy protocol invented by the V2Ray project. It is an authenticated look-like-nothing protocol similar to ScrambleSuit, obfs4, or Shadowsocks.
    • There is a dynamic port feature where the VMess server can instruct the client to use a different port. This is to avoid concentrating a large amount of unidentifiable traffic at one port for a long time. (The English version of the protocol description does not seem to document this feature.)
    • In the past couple of weeks, there have been several changes made in VMess as a result of recently discovered replay / active probing attacks.
    • Starting with v4.24.2 (released yesterday), VMess has an experimental AEAD mode, being tested as future-proof resistance to these kinds of attacks. You can find details in extra-VMessAEADdoc.zip.
    • It is not known for sure whether the GFW already knew about the flaws in VMess and was actively exploiting them. VMess servers are known to receive active probing, though it's not known for sure whether the probes were targeting VMess or some other protocol (Shadowsocks, for example, which also gets active-probed).
    • The mKCP transport reportedly gets detected and blocked within a day. The v4.24.2 release now encrypts the mKCP layer to try to resist this detection.
  • In China, there is a commercial market of resellers who set up proxy servers and charge money for subscription access to them. Such vendors are known as 机场, "airports," probably because the Shadowsocks logo is a paper airplane. V2Ray is taking over in the commercial circumvention market from Shadowsocks. There is a perception that airports based on V2Ray or trojan are more up-to-date technically that those using Shadowsocks.
    • V2Ray users either set up their own server, or pay one of these commercial providers.
    • The reseller model is an interesting contrast to the BridgeDB model used in Tor. The resellers are naturally distributed, and have a financial motive to run proxies and keep them unblocked. It costs money to learn proxy addresses, which, in principle, makes them harder to enumerate. There is a cycle of airports starting up, getting popular, getting blocked, then starting over somewhere else or being replaced by another operator.
  • V2Ray and Shadowsocks are designing a next-generation protocol. (There weren't other details on this point.)
  • V2Ray developers would appreciate external attention on the code and protocols. The current maintainers are not the original developers and cannot find everything themselves.
  • There is some desire to share modular transports across circumvention systems, like SIP003 in Shadowsocks and pt-spec in Tor.

@ghost
Copy link

ghost commented Jun 13, 2020

About next-generation protocol. For v2ray, discussions are spread in https://github.com/v2ray/discussion/issues and https://github.com/v2ray/v2ray-core/issues ... There are so many proposals, you'll need some time to read all of them. For shadowsocks, their proposal is in shadowsocks/shadowsocks-org#157

Here's some interesting idea in their dicussion:

  • Do key exchange and handshake with a key server, maybe using REST API or other similar stuff, so we can hide handshake fingerprint in another channel.
  • Design a "protocol generator", programmatic generate many protocols.
  • Should new protocol based on TLS? There are many argues about it.

@fortuna
Copy link

fortuna commented Jun 15, 2020

Thanks for the excellent information about V2Ray. I lead the team that created the Outline VPN and I consider other protocols, even though our flavor of Shadowsocks is working, including in Iran, Turkmenistan and somewhat in China.

One issue I've found with many protocol options is the lack of support for UDP. Some of them just don't support UDP at all other than DNS, some options tunnel UDP over TCP, defeating the purpose of UDP.

Does the V2Ray suite offer options to proxy UDP over UDP without delivery guarantees? It seems all transport options are reliable channels.

@xiaokangwang
Copy link

xiaokangwang commented Jun 16, 2020

Thanks for the excellent information about V2Ray. I lead the team that created the Outline VPN and I consider other protocols, even though our flavor of Shadowsocks is working, including in Iran, Turkmenistan and somewhat in China.

One issue I've found with many protocol options is the lack of support for UDP. Some of them just don't support UDP at all other than DNS, some options tunnel UDP over TCP, defeating the purpose of UDP.

Does the V2Ray suite offer options to proxy UDP over UDP without delivery guarantees? It seems all transport options are reliable channels.

Currently, V2Ray's support for UDP traffic is quite limited. There is no support for SOCK_DGRAM like UDP traffic and they are treated like SOCK_SEQPACKET(SCTP like). However, there are plans to support UDP traffic and integrate a protocol called VLite that optimized for gaming. In fact, it is expected to be included in the next major functional update.

What kind of UDP usage would you advise V2Ray to support or optimize for? Is it gaming which requires low latency or torrenting (or QUIC) that requires high throughput?

The reason V2Ray deprioritized UDP over UDP support is that Chinese ISP always tries to sabotage UDP traffics. It often requires specialized and expensive setup to get UDP tunnel optimized for gaming to work as expected. And without such a setup, the packet loss imposed by a lot of ISP will make any UDP based protocol ineffective. For downloading and web browsing, it is almost always better and more economic to just use TCP(reliable connection) and get rid of the head of line blocking by establishing more TCP(reliable) connections(by changing network.http.max-connections in browser settings).

@outliners
Copy link

outliners commented Jun 16, 2020

@xiaokangwang wish everyone safe there, is that better to keep personal information cloaked?

@fortuna
Copy link

fortuna commented Jun 16, 2020

@xiaokangwang We see significant UDP traffic in Outline servers, so UDP is indeed being used. I don't have numbers, but I imagine the main use for UDP is video calls and watching videos (e.g YouTube). Gaming is probably relevant too. I believe none of those use cases work well on TCP.

@gfw-report
Copy link

gfw-report commented Jun 16, 2020

Summary on Recently Discovered V2Ray Weaknesses

Authors: Anonymous

Date: Tuesday, June 16, 2020

中文版: 总结近期发现的V2Ray弱点

This summary first appeared on GFW Report. We also maintain an up-to-date copy of the report on both net4people and ntc.party.


Several weaknesses were discovered in the V2Ray recently, which could be used to identify V2Ray clients or servers that run VMess, TLS or HTTP protocol. Below is our summary and understanding on these weaknesses.

In general, these weaknesses fall into three categories:

  • Inappropriate authentications in VMess, making the servers vulnerable to replay attacks.
  • Hardcoded unique ciphersuites, leading to the rarely-seen fingerprints of the TLS ClientHello messages.
  • Failed attempt to parrot/mimic the HTTP server.

Replay Attacks against the VMess Protocol

As introduced in the specification (English version) of the VMess protocol, a VMess request looks like this:

16 bytes X bytes Other Parts
Authentication Credential Command Data
  • The 16-byte Authentication Credential is a HMAC associated with the user ID and a UTC timestamp.
  • The Command is encrypted using AES-128-CFB(iv, key), where the iv is the md5 hash value of the UTC timestamp, and key is the preshared one associated with user ID.

The following table shows the structure of Command after decryption:

1 byte 16 bytes 16 bytes 1 byte 1 byte 4 bits 4 bits 1 byte 1 byte 2 byte 1 bytes N byte P bytes 4 bytes
Version Encryption IV Encryption Key Response Auth V Options Margin P Encrypt Method Reserved CMD Port Address Type Address Random Value Checksum F
  • The Encryption IV and the Encryption Key are used to decrypt Data, not Command.
  • The Margin P and Random Value are used as a padding scheme. Specifically, the 4-bit Margin P specifies the length of the Random Value to be between 0 and 15 bytes.
  • The Checksum F, serving as a MAC, should be the FNV1a hash of all plaintext in Command, excluding itself.

Inappropriate authentication

On May 31, 2020, @p4gefau1t reported that VMess servers could be identified by replay-based active probing, due to the inappropriate authentications.

VMess authenticates each request in two steps, using Authentication Credential and checksum. Unfortunately both of them can be circumvented.

First, the VMess server validates whether the timestamp in Authentication Credential is expired. The expiration time is 120 seconds at maximum and 60 seconds on average (see here and here for implementation details). That is to say, an attacker can record and replay a legitimate Authentication Credential within around 60 seconds to bypass this authentication.

Second, since the aes-cfb used to encrypt the Command does not provide any authentication, a MAC-then-Encrypt mechanism is used. As pointed out by @p4gefau1t, VMess fell into the same pitfall as Shadowsocks OTA mode did (See the English summary on the weakness of Shadowsocks OTA mode here). Specifically, since the length of the Random Value varies, the server will not be able to know where the Checksum F (MAC) is located, unless it blindly trusts the value in Margin P without any authentication (see here for implementation details). In other words, only after reading P+4 bytes, V2Ray will be able to validate whether the decrypted content is legal. If not legal, the V2Ray server will close the connection.

VMess server indeed has a replay defense mechanism. In particular, the server records the (Encryption IV, Encryption Key) of each request, regardless of the validity of the requests; and close the connection immediately when the (Encryption IV, Encryption Key) is seen before. Depending on her needs, an attacker can:

  1. bypass this replay defense by alternating the (ciphertext of) Encryption IV or Encryption Key.
  2. or intentionally trigger the replay defense to expect the inconsistency behaviors of the servers when seeing the same (Encryption IV, Encryption Key) first time and more times.

Exploiting these weaknesses, many replay-based probes are creatively crafted to identify the VMess server. We introduce them below in separate sections.

Replays with padding length field changed

Based on @p4gefau1t's findings, @studentmain proposed and @p4gefau1t improved an attack to identify the VMess servers. For simplicity, below we present this attack in a slightly different way.

The malicious probe is a replay of the legitimate request, with many bytes changed as follows:

16 bytes 41 bytes M bytes
Auth Info Malicious Incomplete Command Zeros

The Malicious Incomplete Command includes:

1 byte 16 bytes 16 bytes 1 byte 1 byte 4 bits 4 bits 1 byte 1 byte 2 byte 1 bytes
Version Encryption IV Encryption Key Response Auth V Options Margin P Encrypt Method Reserved CMD Port Address Type

In total, the attacker makes 16 connections to the server. In each connection, the attacker:

  1. first sends a replay of the first 16 + 41 bytes of the legitimate connection, with both the last byte of the Encryption Key and the 4-bit Margin P changed to a value different from the ones in other connections;
  2. then sends M bytes of zero (or random) data one byte per second, until the server closes the connection.

If the Ms recorded among 16 connections happen to be a list of non-repeated integers with the delta of max and min is 15, then it is very likely that the server runs VMess protocol.

The explanations of the attack are as follows:

  • To circumvent the authentication based on Auth Info, the attacker replays an Auth Info sent by the legitimate client in around 60 seconds.
  • To circumvent the replay defense based on (Encryption IV, Encryption Key), the attacker uses a different value of the Encryption Key in each connection.
  • To avoid the bit errors propagating to the Margin P, the attacker carefully chooses the last byte of the Encryption Key to alter. This is because this byte happens to be within the same 16-byte cipher block as the Margin P. (Note that, the bit error propagation of AES-128-CFB works as follows: changing a bit in cipher block Ci, will change 1) the specific corresponding bit in plaintext block Pi; 2) as well as the Random bit errors in all subsequent blocks.)
  • The attacker then exploits the malleability of the stream cipher to enumerate all possible values of the 4-bit Margin P in 16 connections.
  • After reading the 16+41 bytes, the server waits for the Address, Paddings and Checksum before closing the connection due to checksum error. Thus, the M measured here is actually N-byte address + P-byte padding + 4-byte checksum.
  • The attacker can thus infer the value of Margin P from M because the Paddings is the only field with varied length. (The length of the Address is a fixed value, because the address type is not changed.)

Relays that trigger inconsistent draining behaviors

After the patches to defeat the probes above, @nametoolong found two more types of replay-based probes that can still the detect VMess servers. Both of them are related to how the server closes the connection. Below, we introduce the first of them, and we leave the explanations of the second attack as an exercise to reader.

@nametoolong described the probes and the behaviors of the server as follows:

    Vector 1:
    Let M1 be the first 54 bytes of a valid session.
    Let M2=M1. Tamper with M2[48] (i.e. alter the 49th byte of M2).
    Replay M1. Connection is closed immediately.
    Replay M2. Connection is not closed.
    Replay M2 again. Connection is closed immediately.

The byte 48 (counting from 0) that got changed is the last byte of the Encryption Key.

In this attack, the attacker intentionally triggers the replay defense, and expects the inconsistent behaviors of the servers when seeing the same (Encryption IV, Encryption Key) for the first time and for more times. The detailed explanations are as follows:

  1. Since the (Encryption IV, Encryption Key) in M1 is the same as the one in the legitimate connection, the server will detect this replay attack and thus close the connection immediately.
  2. When it is the first time to send M2, since the server has never seen the altered (Encryption IV, Encryption Key), it will bypass the replay defense. The server thus waits for more bytes to come,
    rather than close the connection.
  3. When it is the second time to send M2, since the server has seen the same (Encryption IV, Encryption Key) before, the server will close the connection immediately.

The V2Ray has actually been patched so that it will close the connections after reading a random number of bytes within a certain range, or after waiting for a random amount of time within a certain range. However, this attack is possible because of the inconsistent usage of the draining methods when different types of errors happen.

@nametoolong thus suggested:

    Drain the connection on all types of errors.
    It still needs to be considered whether draining the connection itself is a attack vector.

Our comments

Although we do not know whether the GFW uses active probing against VMess protocol, the attacks proposed above are feasible to the GFW. For example, it is observed that the GFW is capable of sending replay-based probes with no delay or arbitrarily long delay. We will investigate whether the GFW uses active probing against VMess protocol in the following work. At the same time, it will save us a lot of time if users can report which V2Ray servers were blocked when using what settings.

It may be a good idea to use a replay defense mechanism for the auth info that is based on both expiration time and nonce. On one hand, V2Ray uses a replay defense mechanism based on expiration time. It will thus consider a replay sent within the expiration time as valid. On the other hand, Shadowsocks-libev uses a replay defense mechanism based on nonce. But it requires the servers to remember these nonces forever until the key is changed. This seems to be complicated to implement as it should even still remember the nonce after a reboot of the software. Therefore, a replay defense mechanism based on both expiration time and nonce may be a good choice.

Frolov et al. found that various popular circumvention tools, including obfs4, Shadowsocks Outline, Psiphon's OSSH and Lantern's Lampshade, can be identified using the TCP flags and timing information when the servers close the connections. Frolov et al. thus suggested that servers should "forever read" on errors, so that the probers will be the first to close the connection. This way, it not only reduces the information leaked by server's timeout value, but also let server to close the connection with FIN/ACK consistently (see Fig. 1 here for more details).

Unique TLS ClientHello Fingerprints

On May 30, 2020, @p4gefau1t reported V2Ray clients would send TLS ClientHello messages with very unique fingerprints. Such unique fingerprints not only gave a censor the opportunity to identify the V2Ray clients and servers, but also allowed a censor to accurately block the TLS traffic by V2Ray without much collateral damage.

@p4gefau1t further identified that these unique fingerprints were partially caused by the use of a hardcoded ciphersuite. Specifically, this rarely seen ciphersuite would be used,
when the AllowInsecureCiphers flag was its default value false.

V2Ray developer @xiaokangwang mitigated this weakness by using the default settings of go-tls library since v4.23.4 (see patches #2510, #2512, #2518). @tomac4t summarized a form, comparing the ClientHello fingerprints before and after the patches using tlsfingerprint.io. However, the fingerprints seem to be still quite unique.

To our best knowledge, as early as November, 2019, @klzgrad had already investigated the fingerprints of V2Ray v4.21.3 as well as many other TLS-based circumvention tools. The result shows most of them have rarely seen TLS ClientHello fingerprints.

Side notes:

Failed to Mimic the HTTP Server

On June 2, 2020, @p4gefau1t reported the V2Ray failed to mimic real HTTP communications. In particular, the two reported issues are:

  1. Both V2Ray clients and servers will prepend a HTTP header only to the first TCP payload they send in each connection, making the mimicking traffic easy to be detected.
  2. V2Ray servers use a hardcoded 500 response for various types of failures, making the mimicking server easy to be distinguished by active probes.

Since the parrot is dead since 2013, instead of reviving the parrot, using a real HTTP engine may be a more promising solution here. Many circumvention tools have been using the idea of application fronting, which include forwardproxy, naiveproxy and trojan.

Credits

All credit goes to the authors of the corresponding works.

Thanks

We want to thank @studentmain and @p4gefau1t for helping us understand their proposed replay attacks, and for sharing their inspiring thoughts on the future works. We are also grateful to David Fifield and @studentmain for offering detailed feedback on a draft of this summary.

Contacts

We will investigate whether the GFW uses active probing against VMess protocol in the following work. At the same time, it will save us a lot of time if you, as a user, can report which circumvention services were blocked when using what settings. We encourage you to share your comments publicly or privately. Our private contact information can be found at the footer of GFW Report.

@klzgrad
Copy link

klzgrad commented Jun 17, 2020

Since I am mentioned here it seems I'm obligated say a few words.

I think there is some dysfunction in the community. It was much earlier than November 2019 when I raised the issue of unique ClientHellos. In Jan 2019 I posted a twitter thread criticizing reinventing cryptos, HTTP parroting, etc. in Shadowsocks and V2Ray, and a user re-reported it to V2Ray. It was not well-received by the developers.

It went to the point that it was such a glaring issue but obviously nobody was going to take action, which forced me to materialize the ClientHello signature report of Nov 2019, as it's long established from Tor project's practice that the cipher list in ClientHello is being actively probed by the GFW. Unfortunately this report also seemed to have yielded no actionable results. In one discussion several months ago, half the people were skeptical about it.

I think most of these could be avoided with a minimum amount of literature reading, but in reality an adequate level of awareness is not achieved unless hard, working exploits are being posted publicly. Most of the exploits listed above are well-predicted from existing literature: fuzz your protocol, check your ClientHello, don't parrot. (I don't think V2Ray is really fuzzed enough with proper tooling?)

Although I had incentives not to publicize these issues the whole time, unlike the talented authors above. Materializing these issues from principle would be paradoxically detrimental to the logistics of circumvention work, because the time and energy invested into materializing these exploits do not produce new and diverse circumvention tools while materializing and publicizing it strictly reduces the amount of research overhead on the adversarial side. This argument is technically very wrong, but it's argued for the sake of logistics, given that the mentions of V2Ray airports already point to a shift of design thinking from pure academic exercise to practical ecosystem survivability with economic considerations (i.e. the proverbial "worse is better").

@DuckSoft
Copy link

DuckSoft commented Jun 17, 2020

This incident was a wake-up call to the entire community. I'm sure the community will be more attentive to every report on security afterwards.

Also, the incident also allowed the community to refine its security breach reporting mechanism, which allows developers to patch the problem before the details being published.

Moreover, I believe that, few developers would intentionally leave a backdoor in their code. Times have limited developers' horizon. Back to 2016, there's no Frolov's great paper. The developers may not realize the fingerprint problem, and they probably didn't meant to create this "bug".

In a word, spreading knowledge is as important as fixing the problem. Even now, ordinary people don't know enough about TLS fingerprints. They don't know what having a unique fingerprint means. I would like to pay respect to all the people who've organise these things into documents for people. Your work is as important as the developers.

Thank you all.

@tomac4t
Copy link

tomac4t commented Jun 18, 2020

Back to 2016, there's no Frolov's great paper. The developers may not realize the fingerprint problem, and they probably didn't meant to create this "bug".

Corrected: TLS fingerprint is nothing new in 2016. It was discussed in Ensafi2015b and Fifield2015a, and was mentioned in Dingledine2006a. (But Sergey's tlsfingerprint.io is the most interesting one.) I think it's the common problem of a non-academic circumvention system -- Nobody follow the latest paper.

I would like to quote dcf's words in IRC, "people have to see something for themselves to believe it." I agree with this. If you tell them how unique TLS fingerprint is, they didn't pay much attention to it. But if you provide a PoC or something like that, they will treat it as a big problem. (In this case, @DuckSoft use iptables -I OUTPUT -m string --algo kmp --hex-string "|001ecca8cca9c02fc02bc030c02cc027c013c023c009c014c00a130113031302|" -j DROP to prove it.)

@itshaadi
Copy link

itshaadi commented Feb 5, 2021

@xiaokangwang
how come v2ray doesn't support tun devices? I mean I'm not looking for a workaround or anything like that. I'm just curious to know the reason. does it affect the resistance and evasion? because other solutions such as trojan or Rosen don't support it either.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants