Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DPI(Deep packet inspection) #5222

Closed
kovalensky opened this issue Oct 16, 2020 · 16 comments
Closed

DPI(Deep packet inspection) #5222

kovalensky opened this issue Oct 16, 2020 · 16 comments
Labels

Comments

@kovalensky
Copy link

Hello, I'm from Russia, most of our internet providers are blocking Bittorrent traffic and this continues for a while(2-3 years).
All trackers are available, the speed is within 300KB, but when downloading a torrent, it goes down to 4-10KB.
All of this is bypassed by VPN proxy, Tor.
But I don't want to reduce the speed; with sites on http(s), when downloading files, there are no problems here, all at maximum speeds.

As I understand it(I don't have a broad technical background), modern DPI(Deep packet inspection) with heuristics and behavioral analysis of packets are used; while the application is running, it generates dynamic traffic, which can also be identified and labeled. For example, BitTorrent generates traffic with a certain sequence of packets that have the same characteristics (incoming and outgoing port, packet size, number of sessions opened per unit of time). it can be classified according to a behavioral (heuristic) model.

I am sure that this practice will soon be used by many providers all around the world.
It may also be that the provider knocks on my/destination's port and checks whether the Bittorrent client is installed there; uses the Connection probe technique(like in China), where when trying to connect to any IP address, such a request is first "frozen", and the subsequent advanced connection to the target address is made on behalf of DPI for inspection.
I ordered a seedbox, but this is not the case.

What I wanted to ask is, can the evolution of the Bittorrent Protocol solve these problems, and in your opinion, what would be possible to do?
What changes could be made so that Bittorrent clients automatically bypasses these restrictions?

As we remember, earlier(from wars between internet providers and torrent downloaders from year of 2005 - to now), first BitTorrent clients worked on tcp ports 6881 to 6889, providers blocked these ports, later with the change of protocol, BitTorrent clients were able to work on any ports, then providers began to analyze traffic by content, and now, after adding encryption(obfuscation), they began to implement heuristics and other techniques, what will be the next great step of the Bittorrent protocol?
Thanks.

@5523eeOMEGALUL
Copy link

What changes could be made so that Bittorrent clients automatically bypasses these restrictions?

russian provider with torrent block? they could have no clients.
try set low connections_limit and unchoke_slots_limit to trick detection

@kovalensky
Copy link
Author

kovalensky commented Oct 18, 2020

What changes could be made so that Bittorrent clients automatically bypasses these restrictions?

russian provider with torrent block? they could have no clients.
try set low connections_limit and unchoke_slots_limit to trick detection

I have doubts that it will work, but as I said, I don't have a broad technical background, if you know any clients with unchoke_slots_limit setting in user interface I would like to test it out and tell you.
For now I can change connection limit in most popular torrent clients.
Sometimes it jumps up to 100KB/s but then again, dies.
The question here is not to find ways to bypass this restriction, VPN works great with this, but to retain regular max speed without reducing speed, max connections and etc.., like it was 2-3 years ago.

@arvidn
Copy link
Owner

arvidn commented Oct 19, 2020

what I learned from the cat-and-mouse race with comcast et.al. back in 2007 was:

  • there's no end to the game, you can sink any amount of effort into it, and the ISP can keep up
  • ultimately there is no way to "hide" from your ISP. with comcast they started implementing heuristics to kill connections they didn't recognize. You'd be surprised how few of their customers suffered (but some people reported IMAP over SSH stopped working e.g.)
  • BitTorrent implementations have already gone too far to obfuscate the protocol (and make implementations more complex and brittle), in my mind.

Features I think have played our their role are:

  • protocol encryption/obfuscation
  • lazy-bitfields (this was removed in libtorrent 1.2)
  • randomizing TCP segment sizes to confuse heuristics (this was also removed from libtorrent)

The network traffic of bittorrent is probably not too hard to identify with high probability, if one really tries to. I don't think you need to see the actual bits going back and forth, just the approximate packet sizes and timing

@FranciscoPombal
Copy link
Contributor

  • BitTorrent implementations have already gone too far to obfuscate the protocol (and make implementations more complex and brittle), in my mind.

👍, Screw ISPs that kill that tamper with BitTorrent traffic. They should simply be neutral providers. Vote with your wallet, ballot, or fight by other means to that end. Network protocols shouldn't have to accumulate baggage and bloat just to fight bullshit ISPs do.

@kovalensky
Copy link
Author

Closed until it becomes a global issue, or if someone deems it a serious discussion on this topic.

@Seeker2
Copy link

Seeker2 commented Oct 21, 2020

The more BitTorrent traffic resembles HTTP/HTTPS (or some other ubiquitous) traffic, the harder it will be for ISPs to disrupt it without also affecting everyone else.
So tracker updates will have to be HTTPS to be even semi-secure against it...or no tracker updates at all.
Things like DHT and LSD/LPD will also have to be encrypted or disabled.
Number of connections and connection attempts need to be minimized as well.
In short, about the only way to fight automatic limiting and disruption by ISPs is to be self-limiting to a huge degree.

"with comcast they started implementing heuristics to kill connections they didn't recognize. You'd be surprised how few of their customers suffered (but some people reported IMAP over SSH stopped working e.g.)"

At the time (circa 2006-2011) ComCast was disrupting a lot of 3rd party apps that used the internet that had nothing to do with BitTorrent.
ComCast made claims of minimal disruption...
But considering ComCast's business model of "pay up or we disrupt your online service" towards companies like Netflix, we cannot accept on face value that few customers suffered.

@arvidn
Copy link
Owner

arvidn commented Oct 21, 2020

You're right. I cannot back the claim that few customers were affected. I assumed so based on not seeing many reports of it in mainstream media (just one that I recall).

Peer traffic could be made to run over SSL to mimic HTTPS. I can think of a few challenges off the top of my head:

  • all connections will look like HTTP upgraded to websocket connections (since they are bi-directional). An ISP may find it suspicious that you have THAT many active websocket connections
  • what certificate would peers offer up? A well known cert whose private key is publicly known? (That cert would be a hint to the ISP). Alternative, a random cert is used and nobody actually authenticates anything.
  • I would expect most web traffic these days include the SNI field in the SSL handshake. If peer connections don't, that might be suspicious. If they do, what hostname would they use? Would that hostname resolve to the IP of the peer? If not, that would be another hint to the ISP.

@kovalensky kovalensky reopened this Oct 21, 2020
@Seeker2
Copy link

Seeker2 commented Oct 22, 2020

If an ISP is using heuristics on unknown traffic and not caring that they have a high false positive rate, BitTorrent traffic is extremely likely to be found and limited/blocked.

smell-of-rain probably has that problem...the behavior even sounds like how Sandvine's RSTs worked, although affecting downloading as well as seeding. The problem is...RST packets don't work on UDP-using uTP traffic. So unless uTP is disabled it's not ACTUALLY the same as Sandvine's method.

Once a BitTorrent listening port is determined, it's easy to block that port...preventing probably all incoming traffic and most outgoing uTP traffic -- which often uses the same port number even for outgoing connections.
"Hopelessly firewalled" peers need a good way to either hide their listening port or directly advertise that they don't have one. I don't even know how to set that in any BitTorrent client without unwanted additional consequences, such as disabling DHT and PEX.

It may be possible for ISPs to break BitTorrent's current peer-level encryption ...at the price of needing extreme resources to do it realtime, which I doubt many ISPs are willing to do.

@arvidn
Each peer getting its own SSL websocket connection by default for everyone is probably overkill -- the people that might benefit from that are better served by a regular proxy or seedbox. So if you change libtorrent to do that, most/all peers will need to accept incoming SSL connections even if they're not actively making them outgoing. Which gets into compatibility...would this be commonly supported by other BitTorrent clients or libtorrent-only feature? If it goes like uTP, even when other BitTorrent clients get support for it -- it may take a few years to work well/fully. At worst, would that be any worse than some ISPs partially throttling/blocking BitTorrent traffic?

IBM (the company) found out ComCast was blocking Lotus Notes at all times of day -- ComCast of course denied it, but later changed their methods to reduce doing so. (No penalties for any wrongdoing or inconvenience was ever publicly paid to IBM or Lotus.) I doubt Lotus Notes was sending sensitive traffic as cleartext, and I'm at a loss how that "resembled" BitTorrent traffic even if it did...however it probably benefited Lotus's competitors.

@kovalensky
Copy link
Author

kovalensky commented Oct 26, 2020

I will also slip my 5 kopecks.
As for the last option, as I know, the ESNI(Encrypted Server Name Indication) extension was added to the TLS 1.3 Protocol, which is already supported by sites behind Cloudflare.
I don't know if it's possible to implement the same in BitTorrent.
Also about certs, wouldn't it be possible to create public/private key via hash derivation from torrent file names or info_hash etc..?
As far as I know Telegram's Mtproto proxy uses TLS for evading blocking.
Also don't know about websockets connections, but this is not the biggest concern.

@Tykov
Copy link

Tykov commented Dec 19, 2020

As for the last option, as I know, the ESNI(Encrypted Server Name Indication) extension was added to the TLS 1.3 Protocol, which is already supported by sites behind Cloudflare.

ECH extension which is being standartalized in TLS 1.3 fixes issue with SNI. So no problems with that.

P.S.
From wiki "Analysis of the BitTorrent protocol encryption (a.k.a. MSE) has shown that statistical measurements of packet sizes and packet directions of the first 100 packets in a TCP session can be used to identify the obfuscated protocol with over 96% accuracy. "

In this research proposes for BitTorrent were:

  • Obfuscation of Payload Data
    
  • Obfuscation of Flow Features
    
  • Randomized Flushing of Data Streams
    
  • Random Padding (packets)
    
  • Tricks with Packet Directions
    
  • Hiding Inside Well Known Protocols
    

Those could help with static pattern based detection in DPIs.
Maybe there are a couple more things to do in 2021, but those are essentials.

@stale
Copy link

stale bot commented Apr 24, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Apr 24, 2021
@stale stale bot removed the stale label May 9, 2021
@Seeker2
Copy link

Seeker2 commented May 9, 2021

A question to anyone having issues like this:
Do you have IPv6 peers+seeds connecting to your BitTorrent client?
(They may be routed differently.)

@stale
Copy link

stale bot commented Sep 1, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Sep 1, 2021
@stale stale bot closed this as completed Sep 21, 2021
@zero77
Copy link

zero77 commented Dec 9, 2021

Alternatively, there is Surge which is a P2P file sharing application built on NKN and is specifically designed to be anonymous.

@Tykov
Copy link

Tykov commented Mar 31, 2022

  • all connections will look like HTTP upgraded to websocket connections (since they are bi-directional). An ISP may find it suspicious that you have THAT many active websocket connections.

HTTP/3 has bidirectional streams[differs from HTTP/2], and most of the internet will use it(using right now, almost 600k devices could be found in Shodan and this is far from the limit), I don't think that an ISP could limit number of connections, they wouldn't do it, cause of critical infrastructure of industrial clients, which use cloud services f.e. with contracts(i.e. banks), anyway they can't limit number of websocket connections too, there have to be at least some recognition of bittorrent traffic to do so.

  • what certificate would peers offer up? A well known cert whose private key is publicly known? (That cert would be a hint to the ISP). Alternative, a random cert is used and nobody actually authenticates anything.

We could use info_hash as an encryption key with encrypted Client Hello SNI message and since both sides know this key, we can authenticate session.
For example Telegram proxies work mostly this way, they don't use certs but able to circumvent censorship, they use domain fronting, and have encryption key known by user agent before starting a communication.
Idk what can be done in DHT architecture since we need to search for info_hash, but from tele side there's:
_The client comes up with a random 32-byte key and a random 16-byte Initialization Vector, which encrypts each packet using AES CTR, and so that the server knows how to decrypt it... the key and IV are added to the beginning of the packet before the encrypted content.

You will call it stupidity, because what is the point of sending encrypted packets and immediately attaching a decryption key to them? Of course, this is an absolutely useless defense in a logical sense, but it makes a lot of sense in practice.

After obfuscation, all packets look like random garbage, so to determine whether it is Telegram traffic or not, the provider will have to decrypt each incomprehensible packet using the obfuscated2 method before conducting further checks. Such actions require an unjustified amount of computing power, which providers simply do not have_

  • I would expect most web traffic these days include the SNI field in the SSL handshake. If peer connections don't, that might be suspicious. If they do, what hostname would they use? Would that hostname resolve to the IP of the peer? If not, that would be another hint to the ISP.

Encrypted Client Hello will come as solution for this[came, ECH is working for OpenSSL, BoringSSL, nginx, Apache HTTPD, lighttpd, HAProxy, Conscrypt, curl, and more], with DoH, DoQ.
Ideally there could be an bittorrent client extension telling version for further improvements of hiding fingerprint in TLS.
I think this is a huge step like it was with uTP and its development should to be started in near future.

What problems should be solved here additionally @arvidn ?

It's possible.

Maybe other people could give some review on this @everyone ?

@judith996
Copy link

Back this issue +1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants