Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TorKameleon: Improving Tor's Censorship Resistance with K-anonymization and Media-based Covert Channels (TrustCom 2023) #331

Open
AfonsoVilalonga opened this issue Feb 12, 2024 · 1 comment
Labels
reading group summaries and discussions of research papers and other publications

Comments

@AfonsoVilalonga
Copy link

Hello!

My name is Afonso Vilalonga, and along with Henrique Domingos and João S. Resende, I am one of the authors of the paper titled "TorKameleon: Improving Tor's Censorship Resistance with K-anonymization and Media-based Covert Channels" I'm sharing information about this paper here because I believe it may be of interest to this forum.

TorKameleon is a Tor pluggable transport designed to encapsulate Tor traffic within WebRTC video frames of a videoconference, similar to Protozoa and Stegozoa (which are two state-of-the-art censorship evasion systems that utilize WebRTC video encapsulation mechanisms). However, unlike Protozoa and Stegozoa, TorKameleon's mechanisms are based on the Insertable Stream/Encoded Transforms WebRTC API. This allows for a much simpler and easily deployable traffic encapsulation system without requiring invasive modifications to a browser, in comparison with previous systems. TorKameleon can also function as a standalone proxy, forming networks of TorKameleon proxies to route traffic through multiple paths and encapsulate it in WebRTC video frames. The main idea behind this is to create networks of K proxies to enable K users to route their traffic through these proxies, achieving a form of K-anonymization. However, this aspect of the work still requires further development. While the solution still requires more testing and additional features to be usable in the real world, our primary goal was to create a proof-of-concept WebRTC video encapsulation pluggable transport that is easily deployable and usable by leveraging the Insertable Stream/Encoded Transforms WebRTC API.

The paper will be published in the TrustCom/ITCCN 2023 proceedings, and a Java version of the code is available on my GitHub profile (https://github.com/AfonsoVilalonga/TorKameleon). We also have a new version of TorKameleon written in Go using PION (which is much simpler to work with!). However, it is currently in a private repository as we are still testing it. It will be made public in the future.

Thank you for your time, and if you have any questions or would like to know more about both the TorKameleon Java and Go versions, please feel free to ask!

Afonso Vilalonga

LINK TO PAPER: https://arxiv.org/abs/2303.17544

References
Protozoa: https://dl.acm.org/doi/10.1145/3372297.3417874
Stegozoa: https://dl.acm.org/doi/abs/10.1145/3488932.3517419

@wkrp wkrp added the reading group summaries and discussions of research papers and other publications label Feb 12, 2024
@wkrp wkrp changed the title TorKameleon: Improving Tor's Censorship Resistance with K-anonymization and Media-based Covert Channels TorKameleon: Improving Tor's Censorship Resistance with K-anonymization and Media-based Covert Channels (TrustCom 2023) Feb 12, 2024
@wkrp
Copy link
Member

wkrp commented Feb 13, 2024

TorKameleon's mechanisms are based on the Insertable Stream/Encoded Transforms WebRTC API

This is a really interesting aspect of the work to me. The WebRTC Encoded Transform API (formerly called Insertable Streams, I think) provides hooks for JavaScript code to manipulate the raw bytes of encoded audio/video streams. Not the images or waveforms that are input to the encoder – you could always do that – but the output bytes of the encoder, e.g., VP8 data.

The main intended use case of WebRTC Encoded Transform, as I understand it, is end-to-end encryption for WebRTC media data: two peers who each have a hop-by-hop WebRTC connection to a WebRTC gateway can encrypt their media streams such that the gateway cannot inspect them. But the transform can be anything: the draft standard has a simple example of inverting all the bits. In particular, with Encoded Transform you can throw away the original media data and replace it with any data of your choosing. I.e., you can do traffic replacement like Slitheen, Protozoa, or Balboa.

The effect is you get a media stream covert channel that is more efficient than, for example, FreeWave, CovertCast, and Stegozoa (which modulate a signal into the input to the media encoder, rather than modifying its output), while being easier to deploy than Protozoa (which similarly replaces encoded media, but whose implementation requires a recompiled browser).

The Stegozoa paper had in fact considered Insertable Streams, but the standard was less well developed at the time, and Stegozoa anyway was designing for an adversary in the WebRTC gateway position, which would be able to detect encrypted data even if it could not decrypt it:

Insertable streams: Insertable streams [31] are recently proposed WebRTC extensions to further strengthen the end-to-end security of WebRTC calls. They allow call participants to encrypt their media streams with a secret key (possibly exchanged out-of-band), prior to applying WebRTC’s default DTLS-SRTP-based encryption at the network layer. Hence, the use of insertable streams could prevent WebRTC gateways from inspecting the media being relayed. However, at the moment, the use of insertable streams in WebRTC calls is optional and remains undeployed in the majority of WebRTC services. In addition, we expect adversaries to deploy generic WebRTC gateway implementations like Janus [3], which require access to the exchanged media streams for additional processing.

The Snowflake paper has a paragraph reflecting on the pros and cons of WebRTC data channels versus WebRTC media streams. Snowflake currently uses data channels, but if that turns out to be too much of a fingerprinting vector, it could likely switch to media streams using Encoded Transform, similar to TorKameleon.

Data channel or media stream Besides data channels, WebRTC offers media streams, serving the purpose of real-time audio and video communication. Though both are encrypted, data channels and media streams are externally distinguishable because they use different containers. Data channels use DTLS, while media streams use DTLS-SRTP; that is, the Secure Real-Time Transport Protocol with a DTLS key exchange [32 §4.3].

Data channels are a closer match to Snowflake's communication model: media streams are meant to contain encoded audio and video, not arbitrary binary data. But the use of DTLS rather than DTLS-SRTP could become a significant feature if other WebRTC applications mainly use media streams. Although it would be less convenient, it would be possible to adapt the WebRTC link between client and proxy to use a media stream rather than a data channel, either by modulating binary data into a well-formed encoded audio or video signal in the manner of, say, Stegozoa [12 §3.3], or by replacing encoded media content within SRTP packets, as in Protozoa [2 §4.4] or TorKameleon [40 §III-D].

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
reading group summaries and discussions of research papers and other publications
Projects
None yet
Development

No branches or pull requests

2 participants