Skip to content

Add event buffering for cloaking user input patterns#149

Merged
marmarek merged 3 commits into
QubesOS:mainfrom
ArrayBolt3:main
Nov 18, 2024
Merged

Add event buffering for cloaking user input patterns#149
marmarek merged 3 commits into
QubesOS:mainfrom
ArrayBolt3:main

Conversation

@ArrayBolt3
Copy link
Copy Markdown
Contributor

@ArrayBolt3 ArrayBolt3 commented Oct 1, 2024

Goal

Implement the functionality of kloak (a tool designed to hide biometric behavior patterns in keystrokes and mouse movements) in qubes-gui-daemon. This PR will implement the functionality requested in QubesOS/qubes-issues#1850 and fleshed out further in QubesOS/qubes-issues#8541. It will also close QubesOS/qubes-issues#8534 as it will no longer be necessary.

TODOs

  • Test rigorously on Qubes R4.3 (an earlier iteration of the code has been smoke-tested on Qubes R4.2, this hasn't been tested at all on R4.3 yet)

Fixed TODOs:

  • Figure out why the domU window occasionally freezes until another input event is sent - we aren't buffering info coming from domU to dom0 so why this is happening is a mystery to me, and something for later investigation. (Solved, Add event buffering for cloaking user input patterns #149 (comment))
  • Potentially change how events are treated (do some events have to operate in pairs for best results?). (Lots of X events are now not buffered in the latest implementation. Only ones that look valuable to buffer are buffered.)
  • Make the delay duration user-configurable (right now it's hardcoded to 150 milliseconds). (Implemented.)
  • Allow configuring event delay duration for individual VMs buffering (right now it is applied equally to all VMs). (Implemented.)
  • Get the configuration code working and test it. (Solved, this ended up requiring a change to core-admin-client which I will be submitting as a separate PR.)
  • Ensure all new code adheres to Qubes OS standards (didn't have time to finish that up) (should be done now)

Rationale

Kloak, the inspiration for this PR, is a user input buffering and obfuscation tool. It intercepts keyboard and mouse events at the evdev layer, holds them in a queue for release at a later scheduled time, then releases them to the applications they were intended for periodically. By adding random noise into the user's input patterns, kloak aims to make otherwise recognizable patterns in user behavior (such as keystroke rhythm and mouse movement patterns) too erratic to be used as a method of identifying the user. This is potentially very useful especially for Whonix Workstation domUs, as it denies an adversary access to a remarkably effective biometric fingerprinting mechanism they could otherwise access without specialized tools.

Kloak is currently able to operate directly in Qubes domUs if (and only if!) gui-agent-virtual-input-device is enabled for the domU in question. Even in these instances, only keyboard events are anonymized, and additionally the domU must have an evdev X driver installed. This is less than ideal from a functionality standpoint, and as @DemiMarie has explained in QubesOS/qubes-issues#8541 it will eventually stop working entirely. There's also the possibility of malware compromise in the domU resulting in the deanonymization of the user. For these reasons, enabling the use of evdev in domUs and running kloak in the domU is not a good solution.

The other obvious option is to run kloak directly in dom0. This has several disadvantages:

  • kloak can now potentially wreak havoc on the user's ability to use their computer. If a bug in kloak locks up the keyboard, or the user does something inadvisable like setting a 20-second event delay, regaining control of the system could be difficult or impossible without doing a hard reset (or worse, booting an external USB in order to chroot into dom0 and disable kloak).
  • Application of kloak's functionality becomes all-or-nothing - you either anonymize all keyboard and mouse input everywhere, or you anonymize none of it. This could make management of dom0 annoying with larger delay times, and it could prevent the user from making use of applications or websites that require input pattern telemetry to function (such as some bank websites).
  • kloak's configuration options similarly apply globally. One might want a comfortable delay of only 25ms in a domU they expect to be safe, but wish to use an extremely long one like 1000ms in a domU they believe is compromised and actively exfiltrating data. With kloak running directly in dom0, this is impossible.

This PR implements a third option - inserting the functionality of kloak directly into qubes-gui-daemon. kloak upstream never needs to be involved, only the functionality of it must be. This functionality I have termed "event buffering", and as this implementation works with X server events I have called it "event buffering", or "ebuf" for short (which is the term used for it in the code). Previously I had called this "X event buffering" and used "xbuf" for short, but as @3hhh pointed out that name would become inaccurate when this is ported to Wayland, so I changed it to "ebuf" so as to make the name be display server agnostic.

By working inside the GUI daemon, the following advantages are gained:

  • No evdev support needed at all, we can work with X events instead.
  • The amount of additional code needed is smaller.
  • Per-VM application and configuration of event buffering is now possible - some VMs can use a small, comfortable delay, others can use a very long one, and others can skip delays entirely.
  • Even if something goes very wrong and buffering prevents the user from inputting anything into any Qube, the user retains control of dom0 and can recover their system from there.
  • A compromised domU with event buffering enabled will be less likely to leak valuable biometric info to the malware within the VM.

How it works

Most of the code should be fairly self-explanatory. In a nutshell, we use a tail queue to store a list of delayed, scheduled X events. As events come from dom0 to a domU, they are captured, scheduled for release at a later time, and thrown into the queue. Events in the queue are regularly checked to see if their scheduled release time has arrived, and those events are released when appropriate. The scheduler inserts some random noise into the delays, making it difficult to uniquely identify the user's typing and mouse movement/usage patterns.

By default, event buffering is disabled and all events are passed through without buffering. To enable it, one must use qvm-features to set gui-ebuf-max-delay to a value greater than 0. It is worth noting that 0 is interpreted not as a "don't add any delay when buffering events", but rather it is interpreted as "don't buffer events at all". This configuration feature does not work without the ebuf_max_events setting being added to the list of GUI daemon configuration settings in qubes-core-admin-client. The pull request for that is at QubesOS/qubes-core-admin-client#309.

This PR needs more testing (especially on Qubes R4.3), but it is solid enough that I feel comfortable asking for a review on it. Thanks for your help!

@DemiMarie
Copy link
Copy Markdown
Contributor

Please use getrandom() instead of ISAAC.

@ArrayBolt3
Copy link
Copy Markdown
Contributor Author

@DemiMarie I can do that, but that will drain the system's entropy sources at a fairly constant rate (every mouse movement, keystroke, etc. will result in entropy drain). Theoretically that could weaken any automatically generated encryption keys used for things like HTTPS and the like. Using ISAAC requires only a single initial use of system entropy.

If entropy drain isn't a concern, then I'm happy to swap it out.

@ArrayBolt3
Copy link
Copy Markdown
Contributor Author

ISAAC removed, getrandom()-based delay mechanism implemented.

I also tracked down the source of the GUI freeze bug - turned out to be because of gui-common/txrx-vchan.c:wait_for_vchan_or_argfd.

int wait_for_vchan_or_argfd(libvchan_t *vchan, int fd) 
{
    int ret;
    while ((ret=wait_for_vchan_or_argfd_once(vchan, fd)) == 0);
    return ret;
}

This was apparently busy-waiting for something to happen and thus keeping queued events from ever being released until the user did something like press a key or move the mouse. I used a hack to make wait_for_vchan_or_argfd non-blocking, but I'm not totally sure that's going to be acceptable in the long run since this will probably significantly increase the CPU usage of qubes-guid. If this is an acceptable solution, wait_for_vchan_or_argfd should just be removed and the underlying function wait_for_vchan_or_argfd_once should be made public so that xside.c can use it directly.

My commit also attempts to implement configuration support - I don't see any reason why the code shouldn't work, but I haven't yet gotten it to work on my machine as I seem to be having trouble setting a custom VM feature properly.

@DemiMarie
Copy link
Copy Markdown
Contributor

@DemiMarie I can do that, but that will drain the system's entropy sources at a fairly constant rate (every mouse movement, keystroke, etc. will result in entropy drain). Theoretically that could weaken any automatically generated encryption keys used for things like HTTPS and the like. Using ISAAC requires only a single initial use of system entropy.

If entropy drain isn't a concern, then I'm happy to swap it out.

Entropy drain doesn’t actually exist. The entropy obtained from the getrandom() syscall cannot be used to derive the internal state of the Linux CSPRNG.

@ArrayBolt3 ArrayBolt3 marked this pull request as ready for review October 4, 2024 22:53
@ArrayBolt3
Copy link
Copy Markdown
Contributor Author

I believe this is now ready for review.

@ArrayBolt3
Copy link
Copy Markdown
Contributor Author

Pull request for configuring X event buffering: QubesOS/qubes-core-admin-client#309

Comment thread gui-common/txrx-vchan.c Outdated
Comment thread gui-daemon/xside.c Outdated
Comment thread gui-daemon/xside.c Outdated
Comment thread gui-daemon/xside.c Outdated
Comment thread gui-daemon/xside.c Outdated
Comment thread gui-daemon/xside.c Outdated
Comment thread gui-daemon/xside.c Outdated
@3hhh
Copy link
Copy Markdown

3hhh commented Oct 12, 2024

It's nice to see you work on this @ArrayBolt3, much appreciated!

I also believe that this may be relatively easy to port to Wayland.
@DemiMarie: What do you think?

@DemiMarie
Copy link
Copy Markdown
Contributor

For Wayland the tricky part is that the protocol is written assuming that events are dispatched in-order. Buffering events therefore creates a risk of head-of-line blocking of events that really should not be delayed, like buffer release events indicating that memory can safely be reused. Avoiding this requires reordering events, but that requires effort to ensure no bugs are introduced.

@marmarek
Copy link
Copy Markdown
Member

For Wayland the tricky part is that the protocol is written assuming that events are dispatched in-order. Buffering events therefore creates a risk of head-of-line blocking of events that really should not be delayed, like buffer release events indicating that memory can safely be reused. Avoiding this requires reordering events, but that requires effort to ensure no bugs are introduced.

Reordering is problematic for X11 version too (see earlier comments). What events are problematic if delayed few hundreds of milliseconds? Will such delay of buffer release events cause practical issues beyond possibly requiring a bit more memory on the VM side?

@ArrayBolt3
Copy link
Copy Markdown
Contributor Author

I wouldn't expect delay to be a problem for X11 since, as I mentioned in one of the earlier comments, the X protocol is designed to operate over a network (for instance SSH with X tunneling). Networks incur a decent amount of latency. That latency causes delays similar to the ones this PR artificially introduces.

Wayland is a concern, however in practice waypipe exists and is used to provide network transparency to Wayland in a fashion similar to X11, and it appears to work well from what I've heard. That would cause the same latency and delay there, so if it's not a problem there it probably won't be a problem here.

Comment thread gui-daemon/xside.c Outdated
Comment thread gui-daemon/xside.c Outdated
Comment thread gui-daemon/xside.c Outdated
Comment thread gui-daemon/xside.c Outdated
Comment thread gui-daemon/xside.c Outdated
Comment thread gui-daemon/guid.conf Outdated
Comment thread gui-daemon/guid.conf Outdated
@ArrayBolt3
Copy link
Copy Markdown
Contributor Author

Next iteration ready for review. Smoke-tested on Qubes OS R4.3, all requested changes implemented.

@ArrayBolt3 ArrayBolt3 changed the title Add X event buffering for cloaking user input patterns Add event buffering for cloaking user input patterns Oct 16, 2024
@ArrayBolt3
Copy link
Copy Markdown
Contributor Author

Had to force-push again because I forgot to update the commit message.

@ArrayBolt3
Copy link
Copy Markdown
Contributor Author

Switched to using events_max_delay terminology, also rebased onto the tip of main.

Copy link
Copy Markdown
Contributor

@HW42 HW42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By working inside the GUI daemon, the following advantages are gained:

  • No evdev support needed at all, we can work with X events instead.

Getting rid of having multiple input paths that need to be tested would be nice.

  • A compromised domU with event buffering enabled will be less likely to leak valuable biometric info to the malware within the VM.

If someone compromised a VM enough to get to the raw event stream, they already have a lot of fingerprinting options. So I'm not sure how much of an advantage that really is.

At the same time this approach has a significant cost. It makes the security critical side of the gui handling more complex. To be fair the implementation is not that complicated.

Comment thread gui-daemon/xside.c
Comment thread gui-daemon/xside.c
Comment thread gui-daemon/xside.c Outdated
Comment thread gui-daemon/xside.c Outdated
Comment thread gui-daemon/xside.c Outdated
Comment thread gui-daemon/xside.c Outdated
Comment thread gui-daemon/xside.c
Comment thread gui-daemon/xside.c Outdated
@ArrayBolt3 ArrayBolt3 force-pushed the main branch 2 times, most recently from bec428c to 4f04d1b Compare October 28, 2024 19:24
@ArrayBolt3
Copy link
Copy Markdown
Contributor Author

ArrayBolt3 commented Oct 28, 2024

Should be ready for another review. Ended up doing two three force-pushes because I forgot to sign all my commits the first two times. That'll teach me to enable automatic commit signing :-/

The user may wish to prevent biometric information about their mouse
and keyboard patterns from leaking into certain Qubes. To make this
possible, this feature inserts random noise into the delivery timing of
all X events, making it more difficult to distinguish the user from
other X event buffering users. The maximum delay in event delivery is
user-configurable through the "events_max_delay" configuration option.
@ArrayBolt3
Copy link
Copy Markdown
Contributor Author

Gentle ping, is there any progress here, or any further changes that need to be made?

@marmarek
Copy link
Copy Markdown
Member

I've tested it a bit and have an observation: the more events are queued (or rather: the quicker new events are produced), the less random delay is. You can easily get into this situation with mouse move events. After initial short time, further events get delay near the max configured value (in the top 10-20ms, depending on movement speed). Then, you need to wait the configured max delay to regain full random delay range.
In practice, it means the random range is limited by the frequency of incoming events (except short initial time).

This is probably not an issue for keyboard events (delay between those events in practice is large enough to leave some room for randomness). But may be an issue for mouse events.
In practice this is okay, right? I think you mostly care about obfuscating typing pattern, which should still be okay (except short time after moving the mouse).
This also means that setting higher values for the max delay has not only UX implications, but also affects how long you need to wait after moving the mouse to regain full randomness range.

But if you do care about obfuscating mouse movements too, this PR isn't enough in its current shape. I haven't read research on this, but I'd expect that profiling based on mouse movements isn't only about speed but also (mostly) about move patterns (directions, mouse travel etc). This you don't change with delays at all. Some simple approach could be dropping some move events (which incidentally would also help with the earlier issue), but it's a risky thing to do (knowing which moves are safe to drop to not break functionality).

Anyway, I'm okay with merging this as is. But whoever wants to use this feature, needs to be aware of the limitation, and be careful about choosing max delay value.

@HW42
Copy link
Copy Markdown
Contributor

HW42 commented Nov 15, 2024

I've tested it a bit and have an observation: the more events are queued (or rather: the quicker new events are produced), the less random delay is.

This is exactly the behavior I was trying to point to with #149 (comment)
Sorry for not updating the comment with a better explanation.

Unless there's a good reason for the current behavior I think something like max(last, now) + random(min, max) would be most likely better.

@marmarek
Copy link
Copy Markdown
Member

marmarek commented Nov 15, 2024

Unless there's a good reason for the current behavior I think something like max(last, now) + random(min, max) would be most likely better.

The problem with this is you get no upper bound on the actual delay then. Imagine you set max delay to 500ms and then move the mouse a bit which, say, generate 200 events. Now you need to wait on average for 50s for the queue to process. It isn't hard to imagine more extreme cases...

@HW42
Copy link
Copy Markdown
Contributor

HW42 commented Nov 18, 2024

Ah, I see that problem now. Are shorter randomization delays good enough for mouse events (or in general non-keyboard events)? In that case maybe having different delay settings for keyboard and non-keyboard delays might be an easy and good solution.

In general: I haven't read up on the research about user input timing fingerprinting, so don't know what's critical and what's considered good enough to prevent fingerprinting. But the current implementation has rather surprising behavior. For example consider having a big delay setting as it has been mentioned in the comments above. When the users then starts typing fast, later keystrokes in the sequence will get much less randomization. So I would like to at least have confirmation that this is the intended behavior.

@ArrayBolt3
Copy link
Copy Markdown
Contributor Author

To be clear, the issue with "shrinking timeouts" exists in the upstream kloak code which this PR was based upon, and it still did a good job at obfuscating user identity according to tests run by kloak's original author. (Sadly the tool he used, KeyTrac, is no longer publicly available, so it's not so easy to test this anymore.) Thus I would say that this is intended behavior, since it's in line with behavior that is known to work well enough. (That being said, I do see how mouse and keyboard events will end up conflicting with each other to some degree, i.e. if someone's typing and moving the mouse at the same time, it will really reduce or eliminate the typing randomization. That's still in line with what kloak does, but it probably hasn't been tested the same way as a keyboard-only use case.)

Separating mouse and keyboard events is theoretically doable, and I'm willing to give it a shot if it would be useful. However, this does mean that the user may be moving the mouse, clicking on things, etc., while the keyboard buffer is still being flushed. With a larger timeout, I can easily see this resulting in the user's keystrokes going to the wrong app or input widgets.

@HW42
Copy link
Copy Markdown
Contributor

HW42 commented Nov 18, 2024

To be clear, the issue with "shrinking timeouts" exists in the upstream kloak code which this PR was based upon, and it still did a good job at obfuscating user identity according to tests run by kloak's original author. [...] Thus I would say that this is intended behavior, since it's in line with behavior that is known to work well enough. [...]

Ok, thanks for confirming. I guess this means the current solution could be considered good enough.

Separating mouse and keyboard events is theoretically doable, and I'm willing to give it a shot if it would be useful. However, this does mean that the user may be moving the mouse, clicking on things, etc., while the keyboard buffer is still being flushed. With a larger timeout, I can easily see this resulting in the user's keystrokes going to the wrong app or input widgets.

I didn't meant different queues, since those would lead to issues as you point out. What I was thinking of is limiting the randomization for non-keystroke events to much smaller values, such that those don't add up to big values, while still having longer delays for keyboard events. (But without any re-ordering. A event is always behind the previous in the queue and it's delayed is generated independently from the previous'.)

@ArrayBolt3
Copy link
Copy Markdown
Contributor Author

I've been working on a new implementation of kloak for non-Qubes systems that fixes the mouse anonymization issues of the original kloak implementation. The algorithm should be able to be ported to Qubes OS. The code is at https://github.com/ArrayBolt3/kloak-v2 if you're interested, the way the algorithm works is that when mouse movements are made, the last queued event is checked to see if it is a mouse event, and if so, that event's intended end location for the mouse are changed rather than adding a new mouse event to the queue. It's currently designed specifically for wlroots-based Wayland compositors (i.e. will not work on X11), so if you want to test it you'll probably want to test it using a fairly recent version of labwc (which is what I tested against). The Qubes reimplementation won't have the problem of being X11-incompatible since it will be working specifically with X11 input events rather than libinput events.

@3hhh
Copy link
Copy Markdown

3hhh commented May 10, 2025 via email

@ArrayBolt3
Copy link
Copy Markdown
Contributor Author

@3hhh Yes, it will, but we have an identical situation with the way keyboard patterns are obfuscated by kloak. The only way to make an obfuscation program for keyboard or mouse activity that could prevent fingerprinting without requiring a substantial userbase would be to clone someone else's keyboard and mouse usage patterns, which is many times more complicated if it's even possible with existing technology. kloak in its current incarnation is enabled by default in Whonix though, and the new version of kloak will likely be shipped with and enabled by default in Whonix as well, which will provide the needed userbase to allow for this to actually provide enhanced anonymity.

@adrelanos
Copy link
Copy Markdown
Member

adrelanos commented May 11, 2025

Quote Whonix: Identifiers Design Goals:

Should the goal be,

A) a shared personality: to have all Whonix appear with the same uniform fingerprint at all times, OR
B) a virtual personality: to invent and emulate a different unique fingerprint for each user?

Whonix design, at the time of writing, is A) (shared personality).

Tor and the Tor Browser take a similar approach to this issue. They do not try to generate a new random identity (pseudonym) each time they are used. Instead, their strategy is to make all users appear the same to outside observers. The Tor Project refers to this concept as Anonymity Loves Company (you can search the web for this term for more background). Since Whonix is designed as an extension of Tor, it follows the same principle.


Emulating a different virtual personality for each user would require extensive research, complex algorithms or even artificial intelligence (AI).

Due to practically unsolvable issues like VM Fingerprinting and Browser Fingerprinting, we don't attempt to implement a virtual personality and stick with the shared personality design.

@adrelanos
Copy link
Copy Markdown
Member

Documented here just now:
Keystroke and Mouse Deanonymization wiki page, limitations, Shared Personality.

@3hhh
Copy link
Copy Markdown

3hhh commented May 11, 2025 via email

@marmarek
Copy link
Copy Markdown
Member

wait for the final click to be made before passing the position

There a significant issue with this approach - you can't have hover effect anymore. For example when you want to see a tooltip, or an URL before clicking it.

@ArrayBolt3
Copy link
Copy Markdown
Contributor Author

Yeah, that's why I went with automatic updates of the mouse position. Delaying movement events until a click is made causes all sorts of problems - hover is broken, as Marek mentioned, but drag will also be somewhat broken, as well as anything that actually needs to track mouse movement to provide functionality (this is kind of a silly example but slither.io is a game that needs to track mouse movements to even be playable). Mouser tries to work around this by allowing the use of a middle click to move the mouse cursor, but that only solves the hover problem, and it also means the middle mouse button (which some applications have a legitimate use for) will also become unusable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

enable qvm-service gui-agent-virtual-input-device for Whonix-Workstation App Qubes by default

6 participants