Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kloak status #10

Open
HulaHoopWhonix opened this issue Aug 9, 2018 · 31 comments
Open

kloak status #10

HulaHoopWhonix opened this issue Aug 9, 2018 · 31 comments

Comments

@HulaHoopWhonix
Copy link

Hi, just pinging you about the status of this project. An eager user asked us about it :)

@vmonaco
Copy link
Owner

vmonaco commented Aug 9, 2018

Glad there's still interest, I will start to make this a priority. I just moved across country and transitioned to a new position, so productivity has been low lately, but that will change soon. I'll try to reproduce #1 next week. Thanks for checking in.

@HulaHoopWhonix
Copy link
Author

Awesome and congrats on your new job

@adrelanos
Copy link
Contributor

There certainly is interest! One of my most awaited software. Looking forward to add this as soon to Whonix as possible. An internet miscommunication. I guess I am refraining from bothering people I think are busy with other tasks when then got interpreted as non-interest. Learned something. :)

@vmonaco
Copy link
Owner

vmonaco commented Aug 17, 2018

It's no bother at all :) Sorry to have put this off for so long. I do have big plans in terms of obfuscating user behavior, not just keystroke biometrics. Also looking at how to obfuscate actions, e.g., temporal keylogging (see my '18 S&P paper). I'll have more bandwidth for this stuff in the new job.

@HulaHoopWhonix
Copy link
Author

The most comprehensive paper on the topic I've ever seen. Impressive and made a great read.

@PowerPress
Copy link

Really looking forward to all your enhancements!

@vmonaco
Copy link
Owner

vmonaco commented Feb 28, 2019

Sorry again for the hiatus. I've been working on this over the past couple weeks and recently made a number of improvements:

  • Rewrote kloak as a single threaded application and removed the dependency on pthreads. This should fix Repeating keys, leads to memory error #1
  • Added functions to auto detect the keyboard device and location of uinput. This should fix autodetect keyboard device #4
  • Added some quantization to emitted event times through a call to sleep in the main loop (a call to sleep() is necessary to avoid hogging the CPU, and this has the side effect of quantizing the time between output events).

Note that these changes still don't currently exclude the possibility of identifying kloak users from non-kloak users. I'm currently working on this - I think a potential solution is to spoof (another user or operating system) instead of obfuscation. More info to follow

@HulaHoopWhonix
Copy link
Author

Thanks for the update. Ready when you are :-) We are still very eager about Kloak and the need that it fills.

@PowerPress
Copy link

Curious how you would spoof an OS? Are you meaning the network stack?

@vmonaco
Copy link
Owner

vmonaco commented Mar 27, 2019

Should be good to go for testing. I updated the docs and just uploaded a deb to releases.

I tested on the latest Whonix in virtualbox and Ubuntu on bare metal with no issues.

The changelog still needs to be updated (and the package version?). I'm not entirely sure about the format of these files, so I'll refer to @adrelanos.

As for OS spoofing: I have some work in progress that indicates OS family and version can be determined from key event timings. This has to do with the global system clock (if any is used) and the way the scheduler handles interrupts. So, a website could fingerprint your host based on DOM input event timestamps, which defeats other methods of obfuscation, such as spoofing user agent string. I plan to address this issue in a future release of kloak after fleshing out the attack.

@adrelanos
Copy link
Contributor

The changelog still needs to be updated (and the package version?). I'm not entirely sure about the format of these files, so I'll refer to @adrelanos.

make uch (upstream changelog) which basically is just doing git log > changelog.upstream.

Packages without upstream changelog cause a lintian --pendantic warning. I just implemented changelog.upstream for perfection sake to eliminate this last lintian warning with a reasonable, acceptable implementation. Since Debian packages upstreams, I don't think there is a standard or convention for upstream changelogs. Unless you like to provide a hand typed (or some sort of fancy git log or similar command) changelog.upstream, I think git log is a good stopgap.

(Btw later for next version make deb-uachl-bumpup-major to increase debian/changelog which is also just a shortcut to call debchange.)

@adrelanos
Copy link
Contributor

https://twitter.com/Whonix/status/1111053743905226752

@HulaHoopWhonix
Copy link
Author

Package installed and service ran without a hitch. Trained Kloak / authenticated Kloak gives accuracy of 42%! :-D

I'll be soon posting results from another team member for the 2 other tests with normal training which I avoided for anonymity reasons.

Thanks for the incredible effort and dedication Vinnie.

@vmonaco
Copy link
Owner

vmonaco commented Mar 28, 2019

@adrelanos Thanks, I should have known to check the other genmkfile targets.

@HulaHoopWhonix Glad it works, and sorry I didn't get to this sooner! More updates to come this year as I work on a method to obfuscate mouse behavior.

@HulaHoopWhonix
Copy link
Author

HulaHoopWhonix commented Mar 29, 2019

@vmonaco Thanks again, can't wait to see the great things in store. :)

OK here's the results. He did three trials for each scenario for confirmation.

train normal / auth normal

trial (1) 94% accuracy identified
trial (2) 92% accuracy
trial (3) 94% ..

train normal / auth kloak

trial 1: 18%
trial 2: 15%
trial 3: 19%

train kloak / auth kloak

trial 1: 40%
trial 2: 42%
trial 3 36%

@vmonaco
Copy link
Owner

vmonaco commented Mar 29, 2019

Nice results.

I suspect that while kloak definitely obfuscates typing behavior, making it difficult to authenticate or identify a particular user, users running kloak may look "similar" to other users running kloak. That is, it might be possible to identify kloak users from non-kloak users. If this is the case, the anonymity set will increase as more users start running kloak.

@adrelanos
Copy link
Contributor

kloak - Keystroke-level online anonymization kernel: obfuscates typing behavior at the device level - Testers Wanted!

https://forums.whonix.org/t/kloak-keystroke-level-online-anonymization-kernel-obfuscates-typing-behavior-at-the-device-level-testers-wanted/7089

https://twitter.com/Whonix/status/1113071411025928192

https://www.facebook.com/Whonix/photos/a.1138314816210772/2618285614880344

@HulaHoopWhonix
Copy link
Author

HulaHoopWhonix commented Apr 7, 2019

With kloak running concurrently on both the host and VM I get an even better result of <10% accuracy with train kloak / auth kloak (when testing from within VM).

Testing longer paragraphs yields same results.

@HulaHoopWhonix
Copy link
Author

keytrac.net recently switched to longer text paragraphs for authentication only. Here are the test results from someone in the Whonix team. @vmonaco what do you make of the accuracy level of "train kloak/test kloak"?

Train normal, test kloak

Test 1: 06% accuracy
Test 2: 08% ...
Test 3: 12% .

Train kloak, test kloak

Test 1: 75% accuracy
Test 2: 70% ...
Test 3: 73% .

Train normal, test normal

Test 1: 98% accuracy
Test 2: 96% ...
Test 3: 96% .

@vmonaco
Copy link
Owner

vmonaco commented May 21, 2019

Thanks for letting me know. It looks like keytrac also underwent some rebranding since I last checked.

Re. the result above, the relative high accuracy in the train kloak/test kloak scenario (compared to others that have tested) highlights one of the current limitations of kloak: it obfuscates your actual typing behavior (achieves low accuracy in the train normal/test kloak scenario), but does not attempt to make two different kloak sessions from the same user look different. Using kloak, typing behavior starts to look more like white noise. But, it does this for everyone, so two different kloak users will both start to look like this white noise. That is, as more people use kloak, the size of the anonymity set will grow.

So currently, kloak currently tries to: obfuscate your own behavior, and make everyone look similar (can't differentiate between kloak users). I have some ideas for other privacy objectives, such as obfuscating your own behavior and make everyone look different. This could be done by spoofing another (made up) identity, which would make it difficult to detect kloak vs non-kloak users. This spoofed identity could change over time, say at each login.

With that said, the default max delay of 100 ms might not be the best option for everyone. This really depends on typing speed - slower typists should use a larger max delay. A to do item is dynamically adjust the max delay to typing speed.

Edit: can differentiate -> can't differentiate

@adrelanos
Copy link
Contributor

Tor Project gave up on making users appear different across different sessions. Instead, they attempt to put all Tor Browser users into the same anonymity set. (Or multiple sets according to security slider settings.) Dunno if this would apply here too.

Everyone looking same, everyone "looking kloak" might be sufficient. Better than, i.e. a made up identity, may not be possible or worth it?

With that said, the default max delay of 100 ms might not be the best option for everyone. This really depends on typing speed - slower typists should use a larger max delay. A to do item is dynamically adjust the max delay to typing speed.

That sounds great! Perhaps 3-4 (as much as needed) anonymity sets for different speeds of typists?

@HulaHoopWhonix
Copy link
Author

@adrelanos If everybody looks kloak, but uniquely differ from each other and their style with kloak is the same across all sessions, you would have pseudonymous typing patterns. If a user types with kloak once non-anonymously, an adversary with stored patterns can go back and link all texts typed by the same person.

@adrelanos
Copy link
Contributor

If everybody looks kloak, but uniquely differ from each other and their style with kloak is the same across all sessions, you would have pseudonymous typing patterns.

That would be bad indeed.

That btw not what I meant, I think. What I meant to say is "If everyone is looking the same, if everyone looking kloak without uniquely identifiable pseudonym, then that's not that bad." That's what we are used to with Tor Browser too.

But I indeed overlooked something important here. What if someone uses kloak non-anonymously first and anonymously later (or vice versa). Since the number of kloak users will be initially low, data harvesters could just guess (give a probability) that it's the same person. Could we derive a recommendation "don't ever use kloak non-anonymously, only use kloak anonymously" from that?

In this context...

This could be done by spoofing another (made up) identity, which would make it difficult to detect kloak vs non-kloak users.

In this context I somehow doubt that's possible. Suppose someone is using a standard browser like most people are doing nowadays and is completely tracked by cookies (or similar tracking technology for the sake of argument). If the typing fingerprint changes all the time to another made up identity, then that is quite unlikely from the perspective of the data harvester and the data harvester would more likely conclude "user of kloak".

This spoofed identity could change over time, say at each login.

If going for this: why not change the spoofed identity all the time, why only at some to be specified trigger (such as login)?

@vmonaco
Copy link
Owner

vmonaco commented May 27, 2019

That sounds great! Perhaps 3-4 (as much as needed) anonymity sets for different speeds of typists?

That btw not what I meant, I think. What I meant to say is "If everyone is looking the same, if everyone looking kloak without uniquely identifiable pseudonym, then that's not that bad." That's what we are used to with Tor Browser too.

Yes, I think that's a good approach. Presumably, most kloak users are Tor users, so having a similar anonymity model makes sense. In this case, the "slider" would be the max delay setting, choosing a higher value to be part of a stronger anonymity set.

But I indeed overlooked something important here. What if someone uses kloak non-anonymously first and anonymously later (or vice versa). Since the number of kloak users will be initially low, data harvesters could just guess (give a probability) that it's the same person. Could we derive a recommendation "don't ever use kloak non-anonymously, only use kloak anonymously" from that?

Since kloak tries to give everyone the "same pseudonym", this is certainly a concern. This is one motivation for making that pseudonym a moving target. With a low number of kloak users, it's also a concern that the small anonymity enables tracking kloak users, assuming it's easy to identify kloak vs non-kloak users (which I think it is). But from your comments above, placing all users in the same anonymity set and making the recommendation to only use kloak anonymously seems like a good compromise.

In this context I somehow doubt that's possible. Suppose someone is using a standard browser like most people are doing nowadays and is completely tracked by cookies (or similar tracking technology for the sake of argument). If the typing fingerprint changes all the time to another made up identity, then that is quite unlikely from the perspective of the data harvester and the data harvester would more likely conclude "user of kloak".

If going for this: why not change the spoofed identity all the time, why only at some to be specified trigger (such as login)?

Yep, the change could be continuous. But probably with the same caveat you point out above (an indication of someone using the tool).

Thinking generally: ideally, a behavior obfuscation tool like kloak would make it difficult to differentiate between:

  1. My obfuscated self and my true self
  2. My obfuscated self in two different sessions
  3. Two different obfuscated users
  4. Tool users and non-users

The challenge is, some of these objectives are competing (potentially 3 and 4) and make other assumptions (if 4 can't be achieved, 3 assumes all users agree to be in the same anonymity set).

There are some other recent works on obfuscating behavior, like authorship (https://www.usenix.org/conference/usenixsecurity18/presentation/shetty). Compared to anonymizing networks, I don't think we yet have a good framework for reasoning about these methods (I'm working on this!).

@vmonaco
Copy link
Owner

vmonaco commented Sep 10, 2019

FYI, here's another use case for kloak: info leaked through network traffic induced by keystrokes.

@adrelanos
Copy link
Contributor

Happy to announce that kloak is installed by default for all users of Non-Qubes-Whonix.

(Qubes-Whonix issue: QubesOS/qubes-issues#2558)

Documented here:
https://www.whonix.org/wiki/Keystroke_Deanonymization

@PowerPress
Copy link

PowerPress commented Sep 12, 2019 via email

@adrelanos
Copy link
Contributor

The ticket for Qubes-Whonix is here:

Is Kloak available for Qubes-Whonix?

@adrelanos
Copy link
Contributor

@siliconwaffle
Copy link

Has kloak been abandoned? I noticed @vmonaco has been pretty much completely inactive from Github for ~5 months now, and this repo hasn't been touched for ~6 months. @adrelanos are you or someone else with Whonix still maintaining kloak independent of @vmonaco? I want to submit a kloak package for Fedora, should I just track Whonix's kloak repo instead of this one?

@adrelanos
Copy link
Contributor

adrelanos commented Mar 24, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants
@vmonaco @adrelanos @PowerPress @HulaHoopWhonix @siliconwaffle and others