Skip to content

Export and visualize the keystroke data generated by GiacomoLaw's Keylogger Tool tool.

Notifications You must be signed in to change notification settings

calebfergie/keylogger-parsing

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Keylogger Parsing

This is a repository for a tool to export and visualize the keystroke data generated by GiacomoLaw's Keylogger Tool tool.

This tool ingests the keystroke.log file of the Keylogger tool and makes two things:

1. Five JSON files:

  1. Frequency of single commands typed (e.g. [return],[left-cmd], [del]).

  2. Frequency of single non-commands (keystrokes & words) typed (e.g. t,13, not).

  3. Frequency of bi-grams (2 keystroke combinations - e.g. hi+there, [left-cmd]+[tab], ma+[return]).

  4. Frequency of tri-grams (3 keystroke combinations - e.g. [left-cmd]+[left-shift]+v, [left-cmd]+[tab]+[tab], i+love+you).

^these 4 JSON files can be used for data analysis.

  1. Frequency of bigrams (like #3, but) in the "graph" data format used by the D3 Sankey.

^this JSON is used to create the visualization depicted above.

B. A Sankey Diagram of Bi-grams:

An interactive visualization of bi-grams made with Evan Galloway's D3 Sankey Diagram. You can see my version of it here.

Using This Tool - Quick & Dirty

If you have data from GiacomoLaw's Keylogger Tool in a keystroke.log file, you can use this tool by following these steps:

  1. Clone or download this repository: git clone https://github.com/calebfergie/keylogger-parsing.git

  2. Install node dependencies: cd keylogger-parsing && npm install

  3. Add/copy your keystroke.log file into the data folder - public/data

  4. Start the node server: node bin/www

You should see the following in your terminal

updated words JSON
updated bigrams JSON
updated commands JSON
updated trigrams JSON
updated bigrams-sankey JSON
finished running log parser

...and the data folder should now have files (commands.json,words.json,bigrams.json,trigrams.json) updated with your data.

If you navigate to localhost:5000 in your browser, the sankey digram should appear. It is slightly interactive, try dragging the nodes up & down.

Using This Tool - Details

JSON File Details

The JSON files mentioned above are formatted as in the examples below:

  1. Frequency of single commands typed (e.g. [return],[left-cmd], [del]):
[{"value":"left-cmd","type":"command","frequency":62706},
{"value":"del","type":"command","frequency":33336},
{"value":"left-shift","type":"command","frequency":27040}]
  1. Frequency of single non-commands (keystrokes & words) typed (e.g. t,13, not):
[{"value":"if","type":"character","frequency":97}},
{"value":"can","type":"character","frequency":96},
{"value":"do","type":"character","frequency":95}]
  1. Frequency of bi-grams (2 keystroke combinations - e.g. hi+there, [left-cmd]+[tab], ma+[return]):
[{"value":["left-cmd","v"],"frequency":2496},
{"value":["left-cmd","c"],"frequency":2388},
{"value":["left-option","left-shift"],"frequency":2206}]
  1. Frequency of tri-grams (3 keystroke combinations - e.g. [left-cmd]+[left-shift]+v, [left-cmd]+[tab]+[tab], i+love+you):
[{"value":["return","return","return"],"frequency":718},
{"value":["left","left-option","left-shift"],"frequency":713},
{"value":["s","left-cmd","left-cmd"],"frequency":712}]
  1. Frequency of bigrams in the "graph" data format used by the D3 Sankey:
  {
    "nodes":[
            {"name":"left-cmd","type":"source"},
            {"name":"down","type":"target"}
            ...],
    "links":[
            {"source":14,"target":3,"value":527},
            {"source":14,"target":41,"value":526}
            ...]
  }       

Parsing the .log file:

This tool is written with node.js with the code to process keystroke.log is stored in the log-parser.js in the repository.

The bigrams.json and trigrams.json files don't include all bi-grams and tri-grams. They are limited to results that appear with a certain frequency (or more). You can change this frequency by changing the value of freqFilter in the file log-parser.js file, set to 250 in the example below:

var freqFilter = 250; //minimum number of occurrences to be included in the output

The app.js file runs the log-parser.js file and then serves the D3 visualization through an express server.

The code for the D3 tool is adapted from Evan Galloway's D3 Sankey Diagram, stored in the file galloway-sankey.js.

Known Issues

1. Presses and Releases Are both Recorded

The Keylogger records both the press and release of some commands (e.g. [shift], [cmd], [ctrl]). For example, the keystroke combo Command+Tab would actually appear as ["left-cmd", "tab", "left-cmd"]. Here's a video demonstrating what I mean.

This 'double-dipping' effect makes it harder to analyze this information, as there is a superfluous keystroke injected between other real ones.

I put in a feature request for this on GitHub, so we'll see if any update occur. Otherwise, log-parser.js file will need to be updated to handle this.

2. Words that are also Array Methods

Words that are also array methods (e.g. push, pop, shift) are not processed correctly for the D3 data viz by log-parser.js. For my personal data set, I added the following alterations to handle it for source and target nodes:

if (source.match(/^(push|find|keys|some|map|shift|every|pop|unshift)$/)) {
  source = source + "_"
}

and...

if (target.match(/^(push|find|keys|some|map|shift|every|pop|unshift)$/)) {
  target = target + "_"
}

If you are receiving an error that reads: could not find X of type Y in the nodes array - this will create an error in the sankey diagram, add the word X to the list of words above.

Context

This tool was made in order to perform analysis on my own keystroke data. Use at your own risk! ⚠️

It was done in an effort to understand my conscious and subconscious decisions - as part of NYU ITPs Rest of You class.

  • Feb. 4: Installed this keylogger on my mac.
  • Feb. 23: Created first log-parser.js file.
  • March 23: Added sankey data visualization & cleaned up tool

Data Analysis

I was mostly interested in what keystrokes I typed in combination - keyboard shortcuts (e.g. ctrl+c, ctrl+v, ctrl+tab) and repeated key presses (e.g. tab+tab+tab, delete+delete+delete).

You can read more about it here.

About

Export and visualize the keystroke data generated by GiacomoLaw's Keylogger Tool tool.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published