Skip to content

Datasets created for the purpose of classifying keyboard acoustic emanations.

License

Notifications You must be signed in to change notification settings

RoyalDonkey/put-kbd-thesis-datasets

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Keyboard Acoustic Emanations datasets

This repository holds 5 distinct datasets, each containing recordings of sounds of individual keystrokes. These datasets were created for classification purposes for the bachelor thesis The sound of typing: using Machine Learning to classify Keyboard Acoustic Emanations by Marcin Gólski, Piotr Kaszubski and Bartłomiej Woroch.

Dataset structure

Each dataset contains 110 (100 train + 10 test) recordings of each of 43 physical keys: A-Z, 0-9, '-' (dash), ';' (semicolon), "'" (quote), ',' (comma), '.' (period), '/' (forward slash) and ' ' (space).

Every train and test set is stored as a .wav file and a corresponding .keys file. A .keys file describes the offset in seconds of each keystroke within a corresponding .wav file, stored in plain text format.

More information

For more information on how the datasets were recorded, their structure, and how to use them, please read the original paper and check the official source code repository: https://github.com/RoyalDonkey/put-kbd-thesis.

About

Datasets created for the purpose of classifying keyboard acoustic emanations.

Topics

Resources

License

Stars

Watchers

Forks