# Keyboard Walking Mitigation
## Jeremy Filizetti (@JayFoxtrot)
---


## What is it?

Creating a password based on a keyboard pattern rather then memorizing the actual characters

## How did I get here?

### Poor Policies
Password policies that are obnoxious and impractical

- 12-16 characters long including 4 character classes (upper, lower, number, symbol)
- No words can be part of the password
- Maximum of 3-4 of one character class before changing
- No reuse of the last 20 passwords
- Changes every 60-90 days
- No use of a password manager
- Can not match a password use elsewhere
      
### Why is this guidance ignored

- Impractical guidance with many accounts 
- Some people are lazy (like me)
- Easy to meet complex passwords with minimal thought
- Easy to accomodate frequent pasword changes
  
### Modern Guidance

From NIST SP800-63b

- Verifiers SHALL require subscriber-chosen memorized secrets to be at least 8 characters in length. Verifiers SHOULD permit subscriber-chosen memorized secrets at least 64 characters in length
- When processing requests to establish and change memorized secrets, verifiers SHALL compare the prospective secrets against a list that contains values known to be commonly-used, expected, or compromised. For example, the list MAY include, but is not limited to:
    - Passwords obtained from previous breach corpuses.
    - Dictionary words.
    - Repetitive or sequential characters (e.g. ‘aaaaaa’, ‘1234abcd’).
    - Context-specific words, such as the name of the service, the username, and derivatives thereof.
- Verifiers SHOULD NOT impose other composition rules (e.g., requiring mixtures of different character types or prohibiting consecutively repeated characters) for memorized secrets. Verifiers SHOULD NOT require memorized secrets to be changed arbitrarily (e.g., periodically). However, verifiers SHALL force a change if there is evidence of compromise of the authenticator


## Why do I care?

- Passwords are still all over despite attempts to supplant them
- MFA isn't easy to deploy everywhere
    - In the environments I commonly work cell phones and USB devices are not permitted
    - Some horrible implementations of still require a separate password after the initial sign on
- Large blacklists have usability concerns
    - Passwords shouldn't be passed in non-cryptographically form so using a centrally located large lists is impractical
    - 16 character keyboard walk file is approximately 5 GB
        - https://github.com/Rich5/Keyboard-Walk-Generators/blob/master/README.txt
    - Performance to parse large lists could be prohibative
- If things can be done algorithmically this eliminates the need to create and manage large lists


## What's here?
- Characterizing keyboard walking
- Quantifying keyboard walking factors
- Effectiveness against most common passwords
- Code used for the python side of analysis
- Analysis of on several lists
    - English Language Words
    - Hashcat kwprocessor (utility for generating keyboard walks)
    - pwqgen generated passwords
    - keepassxc generated passwords
    - rockyou.txt
    - rockyou2021.txt


## What's missing?
- I didn't find a lot of details on the internet even trying to prevent this so perhaps there is a better resource
- Only english addressed here
- Currently no Unicode or spaces
    - Characters that don't match a standard keyboard character are ignored for the determination
- CSV makes it problematic for pipe separator in passwords
    - Will fix in the future
- A specific implementation
    - Just haven't made it here yet but its in the plans
    - Will be a linux PAM module for the change auth token phase



---

# Detecting keyboard walking

## Attributes of Keyboard Walking

- Distance between keys
- Direction of travel


## Measure Keyboard distances

- To measure that distance the first character of the password is considered the base and measurements start with the second character.
- For the sake of this talk the A and " key are no adjacent

|  |  |
:-:|:-:
![Distance 0](images/base-d-w0.png) | ![Distance 1](images/base-d-w1.png)
![Distance 2](images/base-d-w2.png) | ![Distance 3](images/base-d-w3.png) 
![Distance 4](images/base-d-w4.png) | ![Distance 5](images/base-d-w5.png)


## Capture Directions

- Look at the direction of travel
- Group directions

Why group directions?  For purposes of a walk directions will be symmetrical.  Consider the following

|  |  |
:-:|:-:
Horizontal (&#x2194;) | Vertical (&#x2195;)
![Direction Horizontal](images/horizontal-d.png) | ![Direction Horizontal](images/vertical-d.png)
Positive Slope (&#x2922;)| Negative Slope (&#x2921;)
![Direction Horizontal](images/posslope-d.png) | ![Direction Horizontal](images/negslope-d.png)




---

# Quantifying Keyboard Walking Factors

1. At the most basic level keyboard walking won't move more then one space
    - Distances to the number of distances equal to 0?
    - Distances to the number of distances equal to 1?
3. Find sequences in the distance vector
4. Partial complete sequences
5. Consideration for the direction to minimize false-positives


## 1. Looking at distances of 0 and 1

- Small distances mean minimal keyboard travel
- Most password criteria would reject repeat characters.
    - Except the shift key changes that factor
- Direction not really a factor

### Examples

Example 1 <br>
Password:  dDfFrReE3#4$ <br>
Distance:  0,1,0,1,0,1,0,1,0,1,0 <br>
Direction: &#x2b6f; &#x2194; &#x2b6f; &#x2195; &#x2b6f; &#x2194; &#x2b6f; &#x2195; &#x2b6f; &#x2194; &#x2b6f;<br>
Factors:  6 of 11 distances = 0;   5 of 11 distances = 1;  11 of 11 distances = 0 or 1 <br>

![example1.gif](images/example1.gif)


Example 2 <br>
Password: qwer4321QWER$#@! <br>
Distance: 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1<br>
Direction: &#x2194; &#x2194; &#x2194; &#x2195; &#x2194; &#x2194; &#x2194; &#x2195; &#x2194; &#x2194; &#x2194; &#x2195; &#x2194; &#x2194; &#x2194; <br>
Factors: 15 of 15 distances = 1 <br>

![example2.gif](images/example2.gif)

Example 3 <br>
Password: zxcvfdsaqwer4321 <br>
Distance: 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1 <br>
Direction: &#x2194; &#x2194; &#x2194; &#x2195; &#x2194; &#x2194; &#x2194; &#x2195; &#x2194; &#x2194; &#x2194; &#x2195; &#x2194; &#x2194; &#x2194; <br>
Factors: 15 of 15 distances = 1 <br>

![example3.gif](images/example3.gif)

Example 4 <br>
Password: zxcvfdsaqwer4321 <br>
Distance: 1,1,1,4,1,1,1,4,1,1,1,4,1,1,1 <br>
Direction: &#x2194; &#x2194; &#x2194; &#x2921; &#x2194; &#x2194; &#x2194; &#x2921; &#x2194; &#x2194; &#x2194; &#x2921; &#x2194; &#x2194; &#x2194; <br>
Factors: 12 of 15 (80%) distances = 1<br>

![example3.gif](images/example4.gif)

Example 5 - The limits of small distances<br>
Password: 1q2w3e4r!Q@W#E$R <br>
Distance: 1,2,1,2,1,2,1,4,1,2,1,2,1,2,1 <br>
Direction: &#x2195; &#x2922; &#x2195; &#x2922; &#x2195; &#x2922; &#x2195; &#x2921; &#x2195; &#x2922; &#x2195; &#x2922; &#x2195; &#x2922; &#x2195; <br>
Factors: 8 of 15 distances = 1<br>

![example5.gif](images/example5.gif)

Example 6 - The need for something more complex<br>
Password: bzgatq51BZGATQ%! <br>
Distance: 4,5,4,5,4,5,4,7,4,5,4,5,4,5,4 <br>
Direction: &#x2194; &#x2922; &#x2194; &#x2922; &#x2194; &#x2922; &#x2194; &#x2921; &#x2194; &#x2922; &#x2194; &#x2922; &#x2194; &#x2922; &#x2194; <br>
Factors: None<br>

![example6.gif](images/example6.gif)

## 2. Find matching sequences in the distance vector

Look back at example 5 and there appears to be some patterns in the distance vector

    Password: 1q2w3e4r!Q@W#E$R
    Distance: 1,2,1,2,1,2,1,4,1,2,1,2,1,2,1


- Of all the possible sequences found what makes up the largest portion?
- Find sequences using a sliding window of increasing size matches
    - Find upto floor(distance vector size / 2)
    - Use a sliding window to start search and traverse entire vector

<pre> 
    Sequence length:  1/1/1   distance_total:  2 <span style="color:#00AA00">1</span>2<span style="color:#AA0000">1</span>212141212121
    Sequence length:  1/2/1   distance_total:  3 <span style="color:#00AA00">1</span>212<span style="color:#AA0000">1</span>2141212121
    Sequence length:  1/3/1   distance_total:  4 <span style="color:#00AA00">1</span>21212<span style="color:#AA0000">1</span>41212121
    Sequence length:  1/4/1   distance_total:  5 <span style="color:#00AA00">1</span>2121214<span style="color:#AA0000">1</span>212121
    Sequence length:  1/5/1   distance_total:  6 <span style="color:#00AA00">1</span>212121412<span style="color:#AA0000">1</span>2121
    Sequence length:  1/6/1   distance_total:  7 <span style="color:#00AA00">1</span>21212141212<span style="color:#AA0000">1</span>21
    Sequence length:  1/7/1   distance_total:  8 <span style="color:#00AA00">1</span>2121214121212<span style="color:#AA0000">1</span>
    Sequence length:  1/1/1   distance_total:  2 1<span style="color:#00AA00">2</span>1<span style="color:#AA0000">2</span>12141212121
    Sequence length:  1/2/1   distance_total:  3 1<span style="color:#00AA00">2</span>121<span style="color:#AA0000">2</span>141212121
    Sequence length:  1/3/1   distance_total:  4 1<span style="color:#00AA00">2</span>1212141<span style="color:#AA0000">2</span>12121
    Sequence length:  1/4/1   distance_total:  5 1<span style="color:#00AA00">2</span>121214121<span style="color:#AA0000">2</span>121
    Sequence length:  1/5/1   distance_total:  6 1<span style="color:#00AA00">2</span>12121412121<span style="color:#AA0000">2</span>1
    Sequence length:  1/1/1   distance_total:  2 12<span style="color:#00AA00">1</span>2<span style="color:#AA0000">1</span>2141212121
    Sequence length:  1/2/1   distance_total:  3 12<span style="color:#00AA00">1</span>212<span style="color:#AA0000">1</span>41212121
    Sequence length:  1/3/1   distance_total:  4 12<span style="color:#00AA00">1</span>21214<span style="color:#AA0000">1</span>212121
    Sequence length:  1/4/1   distance_total:  5 12<span style="color:#00AA00">1</span>2121412<span style="color:#AA0000">1</span>2121
    Sequence length:  1/5/1   distance_total:  6 12<span style="color:#00AA00">1</span>212141212<span style="color:#AA0000">1</span>21
    Sequence length:  1/6/1   distance_total:  7 12<span style="color:#00AA00">1</span>21214121212<span style="color:#AA0000">1</span>
    Sequence length:  1/1/1   distance_total:  2 121<span style="color:#00AA00">2</span>1<span style="color:#AA0000">2</span>141212121
    Sequence length:  1/2/1   distance_total:  3 121<span style="color:#00AA00">2</span>12141<span style="color:#AA0000">2</span>12121
    Sequence length:  1/3/1   distance_total:  4 121<span style="color:#00AA00">2</span>1214121<span style="color:#AA0000">2</span>121
    Sequence length:  1/4/1   distance_total:  5 121<span style="color:#00AA00">2</span>121412121<span style="color:#AA0000">2</span>1
    Sequence length:  1/1/1   distance_total:  2 1212<span style="color:#00AA00">1</span>2<span style="color:#AA0000">1</span>41212121
    Sequence length:  1/2/1   distance_total:  3 1212<span style="color:#00AA00">1</span>214<span style="color:#AA0000">1</span>212121
    Sequence length:  1/3/1   distance_total:  4 1212<span style="color:#00AA00">1</span>21412<span style="color:#AA0000">1</span>2121
    Sequence length:  1/4/1   distance_total:  5 1212<span style="color:#00AA00">1</span>2141212<span style="color:#AA0000">1</span>21
    Sequence length:  1/5/1   distance_total:  6 1212<span style="color:#00AA00">1</span>214121212<span style="color:#AA0000">1</span>
    Sequence length:  1/1/1   distance_total:  2 12121<span style="color:#00AA00">2</span>141<span style="color:#AA0000">2</span>12121
    Sequence length:  1/2/1   distance_total:  3 12121<span style="color:#00AA00">2</span>14121<span style="color:#AA0000">2</span>121
    Sequence length:  1/3/1   distance_total:  4 12121<span style="color:#00AA00">2</span>1412121<span style="color:#AA0000">2</span>1
    Sequence length:  1/1/1   distance_total:  2 121212<span style="color:#00AA00">1</span>4<span style="color:#AA0000">1</span>212121
    Sequence length:  1/2/1   distance_total:  3 121212<span style="color:#00AA00">1</span>412<span style="color:#AA0000">1</span>2121
    Sequence length:  1/3/1   distance_total:  4 121212<span style="color:#00AA00">1</span>41212<span style="color:#AA0000">1</span>21
    Sequence length:  1/4/1   distance_total:  5 121212<span style="color:#00AA00">1</span>4121212<span style="color:#AA0000">1</span>
    Sequence length:  1/1/1   distance_total:  2 12121214<span style="color:#00AA00">1</span>2<span style="color:#AA0000">1</span>2121
    Sequence length:  1/2/1   distance_total:  3 12121214<span style="color:#00AA00">1</span>212<span style="color:#AA0000">1</span>21
    Sequence length:  1/3/1   distance_total:  4 12121214<span style="color:#00AA00">1</span>21212<span style="color:#AA0000">1</span>
    Sequence length:  1/1/1   distance_total:  2 121212141<span style="color:#00AA00">2</span>1<span style="color:#AA0000">2</span>121
    Sequence length:  1/2/1   distance_total:  3 121212141<span style="color:#00AA00">2</span>121<span style="color:#AA0000">2</span>1
    Sequence length:  1/1/1   distance_total:  2 1212121412<span style="color:#00AA00">1</span>2<span style="color:#AA0000">1</span>21
    Sequence length:  1/2/1   distance_total:  3 1212121412<span style="color:#00AA00">1</span>212<span style="color:#AA0000">1</span>
    Sequence length:  1/1/1   distance_total:  2 12121214121<span style="color:#00AA00">2</span>1<span style="color:#AA0000">2</span>1
    Sequence length:  1/1/1   distance_total:  2 121212141212<span style="color:#00AA00">1</span>2<span style="color:#AA0000">1</span>
    Sequence length:  2/1/2   distance_total:  4 <span style="color:#00AA00">12</span><span style="color:#AA0000">12</span>12141212121
    Sequence length:  2/2/2   distance_total:  6 <span style="color:#00AA00">12</span>12<span style="color:#AA0000">12</span>141212121
    Sequence length:  2/3/2   distance_total:  8 <span style="color:#00AA00">12</span>121214<span style="color:#AA0000">12</span>12121
    Sequence length:  2/4/2   distance_total: 10 <span style="color:#00AA00">12</span>12121412<span style="color:#AA0000">12</span>121
    Sequence length:  2/5/2   distance_total: 12 <span style="color:#00AA00">12</span>1212141212<span style="color:#AA0000">12</span>1
    Sequence length:  2/1/2   distance_total:  4 1<span style="color:#00AA00">21</span><span style="color:#AA0000">21</span>2141212121
    Sequence length:  2/2/2   distance_total:  6 1<span style="color:#00AA00">21</span>21<span style="color:#AA0000">21</span>41212121
    Sequence length:  2/3/2   distance_total:  8 1<span style="color:#00AA00">21</span>212141<span style="color:#AA0000">21</span>2121
    Sequence length:  2/4/2   distance_total: 10 1<span style="color:#00AA00">21</span>21214121<span style="color:#AA0000">21</span>21
    Sequence length:  2/5/2   distance_total: 12 1<span style="color:#00AA00">21</span>2121412121<span style="color:#AA0000">21</span>
    Sequence length:  2/1/2   distance_total:  4 12<span style="color:#00AA00">12</span><span style="color:#AA0000">12</span>141212121
    Sequence length:  2/2/2   distance_total:  6 12<span style="color:#00AA00">12</span>1214<span style="color:#AA0000">12</span>12121
    Sequence length:  2/3/2   distance_total:  8 12<span style="color:#00AA00">12</span>121412<span style="color:#AA0000">12</span>121
    Sequence length:  2/4/2   distance_total: 10 12<span style="color:#00AA00">12</span>12141212<span style="color:#AA0000">12</span>1
    Sequence length:  2/1/2   distance_total:  4 121<span style="color:#00AA00">21</span><span style="color:#AA0000">21</span>41212121
    Sequence length:  2/2/2   distance_total:  6 121<span style="color:#00AA00">21</span>2141<span style="color:#AA0000">21</span>2121
    Sequence length:  2/3/2   distance_total:  8 121<span style="color:#00AA00">21</span>214121<span style="color:#AA0000">21</span>21
    Sequence length:  2/4/2   distance_total: 10 121<span style="color:#00AA00">21</span>21412121<span style="color:#AA0000">21</span>
    Sequence length:  2/1/2   distance_total:  4 1212<span style="color:#00AA00">12</span>14<span style="color:#AA0000">12</span>12121
    Sequence length:  2/2/2   distance_total:  6 1212<span style="color:#00AA00">12</span>1412<span style="color:#AA0000">12</span>121
    Sequence length:  2/3/2   distance_total:  8 1212<span style="color:#00AA00">12</span>141212<span style="color:#AA0000">12</span>1
    Sequence length:  2/1/2   distance_total:  4 12121<span style="color:#00AA00">21</span>41<span style="color:#AA0000">21</span>2121
    Sequence length:  2/2/2   distance_total:  6 12121<span style="color:#00AA00">21</span>4121<span style="color:#AA0000">21</span>21
    Sequence length:  2/3/2   distance_total:  8 12121<span style="color:#00AA00">21</span>412121<span style="color:#AA0000">21</span>
    Sequence length:  2/1/2   distance_total:  4 12121214<span style="color:#00AA00">12</span><span style="color:#AA0000">12</span>121
    Sequence length:  2/2/2   distance_total:  6 12121214<span style="color:#00AA00">12</span>12<span style="color:#AA0000">12</span>1
    Sequence length:  2/1/2   distance_total:  4 121212141<span style="color:#00AA00">21</span><span style="color:#AA0000">21</span>21
    Sequence length:  2/2/2   distance_total:  6 121212141<span style="color:#00AA00">21</span>21<span style="color:#AA0000">21</span>
    Sequence length:  2/1/2   distance_total:  4 1212121412<span style="color:#00AA00">12</span><span style="color:#AA0000">12</span>1
    Sequence length:  2/1/2   distance_total:  4 12121214121<span style="color:#00AA00">21</span><span style="color:#AA0000">21</span>
    Sequence length:  3/1/3   distance_total:  6 <span style="color:#00AA00">121</span>2<span style="color:#AA0000">121</span>41212121
    Sequence length:  3/2/3   distance_total:  9 <span style="color:#00AA00">121</span>21214<span style="color:#AA0000">121</span>2121
    Sequence length:  3/3/3   distance_total: 12 <span style="color:#00AA00">121</span>212141212<span style="color:#AA0000">121</span>
    Sequence length:  3/1/3   distance_total:  6 1<span style="color:#00AA00">212</span>12141<span style="color:#AA0000">212</span>121
    Sequence length:  3/1/3   distance_total:  6 12<span style="color:#00AA00">121</span>214<span style="color:#AA0000">121</span>2121
    Sequence length:  3/2/3   distance_total:  9 12<span style="color:#00AA00">121</span>2141212<span style="color:#AA0000">121</span>
    Sequence length:  3/1/3   distance_total:  6 121<span style="color:#00AA00">212</span>141<span style="color:#AA0000">212</span>121
    Sequence length:  3/1/3   distance_total:  6 1212<span style="color:#00AA00">121</span>4<span style="color:#AA0000">121</span>2121
    Sequence length:  3/2/3   distance_total:  9 1212<span style="color:#00AA00">121</span>41212<span style="color:#AA0000">121</span>
    Sequence length:  3/1/3   distance_total:  6 12121214<span style="color:#00AA00">121</span>2<span style="color:#AA0000">121</span>
    Sequence length:  4/1/4   distance_total:  8 <span style="color:#00AA00">1212</span>1214<span style="color:#AA0000">1212</span>121
    Sequence length:  4/1/4   distance_total:  8 1<span style="color:#00AA00">2121</span>2141<span style="color:#AA0000">2121</span>21
    Sequence length:  4/1/4   distance_total:  8 12<span style="color:#00AA00">1212</span>14<span style="color:#AA0000">1212</span>121
    Sequence length:  4/1/4   distance_total:  8 121<span style="color:#00AA00">2121</span>41<span style="color:#AA0000">2121</span>21
    Sequence length:  5/1/5   distance_total: 10 <span style="color:#00AA00">12121</span>214<span style="color:#AA0000">12121</span>21
    Sequence length:  5/1/5   distance_total: 10 1<span style="color:#00AA00">21212</span>141<span style="color:#AA0000">21212</span>1
    Sequence length:  5/1/5   distance_total: 10 12<span style="color:#00AA00">12121</span>4<span style="color:#AA0000">12121</span>21
    Sequence length:  6/1/6   distance_total: 12 <span style="color:#00AA00">121212</span>14<span style="color:#AA0000">121212</span>1
    Sequence length:  6/1/6   distance_total: 12 1<span style="color:#00AA00">212121</span>41<span style="color:#AA0000">212121</span>
    Sequence length:  7/1/7   distance_total: 14 <span style="color:#00AA00">1212121</span>4<span style="color:#AA0000">1212121</span>
</pre>

---

## 3. Consider Partial Sequences

If we consider partial sequences that may not be fully repeated we can account for the jump that occurs in example 5 back to the original position.

![example5.gif](images/example5.gif)

Without considering partial matches the jump from 'R' to '1' is not considered:

<pre>
    Sequence length:  7/1/7   distance_total: 14 <span style="color:#00AA00">1212121</span>4<span style="color:#AA0000">1212121</span>
</pre>
With partial matches the firs partial sequence includes the all characters:
<pre>
    Sequence length:  8/1/7   distance_total: 15  <span style="color:#00AA00">12121214</span><span style="color:#AA0000">1212121</span>
    Sequence length:  8/1/6   distance_total: 14  1<span style="color:#00AA00">21212141</span><span style="color:#AA0000">212121</span>
    Sequence length:  8/1/5   distance_total: 13  12<span style="color:#00AA00">12121412</span><span style="color:#AA0000">12121</span>
    Sequence length:  8/1/4   distance_total: 12  121<span style="color:#00AA00">21214121</span><span style="color:#AA0000">2121</span>
</pre>

---

# Effictiveness

## Common Passwords According to WPEngine.com

20 Most common keyboard patterns according to WPEngine.com (https://wpengine.com/resources/passwords-unmasked-infographic/)

|  |   |  |
:-:|:-:|:-:
![common_0.gif](images/common_0.gif) | ![common_1.gif](images/common_1.gif) | ![common_2.gif](images/common_2.gif)
![common_3.gif](images/common_3.gif) | ![common_4.gif](images/common_4.gif) | ![common_5.gif](images/common_5.gif)
![common_6.gif](images/common_6.gif) | ![common_7.gif](images/common_7.gif) | ![common_8.gif](images/common_8.gif)
![common_9.gif](images/common_9.gif) | ![common_10.gif](images/common_10.gif) | ![common_11.gif](images/common_11.gif)
![common_12.gif](images/common_12.gif) | ![common_13.gif](images/common_13.gif) | ![common_14.gif](images/common_14.gif)
![common_15.gif](images/common_15.gif) | ![common_16.gif](images/common_16.gif) | ![common_17.gif](images/common_17.gif)
![common_18.gif](images/common_18.gif) | ![common_19.gif](images/common_19.gif)

## Looking at results

While many of these won't meet modern password lengths we can detect the following ratios based on the criteria above:
- Distances of only ones: 6 of 20
- Ratio of ones >= 75%: 14 of 20
- Sequence makes up >= 75% of characters: 18 of 20
- Not matched at all: 2 of 20

There are 2 problematic patterns and neither match a qwerty based keyboard walk that I can see.

The first is more of a construction of the english language and numerics and the second is based on a phone's numeric pattern hitting keys 2 - 9.

|  |   |  |
:-:|:-:|:-:
![common_18.gif](images/common_18.gif) | ![common_19.gif](images/common_19.gif)


---

# Code used

In [2]:
%matplotlib inline
import time
import numpy as np
import pandas
import matplotlib.pyplot as plt
import matplotlib.cm as cm

def bargraph_results(df, word_length=(8, 30), ylim=(0, 100)):
    #colors = np.append(plt.cm.Greys(np.linspace(.1, .8, 12)), plt.cm.Reds(np.linspace(.6, .8, 10)), axis=0)
    colors = plt.cm.Spectral_r(np.linspace(.1, 1, 25))
    f = plt.figure(figsize=(16,16), dpi=100)

    ax1 = f.add_subplot(2, 3, 1)
    ax2 = f.add_subplot(2, 3, 2)
    ax3 = f.add_subplot(2, 3, 3)
    ax4 = f.add_subplot(2, 3, 4)
    ax5 = f.add_subplot(2, 3, 5)
    ax6 = f.add_subplot(2, 3, 6)


    # iterate through password lengths
    for i in range(word_length[0], word_length[1]):
        # colors will be black below threshold and red above
        threshold = .75
        grays = 0
        reds = 0
        for j in range(0, i+1):
            if j / i > threshold:
                reds += 1
            else:
                grays += 1
        colors = np.append(plt.cm.Greys(np.linspace(.2, .8, grays)), plt.cm.Reds(np.linspace(.3, .9, reds)), axis=0)
        y = []
        #d2 = df[df['word-len'] == i]
        d2 = df[(df.wordlen == i)]
        count = len(d2)
        if count == 0:
            continue
            
        last_zero = 0
        last_ones = 0
        last_dist = 0
        last_dir = 0
        last_word = 0

        # iterate thorugh 10% at a time
        index = 0
        #for j in np.arange(0, 1.25, .25):
        for j in np.arange(0, i + 1):
            limit = (j / i)
            y = len(d2[(d2.zeros <= limit) & (d2.zeros.notna())])
            zero = (y / count) * 100
            
            y = len(d2[(d2.ones <= limit) & (d2.ones.notna())])
            ones = (y / count) * 100
            
            y = len(d2[(d2.distances <= limit) & (d2.distances.notna())])
            dist = (y / count) * 100

            y = len(d2[(d2.directions <= limit) & (d2.directions.notna())])
            dir = (y / count) * 100

            y = len(d2[(d2.characters <= limit) & (d2.characters.notna())])
            word = (y / count) * 100

            #print(i, j, zero, ones, seq)
            ax1.bar(i, zero - last_zero, bottom=last_zero, color=colors[index], label=j)
            ax2.bar(i, ones - last_ones, bottom=last_ones, color=colors[index], label=j)
            ax3.bar(i, dist - last_dist, bottom=last_dist, color=colors[index], label=j)
            ax4.bar(i, dir - last_dir, bottom=last_dir, color=colors[index], label=j)
            ax5.bar(i, word - last_word, bottom=last_word, color=colors[index], label=j)

            last_zero = zero
            last_ones = ones
            last_dist = dist
            last_dir = dir
            last_word = word
            index += 1
        
        ax1.set_ylabel('Percent')
        ax1.set_ylim(ylim)
        ax1.set_xlabel('Password Length')
        ax1.set_title('Ratio of characters with distance=0')
        
        ax2.set_ylabel('Percent')
        ax2.set_ylim(ylim)
        ax2.set_xlabel('Password Length')
        ax2.set_title('Ratio of characters with distance=1')

        ax3.set_ylabel('Percent')
        ax3.set_ylim(ylim)
        ax3.set_xlabel('Password Length')
        ax3.set_title('Ratio of distances part of a sequence')
        
        ax4.set_ylabel('Percent')
        ax4.set_ylim(ylim)
        ax4.set_xlabel('Password Length')
        ax4.set_title('Ratio of directions part of a sequence')
        
        ax5.set_ylabel('Percent')
        ax5.set_ylim(ylim)
        ax5.set_xlabel('Password Length')
        ax5.set_title('Ratio of characters part of a sequence')

    # Rather then a typical color bar it seemed easier to achieve what I was looking
    # for here with another image lot and some axis manipulation
    ax6.imshow(np.expand_dims(colors, axis=1))
    ax6.set_xticks([])
    #ax6.set_yticks(np.arange(0, 1.05, .25) * 4)
    #labels = ['<=%d%%' % (x) for x in np.arange(0, 105, 25)]
    #labels = ['0%', '1-25%', '26-50%', '51-75%', '75-100%']
    #ax6.set_yticklabels(labels)
    
    plt.show()

def graph_words(df, zoom=None, word_length=(8, 30), ylim=(0, 100), threshold=.8):
    # default zoom is 1% of the data
    if not zoom:
        zoom = (0, len(df) / 100)
    #colors = np.append(plt.cm.Reds(np.linspace(.1, .8, 8)), plt.cm.Greens(np.linspace(.5, 1, 5)), axis=0)
    colors = np.append(plt.cm.Greys(np.linspace(.1, .8, 5)), plt.cm.Reds(np.linspace(.4, .8, 3)), axis=0)
    #colors = plt.cm.Spectral_r(np.linspace(.1, 1, 20))
    f = plt.figure(figsize=(16,8), dpi=100)
    x = []
    y = []
    zeros = []
    ones = []
    distance = []
    direction = []
    zoommax = 0
    words = []
    for i in range(word_length[0], word_length[1]):
        d2 = df[df.wordlen == i]
        count = len(d2)
        if count == 0:
            continue

        tmp = len(d2[d2.zeros >= threshold])
        zeros.append(tmp)
        if zoommax < tmp:
            zoommax = tmp
        
        tmp = len(d2[d2.ones >= threshold])
        ones.append(tmp)
        if zoommax < tmp:
            zoommax = tmp
        
        tmp = len(d2[d2.distances >= threshold])
        distance.append(tmp)
        if zoommax < tmp:
            zoommax = tmp
        
        tmp = len(d2[d2.directions >= threshold])
        direction.append(tmp)
        if zoommax < tmp:
            zoommax = tmp
            
        tmp = len(d2[d2.characters >= threshold])
        words.append(tmp)
        if zoommax < tmp:
            zoommax = tmp
            
        x.append(i)
        y.append(len(d2))
        
    ax1 = f.add_subplot(121)
    ax2 = f.add_subplot(122)

    ax1.plot(x, y, marker='.', label='Word Count')
    ax1.plot(x, zeros, marker='.', label='Ratio >= %d%% Zeros Ratio' % (threshold * 100))
    ax1.plot(x, ones, marker='.', label='Ratio >= %d%% Ones Ratio' % (threshold * 100))
    ax1.plot(x, distance, marker='.', label='Ratio >= %d%% Sequence Ratio' % (threshold * 100))
    ax1.plot(x, direction, marker='.', label='Ratio >= %d%% Sequence Ratio' % (threshold * 100))
    ax1.plot(x, words, marker='.', label='Ratio >= %d%% Character Sequence Ratio' % (threshold * 100))
    ax1.set_xlabel('Word Length')
    ax1.set_ylabel('Count')
    ax1.set_yscale('linear')
    ax1.set_ylim(bottom=0)
    ax1.set_title('Words matched by keyboard walking')
    ax1.legend()

    ax2.plot(x, y, marker='.', label='Word Count')
    ax2.plot(x, zeros, marker='.', label='Ratio >= %d%% Zeros Ratio' % (threshold * 100))
    ax2.plot(x, ones, marker='.', label='Ratio >= %d%% Ones Ratio' % (threshold * 100))
    ax2.plot(x, distance, marker='.', label='Ratio >= %d%% Sequence Ratio' % (threshold * 100))
    ax2.plot(x, direction, marker='.', label='Ratio >= %d%% Sequence Ratio' % (threshold * 100))
    ax2.plot(x, words, marker='.', label='Ratio >= %d%% Character equence Ratio' % (threshold * 100))
    ax2.set_xlabel('Word Length')
    ax2.set_ylabel('Count')
    ax2.set_yscale('linear')
    if not zoommax:
        zoommax = len(df) / 100
    ax2.set_ylim((0, zoommax))
    ax2.set_title('Words matched by keyboard walking (Zoomed)')
    ax2.legend()
    
    plt.show()

def histo_results(dataset, bins=20):
    f = plt.figure(figsize=(12,12), dpi=100)

    ax1 = f.add_subplot(2, 3, 1)
    ax2 = f.add_subplot(2, 3, 2)
    ax3 = f.add_subplot(2, 3, 3)
    ax4 = f.add_subplot(2, 3, 4)
    ax5 = f.add_subplot(2, 3, 5)
    ax6 = f.add_subplot(2, 3, 6)
    ax1.hist(dataset.zeros, bins=bins)
    ax2.hist(dataset.ones, bins=bins)
    ax3.hist(dataset.combo, bins=bins)
    ax4.hist(dataset.distances, bins=bins)
    ax5.hist(dataset.directions, bins=bins)
    ax6.hist(dataset.characters, bins=bins)

def summarize_file(path):
    tmp = pandas.read_csv(path, sep='|', on_bad_lines='skip', encoding_errors='ignore')
    dataset = tmp[tmp.zeros.notna() & tmp.ones.notna() & tmp.distances.notna() & tmp.directions.notna()]
    dataset['zeros_ratio'] = dataset.zeros / dataset.wordlen
    dataset['ones_ratio'] = dataset.ones / dataset.wordlen
    dataset['combo_ratio'] = dataset.combo / dataset.wordlen
    dataset['distances_ratio'] = dataset.distances / dataset.wordlen
    dataset['directions_ratio'] = dataset.directions / dataset.wordlen
    dataset['characters_ratio'] = dataset.characters / dataset.wordlen
    #graph_words(dataset, zoom=(0, 5), threshold=.75)
    #bargraph_results(dataset)
    print('Length: %d' % len(dataset))
    #display(dataset.sample(20))
    return dataset

---

# Review of various word lists


## English Language Words

![images/english.txt.png](images/english.txt.png)

![images/english.txt_zoom.png](images/english.txt_zoom.png)



In [15]:
data = summarize_file('wordlists/results/english.txt.results')
english = data[(data.distlen >= 8) & (data.distances_ratio >= .75) & (data.directions_ratio >= .75)]
print("%d of %d words that meet at least one of the criteria to be flagged" % (len(english), len(data)))
e2 = english[['word', 'distvector', 'dirvector', 'wordlen', 'distlen', 'distances_ratio', 'directions_ratio', 'characters_ratio']]
display(e2.sample(10).round(2))

Length: 335441
343 of 335441 words that meet at least one of the criteria to be flagged


Unnamed: 0,word,distvector,dirvector,wordlen,distlen,distances_ratio,directions_ratio,characters_ratio
166583,moko-moko,42237422,33333333,9,8,0.78,0.78,1.0
13404,anting-anting,6334281263342,441343344134,13,12,0.85,0.85,1.0
117038,heart-to-heart,4341774374341,4331331334331,14,13,0.79,0.79,0.86
22238,Baden-Baden,5215895215,4124334124,11,10,0.82,0.82,1.0
284868,tat-tat-tat,5577557755,3333333333,11,10,0.91,0.91,1.0
116214,harum-scarum,543271123432,13123344312,12,11,0.75,0.75,0.83
282751,sweetberry,110224102,210124101,10,9,0.8,0.8,0.3
281426,supersuperb,6371363713,3111331114,11,10,0.91,0.82,0.91
69160,demarcatordemarcators,16743354521674335453,24433431132443343113,21,20,0.86,0.86,0.95
14333,antsy-pantsy,634562106345,44333334433,12,11,0.83,0.83,0.92


---

## Hashcat kwprocessor 

**Results from (word-list-2to16-3)**

![images/hashcat_kwprocessor.txt.png](images/hashcat_kwprocessor.txt.png) 


In [3]:
data = summarize_file('wordlists/results/hashcat_kwprocessor.txt.results')
hashcat = data[(data.distlen >= 8) & (data.distances_ratio < .75)]
print("%d of %d words that meet at least one of the criteria to be flagged" % (len(hashcat), len(data)))
d2 = hashcat[['word', 'distvector', 'dirvector', 'wordlen', 'distlen', 'distances_ratio', 'directions_ratio', 'characters_ratio']]
display(d2.sample(10).round(2))

Length: 835481
2287 of 835481 words that meet at least one of the criteria to be flagged


Unnamed: 0,word,distvector,dirvector,wordlen,distlen,distances_ratio,directions_ratio,characters_ratio
148736,`\qwertyUIO,141211111111,4111111111,11,10,0.73,0.73,0.09
555417,qWERTREWQ\`,111111111214,1111111114,11,10,0.73,0.73,0.18
772915,iUYTREWQ\qa,111111112121,1111111112,11,10,0.73,0.64,0.09
102017,`\qwerEWQ\,141211111112,411111111,10,9,0.6,0.6,0.4
161601,~\qwertyu&,14121111111,411111112,10,9,0.7,0.6,0.1
309979,wQ\QWERTREWQ\,112121111111112,111111111111,13,12,0.69,0.69,0.54
370763,#@!~\QWERTY,111141211111,1114111111,11,10,0.73,0.73,0.09
669913,YTREWQ\~!@#,111111214111,1111114111,11,10,0.73,0.73,0.09
173901,q\QWERTYUIK,121211111111,1111111112,11,10,0.73,0.64,0.09
84398,WeWQ\QWERTY,111121211111,1111111111,11,10,0.73,0.73,0.27


---

## pwqgen


**Generated 100000 passwords using pwqgen with the following:**

    [jeremy@devone results]$ for len in $(seq 24 8 96); do seq 1 1000 | xargs -i pwqgen random=$len; done > pwqgen.txt

![images/pwqgen.txt.png](images/pwqgen.txt.png)

![images/pwqgen.txt.png](images/pwqgen.txt_zoom.png)


In [4]:
data = summarize_file('wordlists/results/pwqgen.txt.results')
pwq = data[(data.distlen >= 8) & (data.distances_ratio >= .75) & (data.directions_ratio >= .75)]
print("%d of %d words that meet at least one of the criteria to be flagged" % (len(pwq), len(data)))
d2 = pwq[['word', 'distvector', 'dirvector', 'wordlen', 'distlen', 'distances_ratio', 'directions_ratio', 'characters_ratio']]

display(d2.head(10).round(2))

Length: 76165
9 of 76165 words that meet at least one of the criteria to be flagged


Unnamed: 0,word,distvector,dirvector,wordlen,distlen,distances_ratio,directions_ratio,characters_ratio
2379,ample-ample,75279127527,4334334334,11,10,0.82,0.82,1.0
5454,bring-ring,344288442,413433134,10,9,0.8,0.8,0.9
5520,messy-messy,6205676205,4303334303,11,10,0.82,0.82,1.0
5536,shore-more,445197451,131133311,10,9,0.8,0.8,0.8
5749,creep-decree,310721012310,31013322310,12,11,0.75,0.75,0.83
8415,power-expert,171189310711,11113333111,12,11,0.75,0.75,0.33
9545,purely-surely,3317461163174,111443331144,13,12,0.77,0.77,0.92
14607,inmate!mate!,41752397523,31431444314,12,11,0.75,0.75,0.83
16929,subway&must!,64526232645,33433322334,12,11,0.75,0.75,0.17


In [16]:
data = summarize_file('wordlists/results/random_english_combo.txt.results')
combo = data[(data.distlen >= 8) & (data.distances_ratio >= .75) & (data.directions_ratio >= .75)]
print("%d of %d words that meet at least one of the criteria to be flagged" % (len(combo), len(data)))
d2 = pwq[['word', 'distvector', 'dirvector', 'wordlen', 'distlen', 'distances_ratio', 'directions_ratio', 'characters_ratio']]

Length: 10000
0 of 10000 words that meet at least one of the criteria to be flagged


---

## keepassxc 

![images/keepassxc.txt.png](images/keepassxc.txt.png)

![images/keepassxc.txt_zoom.png](images/keepassxc.txt_zoom.png)

In [13]:
data = summarize_file('wordlists/results/keepassxc.txt.results')
keypass = data[(data.distlen >= 8) & (data.distances_ratio >= .75) & (data.directions_ratio >= .75)]
print("%d of %d words that meet at least one of the criteria to be flagged" % (len(keypass), len(data)))
d2 = keypass[['word', 'distvector', 'dirvector', 'wordlen', 'distlen', 'distances_ratio', 'directions_ratio', 'characters_ratio']]

display(d2.head(10).round(2))

Length: 110000
7 of 110000 words that meet at least one of the criteria to be flagged


Unnamed: 0,word,distvector,dirvector,wordlen,distlen,distances_ratio,directions_ratio,characters_ratio
4435,WVgp3ReHuS,426821426,433441433,10,9,0.8,0.8,0.1
5194,XCU5dKJWC4,163451634,134311443,10,9,0.9,0.8,0.2
10859,U3FznbVezNm,5345113451,4431114311,11,10,0.82,0.82,0.18
22184,fDT55LZ2UYfr,13106946131,13204334132,12,11,0.75,0.75,0.17
23685,p247CPXUaf9S,923791077379,41133333133,12,11,0.75,0.75,0.08
25852,NydnzSjKS4hA,24452516445,23413111341,12,11,0.75,0.75,0.17
25922,Vqg2FWw5WHWh,55543044555,44444033444,12,11,0.75,0.75,0.25


### keepassxc generated false-postives

None of these appear to be keyboard patterns to me.  It's likely some consideration for the direction of flow added as an additional consideration could prevent false-positives.  This is something that will be investigated in the future.

|  |  |
:-:|:-:
![keepass_0.gif](images/keepass_0.gif) | ![keepass_1.gif](images/keepass_1.gif)
![keepass_2.gif](images/keepass_2.gif) | ![keepass_3.gif](images/keepass_3.gif)
![keepass_4.gif](images/keepass_4.gif) | ![keepass_5.gif](images/keepass_5.gif)
![keepass_6.gif](images/keepass_6.gif) 


---
## rockyou.txt

![images/rockyou.txt.png](images/rockyou.txt.png)

![images/rockyou.txt.png](images/rockyou.txt_zoom.png)



---
## rockyou2021.txt

![images/rockyou2021.txt.png](images/rockyou2021.txt.png)

![images/rockyou2021.txt.png](images/rockyou2021.txt_zoom.png)



# Using Euclidian Distance for Keyboard Distances

Using euclidian distance

- To measure that distance the first character of the password is considered the base and measurements start with the second character.
- For the sake of this talk the A and " key are no adjacent

|  |  |
:-:|:-:
![Distance 0](images/ed-d-w0.png) | ![Distance 1](images/ed-d-w1.png)
![Distance 1.4](images/ed-d-w1.4.png) | ![Distance 2](images/ed-d-w2.png) 
![Distance 2.2](images/ed-d-w2.2.png) | ![Distance 2.8](images/ed-d-w2.8.png)
![Distance 3](images/ed-d-w3.png) | ![Distance 3.2](images/ed-d-w3.2.png)
![Distance 3.6](images/ed-d-w3.6.png) | ![Distance 4](images/ed-d-w4.png)

---
## Minimizing False-positives using Euclidian Distance

### keepass
(top is using previous keyboard distance method, bottom is using euclidian distance)
![images/compare/keepass_montage.png](images/compare/keepass_montage.png)

### pwqgen
(top is using previous keyboard distance method, bottom is using euclidian distance)
![images/compare/keepass_montage.png](images/compare/pwqgen_montage.png)

### rockyou.txt
(top is using previous keyboard distance method, bottom is using euclidian distance)
![images/compare/keepass_montage.png](images/compare/rockyou_montage.png)



In [10]:
data = summarize_file('wordlists/results/keepassxc.txt.results')
data_float = summarize_file('wordlists/results/keepassxc.txt.float.results')
keepass = data[(data.distlen >= 8) & (data.distances_ratio >= .75) & (data.directions_ratio >= .75)]
keepass_float = data_float[(data_float.distlen >= 8) & (data_float.distances_ratio >= .75) & (data_float.directions_ratio >= .75)]
print("%d of %d words that meet at least one of the criteria to be flagged" % (len(keepass), len(data)))
print("%d of %d words that meet at least one of the criteria to be flagged using floats" % (len(keepass_float), len(data_float)))
d2 = keepass[['word', 'distvector', 'wordlen', 'distlen', 'distances_ratio', 'directions_ratio', 'characters_ratio']]
display(d2.head(10).round(2))
d2_float = keepass_float[['word', 'distvector', 'wordlen', 'distlen', 'distances_ratio', 'directions_ratio', 'characters_ratio']]
display(d2_float.head(10).round(2))


Length: 110000
Length: 110000
7 of 110000 words that meet at least one of the criteria to be flagged
1 of 110000 words that meet at least one of the criteria to be flagged using floats


Unnamed: 0,word,distvector,wordlen,distlen,distances_ratio,directions_ratio,characters_ratio
4435,WVgp3ReHuS,426821426,10,9,0.8,0.8,0.1
5194,XCU5dKJWC4,163451634,10,9,0.9,0.8,0.2
10859,U3FznbVezNm,5345113451,11,10,0.82,0.82,0.18
22184,fDT55LZ2UYfr,13106946131,12,11,0.75,0.75,0.17
23685,p247CPXUaf9S,923791077379,12,11,0.75,0.75,0.08
25852,NydnzSjKS4hA,24452516445,12,11,0.75,0.75,0.17
25922,Vqg2FWw5WHWh,55543044555,12,11,0.75,0.75,0.25


Unnamed: 0,word,distvector,wordlen,distlen,distances_ratio,directions_ratio,characters_ratio
22184,fDT55LZ2UYfr,"1.0,2.2,1.0,0.0,4.5,8.1,3.2,5.1,1.0,2.2,1.0",12,11,0.75,0.75,0.17
