-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Separator Matcher doesn't catch first separator #193
Comments
It's up to you. It made a little more sense to me to allow for any special
character. I also didn't like how much I was mangling the bruteforce and
repeating checks, so I loosened those. Sorry, I should have brought that up
and not just left it for you to discover. Although I don't know why it
prefers the bruteforce over the dictionary + separator.
The limited character set shouldn't matter, I could revert back to only the
handful I had. I would assume most people expect only a small handful to
act as separators, but to me the repeating special char between words makes
sense to mark as a separator regardless of what is used.
The bigger thing would be to ensure that the bruteforce algorithm doesn't
get to select as broad ranges any more. That was the bigger problem that
seemed to eat more inputs when I was testing that I relaxed a bit in later
versions (I let the separator match and then the algorithm determine if
bruteforce was better or not).
I have some time today in a few hours and I can put out another PR with
changes. Let me know what you think would be best.
…On Sat, May 6, 2023, 05:04 MrWook ***@***.***> wrote:
@domosapien <https://github.com/domosapien> i wanted to publish the new
major version and tested everything before and it seems like some of your
later changes broke the separater matcher a little bit. Your video from
#115 <#115> doesn't seem the
source of true anymore as the string buy by beer splits into buy by as
bruteforce, as separator and beer as a dictionary.
I think the first approach was a better idea to have specific chars that
acts as separators 🤔 What do you think?
—
Reply to this email directly, view it on GitHub
<#193>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABI3I5WASIOSJGVG3V2DH73XEYH3LANCNFSM6AAAAAAXX7IZD4>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
It appears reasonable to check for a specific set of special characters, as the Java port may also be utilizing any special character. However, when adding other languages such as Persian, every character could be considered a special character. Therefore, I propose that we define a fixed set of separators, and trigger them if at least one is present. The suggested separators include:
Naturally, this list should be customizable, allowing users to define their own set of separators or consider all special characters as separators if they prefer. I also had the opportunity to explore the earlier implementation where you made adjustments to the repeat and brute-force matchers. That version had a significant flaw, as it considered |
Ok, I've made some changes but they aren't 100% done and I have to run
again. I will try to finish this up later today (I'm in eastern US).
I wasn't aware a regex of `\W` or `[^\w]` wouldn't match on other locales.
I've reverted the broader range of allowed chars and am matching only the
ones you give. I also fixed a bug in changes to the bruteforce matcher that
I added (if the password changed, for example when we were checking for
repeats it uses the same container to track patterns and I look up the
regex, we wouldn't reset on password change).
Now, `buy by beer` gives:
```
{
calcTime: 212,
password: 'buy by beer',
guesses: 130000000,
guessesLog10: 8.113943352306837,
sequence: [
{
pattern: 'bruteforce',
token: 'bu',
i: 0,
j: 1,
guesses: 100,
guessesLog10: 2
},
{
pattern: 'repeat',
i: 2,
j: 7,
token: 'y by b',
baseToken: 'y b',
baseGuesses: 23,
repeatCount: 2,
guesses: 50,
guessesLog10: 1.6989700043360185
},
{
pattern: 'bruteforce',
token: 'eer',
i: 8,
j: 10,
guesses: 1000,
guessesLog10: 2.9999999999999996
}
],
crackTimesSeconds: {
onlineThrottling100PerHour: 4680000000,
onlineNoThrottling10PerSecond: 13000000,
offlineSlowHashing1e4PerSecond: 13000,
offlineFastHashing1e10PerSecond: 0.013
},
crackTimesDisplay: {
onlineThrottling100PerHour: 'centuries',
onlineNoThrottling10PerSecond: '5 months',
offlineSlowHashing1e4PerSecond: '4 hours',
offlineFastHashing1e10PerSecond: 'less than a second'
},
score: 3,
feedback: { warning: null, suggestions: [] }
}
```
Removing the changes I made to bruteforce matching, I get
```
{
calcTime: 209,
password: 'buy by beer',
guesses: 100000000,
guessesLog10: 8,
sequence: [
{
pattern: 'bruteforce',
token: 'buy by',
i: 0,
j: 5,
guesses: 1000000,
guessesLog10: 5.999999999999999
},
{
pattern: 'separator',
token: ' ',
i: 6,
j: 6,
guesses: 0,
guessesLog10: 0
},
{
pattern: 'dictionary',
i: 7,
j: 10,
token: 'beer',
matchedWord: 'beer',
rank: 514,
dictionaryName: 'passwords',
reversed: false,
l33t: false,
baseGuesses: 514,
uppercaseVariations: 1,
l33tVariations: 1,
guesses: 514,
guessesLog10: 2.7109631189952754
}
],
crackTimesSeconds: {
onlineThrottling100PerHour: 3600000000,
onlineNoThrottling10PerSecond: 10000000,
offlineSlowHashing1e4PerSecond: 10000,
offlineFastHashing1e10PerSecond: 0.01
},
crackTimesDisplay: {
onlineThrottling100PerHour: 'centuries',
onlineNoThrottling10PerSecond: '4 months',
offlineSlowHashing1e4PerSecond: '3 hours',
offlineFastHashing1e10PerSecond: 'less than a second'
},
score: 2,
feedback: {
warning: null,
suggestions: [ 'Add more words that are less common.' ]
}
}
```
So it seems like I should probably remove that or change the repeat to not
include separators (which I had before, and what the original demo showed).
Like I said, I'll mess around with it some more later, unless you have an
opinion on what I should or should not do.
* Let separator matches just match and let the algorithm pick
* Remove separator matching from brute force options (bruteforce will be
used more often than separator + dictionary or other match)
* Remove separator matching from repeat (this is less likely, but with
separators a repeat pattern can happen)
…On Sat, May 6, 2023 at 1:49 PM MrWook ***@***.***> wrote:
It appears reasonable to check for a specific set of special characters,
as the Java port may also be utilizing any special character. However, when
adding other languages such as Persian
<#136>, every character could be
considered a special character. Therefore, I propose that we define a fixed
set of separators, and trigger them if at least one is present. The
suggested separators include:
[
' ',
',',
';',
':',
'|',
'/',
'\\',
'-',
'_',
'.',
]
Naturally, this list should be customizable, allowing users to define
their own set of separators or consider all special characters as
separators if they prefer.
I also had the opportunity to explore the earlier implementation where you
made adjustments to the repeat and brute-force matchers. That version had a
significant flaw, as it considered by by a strong password. Consequently,
the current implementation appears to be an improvement in this aspect.
—
Reply to this email directly, view it on GitHub
<#193 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABI3I5WRVHJVABA6UCTGVPLXE2FMPANCNFSM6AAAAAAXX7IZD4>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
With the repeat matcher turned off:
```
{
calcTime: 146,
password: 'buy by beer',
guesses: 10000000000000000,
guessesLog10: 16,
sequence: [
{
pattern: 'dictionary',
i: 0,
j: 2,
token: 'buy',
matchedWord: 'buy',
rank: 509,
dictionaryName: 'commonWords',
reversed: false,
l33t: false,
baseGuesses: 509,
uppercaseVariations: 1,
l33tVariations: 1,
guesses: 509,
guessesLog10: 2.7067177823367583
},
{
pattern: 'separator',
token: ' ',
i: 3,
j: 3,
guesses: 10,
guessesLog10: 1
},
{
pattern: 'dictionary',
i: 4,
j: 5,
token: 'by',
matchedWord: 'by',
rank: 11,
dictionaryName: 'wikipedia',
reversed: false,
l33t: false,
baseGuesses: 11,
uppercaseVariations: 1,
l33tVariations: 1,
guesses: 50,
guessesLog10: 1.6989700043360185
},
{
pattern: 'separator',
token: ' ',
i: 6,
j: 6,
guesses: 0,
guessesLog10: 0
},
{
pattern: 'dictionary',
i: 7,
j: 10,
token: 'beer',
matchedWord: 'beer',
rank: 514,
dictionaryName: 'passwords',
reversed: false,
l33t: false,
baseGuesses: 514,
uppercaseVariations: 1,
l33tVariations: 1,
guesses: 514,
guessesLog10: 2.7109631189952754
}
],
crackTimesSeconds: {
onlineThrottling100PerHour: 360000000000000000,
onlineNoThrottling10PerSecond: 1000000000000000,
offlineSlowHashing1e4PerSecond: 1000000000000,
offlineFastHashing1e10PerSecond: 1000000
},
crackTimesDisplay: {
onlineThrottling100PerHour: 'centuries',
onlineNoThrottling10PerSecond: 'centuries',
offlineSlowHashing1e4PerSecond: 'centuries',
offlineFastHashing1e10PerSecond: '12 days'
},
score: 4,
feedback: { warning: null, suggestions: [] }
}
```
So the score is definitely ballooning. So it seems like, at least for now,
I will just match separators and remove the changes in the bruteforce /
repeater matchers so the algorithm can make whatever guess it wants (but
this may not use separators as often as a result).
…On Sat, May 6, 2023 at 4:53 PM Zach Werner ***@***.***> wrote:
Ok, I've made some changes but they aren't 100% done and I have to run
again. I will try to finish this up later today (I'm in eastern US).
I wasn't aware a regex of `\W` or `[^\w]` wouldn't match on other locales.
I've reverted the broader range of allowed chars and am matching only the
ones you give. I also fixed a bug in changes to the bruteforce matcher that
I added (if the password changed, for example when we were checking for
repeats it uses the same container to track patterns and I look up the
regex, we wouldn't reset on password change).
Now, `buy by beer` gives:
```
{
calcTime: 212,
password: 'buy by beer',
guesses: 130000000,
guessesLog10: 8.113943352306837,
sequence: [
{
pattern: 'bruteforce',
token: 'bu',
i: 0,
j: 1,
guesses: 100,
guessesLog10: 2
},
{
pattern: 'repeat',
i: 2,
j: 7,
token: 'y by b',
baseToken: 'y b',
baseGuesses: 23,
repeatCount: 2,
guesses: 50,
guessesLog10: 1.6989700043360185
},
{
pattern: 'bruteforce',
token: 'eer',
i: 8,
j: 10,
guesses: 1000,
guessesLog10: 2.9999999999999996
}
],
crackTimesSeconds: {
onlineThrottling100PerHour: 4680000000,
onlineNoThrottling10PerSecond: 13000000,
offlineSlowHashing1e4PerSecond: 13000,
offlineFastHashing1e10PerSecond: 0.013
},
crackTimesDisplay: {
onlineThrottling100PerHour: 'centuries',
onlineNoThrottling10PerSecond: '5 months',
offlineSlowHashing1e4PerSecond: '4 hours',
offlineFastHashing1e10PerSecond: 'less than a second'
},
score: 3,
feedback: { warning: null, suggestions: [] }
}
```
Removing the changes I made to bruteforce matching, I get
```
{
calcTime: 209,
password: 'buy by beer',
guesses: 100000000,
guessesLog10: 8,
sequence: [
{
pattern: 'bruteforce',
token: 'buy by',
i: 0,
j: 5,
guesses: 1000000,
guessesLog10: 5.999999999999999
},
{
pattern: 'separator',
token: ' ',
i: 6,
j: 6,
guesses: 0,
guessesLog10: 0
},
{
pattern: 'dictionary',
i: 7,
j: 10,
token: 'beer',
matchedWord: 'beer',
rank: 514,
dictionaryName: 'passwords',
reversed: false,
l33t: false,
baseGuesses: 514,
uppercaseVariations: 1,
l33tVariations: 1,
guesses: 514,
guessesLog10: 2.7109631189952754
}
],
crackTimesSeconds: {
onlineThrottling100PerHour: 3600000000,
onlineNoThrottling10PerSecond: 10000000,
offlineSlowHashing1e4PerSecond: 10000,
offlineFastHashing1e10PerSecond: 0.01
},
crackTimesDisplay: {
onlineThrottling100PerHour: 'centuries',
onlineNoThrottling10PerSecond: '4 months',
offlineSlowHashing1e4PerSecond: '3 hours',
offlineFastHashing1e10PerSecond: 'less than a second'
},
score: 2,
feedback: {
warning: null,
suggestions: [ 'Add more words that are less common.' ]
}
}
```
So it seems like I should probably remove that or change the repeat to not
include separators (which I had before, and what the original demo showed).
Like I said, I'll mess around with it some more later, unless you have an
opinion on what I should or should not do.
* Let separator matches just match and let the algorithm pick
* Remove separator matching from brute force options (bruteforce will be
used more often than separator + dictionary or other match)
* Remove separator matching from repeat (this is less likely, but with
separators a repeat pattern can happen)
On Sat, May 6, 2023 at 1:49 PM MrWook ***@***.***> wrote:
> It appears reasonable to check for a specific set of special characters,
> as the Java port may also be utilizing any special character. However, when
> adding other languages such as Persian
> <#136>, every character could be
> considered a special character. Therefore, I propose that we define a fixed
> set of separators, and trigger them if at least one is present. The
> suggested separators include:
>
> [
> ' ',
> ',',
> ';',
> ':',
> '|',
> '/',
> '\\',
> '-',
> '_',
> '.',
> ]
>
> Naturally, this list should be customizable, allowing users to define
> their own set of separators or consider all special characters as
> separators if they prefer.
>
> I also had the opportunity to explore the earlier implementation where
> you made adjustments to the repeat and brute-force matchers. That version
> had a significant flaw, as it considered by by a strong password.
> Consequently, the current implementation appears to be an improvement in
> this aspect.
>
> —
> Reply to this email directly, view it on GitHub
> <#193 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/ABI3I5WRVHJVABA6UCTGVPLXE2FMPANCNFSM6AAAAAAXX7IZD4>
> .
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
|
@domosapien i wanted to publish the new major version and tested everything before and it seems like some of your later changes broke the separater matcher a little bit. Your video from #115 doesn't seem the source of true anymore as the string
as separator and
buy by beer
splits intobuy by
as bruteforce,beer
as a dictionary.I think the first approach was a better idea to have specific chars that acts as separators 🤔 What do you think?
The text was updated successfully, but these errors were encountered: