Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Many unique lines with spaces or non-ASCII characters being deleted #11

Closed
tinyapps opened this issue Nov 6, 2019 · 2 comments

Comments

@tinyapps
Copy link

@tinyapps tinyapps commented Nov 6, 2019

Thanks so much for crafting and sharing duplicut.

It appears to delete many unique lines containing spaces or non-ASCII characters, e.g.,

$ cat 1

foo
bar
pass with spaces
a pass word
another unique password

$ duplicut 1 -o 1.out

$ cat 1.out

foo
bar
a pass word

Any line with 5 or more Japanese characters is cut:

$ cat 2

一
十一
百十一
千百十一
万千百十一
ほげふがぴよ

$ duplicut 2 -o 2.out

$ cat 2.out

一
十一
百十一
千百十一
@nil0x42

This comment has been minimized.

Copy link
Owner

@nil0x42 nil0x42 commented Nov 6, 2019

Hi ! I guess you ran duplicut without changing the options.
A said by duplicut --help, the --line-max-size option's default value is 14.
This behavior can be discussed, but at the time i implemented it because duplicut was meant to optimize password wordlists, and i rarely want >14 chars passwords in my lists. This allows filtering garbage lines from wordlists downloaded in the internet, that comonly contain long useless html lines or irrelevant lines due to parsing errors.

Also, the empty lines deletion is hardcoded, because the goal is to speedup tools like hascat & john, so they don't have to read empty lines (or guess empty passwords)

Anyway, a limitation of --line-max-size is 254, because line_sz is stored in 8bits in the hashmap (for memory optimization)

Please close the issue if my comment resolved it, and thank you for the feedback !

@tinyapps

This comment has been minimized.

Copy link
Author

@tinyapps tinyapps commented Nov 6, 2019

Sorry to have missed that, nil0x42 - thanks so much for letting me know!

@tinyapps tinyapps closed this Nov 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.