False positive for short regular expressions #178

meanrin · 2022-08-08T08:40:12Z

Many short regular expressions may randomly apear in the hashes, or base64 lines

Example:

https://github.com/tosca/web/blob/master/packages/EntityFramework.5.0.0/tools/EntityFramework.PS3.psd1#L172

$ python -m credsweeper --path EntityFramework.PS3.psd1 
rule: JSON Web Token / severity: medium / line_data_list: [line: '# UK3O3RhOJA/u0afRTK10MCAR6wfVVJUVSZQbQpKumFwwJtoAa+h7veyJBw/3DgSY' / line_num: 172 / path: EntityFramework.PS3.psd1 / value: 'eyJBw' / entropy_validation: False] / api_validation: NOT_AVAILABLE / ml_validation: VALIDATED_KEY

99.9% sure it's not really a JWT

While it is a JWT example, this can probably also happen with other short regular expressions (e.g. Google API based on ya29. prefix)

As for now i see few options to solve it:

Add prefix check and forbid matching regexes if they have a lot of alphanumeric stuff directly to the left. Example: =eyJBw... - ok, dasjkfseyJBw - bad
Add minimal value length for some regexes. For example require minimal JWT length

Maybe there are some different options. Please propose if you'll have any

I haven't tested it with the CredData metrics yet, just ideas

The text was updated successfully, but these errors were encountered:

csh519 · 2022-08-08T09:49:41Z

Both options are good.
Regarding 1. option, maybe prefix filter can be added like below.

filtering_prefix = [" ", "=", ":"]

if candidate_prefix in filtering_prefix:
    return True  # Filtered

return False

babenek · 2022-08-08T17:31:30Z

ML decision = 0.546 for example from http://calebb.net/
Possibly there might be false-negative

meanrin · 2022-08-09T08:25:28Z

2. Add minimal value length for some regexes. For example require minimal JWT length

Adding minimal length of 12 results in fixing issue for mentioned examples without reducing metrics on the CredData
It can be implemented by adding ValueLengthCheck to get_pattern_base_filters
Will open a PR today

meanrin · 2022-08-11T13:00:19Z

Other related PR: #184

meanrin · 2022-08-11T13:01:31Z

I'll close this issue as both ideas already implemented (left alphanumeric character and min length)

babenek mentioned this issue Aug 8, 2022

Fix JWT pattern #179

Closed

meanrin assigned babenek and meanrin Aug 9, 2022

meanrin mentioned this issue Aug 9, 2022

Add minimal pattern value length #182

Merged

meanrin closed this as completed Aug 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

False positive for short regular expressions #178

False positive for short regular expressions #178

meanrin commented Aug 8, 2022 •

edited

csh519 commented Aug 8, 2022

babenek commented Aug 8, 2022

meanrin commented Aug 9, 2022

meanrin commented Aug 11, 2022

meanrin commented Aug 11, 2022

False positive for short regular expressions #178

False positive for short regular expressions #178

Comments

meanrin commented Aug 8, 2022 • edited

csh519 commented Aug 8, 2022

babenek commented Aug 8, 2022

meanrin commented Aug 9, 2022

meanrin commented Aug 11, 2022

meanrin commented Aug 11, 2022

meanrin commented Aug 8, 2022 •

edited