Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Keyword Plugin] Various accuracy improvements #229

Merged
merged 6 commits into from
Aug 24, 2019

Conversation

KevinHock
Copy link
Collaborator

πŸ”­ βž• βž•

These are no longer necessary as we made quotes required in the last
commit.
- 🐍 Alphabetize variables
- 🐍 Replace `elif` chain with a dict
- 🐍 Update QUOTES_REQUIRED_FILE_EXTENSIONS in the test
e.g. "secret": "{secret}"

- Change keyword test to not have the positives start with { and end with }
- 🐍 Make .pyi files have FileType.PYTHON
- Add FP check for '/etc/' in secret
- Add FP heuristic for secrets that look like directories
- Add <secret> FP heuristic for .example files
- Add more to FALSE_POSITIVES dict
- Make all non-quotes required filetypes skip secrets starting with a $
- Make quotes required for Terraform files
):
return FileType.YAML
return FileType.OTHER
_, file_extension = os.path.splitext(filename)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this handle my.weird.filename?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup

>>> os.path.splitext('my.weirdo.filename')
('my.weirdo', '.filename')

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right, and does this work with no file extension?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh it does, the get call in the next line has a default

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For posterity, it'll be '' and then get set to FileType.OTHER

(I had an internal πŸ€” about setdefault vs. the .get fallback but πŸ€·β€β™‚ I kinda liked the way it looked.)

'account_password',
'api_key',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this in both DENYLIST and FALSE_POSITIVES?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is for the case wherein silly code does magical_api_key: api_key, or something to that effect, normally in a test.

DENYLIST will catch the fact that it could be an API key from magical_api_key on the LHS, but then FALSE_POSITIVES will check the RHS and cause us not to alert on it.

'password',
'password)',
'password))',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify, does this mean a word that contains password)) will be marked as a false-positive?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The short-answer is, only if it == password)), not contains.

The FALSE_POSITIVES set is for RHS values normally resulting in false-positives, you could make your real password password)), but it's more likely that e.g. I made some mistake in not requiring quotes for that particular filetype, or haven't made a custom regex for that language when I should, this set patches over certain things like that.

FOLLOWED_BY_EQUAL_SIGNS_OPTIONAL_BRACKETS_OPTIONAL_AT_SIGN_QUOTES_REQUIRED_REGEX = re.compile(
# e.g. my_password = "bar"
# e.g. my_password = @"bar"
# e.g. my_password[] = "bar";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on below this is Objective-C. What does this actually do in Objective-C? (I couldn't Google it)

Copy link
Collaborator Author

@KevinHock KevinHock Aug 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ means it is a constant
[] means it is a C style array e.g. char myString[] = "This is a C character array";

Something I do not handle yet, but can ticket, are strings like the following

password = [str1 stringByAppendingFormat:@"World"];
NSMutableString *password = [NSMutableString stringWithString:@"This string is mutable"];

Prior to this PR, we'd capture e.g. [str1 and [NSMutableString, as secrets, which was 🀐

)
) or lowered_secret in FALSE_POSITIVES
or lowered_secret.count('/') >= 3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, added comment in 5de43e2

@KevinHock KevinHock merged commit e1ad354 into master Aug 24, 2019
@KevinHock KevinHock deleted the cleanup_keyword_detector branch September 21, 2019 00:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants