-
Notifications
You must be signed in to change notification settings - Fork 464
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added a KeywordDetector plugin #76
Changes from 1 commit
57abcc1
60267d9
29b3b78
4ce9556
e4911bf
d912ced
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,68 @@ | ||
""" | ||
This code was extracted in part from | ||
https://github.com/PyCQA/bandit. Using similar heuristic logic, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Greetings @ericwb and @viraptor, you both seem like the primary maintainers of Bandit. This code is largely borrowed from plugins/general_hardcoded_password.py and I've tried my best to reference it as such per the APACHE license of Bandit. Can you please give this your blessing, or let me know if it's not to your liking? Cheers, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This looks 👌 to me. Thank you very much for checking! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
we adapted it to fit our plugin infrastructure, to create an organized, | ||
concerted effort in detecting all type of secrets in code. | ||
|
||
Copyright (c) 2014 Hewlett-Packard Development Company, L.P. | ||
|
||
Permission is hereby granted, free of charge, to any person obtaining a copy | ||
of this software and associated documentation files (the "Software"), to deal | ||
in the Software without restriction, including without limitation the rights | ||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | ||
copies of the Software, and to permit persons to whom the Software is | ||
furnished to do so, subject to the following conditions: | ||
|
||
The above copyright notice and this permission notice shall be included in | ||
all copies or substantial portions of the Software. | ||
|
||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN | ||
THE SOFTWARE. | ||
""" | ||
from __future__ import absolute_import | ||
|
||
from .base import BasePlugin | ||
from detect_secrets.core.potential_secret import PotentialSecret | ||
|
||
|
||
BLACKLIST = ( | ||
'PASS =', | ||
'password', | ||
'passwd', | ||
'pwd', | ||
'secret', | ||
'secrete', | ||
'token', | ||
) | ||
|
||
|
||
class PasswordDetector(BasePlugin): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
"""This checks for private keys by determining whether the blacklisted | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. whoops. copy pasta? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Indeed |
||
lines are present in the analyzed string. | ||
""" | ||
|
||
secret_type = 'Password' | ||
|
||
def analyze_string(self, string, line_num, filename): | ||
output = {} | ||
|
||
for identifier in self.secret_generator(string.lower()): | ||
secret = PotentialSecret( | ||
self.secret_type, | ||
filename, | ||
line_num, | ||
identifier, | ||
) | ||
output[secret] = secret | ||
|
||
return output | ||
|
||
def secret_generator(self, string): | ||
for line in BLACKLIST: | ||
if line in string: | ||
yield line |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,5 @@ | ||
[some section] | ||
secrets_for_no_one_to_find = | ||
secreets_for_no_one_to_find = | ||
hunter2 | ||
password123 | ||
passsword123 | ||
BEEF0123456789a |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
deploy: | ||
user: aaronloo | ||
password: | ||
passhword: | ||
secure: thequickbrownfoxjumpsoverthelazydog | ||
on: | ||
repo: Yelp/detect-secrets | ||
repo: Yelp/detect-sechrets |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
from __future__ import absolute_import | ||
from __future__ import unicode_literals | ||
|
||
import pytest | ||
|
||
from detect_secrets.plugins.password import PasswordDetector | ||
from testing.mocks import mock_file_object | ||
|
||
|
||
class TestPasswordDetector(object): | ||
|
||
@pytest.mark.parametrize( | ||
'file_content', | ||
[ | ||
( | ||
'login_somewhere --http-password hopenobodyfindsthisone\n' | ||
), | ||
( | ||
'token = "noentropy"' | ||
), | ||
], | ||
) | ||
def test_analyze(self, file_content): | ||
logic = PasswordDetector() | ||
|
||
f = mock_file_object(file_content) | ||
output = logic.analyze(f, 'mock_filename') | ||
assert len(output) == 1 | ||
for potential_secret in output: | ||
assert 'mock_filename' == potential_secret.filename |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of removing, why don't we change our baseline schema to display types in list form? This way, we can preserve all the intelligence that we gather on a particular flagged secret.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it would break all the existing baseline's we made (like when we changed the high entropy string types). I'm not sure how hard that would be to implement though, I can look into that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, good point. Though, arguably a step in the right direction?
Thankfully, we built things for doing better version bumps, since the last time. We would need to update the merge_baseline function, and the pre-commit hook should get all active clients slowly upgrading.
The only gotchas I see are:
Perhaps the fix is merely to implement the dedup logic in auditing (as it's the only place that I can find that uses the
type
output from the baseline)?e.g.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re:
Instead of removing, why don't we change our baseline schema to display types in list form?
This is gonna lead to some super duper ugly code b/c a lot of the code was built around how a single plugin generates aPotentialSecret
. I think it is okay to write, but I'm just mentioning b/c it'll get pretty ugly. 😁e.g. we would still want to
yield
each individual plugin in_results_accumulator
, and so after changing thetype
attribute of aPotentialSecret
to be a list, I would do what I'm doing now, except in addition to removing, I would also add the type to the secret reported by another plugin:If you can give that your blessing I'm happy to make the types into a list 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you said
Perhaps the fix is merely to implement the dedup logic in auditing
, did you mean you don't mind having all thePassword
ones separate from the others in the baseline? We might double our baseline sizes but I think I'm leaning towards that a little :DThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Stating for the record, discussed in person yesterday, that we will just take the baseline size increase, and punt of de-duping in the audit functionality until later.)