-
Notifications
You must be signed in to change notification settings - Fork 472
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bring-your own-plugins (BYOP), via --custom-plugins option #255
Conversation
if parser.prog == 'detect-secrets-hook': | ||
return parser | ||
|
||
for action in parser._actions: # pragma: no cover (Always returns) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bottom answer of https://stackoverflow.com/questions/43688450/how-to-obtain-argparse-subparsers-from-a-parent-parser-to-inspect-defaults had some great pointers, works in both 2 and 3.
1afce7b
to
570b41b
Compare
@@ -20,6 +20,7 @@ class SecretsCollection(object): | |||
def __init__( | |||
self, | |||
plugins=(), | |||
custom_plugin_paths=[], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we switch this to None
and then fill in the correct default value below? Re: the default-parameter-gets-re-used-in-Python feature.
I think the ()
above is fine because ()
are immutable.
# The difference will show whenever the word list changes | ||
automaton, result.word_list_hash = build_automaton(result.word_list_file) | ||
|
||
# In v0.12.8 the `--custom-plugins` option got added | ||
result.custom_plugin_paths = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we do result.custom_plugin_paths = data.get('custom_plugin_paths', [])
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great catch, I caught this too after reviewing it again, reminded me of bloodhound code.
plugin_paths = ['detect_secrets/plugins'] + list(custom_plugin_paths) | ||
plugin_modules = [] | ||
for plugin_path in plugin_paths: | ||
for root, _, files in os.walk( # pragma: no cover (Always breaks) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What was the rationale for os.walk
vs os.listdir
? Seems like with listdir
we could reduce the code complexity here.
Nevermind, I see that it was already like that. You can fix it if you want though :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most definitely :)
detect_secrets/core/usage.py
Outdated
action='append', | ||
default=[], | ||
dest='custom_plugin_paths', | ||
help='Paths containing custom plugin Python files, not searched recursively.', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is Directories containing ...
clearer? Reading the code, it seems like things will break if we specify a file here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, that makes sense, ++
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noting that it now works for directories and files.
tests/core/audit_test.py
Outdated
@@ -243,6 +243,7 @@ def mock_env(self, user_inputs=None, baseline=None): | |||
@property | |||
def baseline(self): | |||
return { | |||
'custom_plugin_paths': (), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these be lists so that it matches the actual interface?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great catch! It was originally a tuple, so I must have missed a spot 👍
|
||
|
||
def _change_custom_plugin_paths_to_tuple(custom_plugin_paths_function): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if we renamed this decorator to def tuplify
(or whatever the canonical verb is) it would be a lot easier to understand. Alternatively, I'm not opposed to just putting it in the function body.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I generally prefer genericized naming too, but as written the argument has to be named custom_plugin_paths_function
, otherwise it won't work, so I'm ambivalent. Maybe I can try to make it so the decorator tuplifies all arguments 🤔 , ++.
If we put it in the function body, I don't think we could lru_cache
it, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if I understand why the parameter has to be named custom_plugin_paths_function
. Can't we s/custom_plugin_paths_function/fn/g
?
I don't know where the lru_cache
is applied right now, so we can leave it as a decorator if it works as intended already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*Ouch my bad @OiCMudkips, I mis-typed last Friday, the argument to the function being decorated has to be named custom_plugin_paths
, as it's currently written. So it's not applicable to any function or any function with 1 arg, only 1 arg func's with custom_plugin_paths
as an arg.
I will try to make it so that it can accept any 1 arg func, so we can name it a more genericized name though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: where the lru_cache is applied
I wrap functions that accept a list, to turn them into functions that accept a tuple, before the lru_cache
decorator gets to them, so that they can be cached. Lists are unacceptable for immutability reasons IIRC
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following up with this: the reason the arg must be named custom_plugin_paths
, is because any function decorated by this decorator may be called with custom_plugin_paths
as a keyword argument.
i.e. If I changed this to
def wrapper_of_custom_plugin_paths_function(one_arg):
return custom_plugin_paths_function(tuple(one_arg))
it would result in
============================================== ERRORS ===============================================
________________________________ ERROR collecting tests/main_test.py ________________________________
tests/main_test.py:76: in <module>
class TestMain:
tests/main_test.py:294: in TestMain
'name': 'Base64HighEntropyString',
tests/main_test.py:33: in get_list_of_plugins
for name, plugin in import_plugins(custom_plugin_paths=[]).items():
E TypeError: wrapper_of_custom_plugin_paths_function() got an unexpected keyword argument 'custom_plugin_paths'
@@ -78,10 +127,36 @@ def add_console_use_arguments(self): | |||
return self | |||
|
|||
def parse_args(self, argv): | |||
output = self.parser.parse_args(argv) | |||
PluginOptions.consolidate_args(output) | |||
# We temporarily remove '--help' so that we can give the full |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if I understand this, can you explain?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you do --help
with --custom-plugins
, you want to see the options for e.g. --no-custom-detector
, if we just parse --help
before --custom-plugins
, it will SystemExit and give the help message without any options from the custom plugins.
570b41b
to
f3214ee
Compare
# Dynamically imported classes have different | ||
# addresses for the same functions as statically | ||
# imported classes do, so issubclass does not work. | ||
# We use __mro__ to loop through all parent classes. | ||
# issubclass does work in Python 3, since parent classes | ||
# do have the same addresses. | ||
if not any( | ||
'plugins.base.BasePlugin' in str(klass) | ||
for klass in attr.__mro__ | ||
): | ||
return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to the issue explained in this comment, super
calls in Python 2 for plugins no longer work!
e.g. the super
call in
@property
def __dict__(self):
output = super(HighEntropyStringsPlugin, self).__dict__
output.update({
'hex_limit': self.entropy_limit,
})
return output
will throw a super(type, obj): obj must be an instance or subtype of type
error because it thinks self
is not a subtype, when really it just has different addresses for its functions. I patched around the same issue with issubclass
in this highlighted code, but with super
there is not a desirable path forward that I can see, I shall ask on StackOverflow for advice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For posterity, I did look into this, although I time-boxed myself.
You can change the metaclass of BasePlugin to be e.g.
+class AlwaysTrue(ABCMeta):
+ def __instancecheck__(self, instance):
+ def __subclasscheck__(self, instance):
however, super still throws the above error, only isinstance
and issubclass
work.
In other words, I believe this is definitely blocked on dropping Python 2 support #260. Either that or have the user supply the import path as well as the file path, but I don't want to do that.
In general it sounds like a good idea. Have we thought about how to deal with things like additional pip packages required by new plugins? Another thing is it would be nice to include a helloworld or sample plugin in the sample_custom_plugins folder, so others can easily follow and start developing new plugins. |
For the server-side case it's easy to include any packages you need next to e.g.
I did include a couple custom plugins in the tests, but those are kind of too basic. Since we link to the plugins/ dir in the README right before we mention We should also add a link to the right |
I think this should work. Given some of the relative imports might be broken (e.g. example) depends on who user structure their custom import folders.
👍 |
To detect AWS secrets keys in function calls
8498770
to
484970e
Compare
assert str(plugin.__class__) == str(HexHighEntropyString) | ||
assert dir(plugin.__class__) == dir(HexHighEntropyString) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had to do
- assert all(
- func(plugin.__class__) == func(Base64HighEntropyString)
- for func in (str, dir)
- )
+ assert str(plugin.__class__) == str(Base64HighEntropyString)
+ assert dir(plugin.__class__) == dir(Base64HighEntropyString)
because otherwise it was 94% 21->exit, 71->exit
I rebased with master so this should be ready to review now @OiCMudkips / @killuazhu 😁 |
- 🐍 Add add_shared_arguments() function in usage.py for DRYness - 🐛 Fix issue #242 via passing `should_verify_secrets=not args.no_verify` to `from_parser_builder` call - 🐛 Fix sorting issue in format_baseline_for_output() where --update and regular scan had different secret order - 💯 All non-separated out files again :D - 🎓 Mention `--custom-plugins` in README - 🎓 Standardize NOTE -> Note - 🐛 Fix test pollution due to `all_plugins` cls attribute - 🐍 Change all relative imports to absolute, to avoid broken imports if someone copies an existing plugin to make a custom plugin 🙈 Hacks located in `def parse_args` of usage.py
Polite re-ping on this one 💟 I have some fun funemployed time at the moment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay, hopefully you're holding up OK :)
It's been a while so I'll probably do one more pass tomorrow to make sure this is sane but here's some more for you to chew on for now.
@@ -283,13 +290,13 @@ def determine_audit_results(baseline, baseline_path): | |||
audit_results['stats'][audit_result]['count'] += 1 | |||
audit_results['stats'][audit_result]['files'][filename] += 1 | |||
total += 1 | |||
audit_results['stats']['signal'] = str( | |||
audit_results['stats']['signal'] = '{:.2f}'.format( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The %
go into this format string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great catch!! I no longer need the [:4]
so that makes a lot of sense 🎊
|
||
from detect_secrets.plugins.base import BasePlugin | ||
from detect_secrets.util import get_root_directory | ||
|
||
|
||
def change_custom_plugin_paths_to_tuple(custom_plugin_paths_function): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems kinda excessive, can we just call tuple(argument)
wherever we want to try to be safer?
One alternative is to make it so that the ArgumentParser
returns a tuple
by implementing a custom action but honestly I'd be OK with us just not worry about the accidentally mutating the list of paths.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason for tuplifying is so that we can lru_cache
the functions this decorator is used on, not b/c I'm afraid the list will get mutated.
I'll re-read the code to see if tuplifying as soon as we parse args (/the baseline) makes things easier. 👍 Convert to X as early as possible seems like always good advice, I'll have to see if there was a reason I wrote it this way. 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does not appear there was a good reason 😁 I'll go ahead and do the custom action, something similar to https://stackoverflow.com/questions/50365835/how-to-make-argparse-argumentparser-return-a-tuple-or-an-np-array-rather-than-a
* Fail build * Support python 2 * One more 🐛 * Add no cov to avoid py2 code in py3 env
Made new PR due to change in GitHub permissions. So closing this one. |
* Fail build * Support python 2 * One more 🐛 * Add no cov to avoid py2 code in py3 env
In case you wanted to make e.g. a new entropy plugin with different metacharacters, or a new keyword type detector, or a company-specific plugin, or use it/tune it privately before merging upstream.
To be used like so:
detect-secrets scan --custom-plugins custom_plugins/ test_data/each_secret.py
We had a "truffleHog vs. metasploit" discussion internally. With the former, you can e.g. edit a config file with regexes in it like
shhgit
, with the latter you write your own modules.Main reasons why this seems like a better fit for
detect-secrets