Massive reduction in performance after 1.5.0 #438

Plazmaz · 2018-12-18T15:52:40Z

Describe the bug
After the 1.5.0 release, there appears to have been a significant reduction in performance. For instance, here is the output of time bandit -r . using 1.4.0:

17.56user 2.12system 0:21.22elapsed 92%CPU (0avgtext+0avgdata 90712maxresident)k
182728inputs+0outputs (0major+45973minor)pagefaults 0swaps

And here it is with 1.5.0:

261.46user 3.96system 4:27.26elapsed 99%CPU (0avgtext+0avgdata 89164maxresident)k
362728inputs+0outputs (8major+45690minor)pagefaults 0swaps

To Reproduce
Steps to reproduce the behavior:

Run time bandit -r . within a medium to large-size project on 1.4.0
Note the time.
Run the same command again, but with bandit 1.5.0
Compare the two times. You should see a significant drop in performance

Expected behavior
Performance consistent or comparable with 1.4.0

Bandit version

bandit 1.5.0
  python version = 2.7.15rc1 (default, Apr 15 2018, 21:51:34) [GCC 7.3.0]

The text was updated successfully, but these errors were encountered:

Plazmaz · 2018-12-18T15:58:15Z

Worth noting this is also an issue in 1.5.1

ehooo · 2018-12-22T19:28:53Z

I just test rule by rule and the issue is not look related to this.

The only point I see that could be caused is the number of rules from 64 to 70 but according to the test I made the degradation time is around 3 or 4 sec, so should be another thing.

Here you are the code and the result in ODS file Bandit_time.zip

To get this data I clone twice bandit one for 1.4.0 and other for 1.5.0 and on a virtual env with each bandit installed to run the script attached.

Bandit 1.4.0

B602: 0.550184
B405: 0.640002
Total: 0:00:36.750709 - Num: 64
Mean: 0.574229 - Median: 0.569994
With all: 0.730448

Bandit 1.5.0
B110: 0.547545
B402: 0.649634
Total: 0:00:40.508832 - Num: 70
Mean: 0.578697 - Median: 0.577912
With all: 0:00:08.433786

I also check the code diffing and I don't see anything that could cause that degradation.
Any idea?

lukehinds · 2018-12-23T15:39:47Z

It might be worth us putting in some debug timers on each plugin call to see if its any in particular that are causing the issue.

ehooo · 2018-12-24T18:25:23Z

I just run line_profile

import line_profiler
from bandit.cli.main import _get_profile
from bandit.core import config as b_config
from bandit.core import manager as b_manager


def line_profiler_function():
    targets = ('../django-moth/', )
    formatter = 'txt'

    prof = line_profiler.LineProfiler()

    b_conf = b_config.BanditConfig(config_file=None)
    profile = _get_profile(b_conf, None, None)
    b_mgr = b_manager.BanditManager(b_conf, 'file', False,
                                    profile=profile, verbose=False,
                                    ignore_nosec=False)
    b_mgr.discover_files(targets, True, None)
    for key in b_mgr.b_ts.tests:
        for funct in b_mgr.b_ts.tests[key]:
            prof.add_function(funct)
    f = prof(b_mgr.run_tests)
    f()
    prof.print_stats(open('profile.log', 'w'))
    prof.disable()

Just to resume the issue is on blacklist function:
On 1.4.0:

Total time: 0.307109 s
File: bandit/core/blacklisting.py
Function: blacklist at line 30
Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
...
    54                                                   else:
    55       624        683.0      1.1      0.2              name = context.call_function_name_qual
    56     21828      11170.0      0.5      3.6          for check in blacklists[node_type]:
    57     77940      41703.0      0.5     13.6              for qn in check['qualnames']:
    58     56737     233936.0      4.1     76.2                  if fnmatch.fnmatch(name, qn):
    59         4         57.0     14.2      0.0                      return report_issue(check, name)
    60                                           
    61       863        859.0      1.0      0.3      if node_type.startswith('Import'):
    62       242        137.0      0.6      0.0          prefix = ""
    63       242        153.0      0.6      0.0          if node_type == "ImportFrom":
    64       198        252.0      1.3      0.1              if context.node.module is not None:
    65       198        237.0      1.2      0.1                  prefix = context.node.module + "."
    66                                           
    67      3104       1679.0      0.5      0.5          for check in blacklists[node_type]:
    68      5964       5013.0      0.8      1.6              for name in context.node.names:
    69      7226       4191.0      0.6      1.4                  for qn in check['qualnames']:
    70      4130       3066.0      0.7      1.0                      if (prefix + name.name).startswith(qn):
    71         6         99.0     16.5      0.0                          return report_issue(check, name.name)

On 1.5.0:

Total time: 37.5339 s
File: bandit/core/blacklisting.py
Function: blacklist at line 30
Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
...
    54                                                   else:
    55       624        688.0      1.1      0.0              name = context.call_function_name_qual
    56                                                       # In the case the Call is an importlib.import, treat the first
    57                                                       # argument name as an actual import module name.
    58       624        393.0      0.6      0.0              if name in ["importlib.import_module", "importlib.__import__"]:
    59                                                           name = context.call_args[0]
    60     24320      15044.0      0.6      0.0          for check in blacklists[node_type]:
    61    102295      88582.0      0.9      0.2              for qn in check['qualnames']:
    62     78600   37401448.0    475.8     99.6                  if fnmatch.fnmatch(name, qn):
    63         4        161.0     40.2      0.0                      return report_issue(check, name)
    64                                           
    65       863       1130.0      1.3      0.0      if node_type.startswith('Import'):
    66       242        146.0      0.6      0.0          prefix = ""
    67       242        161.0      0.7      0.0          if node_type == "ImportFrom":
    68       198        264.0      1.3      0.0              if context.node.module is not None:
    69       198        264.0      1.3      0.0                  prefix = context.node.module + "."
    70                                           
    71      3576       2015.0      0.6      0.0          for check in blacklists[node_type]:
    72      6944       5963.0      0.9      0.0              for name in context.node.names:
    73     12060       7133.0      0.6      0.0                  for qn in check['qualnames']:
    74      8456       6229.0      0.7      0.0                      if (prefix + name.name).startswith(qn):
    75         6        104.0     17.3      0.0                          return report_issue(check, name.name)
name)

ehooo · 2018-12-26T21:14:14Z

I don't see any possible solution to solve it.
If I'm not wrong the only changes was on:

For call.py file

'pickle', 'B301',
- 'dill.loads',
- 'dill.load',
- 'dill.Unpickler'
'md5', 'B303':
- 'hashlib.sha1',
- 'Crypto.Hash.SHA.new',
- 'Cryptodome.Hash.MD2.new',
- 'Cryptodome.Hash.MD4.new',
- 'Cryptodome.Hash.MD5.new',
- 'Cryptodome.Hash.SHA.new',
- 'cryptography.hazmat.primitives.hashes.SHA1'
'ciphers', 'B304':
- 'Cryptodome.Cipher.ARC2.new',
- 'Cryptodome.Cipher.ARC4.new',
- 'Cryptodome.Cipher.Blowfish.new',
- 'Cryptodome.Cipher.DES.new',
- 'Cryptodome.Cipher.XOR.new',
'tempnam', 'B325':
- 'os.tempnam'
- 'os.tmpnam'
'unverified_context', 'B323',:
- 'ssl._create_unverified_context'

For imports.py file

import_pycrypto', 'B413':
- 'Crypto.Cipher',
- 'Crypto.Hash',
- 'Crypto.IO',
- 'Crypto.Protocol',
- 'Crypto.PublicKey',
- 'Crypto.Random',
- 'Crypto.Signature',
- 'Crypto.Util'
'import_pycryptodome', 'B414':
- 'Cryptodome.Cipher',
- 'Cryptodome.Hash',
- 'Cryptodome.IO',
- 'Cryptodome.Protocol',
- 'Cryptodome.PublicKey',
- 'Cryptodome.Random',
- 'Cryptodome.Signature',
- 'Cryptodome.Util'

I also test the code with the commit 257b5dd the different is near to None

I realy don't know how to fix it. :(

Any idea??

lukehinds · 2019-01-11T12:55:33Z

Tracked down the issue to Python 2 and < Release 1.50

Python 3.7.1 & Bandit 1.4.0

result: bandit -r youtube-dl  64.17s user 0.12s system 99% cpu 1:04.47 total

Python 3.7.1 & Bandit 1.5.0

result: bandit -r youtube-dl  64.52s user 0.07s system 99% cpu 1:04.68 total

Python 3.7.1 & Bandit 1.5.1

result: bandit -r youtube-dl  63.88s user 0.09s system 99% cpu 1:04.06 total

Python version 2.7.15 & Bandit 1.4.0

result: bandit -r youtube-dl  73.62s user 0.17s system 99% cpu 1:14.00 total

Python version 2.7.15 & Bandit 1.5.0

result: bandit -r youtube-dl  597.42s user 0.72s system 99% cpu 10:00.45 total

Python version 2.7.15 & Bandit 1.5.1

result: bandit -r youtube-dl  581.50s user 0.46s system 99% cpu 9:45.59 total

Now trying to find the particular change that caused this. As per #422 B303, B304, B305 are where the more delaying factors are based.

For now recommend anyone feeling the impact of this in their gate, change to running bandit under python3.

Plazmaz · 2019-01-14T19:00:18Z

@lukehinds unfortunately that's not currently an option in my scenario. The temp fix I've implemented scans a much smaller subset of files, but it's definitely sub-optimal. One thing I noticed was that calls to re seemed to be taking up a large amount of the execution time, so if there's anywhere that doesn't use precompiled regexes, that might be worth looking at.

ghugo · 2019-01-14T21:41:00Z

Looks like this is the commit where performance tanks:
85e5667

Commenting out the added blacklisted items brings performance back to being the original speedy self.

gagehugo@zbook:~$ time bandit -r keystone/keystone/ -x keystone/keystone/tests/ -f txt -o /dev/null
[main]	INFO	profile include tests: None
[main]	INFO	profile exclude tests: None
[main]	INFO	cli include tests: None
[main]	INFO	cli exclude tests: None
[main]	INFO	running on Python 2.7.12
525 [0.. 50.. 100.. 150.. 200.. 250.. 300.. 350.. 400.. 450.. 500.. ]
[text]	INFO	Text output written to file: /dev/null

real	1m33.092s
user	1m33.008s
sys	0m0.080s

After commenting out:

    sets.append(utils.build_conf_dict(
        'md5', 'B303',
        ['hashlib.md5',
         'hashlib.sha1',
         'Crypto.Hash.MD2.new',
         'Crypto.Hash.MD4.new',
         'Crypto.Hash.MD5.new',
#         'Crypto.Hash.SHA.new',
         'Cryptodome.Hash.MD2.new',
         'Cryptodome.Hash.MD4.new',
         'Cryptodome.Hash.MD5.new'],
#         'Cryptodome.Hash.SHA.new',
#         'cryptography.hazmat.primitives.hashes.MD5',
#         'cryptography.hazmat.primitives.hashes.SHA1'],
        'Use of insecure MD2, MD4, MD5, or SHA1 hash function.'
        ))

gagehugo@zbook:~$ time bandit -r keystone/keystone/ -x keystone/keystone/tests/ -f txt -o /dev/null
[main]	INFO	profile include tests: None
[main]	INFO	profile exclude tests: None
[main]	INFO	cli include tests: None
[main]	INFO	cli exclude tests: None
[main]	INFO	running on Python 2.7.12
525 [0.. 50.. 100.. 150.. 200.. 250.. 300.. 350.. 400.. 450.. 500.. ]
[text]	INFO	Text output written to file: /dev/null

real	0m6.508s
user	0m6.452s
sys	0m0.052s

lukehinds · 2019-01-17T11:37:42Z

nice find @ghugo

I wonder what is the factor that causes the performance variations between python 2&3, something.

ehooo · 2019-01-17T12:07:07Z

I just check the function fnmatch on python core and it was changed:
https://github.com/python/cpython/blob/2.7/Lib/fnmatch.py#L69
https://github.com/python/cpython/blob/3.4/Lib/fnmatch.py#L64

Changed on the commit:
python/cpython@5b0c1e0#diff-b62de5c6bce57a277ae7bc60324011ec

lukehinds · 2019-01-18T07:50:07Z

In light of @ehooo's findings I suggest we update the README with a recommendation that folks use py3 if utilising the crypto plugins with a link to this issue. That way if they have a super need to run Bandit under py2 and with the crypto plugins, they can pursue the python code linked in comment

If no objections I will assign this to me, and close it on the basis of the above readme additions.

Plazmaz · 2019-02-22T21:16:45Z

Just wanted to add that there is still a significant reduction in performance with ANY of the tests under B303 enabled, not just the ones added in 85e5667. It seems to get better the more of those I comment out, but is still very slow in general.
With commented out lines in B303:

231.61user 0.15system 3:51.80elapsed 99%CPU (0avgtext+0avgdata 76112maxresident)k
0inputs+0outputs (0major+24931minor)pagefaults 0swaps

B303 Disabled:

15.70user 0.09system 0:15.84elapsed 99%CPU (0avgtext+0avgdata 74372maxresident)k
104inputs+0outputs (0major+21624minor)pagefaults 0swaps

Also, my original test was with the same version of python in two different versions of bandit, so it seems somewhat suspicious that this is an issue in python 2.7.

KelSolaar · 2019-05-21T02:07:01Z

Hi,

I just came around this issue for similar problems, I'm extending the blacklists with more content and noticed that the time skyrocketed, the culprit being indeed the fnmatch definition and the calls to the re regex library:

Is there a strong rational for using regex here? Looking at the shipped blacklists, none seem to be using regex patterns.

KelSolaar · 2019-05-21T02:14:54Z

A quick test I did was just changing this line:

From

if name is not None and fnmatch.fnmatch(name, qn):

To

if name is not None and name == qn:

and I went from 14m48.641s to 0m20.871s while removing 1 false positive. Given that it seems like regex matching should be optional and not the default? I will certainly monkey-patch the definition on our end.

Plazmaz · 2019-05-23T00:12:45Z

FWIW fnmatch is not regex matching. That'd be via the re library, which also allows for precompiled patterns (which might improve performance)

KelSolaar · 2019-05-23T01:37:03Z

@Plazmaz : I might misunderstood what you just said but fnmatch is definitely regex matching: https://github.com/python/cpython/blob/e42b705188271da108de42b55d9344642170aa2b/Lib/fnmatch.py#L39 and it is also quite clear from the profile screengrab I posted.

Plazmaz · 2019-05-23T01:38:43Z

Ah my bad, I didn't notice it used regex internally. I was looking at https://docs.python.org/3.7/library/fnmatch.html and figured it was just glob matching.

KelSolaar · 2019-05-23T01:42:59Z

Yeah, nah :) It explains why it is uber slow.

Plazmaz · 2019-05-23T01:44:32Z

I mean, it could precompile those regexes and it would probably help a lot.

Bandit UX is seriously broken, only <1.6 works predictably. Exclude/ignore of files is currently broken in Bandit: - PyCQA/bandit#693 - PyCQA/bandit#490 - PyCQA/bandit#438 (comment) Reading settings from configuration files is broken: - PyCQA/bandit#753 - PyCQA/bandit#595 Reading from pyproject.toml not yet functional: - Must install "toml" package and use "-c pyproject.toml". - PyCQA/bandit#758 INI file configuration and CLI usage is unclear: - PyCQA/bandit#603 - PyCQA/bandit#467 - PyCQA/bandit#396

Bandit UX is seriously broken, only <1.6 works predictably. Exclude/ignore of files is currently broken in Bandit: - PyCQA/bandit#693 - PyCQA/bandit#490 - PyCQA/bandit#438 (comment) Reading settings from configuration files is broken: - PyCQA/bandit#753 - PyCQA/bandit#595 Reading from pyproject.toml not yet functional: Must install "toml" package and use "-c pyproject.toml". - PyCQA/bandit#758 INI file configuration and CLI usage is unclear: - PyCQA/bandit#603 - PyCQA/bandit#467 - PyCQA/bandit#396

The blacklisting function is currently using fnmatch.fnmatch() to do matching of qualified names of blacklist calls. It seems it is only used for telnetlib and ftplib where they are setting the qualified name in a file glob style (telnetlib.*). This change would slightly break backward compatibility if there are any third-party plugins that use globbing in the qualified names for blacklisting. I think the likelyhood is small. I also think it is better to be more explicit in the qualified name patterns. In the case of ftplib, FTP is insecure, but FTP_TLS is not. So this already is resolving one false postive. The other effect of this change is a slight boost to performance. When scanning cpython prior to this fix, it would take around 1 min. After the fix, closer to 50 seconds. So a nice little bump in speed. Fixes: PyCQA#438 Signed-off-by: Eric Brown <eric_wade_brown@yahoo.com>

* Performance improvement in blacklist function The blacklisting function is currently using fnmatch.fnmatch() to do matching of qualified names of blacklist calls. It seems it is only used for telnetlib and ftplib where they are setting the qualified name in a file glob style (telnetlib.*). This change would slightly break backward compatibility if there are any third-party plugins that use globbing in the qualified names for blacklisting. I think the likelyhood is small. I also think it is better to be more explicit in the qualified name patterns. In the case of ftplib, FTP is insecure, but FTP_TLS is not. So this already is resolving one false postive. The other effect of this change is a slight boost to performance. When scanning cpython prior to this fix, it would take around 1 min. After the fix, closer to 50 seconds. So a nice little bump in speed. Fixes: #438 Signed-off-by: Eric Brown <eric_wade_brown@yahoo.com> * Add test for usage of FTP_TLS This change adds an FTP_TLS call to the examples. A high severity error is no longer reported as a result of the fix in PR #1148 that explicitly now matches blacklist call qualified names rather than using a file glob. However, you will notice that there is one more high severity issue reported in the tests as a result of the import of ftplib.FTP_TLS because the blacklist import is only checking for "ftplib". Fixes: #148 Signed-off-by: Eric Brown <eric_wade_brown@yahoo.com> --------- Signed-off-by: Eric Brown <eric_wade_brown@yahoo.com>

ericwb added the bug Something isn't working label Dec 19, 2018

lukehinds mentioned this issue Jan 11, 2019

Hash and crypto blacklists greatly increasing execution time #422

Closed

ericwb added this to the Near Future milestone May 9, 2019

ehooo mentioned this issue May 29, 2019

degraded performance in 1.6.0 release #490

Closed

bittner mentioned this issue Jan 10, 2022

Restore older Bandit version for predictive setup behave/behave-django#134

Merged

ericwb mentioned this issue Jun 23, 2024

Performance improvement in blacklist function #1148

Merged

ericwb closed this as completed in #1148 Jun 24, 2024

ericwb closed this as completed in 4208e9d Jun 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Massive reduction in performance after 1.5.0 #438

Massive reduction in performance after 1.5.0 #438

Plazmaz commented Dec 18, 2018

Plazmaz commented Dec 18, 2018

ehooo commented Dec 22, 2018

lukehinds commented Dec 23, 2018

ehooo commented Dec 24, 2018 •

edited

Loading

ehooo commented Dec 26, 2018

lukehinds commented Jan 11, 2019 •

edited

Loading

Plazmaz commented Jan 14, 2019

ghugo commented Jan 14, 2019

lukehinds commented Jan 17, 2019

ehooo commented Jan 17, 2019

lukehinds commented Jan 18, 2019 •

edited

Loading

Plazmaz commented Feb 22, 2019 •

edited

Loading

KelSolaar commented May 21, 2019 •

edited

Loading

KelSolaar commented May 21, 2019 •

edited

Loading

Plazmaz commented May 23, 2019 •

edited

Loading

KelSolaar commented May 23, 2019

Plazmaz commented May 23, 2019 •

edited

Loading

KelSolaar commented May 23, 2019

Plazmaz commented May 23, 2019

Massive reduction in performance after 1.5.0 #438

Massive reduction in performance after 1.5.0 #438

Comments

Plazmaz commented Dec 18, 2018

Plazmaz commented Dec 18, 2018

ehooo commented Dec 22, 2018

lukehinds commented Dec 23, 2018

ehooo commented Dec 24, 2018 • edited Loading

ehooo commented Dec 26, 2018

lukehinds commented Jan 11, 2019 • edited Loading

Plazmaz commented Jan 14, 2019

ghugo commented Jan 14, 2019

lukehinds commented Jan 17, 2019

ehooo commented Jan 17, 2019

lukehinds commented Jan 18, 2019 • edited Loading

Plazmaz commented Feb 22, 2019 • edited Loading

KelSolaar commented May 21, 2019 • edited Loading

KelSolaar commented May 21, 2019 • edited Loading

Plazmaz commented May 23, 2019 • edited Loading

KelSolaar commented May 23, 2019

Plazmaz commented May 23, 2019 • edited Loading

KelSolaar commented May 23, 2019

Plazmaz commented May 23, 2019

ehooo commented Dec 24, 2018 •

edited

Loading

lukehinds commented Jan 11, 2019 •

edited

Loading

lukehinds commented Jan 18, 2019 •

edited

Loading

Plazmaz commented Feb 22, 2019 •

edited

Loading

KelSolaar commented May 21, 2019 •

edited

Loading

KelSolaar commented May 21, 2019 •

edited

Loading

Plazmaz commented May 23, 2019 •

edited

Loading

Plazmaz commented May 23, 2019 •

edited

Loading