Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CVE-2022-42964 ReDOS vulnerability in GaussianInput #2755

Open
drew-parsons opened this issue Nov 28, 2022 · 2 comments
Open

CVE-2022-42964 ReDOS vulnerability in GaussianInput #2755

drew-parsons opened this issue Nov 28, 2022 · 2 comments

Comments

@drew-parsons
Copy link
Contributor

drew-parsons commented Nov 28, 2022

Describe the bug
A CVE-2022-42964 ReDOS vulnerability in GaussianInput has been reported in the GaussianInput.from_string method.

An exponential ReDoS (Regular Expression Denial of Service) can be triggered in the pymatgen PyPI package, when an attacker is able to supply arbitrary input to the GaussianInput.from_string method

The report was made at https://research.jfrog.com/vulnerabilities/pymatgen-redos-xray-257184/
and documented by Debian at https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1024017 (see also https://security-tracker.debian.org/tracker/CVE-2022-42964 )

To Reproduce
Steps to reproduce the behavior:

  1. Use test code CVE-2022-42964.py
import time
from pymatgen.io.gaussian import GaussianInput

def str_and_from_string(i):
    ans = """#P HF/6-31G(d) SCF=Tight SP

H4 C1

0 1
"""
    vulnerable_input = ans + 'C'+'0' * i + '!'+'\n'
    GaussianInput.from_string(vulnerable_input)

for i in range(1000):
    start = time.time()
    str_and_from_string(i)
    print(f"{i}: Done in {time.time() - start}")
  1. python3 CVE-2022-42964.py
  2. Output shows exponentially growing execution time for what should be a trivial constant loop

Expected behavior
Creating strings of the kind in this example should require the same millisecond time in each iteration.

Screenshots

$ python3 CVE-2022-42964.py
0: Done in 0.0006997585296630859
1: Done in 4.506111145019531e-05
2: Done in 3.814697265625e-05
3: Done in 4.291534423828125e-05
4: Done in 5.6743621826171875e-05
5: Done in 6.365776062011719e-05
6: Done in 5.555152893066406e-05
7: Done in 7.033348083496094e-05
8: Done in 0.00010371208190917969
9: Done in 0.00017571449279785156
10: Done in 0.0003418922424316406
11: Done in 0.0006191730499267578
12: Done in 0.0012633800506591797
13: Done in 0.002537250518798828
14: Done in 0.005010366439819336
15: Done in 0.009590387344360352
16: Done in 0.01953911781311035
17: Done in 0.03992795944213867
18: Done in 0.07311630249023438
19: Done in 0.13045120239257812
20: Done in 0.2530491352081299
21: Done in 0.5362303256988525
22: Done in 1.0537843704223633
23: Done in 2.012873888015747
24: Done in 4.074865102767944
25: Done in 8.38607931137085
26: Done in 17.248133182525635
27: Done in 38.30663585662842
28: Done in 79.40008401870728
...

Desktop:

  • OS: Debian Linux
  • Linux Version 6.0.8-1 (debian unstable)

Additional context
Python 3.10.8
pymatgen 2022.11.7

@ScottNotFound
Copy link
Contributor

Looks like this stems from ^(\w+)* in

_zmat_patt = re.compile(r"^(\w+)*([\s,]+(\w+)[\s,]+(\w+))*[\-\.\s,\w]*$")

which you can explode with "0" * 100 + "!".

I'm not a Gaussian user, but is this part of the regex necessary? Can the input be satisfied with ^(\w)*...?

@gVallverdu
Copy link
Contributor

@ScottNotFound I don't think it is possible to change ^(\w+)* to ^(\w)*.

This regex is for example looking for lines such as

C1
C2  1   CC
H3  1   CH1  2  asp2
H4  1   CH1  2  asp2  3  180.
H5  2   CH2  1  asp3  3  D1
H6  2   CH2  1  asp3  5  D2

The line starts by an element symbol which could be (optionally) followed by a number.
If only chemical elements were valid items here, the Regex could had been [a-zA-Z]{1,2}. But, as for the above example you ca write C1 or C.

Using the following regex line 93 for the class attribute _zmat_patt should be ok. The tests in test_gaussian.py are valid.

_zmat_patt = re.compile(r"^(\w+)([\s,]+(\w+)[\s,]+(\w+)){0,3}[\-\.\s,\w]*$")

It looks like it fixes the vulnerability:

1: Done in 0.0005118846893310547
21: Done in 6.198883056640625e-05
41: Done in 5.817413330078125e-05
61: Done in 7.319450378417969e-05
81: Done in 9.799003601074219e-05
101: Done in 0.00012993812561035156
121: Done in 0.0001690387725830078
141: Done in 0.00022125244140625
161: Done in 0.00026917457580566406
181: Done in 0.00033211708068847656
201: Done in 0.0004401206970214844
221: Done in 0.0005009174346923828
241: Done in 0.0005931854248046875
261: Done in 0.0006520748138427734
281: Done in 0.0007491111755371094
301: Done in 0.0008258819580078125
321: Done in 0.00090789794921875
341: Done in 0.001276254653930664
361: Done in 0.0011610984802246094
381: Done in 0.0013689994812011719
401: Done in 0.0014448165893554688
421: Done in 0.001538991928100586
441: Done in 0.0017888545989990234
461: Done in 0.0019121170043945312
481: Done in 0.0019729137420654297
501: Done in 0.002460002899169922
521: Done in 0.0028901100158691406
541: Done in 0.002777099609375
561: Done in 0.0030379295349121094
581: Done in 0.0029418468475341797
601: Done in 0.0033109188079833984
621: Done in 0.003364086151123047
641: Done in 0.0034112930297851562
661: Done in 0.003716707229614258
681: Done in 0.0038809776306152344
701: Done in 0.004051923751831055
721: Done in 0.004559993743896484
741: Done in 0.004487037658691406
761: Done in 0.0049211978912353516
781: Done in 0.00495600700378418
801: Done in 0.00517582893371582
821: Done in 0.005485057830810547
841: Done in 0.005793094635009766
861: Done in 0.005961894989013672
881: Done in 0.007217884063720703
901: Done in 0.006846904754638672
921: Done in 0.007014751434326172
941: Done in 0.00799417495727539
961: Done in 0.007646322250366211
981: Done in 0.008347272872924805
1001: Done in 0.008255958557128906

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants