Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

verify=False causing memory leak #5215

Closed
tallona opened this issue Sep 27, 2019 · 13 comments
Closed

verify=False causing memory leak #5215

tallona opened this issue Sep 27, 2019 · 13 comments

Comments

@tallona
Copy link

tallona commented Sep 27, 2019

When I use verify=False in requests.get the memory usage creeps up over time until all memory is used up, if I remove verify=False then the memory usage is stable at approximately about 14.2-14.9mb over time.

I've changed the request to use verify=True and memory usage is stable at the expected usage I've seen as outlined above.

When using the requests.get function I can see the memory usage slowly creeping up via task manager by watching the spawned process.

I've watched this over a two hour period in both cases and when using verify=False the memory usage is at 1GB+ after two hours.

I've recreated the issue using a free API that uses no auth, sample script attached.

What you expected

Memory usage to be stable enough during life time of script running.

What happened instead

Memory usage creeps up over time and will use up all server memory available eventually.

Sample Script

#!/usr/bin/env python

import time, threading, requests, urllib3

urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)

def check_status():
    global running

    url = "https://swapi.co/api/films/1/"

    while running:
        response = requests.get(url, verify=False)
        
        time.sleep(5)

if __name__ == "__main__":
    running = True
    
    try:    
        check_status = threading.Thread(target=check_status)
        check_status.start()
        
        # To stop falling off main thread so it will catch a keyboard interrupt
        while running:
            time.sleep(1)
    except KeyboardInterrupt:
        running = False
    except Exception as e:
        running = False
        

System Information

$ python -m requests.help
{
  "chardet": {
    "version": "3.0.4"
  },
  "cryptography": {
    "version": ""
  },
  "idna": {
    "version": "2.8"
  },
  "implementation": {
    "name": "CPython",
    "version": "3.7.4"
  },
  "platform": {
    "release": "10",
    "system": "Windows"
  },
  "pyOpenSSL": {
    "openssl_version": "",
    "version": null
  },
  "requests": {
    "version": "2.22.0"
  },
  "system_ssl": {
    "version": "1010103f"
  },
  "urllib3": {
    "version": "1.25.5"
  },
  "using_pyopenssl": false
}

Pympler summary

Some additional information, I've used pympler to give some memory usage with verify set as both possibilities.

1 - Start of script - verify=False

                           types |   # objects |   total size

==================================== | =========== | ============
<class 'str | 13937 | 1.59 MB
<class 'dict | 2014 | 1.09 MB
<class 'code | 4445 | 626.77 KB
<class 'type | 563 | 588.95 KB
<class 'set | 147 | 144.16 KB
<class 'tuple | 1574 | 107.02 KB
<class 'wrapper_descriptor | 1213 | 94.77 KB
<class 'weakref | 1093 | 85.39 KB
<class 'abc.ABCMeta | 84 | 83.24 KB
<class 'builtin_function_or_method | 1042 | 73.27 KB
<class 'method_descriptor | 945 | 66.45 KB
<class 'int | 1891 | 56.13 KB
<class 'list | 314 | 53.05 KB
<class 'getset_descriptor | 689 | 48.45 KB
<class 're.Pattern | 87 | 44.62 KB

2 - 4/5 minutes into run - verify=False

                           types |   # objects |   total size

==================================== | =========== | ============
<class 'list | 61274 | 22.22 MB
<class 'str | 75877 | 5.81 MB
<class 'dict | 2434 | 1.17 MB
<class 'code | 4445 | 626.77 KB
<class 'type | 563 | 588.95 KB
<class 'int | 12663 | 350.91 KB
<class 'set | 167 | 148.53 KB
<class 'tuple | 1573 | 106.91 KB
<class 'wrapper_descriptor | 1213 | 94.77 KB
<class 'collections.OrderedDict | 104 | 92.80 KB
<class 'weakref | 1093 | 85.39 KB
<class 'abc.ABCMeta | 84 | 83.24 KB
<class 'builtin_function_or_method | 1042 | 73.27 KB
<class 'method_descriptor | 945 | 66.45 KB
<class 'bytes | 123 | 50.75 KB

3 - Start of script - verify=True

                           types |   # objects |   total size

==================================== | =========== | ============
<class 'str | 13937 | 1.59 MB
<class 'dict | 2014 | 1.09 MB
<class 'code | 4445 | 626.77 KB
<class 'type | 563 | 588.95 KB
<class 'set | 147 | 144.16 KB
<class 'tuple | 1574 | 107.02 KB
<class 'wrapper_descriptor | 1213 | 94.77 KB
<class 'weakref | 1081 | 84.45 KB
<class 'abc.ABCMeta | 84 | 83.24 KB
<class 'builtin_function_or_method | 1030 | 72.42 KB
<class 'method_descriptor | 945 | 66.45 KB
<class 'int | 1891 | 56.13 KB
<class 'list | 314 | 53.05 KB
<class 'getset_descriptor | 689 | 48.45 KB
<class 're.Pattern | 87 | 44.62 KB

4 - 4/5 minutes into run - verify=True

                           types |   # objects |   total size

==================================== | =========== | ============
<class 'str | 13936 | 1.59 MB
<class 'dict | 2014 | 1.09 MB
<class 'code | 4444 | 626.62 KB
<class 'type | 563 | 588.95 KB
<class 'set | 147 | 144.16 KB
<class 'tuple | 1573 | 106.91 KB
<class 'wrapper_descriptor | 1213 | 94.77 KB
<class 'weakref | 1081 | 84.45 KB
<class 'abc.ABCMeta | 84 | 83.24 KB
<class 'builtin_function_or_method | 1030 | 72.42 KB
<class 'method_descriptor | 945 | 66.45 KB
<class 'int | 1891 | 56.13 KB
<class 'list | 314 | 53.05 KB
<class 'getset_descriptor | 689 | 48.45 KB
<class 're.Pattern | 87 | 44.62 KB

@tallona tallona changed the title verify=False causing memory leak verify=False causing memory leak Sep 27, 2019
@AndTheDaysGoBy
Copy link

AndTheDaysGoBy commented Sep 28, 2019

@tallona Have you tried the same experiment with straight urllib3? I ask because it seems the verify code just reduces to this function where, if verify = False, then conn.ca_certs is not set on the urllib3 Connection object.

def cert_verify(self, conn, url, verify, cert):

@tallona
Copy link
Author

tallona commented Sep 28, 2019

@AndTheDaysGoBy I've just tried using urllib3 as seen in the function below and the memory usage is stable, slightly more then using the requests package but regardless the memory is stable at about approximately 15.1mb for over 30 minutes.

It's strange as it only happens when using verify=False on requests.get but like you said cert_verify doesn't do a whole lot, I see that in cert_verify there are default certs loaded when verify is False and there is also a call to extract_zipped_paths when verify is False, I haven't had time to dig further into either of these areas yet, could one of these be responsible for the memory usage creeping up?!

I've also added some additional memory usage stats using pympler to the original post.

def check_status():
	global running

	url = "https://swapi.co/api/films/1/"
	http = urllib3.PoolManager(key_file=None, cert_file=None, cert_reqs='CERT_NONE', ca_certs=None)

	while running:
		response = http.request('GET', url)
		
		time.sleep(5)

@AndTheDaysGoBy
Copy link

AndTheDaysGoBy commented Sep 29, 2019

@tallona Seems your statistics imply something is repeatedly generating lists in the case of verify = False? Also, surprising if things are constant in the case of just running the urllib3 code. In particular because the requests code can be understood as:

def get/post/put/etc. -> def request -> adapter.send() (HTTPAdapter) where HTTPAdapter leads to a file which makes use of urllib3. E.g. the above function I mentioned.

The approach I'd take if I were you is to isolate further and further the issue. I.e., does the issue appear when using requests.request()? Does the issue appear when you use preparedRequest + send? Etc.

Edit:
As an aside, presuming I'm performing your test properly, I don't believe I experience the same behavior. I.e. running the below has a diff on during the first 5 seconds, but empty diffs all subsequent times. Presumably because I'm not experiencing a leak, hence nothing is being continuously added to the memory footprint.

#!/usr/bin/env python
from pympler import tracker
import time, threading, requests, urllib3

urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)

def check_status():
    global running, tr
    url = "https://swapi.co/api/films/1/"
    while running:
        response = requests.get(url, verify=False)
        tr.print_diff()     
        time.sleep(5)

if __name__ == "__main__":
    running = True
    tracker.SummaryTracker()
    try: 
        check_status = threading.Thread(target=check_status)
        check_status.start()
        while running:
            time.sleep(1)
    except KeyboardInterrupt:
        running = False
    except:
        running = False

@tallona
Copy link
Author

tallona commented Sep 29, 2019

@AndTheDaysGoBy Done a bit more testing using requests.request (example function below) and I'm seeing the same as using requests.get. When I set verify=False the memory usage creeps up over time and keeps increasing, but when I set verify=True the memory usage is stable.

I've also taken your code example and run it locally, when using verify False I'm seeing a memory increase over time but using verify True the memory usage is stable.

def check_status():
    global running

    url = "https://swapi.co/api/films/1/"

    while running:
        response = requests.request(method='GET', url=url, verify=False)
        
        time.sleep(5)

@AndTheDaysGoBy
Copy link

AndTheDaysGoBy commented Sep 29, 2019

@tallona Meaning, you constantly see a memory diff? Or do you perform something like top or watch pmap -x <pid>? Because, if what you say it's true, I'm (personally) at a loss because I can't reproduce. Granted, I'm running Ubuntu, not Windows 10, but I have tried with Python 2.7 and 3.7 to reproduce your issue. I also noticed you in a different thread regarding memory leaks. What someone in that thread had mentioned is requests's caching that's the issue. Again, though, I can't reproduce it personally, so the most I can give is guidance on narrowing down the issue area.

Since you state requests.request gives you the same issue, does r = requests.Request(...); r.prepare_request(); r.send() exhibit the same behavior?

If that still is the issue, then try instantiating

class HTTPAdapter(BaseAdapter):

manually, i.e. the adapters.py HTTPAdapter object and sending via that. If that no longer fails then the issue would at least be something in the sessions.py. If the issue persists, then the issue is likely somewhere in adapters.py.

@tiran
Copy link
Contributor

tiran commented Nov 18, 2019

The initial report was for Python 3.7.4 on Windows. You might be affected by the Windows-only bug https://bugs.python.org/issue37702

@nepix32
Copy link

nepix32 commented Jan 15, 2020

Can confirm this bug when using verify=False on requests to addresses with https. Memory consumption growing without bounds (requests 2.22.0)

Affected environment: Winpython 64 3740/3741)
Environment not affected: Winpython 64 3760

Thanks to @tiran to pointing that out!

@NoiseControllers
Copy link

NoiseControllers commented Mar 10, 2020

Any solution, for this, I have the same problem.

python: 3.7
requests: 2.23.0

@harrisonwhiskey
Copy link

OMG! I can confirm verify=False leaks memory so bad. The past 10 days has been hell for me trying to find the leak in my python app

@tiran
Copy link
Contributor

tiran commented Jun 3, 2020

OMG! I can confirm verify=False leaks memory so bad. The past 10 days has been hell for me trying to find the leak in my python app

What's your platform, Python version, and requests version?

@harrisonwhiskey
Copy link

OMG! I can confirm verify=False leaks memory so bad. The past 10 days has been hell for me trying to find the leak in my python app

What's your platform, Python version, and requests version?

Windows 10
python: 3.7.4
requests: 2.23.0

@tiran
Copy link
Contributor

tiran commented Jun 3, 2020

You have to update to a newer version of 3.7 series. 3.7.4 has a memory leak that only affects Windows.

@nateprewitt
Copy link
Member

As @tiran pointed out this does appear to be resolved in https://bugs.python.org/issue37702. We haven't seen new reports of this in almost a year, so I'm going to resolve.

If you are still experiencing this on Python 3.7.5+, please open a new issue linking back to this one so we can track separately. Thanks!

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 3, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants