New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Strange memory leak(?) consuming behaviour #65
Comments
Example adding all the usual del's and gc() cleanups
120mb still in use, unable to be released |
using
|
update looks like a Python LXML memory leak issue https://medium.com/devopss-hole/python-lxml-memory-leak-b8d0b1000dc7 |
thank you for reporting this. although the issue is caused by lxml, it still looks like an interesting problem. the following code releases the allocated memory (see https://www.mail-archive.com/lxml@python.org/msg00029.html) but probably requires more investigation and testing to determine whether it works stable and across systems. import ctypes
def trim_memory() -> int:
libc = ctypes.CDLL("libc.so.6")
return libc.malloc_trim(0) |
@AlbertWeichselbraun ok, I was thinking of closing this issue and making a new PR with a small note for the README, but lets keep it open and see what we can find - definitely looks more like a bug in liblxml like you say, thanks! |
2.2.0
update looks like a Python LXML memory leak issue https://medium.com/devopss-hole/python-lxml-memory-leak-b8d0b1000dc7
For some background, I'm using your wonderful library in my flask application, so it means that the process does not get restarted, I've tried solving this by moving the inscriptis step to its own thread but it still seems to make the whole app bleed memory
See the script and the test HTML here
leaky.html.zip
What I'm seeing is that that on some more complex HTML, it will consume something like 150Mb on the first
get_text(..)
call, and then it will never let the process release that memory, that's the problem for me.gc.collect()
afterget_text()
but it never releases the memorydel text_content
etc etc, but that didnt helpideas? is this a bug?
Happy to throw a few dollars across for supporting your wonderful project!
The text was updated successfully, but these errors were encountered: