Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[🐛 Bug]: Log File Process Is Not Closed in Firefox Driver Binary #11730

Closed
acbilson opened this issue Mar 3, 2023 · 3 comments
Closed

[🐛 Bug]: Log File Process Is Not Closed in Firefox Driver Binary #11730

acbilson opened this issue Mar 3, 2023 · 3 comments

Comments

@acbilson
Copy link

acbilson commented Mar 3, 2023

What happened?

I am running a Selenium instance with a Firefox driver to scrape and download many files from a website. This runs as a Python Flask web service inside a Docker container.

I discovered that my container would scrape a few pages before it began to hit it's memory limits and need a restart. I used Python's default profiler to investigate where the memory allocation was growing and discovered that the process handler to the log file continued to grow with each execution. This was especially surprising given that I was pointing the logging service to /dev/null.

I was able to resolve this in my code by manually closing the file handler prior to calling driver.quit(). I think it might be best if driver.quit() handled closing this handler internally.

How can we reproduce the issue?

# This was a wrapper I created to ensure that the log file handler closes when
# I am finished with the driver. If you remove the line that closes the handler
# and run this driver instance against a site multiple times, you'll observe
# that the handler eats up more and more space. If you don't have a lot of
# memory, you may also observe that each execution gets slower.

class DriverManager:
    """wraps a selenium driver instance
    attrs:
        download_path (str): the folder location that driver downloads will be placed inside
        firefox_exe_path (str): the path to the Firefox executable
        gecko_driver_exe_path (str): the path to the Gecko Driver executable
    """

    def __init__(
        self,
        download_path: str,
        firefox_exe_path: str,
        gecko_driver_exe_path: str,
    ):
        self.download_path = download_path

        # Setup the firefox webdriver
        service = Service(executable_path=gecko_driver_exe_path, log_path=os.devnull)

        options = Options()
        options.headless = True
        options.binary = firefox_exe_path
        options.set_preference("browser.download.folderList", 2)
        options.set_preference("browser.download.manager.showWhenStarting", False)
        options.set_preference("browser.download.dir", download_path)
        options.set_preference("download.prompt_for_download", False)
        options.set_preference(
            "browser.helperApps.neverAsk.saveToDisk", "application/pdf"
        )
        options.set_preference("pdfjs.disabled", True)
        options.set_capability("marionette", True)

        self.driver = Firefox(options=options, service=service)

    def __enter__(self) -> Firefox:
        return self.driver

    def __exit__(self, exception_type, exception_val, trace):
        # closes file handler manually to fix memory leak
        self.driver.binary._log_file.close()
        self.driver.quit()

Relevant log output

I wish I'd kept the profiler output, but I don't have it anymore.

Operating System

Debian Buster

Selenium version

Python 4.1.3

What are the browser(s) and version(s) where you see this issue?

Firefox 102.0.1

What are the browser driver(s) and version(s) where you see this issue?

GeckoDriver v0.31.0

Are you using Selenium Grid?

No response

@github-actions
Copy link

github-actions bot commented Mar 3, 2023

@acbilson, thank you for creating this issue. We will troubleshoot it as soon as we can.


Info for maintainers

Triage this issue by using labels.

If information is missing, add a helpful comment and then I-issue-template label.

If the issue is a question, add the I-question label.

If the issue is valid but there is no time to troubleshoot it, consider adding the help wanted label.

If the issue requires changes or fixes from an external project (e.g., ChromeDriver, GeckoDriver, MSEdgeDriver, W3C), add the applicable G-* label, and it will provide the correct link and auto-close the issue.

After troubleshooting the issue, please add the R-awaiting answer label.

Thank you!

@acbilson
Copy link
Author

acbilson commented Mar 3, 2023

It was a few months back that I ran into this, so my memory is a little fuzzy. I just remembered while looking at the configuration that I had added harakiri mode to my USWGI config. It's possible that the log file handler was actually keeping the main process alive and not only an open handler to the log file.

@symonk symonk closed this as completed in e4b87d4 Mar 11, 2023
alpatron pushed a commit to alpatron/selenium that referenced this issue Mar 15, 2023
Copy link

github-actions bot commented Dec 9, 2023

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked and limited conversation to collaborators Dec 9, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants