Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Node and Chromium processes remain despite calling IPlaywright.Dispose() #1749

Closed
corygehr opened this issue Sep 18, 2021 · 12 comments · Fixed by #1864
Closed

[Bug]: Node and Chromium processes remain despite calling IPlaywright.Dispose() #1749

corygehr opened this issue Sep 18, 2021 · 12 comments · Fixed by #1864

Comments

@corygehr
Copy link
Member

Playwright version

1.14.1

Operating system

Linux

What browsers are you seeing the problem on?

Chromium

Other information

.NET 5; Ubuntu Server 18.04 (Azure Kubernetes).

What happened? / Describe the bug

Despite calling IBrowserContext.DisposeAsync() and IPlaywright.Dispose(), I see zombie processes on my Linux host - one Node process paired with two Chromium instances (see the Log Output).

My service is a .NET Console Application which runs indefinitely (until terminated by the Kubernetes host). and needs to tear down the IDriver after my service finishes processing a request.

Code snippet to reproduce your bug

// Create Driver and Browsers.
var driver = Playwright.CreateAsync().Result;
var browserInstance = await driver.Chromium.LaunchAsync();

// Do work...

// Dispose.
browserInstance.CloseAsync().Wait();
driver.Dispose();

Relevant log output

ps aux

root     23085  0.0  0.0      0     0 ?        Z    01:35   0:00 [chrome] <defunct>
root     23086  0.1  0.0      0     0 ?        Z    01:35   0:02 [chrome] <defunct>
root     24215  0.3  0.0      0     0 ?        Z    Sep17   0:34 [node] <defunct>
root     24334  0.0  0.0      0     0 ?        Z    Sep17   0:00 [chrome] <defunct>
root     24335  0.0  0.0      0     0 ?        Z    Sep17   0:02 [chrome] <defunct>
root     25653  1.2  0.0      0     0 ?        Z    01:16   0:35 [node] <defunct>
root     25770  0.0  0.0      0     0 ?        Z    01:16   0:00 [chrome] <defunct>
root     25771  0.0  0.0      0     0 ?        Z    01:16   0:02 [chrome] <defunct>
root     26958  0.3  0.0      0     0 ?        Z    Sep17   0:35 [node] <defunct>
root     27076  0.0  0.0      0     0 ?        Z    Sep17   0:00 [chrome] <defunct>
root     27077  0.0  0.0      0     0 ?        Z    Sep17   0:02 [chrome] <defunct>
root     28314  0.9  0.0      0     0 ?        Z    00:57   0:36 [node] <defunct>
root     28431  0.0  0.0      0     0 ?        Z    00:57   0:00 [chrome] <defunct>
root     28432  0.0  0.0      0     0 ?        Z    00:57   0:02 [chrome] <defunct>
root     29734  0.3  0.0      0     0 ?        Z    Sep17   0:33 [node] <defunct>
root     29851  0.0  0.0      0     0 ?        Z    Sep17   0:00 [chrome] <defunct>
root     29852  0.0  0.0      0     0 ?        Z    Sep17   0:01 [chrome] <defunct>
root     30989  0.7  0.0      0     0 ?        Z    00:38   0:36 [node] <defunct>
root     31107  0.0  0.0      0     0 ?        Z    00:38   0:00 [chrome] <defunct>
root     31108  0.0  0.0      0     0 ?        Z    00:38   0:02 [chrome] <defunct>
@pavelfeldman
Copy link
Member

Can you reproduce this outside K8S?

@trebor678
Copy link

trebor678 commented Sep 28, 2021

I have a similar issue where node isn't closing and it creates a new process each time it creates a new instance. I'm using Firefox though and that seems to be closing fine.

I'm running this on a standard Windows machine

@corygehr
Copy link
Member Author

I have not tried outside of K8S, though I do occasionally see a stuck Chromium window when running on my local Windows machine (non-headless mode). Will try to test against a Linux VM without K8S.

@corygehr
Copy link
Member Author

corygehr commented Sep 30, 2021

Still haven't played with this outside of Kubernetes, but I came across an article discussing this:

https://www.back2code.me/2020/02/zombie-processes-back-in-k8s/

The short version is: adding shareProcessNamespace: true to the spec segment of the deployment seems to fix the issue. However, it doesn't feel like a solid solution - I don't think it's obvious to folks that they need to do it until they notice the problem and start searching, and I don't know about the security implications of adding this flag.

I also came across a Stack Overflow post with a similar issue:

https://stackoverflow.com/questions/43515360/net-core-process-start-leaving-defunct-child-process-behind

I do notice that Program.cs in Playwright.Core creates a Process but doesn't set the EnableRaisingEvents property to true. I'm not sure if that's relevant here, especially because I don't think Program is used when calling Playwright.CreateAsync() and I have not dug deeper to see where else Process objects may be created from the .NET , library, but it might be worth investigating. I'm not sure if something (either in dotnet or the container) waits for the event to get raised to clean up the driver.

@fr4gles
Copy link

fr4gles commented Oct 1, 2021

We have same problem on Windows 10 / Windows Server 2019 - node.exe process remain detached from parent process besides

  • all pages are closed
  • all contexts are closed & disposed
  • playwright "server" / "driver" is disposed

node.exe processes are closed automatically after main process is closed.

@corygehr
Copy link
Member Author

corygehr commented Oct 1, 2021

I can repro this on my developer machine as well (Windows). node.exe sticks around, along with two chrome.exe processes - this is after updating my code per our offline discussion to properly implement the async methods provided by the Playwright library.

I'm having a hard time telling if this issue is actually resolved in my K8 cluster per my comments above as well. I no longer see defunct processes, but there are several which still appear to be running (I think this might be expected due to sharing processes across the namespace?)

Meanwhile, they seem to be getting OOMKilled after some time which I suspect is due to going over their memory allocation because of the lingering processes.

@kababoom
Copy link

kababoom commented Oct 12, 2021

Same issue here.

  • Context closed & disposed
  • Browser closed & disposed
  • Playwright disposed

Node processes are left behind until main program closed.

Verified on Windows 10, Ubuntu 20 and OSX 11

I assume this is because in StdIOTransport.cs we start _process playwright.sh which opens in bash, later we kill this _process but it only kills the bash process.

@kababoom
Copy link

First of all it might be I'm doing something wrong, if so please let me know..

Without any K8s looping the frontpage sample with added close/dispose like this:

        while (true)
        {
            using var playwright = await Playwright.CreateAsync();
            await using var browser = await playwright.Chromium.LaunchAsync(new() { Headless = false });
            var page = await browser.NewPageAsync();
            await page.GotoAsync("https://playwright.dev/dotnet");
            await page.ScreenshotAsync(new() { Path = "screenshot.png" });

            await browser.CloseAsync();
            await browser.DisposeAsync();
            playwright.Dispose();

            await Task.Delay(2500);
        }

Number of node processes will keep growing until exiting main..

Problem might lay elsewhere (node + cli.js not exiting cleanly) but adding a dispose before the kill in StdIOTransport.cs fixes it .

        public void Close(string closeReason)
        {
            if (!IsClosed)
            {
                IsClosed = true;
                TransportClosed?.Invoke(this, new() { CloseReason = closeReason });
                _readerCancellationSource?.Cancel();
                try
                {
                    _process?.Dispose(); //This releases the other resources like node.
                    _process?.Kill();
                }
                catch
                {
                }
            }
        }

@fr4gles
Copy link

fr4gles commented Oct 14, 2021

BTW @kababoom

_process?.Dispose(); //This releases the other resources like node.
_process?.Kill();

IMO Kill() should be called before Dispose()

@kababoom
Copy link

@fr4gles

It's just to illustrate dispose does release the child node processes.
Up to the devs howto.

@jasomusc
Copy link

On this version it works - #1813 (comment)

It adds process?.Dispose(); as mentioned by @kababoom

@andliang
Copy link

andliang commented Dec 3, 2021

I'm also seeing zombie node processes depsite calling Dispose() and DisposeAsync() methods

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants