Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WSL2 Slowing down over time (August 2020) #5832

Closed
raphaelzulliger opened this issue Aug 30, 2020 · 9 comments
Closed

WSL2 Slowing down over time (August 2020) #5832

raphaelzulliger opened this issue Aug 30, 2020 · 9 comments

Comments

@raphaelzulliger
Copy link

Environment

PS C:\WINDOWS\system32> [Environment]::OSVersion                                                                        
Platform ServicePack Version      VersionString
-------- ----------- -------      -------------
 Win32NT             10.0.19041.0 Microsoft Windows NT 10.0.19041.0

Release:        20.04
(Issue also happened with 18.04)

Linux version 4.19.104-microsoft-standard (oe-user@oe-host) (gcc version 8.2.0 (GCC)) #1 SMP Wed Feb 19 06:37:35 UTC 2020

Steps to reproduce

In the home folder (i.e. on the Linux FS):

wget https://dl.bintray.com/boostorg/release/1.74.0/source/boost_1_74_0.tar.gz
tar xvfz boost_1_74_0.tar.gz
cd boost_1_74_0
./bootstrap.sh
./b2

for i in {1..10}
do
  ./b2 clean && time ./b2
done

WSL logs: Will follow as per the instruction

Expected behavior

The compilation of boost ./b2 should take more or less the same time on each invocation

Actual behavior

Each time boost is built, it takes longer than the previous time, the difference is growing more and more. wsl --shutdown seems not to bring down the times only a Windows restart helps to resolve the issue. In my test run, the command took the following times:

  • After Windows 10 reboot (real/user/sys):
    • 5:23/25:32/22:32
    • 5:41/29:23/22:21
    • 7:11/39:10/26:15
    • 24:54/144:45/83:04
  • After manually restarting lxssManager service and a wsl --shutdown:
    • 30:11/173:25/101:02

I originally encountered this while compiling a closed source project. Boost was just taken for the sake of this issue.

Related issues

I'm aware of #4498 and I've already applied the wsl.conf-PATH thing (that one hurt me too btw)

More considerations

The System is running on a MacBook Pro. Windows 10 in running within Parallels 16.0, IOW: Nested virtualization. The VM has 10 Cores assigned, 16 GByte of RAM, which is by far not fully used according to the TaskManager (used: ~8GByte). Right now, I do not have access to a native Windows box. When I have, I will run the same test there as well.

While running the test, CPU usage (in TaskManager) is almost constantly at 100% (which seems "reasonable" due to the compilation load generated): VMMem (~80%), System Interrupts (2..9%), Tasmgr.exe, dwm.exe, ...

Windows 10 is merely unusable when the compilation is ongoing. E.g. Clicking the start menu takes about 20 seconds until something happens. At least this is the situation after having run the test 4-5 times.

I changed the Parallels hyper visor settings from "Adaptive" to "Non-Adaptive", without any obvious change.

Thread & Handles count in TaskManager show no anomalies

From MacOs point of view, the Parallels process consumes around 970% of CPU power (where 1000% = 10 cores). Thus, it seems like "the Windows VM" really fully utilizes the CPU. Also the highly sounding FANs seem to double this

@raphaelzulliger
Copy link
Author

Regarding the "recorded logs": I've recorded them when the system was in a bad state already, i.e. very slow. Note that I aborted the ./b2 command (Ctrl+C) after some minutes of running.

@therealkenc
Copy link
Collaborator

Cannot reproduce here, at least on WSL2 20201. The runs are consistent:

image

Since you are on 19041, WAG run:

$ sync && sudo bash -c "echo 1 > /proc/sys/vm/drop_caches"

Also setting memory=N where N is half your physical RAM in .wslconfig doesn't hurt. In your case that would be 8GB since your VM is 16GB.

@cmorty
Copy link

cmorty commented Sep 1, 2020

Running the following script as root in the backgound helps a lot during docker build:

#!/bin/bash

while :
do
    freemem=`free -g | head -n2 | tail -n1  | awk '{print $4}'`
    #free -g
    if [ $freemem -eq 0 ] ; then
        echo Dropping Cache
        echo 1 > /proc/sys/vm/drop_caches
    fi
    sleep 2
done

@raphaelzulliger
Copy link
Author

Thanks for the ultra fast response!

In short: cmorty's approach helps.

Here's what I did:

  • I adjusted .wslconfig to "reasonable values":
    • Use only 8 out of 10 CPU cores
    • Use only 8 GByte of RAM
    • htop shows that these changes worked
  • I ran sync && sudo bash -c "echo 1 > /proc/sys/vm/drop_caches" in-between each ./b2 invocation

Unfortunately, the times still increased.

I then ran cmorty's script in the background and the times became constant. I ran this test with the adjusted .wslconfig in place. Thus, I'm not sure whether it would be sufficient to only run cmorty's script in the background but not limit CPU/RAM

@raphaelzulliger
Copy link
Author

I wonder why the system behaves like that. Is this considered a bug (and thus will be fixed) or is it some kind of limitation we have to deal with forever (🤪)?

@therealkenc
Copy link
Collaborator

therealkenc commented Sep 1, 2020

Interesting dropping the cache every two seconds versus between runs behaves much different, at least smoothed over time. It isn't obvious that "drop caches harder" would be better. Academic; if it works, it works. You've hit known #4166, over which enough ink has been spilt.

@cmorty
Copy link

cmorty commented Sep 2, 2020

@therealkenc My script isn't dropping caches every 2 seconds. It drops caches when the free memory goes below 1G. My assumption is, that the code that drops the caches to reclaim memory for Windows, when Docker is idle, broke the original Linux replacement strategy, probably not freeing the cache for new files anymore.

@cmorty
Copy link

cmorty commented Sep 2, 2020

@therealkenc : Also, opposed to reducing WSL's memory, as suggested by #4166, my solution works. My wild(!) guess why reducing the memory helps a bit, is that Windows has more memory for caching, and we all know that Windows sucks at File-IO.

@thomas-haller
Copy link

In my system 1G free memory is to few. It starts to become slow at 1.4 GB already.
My modified version of @cmorty s script:

#!/bin/bash

while :
do
    freemem=`free -m | head -n2 | tail -n1  | awk '{print $4}'`
    #free -m
    if (( $freemem < 1500 )); then
        echo Dropping Cache
        echo 1 > /proc/sys/vm/drop_caches
    fi
    sleep 2
done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants