Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

corlib-xunit tests taking 10x longer in the coop configuration #12945

Open
akoeplinger opened this issue Feb 11, 2019 · 3 comments

Comments

@akoeplinger
Copy link
Member

@akoeplinger akoeplinger commented Feb 11, 2019

Steps to Reproduce

  1. Run corlib xunit tests in the coop configuration: make -C mcs/class/corlib run-xunit-test
  2. Observe the test suite taking a long time, ~20mins compared to 2mins on a normal build

Current Behavior

Coop, takes ~20mins:
https://jenkins.mono-project.com/job/test-mono-pull-request-coop/12631/

Expected Behavior

Finishing much faster.

On which platforms did you notice this

[ ] macOS
[ X ] Linux
[ ] Windows

Version Used: master

akoeplinger added a commit that referenced this issue Feb 11, 2019
@luhenry

This comment has been minimized.

Copy link
Member

@luhenry luhenry commented Feb 11, 2019

@lambdageek @BrzVlad could you please take a quick look at what could be going wrong?

@lambdageek

This comment has been minimized.

Copy link
Member

@lambdageek lambdageek commented Feb 11, 2019

The test suite summary page says "00:18:36" and the xunit log says "1093.173s" (about 00:18:13). So the actual test run is taking a long time, not just shutdown (which is our usual culprit). So it's doing something pathological.

Note that we also build the coop lane as a checked build and run with thread and gc checks. (see ci/scripts/run-jenkins.sh where it mentions "checked-coop")

@lambdageek

This comment has been minimized.

Copy link
Member

@lambdageek lambdageek commented Feb 12, 2019

Runs in a reasonable amount of time on osx with MONO_THREADS_SUSPEND=coop MONO_CHECK_MODE=thread,gc. Also on Linux/amd64 MONO_THREADS_SUSPEND=coop is decently fast (smidge over 2 minutes).

But Linux/amd64 MONO_THREADS_SUSPEND=coop MONO_CHECK_MODE=thread,gc is slooooow (14minutes).

So checked build mode with coop is behaving pathologically.

Possibly it's collecting stack traces on each thread state transition and hitting contention around the backtrace() - I think Linux has a lock in there. I haven't tried to instrument anything yet. Just a wild guess.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.