Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix t2002-qmanager-reload.t with new core resource module #661

Merged
merged 1 commit into from May 27, 2020

Conversation

grondo
Copy link
Contributor

@grondo grondo commented May 27, 2020

This PR fixes a hang that will come up once flux-framework/flux-core#2949 is merged. Since the resource module now shares resource configuration with sched-simple via the acquire interface, this module must be reloaded once hwloc data is manually overridden, or else sched-simple will reject free requests after it is reloaded at the end of the test.

I suppose since #657 has not been merged yet, I can tack this onto that PR if that would be better.

@codecov-commenter
Copy link

codecov-commenter commented May 27, 2020

Codecov Report

Merging #661 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #661   +/-   ##
=======================================
  Coverage   75.45%   75.45%           
=======================================
  Files          78       78           
  Lines        7860     7860           
=======================================
  Hits         5931     5931           
  Misses       1929     1929           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 52e590d...41e955d. Read the comment docs.

Problem: The t1002-qmanager-reload.t test will hang during rc3 when
used against a Flux where sched-simple acquires resourcess from the
resource module. This occurs because the test reloads hwloc XML
out of band, and the core resource module (from which sched-simple
gets its resource list) has a different resource configuration.
This causes sched-simple to reject all free requests as invalid
at the end of the test, causing the wait for queue drain to block
forever.

In order to fix this, the resource module must be reloaded, which
in turn requires that sched-simple be reloaded, after hwloc XML is
overwritten.
@grondo
Copy link
Contributor Author

grondo commented May 27, 2020

Rebased on current master

Copy link
Member

@garlick garlick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense, thanks!

@grondo
Copy link
Contributor Author

grondo commented May 27, 2020

Thanks! Setting MWP.

@grondo grondo added the merge-when-passing mergify.io - merge PR automatically once CI passes label May 27, 2020
@mergify mergify bot merged commit 6cb79ff into flux-framework:master May 27, 2020
@grondo grondo deleted the t-qmanager-reload-fixup branch May 27, 2020 15:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merge-when-passing mergify.io - merge PR automatically once CI passes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants