Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

t/t2000-wreck.t: Fix invalid compare on per-task affinity test #837

Merged
merged 2 commits into from Oct 7, 2016

Conversation

Projects
None yet
5 participants
@chu11
Copy link
Contributor

chu11 commented Oct 7, 2016

Found similar invalid compare just like #836

@garlick garlick added the review label Oct 7, 2016

@coveralls

This comment has been minimized.

Copy link

coveralls commented Oct 7, 2016

Coverage Status

Coverage increased (+0.004%) to 75.33% when pulling be08305 on chu11:t2000-wreck.t-bad-test_cmp2 into c94fdd3 on flux-framework:master.

@garlick

This comment has been minimized.

Copy link
Member

garlick commented Oct 7, 2016

Looks like the test error you found was masking some other problem with the test. Running the test with your fix and --verbose on my desktop:


--- expected_cpus2  2016-10-07 16:12:02.474234574 +0000
+++ output_cpus2    2016-10-07 16:12:02.474234574 +0000
@@ -1,2 +1,2 @@
 0: Cpus_allowed_list:  0
-1: Cpus_allowed_list:  0-11
+1: Cpus_allowed_list:  0-63
not ok 32 - wreckrun: supports per-task affinity assignment
#   
#       mask=$(cpus_allowed) &&
#       newmask=$(cpus_allowed first) &&
#       run_timeout 5 flux wreckrun -ln2 \
#         --pre-launch-hook="lwj[\"0.cpumask\"] = \"$newmask\"" \
#         grep ^Cpus_allowed_list /proc/self/status | sort > output_cpus2 &&
#       cat <<-EOF >expected_cpus2 &&
#       0: Cpus_allowed_list:   $newmask
#       1: Cpus_allowed_list:   $mask
#       EOF
#       test_cmp expected_cpus2 output_cpus2
#   
@grondo

This comment has been minimized.

Copy link
Contributor

grondo commented Oct 7, 2016

Hm, the problem above is that the apparent default cpus_allowed list in the script is only 0-11 but the wreckrunned task had a cpus_allowed list of 0-63 (unless I misinterpret diff output). What is the cpumask of the shell from which you are running the tests?

@garlick

This comment has been minimized.

Copy link
Member

garlick commented Oct 7, 2016

jimbo:~ > grep ^Cpus_allowed_list /proc/self/status
Cpus_allowed_list:  0-63
@grondo

This comment has been minimized.

Copy link
Contributor

grondo commented Oct 7, 2016

Hm, perhaps a bug in scripts/cpus-allowed.lua or the cpuset lua bindings then (or perhaps somewhere previous in the script we reset our own cpus allowed mask). In any event I'm taking a look at it.

@grondo

This comment has been minimized.

Copy link
Contributor

grondo commented Oct 7, 2016

Ok, I've noticed that sched_getaffinity() could return a different Cpus list than Cpus_allowed, which appears can be a superset of actual available CPUs. On hype for instance I see:

$ grep Cpus_allowed_list /proc/self/status
Cpus_allowed_list:  0-239
$ t/scripts/cpus-allowed.lua 
0-31

Which is an additional failure mode for this test. I think the fix is to use the cpus_allowed.lua script for both expected and job output, e.g. something like:

diff --git a/t/t2000-wreck.t b/t/t2000-wreck.t
index fac8e14..e460762 100755
--- a/t/t2000-wreck.t
+++ b/t/t2000-wreck.t
@@ -200,32 +200,30 @@ test_expect_success 'wreckrun: -t1 -n${SIZE} sets nnodes i
        test "$n" = "${SIZE}"
 '

-cpus_allowed() {
-       ${SHARNESS_TEST_SRCDIR}/scripts/cpus-allowed.lua "$@"
-}
+cpus_allowed=${SHARNESS_TEST_SRCDIR}/scripts/cpus-allowed.lua
 test "$(cpus_allowed count)" = "0" || test_set_prereq MULTICORE

 test_expect_success MULTICORE 'wreckrun: supports affinity assignment' '
-       newmask=$(cpus_allowed last) &&
+       newmask=$($cpus_allowed last) &&
        run_timeout 5 flux wreckrun -n1 \
          --pre-launch-hook="lwj.rank[0].cpumask = \"$newmask\"" \
-         grep ^Cpus_allowed_list /proc/self/status > output_cpus &&
+         $cpus_allowed > output_cpus &&
        cat <<-EOF >expected_cpus &&
-       Cpus_allowed_list:      $newmask
+       $newmask
        EOF
        test_cmp expected_cpus output_cpus
 '
 test_expect_success MULTICORE 'wreckrun: supports per-task affinity assignment'
-       mask=$(cpus_allowed) &&
-       newmask=$(cpus_allowed first) &&
+       mask=$($cpus_allowed) &&
+       newmask=$($cpus_allowed first) &&
        run_timeout 5 flux wreckrun -ln2 \
          --pre-launch-hook="lwj[\"0.cpumask\"] = \"$newmask\"" \
-         grep ^Cpus_allowed_list /proc/self/status | sort > output_cpus2 &&
+         $cpus_allowed | sort > output_cpus2 &&
        cat <<-EOF >expected_cpus2 &&
-       0: Cpus_allowed_list:   $newmask
-       1: Cpus_allowed_list:   $mask
+       0: $newmask
+       1: $mask
        EOF
-       test_cmp expected_cpus output_cpus
+       test_cmp expected_cpus2 output_cpus2
 '
 test_expect_success 'wreckrun: top level environment' '
        flux kvs put lwj.environ="{ \"TEST_ENV_VAR\": \"foo\" }" &&

This works on hype, but I'm not sure about jimbo

@garlick

This comment has been minimized.

Copy link
Member

garlick commented Oct 7, 2016

Just verified the above does work on jimbo.

Minor but I think

test "$(cpus_allowed count)" = "0" || test_set_prereq MULTICORE

should be

test "$($cpus_allowed count)" = "0" || test_set_prereq MULTICORE
@grondo

This comment has been minimized.

Copy link
Contributor

grondo commented Oct 7, 2016

Yeah, good catch, I went a bit too fast. @chu11 want to append those changes to this PR?

@chu11

This comment has been minimized.

Copy link
Contributor Author

chu11 commented Oct 7, 2016

@grondo Sure, do you have a branch to pull from?

@grondo

This comment has been minimized.

Copy link
Contributor

grondo commented Oct 7, 2016

I didn't push the change anywhere, sorry. I can do that after lunch if you
still need it

On Oct 7, 2016 11:01 AM, "Al Chu" notifications@github.com wrote:

@grondo https://github.com/grondo Sure, do you have a branch to pull
from?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#837 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAtSUiGr4jyOkm0l8AmW3xlA2qt77BLSks5qxokMgaJpZM4KQohx
.

@chu11

This comment has been minimized.

Copy link
Contributor Author

chu11 commented Oct 7, 2016

@grondo Nah, just thought you had it somewhere and would be easier. I can just integrate the patch.

@codecov-io

This comment has been minimized.

Copy link

codecov-io commented Oct 7, 2016

Current coverage is 75.10% (diff: 100%)

Merging #837 into master will increase coverage by 0.08%

@@             master       #837   diff @@
==========================================
  Files           146        146          
  Lines         25415      25415          
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits          19066      19087    +21   
+ Misses         6349       6328    -21   
  Partials          0          0          

Powered by Codecov. Last update c94fdd3...8a3d53f

@chu11 chu11 force-pushed the chu11:t2000-wreck.t-bad-test_cmp2 branch from cbb1cb3 to 076599a Oct 7, 2016

t/t2000-wreck.t: Fix error in affinity assignment tests
sched_getaffinity() can return a different Cpus list than
cpus_allowed, which appears can be a superset of actual
available CPUs.

@chu11 chu11 force-pushed the chu11:t2000-wreck.t-bad-test_cmp2 branch from 076599a to 8a3d53f Oct 7, 2016

@chu11

This comment has been minimized.

Copy link
Contributor Author

chu11 commented Oct 7, 2016

Just merged. Verified everything passes on atleast hype2 & apex

@coveralls

This comment has been minimized.

Copy link

coveralls commented Oct 7, 2016

Coverage Status

Coverage decreased (-0.01%) to 75.315% when pulling 8a3d53f on chu11:t2000-wreck.t-bad-test_cmp2 into c94fdd3 on flux-framework:master.

@coveralls

This comment has been minimized.

Copy link

coveralls commented Oct 7, 2016

Coverage Status

Coverage increased (+0.008%) to 75.334% when pulling 8a3d53f on chu11:t2000-wreck.t-bad-test_cmp2 into c94fdd3 on flux-framework:master.

@coveralls

This comment has been minimized.

Copy link

coveralls commented Oct 7, 2016

Coverage Status

Coverage increased (+0.08%) to 75.407% when pulling 8a3d53f on chu11:t2000-wreck.t-bad-test_cmp2 into c94fdd3 on flux-framework:master.

@garlick

This comment has been minimized.

Copy link
Member

garlick commented Oct 7, 2016

Looks good! Thanks Al.

@garlick garlick merged commit 2df89fd into flux-framework:master Oct 7, 2016

4 checks passed

codecov/patch Coverage not affected when comparing c94fdd3...8a3d53f
Details
codecov/project 75.10% (+0.08%) compared to c94fdd3
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
coverage/coveralls Coverage increased (+0.08%) to 75.407%
Details

@garlick garlick removed the review label Oct 7, 2016

@garlick garlick referenced this pull request Oct 26, 2016

Closed

0.5.0 release notes #879

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.