Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wreck: minor enhancements for scale testing #782

Merged
merged 9 commits into from Aug 24, 2016

Conversation

Projects
None yet
3 participants
@grondo
Copy link
Contributor

grondo commented Aug 24, 2016

This is a list of minor fixes and workarounds encountered during scale testing of program launch. Most notable is the kludge to workaround use of hostlist for lists of tasks (eventually we'll use something more apt), and a convenience option for flux wreckrun to avoid processing stdio and task exit states, but simply wait until a job reaches a given state (--wait-until=state).

The kvs_watch for lwj.state is moved before the run request event, mainly to support --wait-until but also because it was found to be confusing to have "missing" states in verbose output during large scale program launch.

grondo added some commits Aug 17, 2016

lua-hostlist: temporarily bump MAX_RANGE
Since lua-hostlist is used for tasks in some cases, temporarily
bump MAX_RANGE (max upper bound for a host range) to 1M hosts
so that errors are not reported for task counts > 16K tasks.

Eventually this hostlist use case should be replaced with
something like nodeset.
flux-wreck: add NTASKS to flux-wreck timing
Add number of tasks column to `flux wreck timing` output.
flux-wreckrun: watch lwj.state before run request
Create kvswatcher (install kvs_watch()) before issueing wrexec.run
request, so no states are lost when running in verbose mode. This
should not affect functionality, but is useful when debugging
flux-wreckrun and/or job launch
wreck: set nocommit flags on fatal log kzio
Add flags to fatal error log kz object such that a kvs commit
is not issued per line, on open, or on close.
wreck/lua.d: input.lua: avoid writing nil to input kz
Check for nil input data (e.g. when reading eof only) in the
iowatcher handler for stdin fd. Avoids potential runtime error
in then input.lua plugin.
bindings/lua: check for NULL kz object in kz:write()
Check for NULL kz object (though this should not happen)
in lua bindings for kz:write, before calling kz_put().
cmd/flux-wreckrun: Add -w, --wait-util=state option
Add an option to flux-wreckrun to ignore stdio and task exit
status, and instead just wait until the job hits a certain state
(starting, running, or complete), then exit with 0 status.
doc/flux-wreckrun(1): document -w, --wait-until=STATE
Document the -w, --wait-until option to flux-wreckrun.

@grondo grondo added the review label Aug 24, 2016

@garlick

This comment has been minimized.

Copy link
Member

garlick commented Aug 24, 2016

All looks good to me. Merging.

@garlick garlick merged commit 569565e into flux-framework:master Aug 24, 2016

0 of 2 checks passed

continuous-integration/travis-ci/pr The Travis CI build is in progress
Details
coverage/coveralls Coverage pending from Coveralls.io
Details

@garlick garlick removed the review label Aug 24, 2016

@coveralls

This comment has been minimized.

Copy link

coveralls commented Aug 24, 2016

Coverage Status

Coverage decreased (-0.03%) to 75.092% when pulling d722d5e on grondo:wreck-cts1 into 752bcbf on flux-framework:master.

@grondo grondo deleted the grondo:wreck-cts1 branch Sep 15, 2016

@garlick garlick referenced this pull request Oct 26, 2016

Closed

0.5.0 release notes #879

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.