Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for the new wreck.state.submitted event #295

Merged
merged 3 commits into from Mar 30, 2018

Conversation

Projects
None yet
6 participants
@dongahn
Copy link
Contributor

dongahn commented Mar 27, 2018

This requires the pending flux-core PR #1389.

Add support within the scheduler and emulator for the new wreck.state.submitted event with which job request info such as nnodes and walltime is piggybacked.

Use the event's payload and other changes within jsc to optimize job submission handling performance -- i.e., reducing KVS accesses and the number of state transitions.

This should improve the job throughput when the submission rate is super high although I didn't measure the improvements.

@dongahn

This comment has been minimized.

Copy link
Contributor Author

dongahn commented Mar 27, 2018

@SteVwonder: this has a change to your emulator code. Please review if you can. Thanks.

@coveralls

This comment has been minimized.

Copy link

coveralls commented Mar 27, 2018

Coverage Status

Coverage decreased (-6.9%) to 68.968% when pulling 0e099cb on dongahn:aug_submit_event into 09ad395 on flux-framework:master.

@@ -462,11 +453,11 @@ static int update_state (flux_t *h, uint64_t jid, job_state_t os, job_state_t ns
return rc;
}

static inline bool is_newjob (json_t *jcb)
static inline bool is_newjob (flux_t *h, json_t *jcb)

This comment has been minimized.

@SteVwonder

SteVwonder Mar 27, 2018

Member

Can the flux_t *h argument be removed? It doesn't appear to be used.

This comment has been minimized.

@dongahn

dongahn Mar 27, 2018

Author Contributor

It can. It was there to print some debug msgs.

@@ -239,7 +238,7 @@ int schedule_next_job (flux_t *h, sim_state_t *sim_state)
return -1;
}

f = flux_rpc_pack (h, "job.create", FLUX_NODEID_ANY, 0,
f = flux_rpc_pack (h, "job.submit", FLUX_NODEID_ANY, 0,

This comment has been minimized.

@SteVwonder

SteVwonder Mar 27, 2018

Member

I think this will cause a race in the KVS for simulator-specific job information. Previously, the simulator submit module worked by creating the job, then populating the KVS directory with simulator specific information (via the put_job_in_kvs function), and then setting the job to submitted. At that point, the scheduler processed the job, scheduled it, and the sim_execsrv module would simulate the execution. In the process of simulating the job execution, sim_execsrv would pull the simulator specific information out of the kvs. So in the previous version, the simulator specific information was guaranteed to exist in the KVS before the sim_execsrv module tried to pull it, because the submit module wouldn't permit the scheduling of the job until the custom simulator information was placed into the KVS.

After a quick glance at flux-framework/flux-core#1389, it seems the new wreck still supports job.create. Is there a way we can keep that flow? Maybe when the submit module sends the wreck.state.submitted event, it can include the new ntasks, nnodes, and walltime data in the json payload?

I'm not married to that control flow though. I'm sure there is a cleaner way to solve this race-condition that I'm not seeing....

This comment has been minimized.

@grondo

grondo Mar 27, 2018

Contributor

After a quick glance at flux-framework/flux-core#1389, it seems the new wreck still supports job.create. Is there a way we can keep that flow?

Sorry we didn't check with you first about these changes @SteVwonder!

Since the scheduler should now ignore the reserved state, you could still use job.create from the simulator to "reserve" a job entry in the kvs, fill in values specific to the scheduler, and once those have been committed to the KVS issue the job.sumbit event yourself as you suggested. I don't think the scheduler could tell the difference between the simulator and the flux-submit program. (This is exactly the purpose for which the "reserved" state was designed)

That is my $0.02

This comment has been minimized.

@SteVwonder

SteVwonder Mar 27, 2018

Member

No need for apologies. I saw the discussion about the state changes in the github issue, but I forgot this is how the simulator submit module works. So I'll take the blame for not catching this earlier.

you could still use job.create from the simulator to "reserve" a job entry in the kvs, fill in values specific to the scheduler, and once those have been committed to the KVS issue the job.sumbit event yourself as you suggested.

That works for me. @dongahn, are you ok with the simulator still using job.create and the reserved state for now? For the splash target, I think we get this PR in sooner rather than later. In the future, we could refactor to include the simulator specific information in the job.submit message (when in sim mode) to avoid the race condition.

This comment has been minimized.

@dongahn

dongahn Mar 27, 2018

Author Contributor

Well, it won't work, at least without changes to flux-core#1398.

Right now, when the reserved state is seen, jsc puts the new job into the active job hash. So when submitted state of this job is seen afterwards, jsc won't be able to insert this job into the active job hash, leading to an error.

The way the scheduler ignores the state change of the job starting withreserved state is that, it just doesn't recognize null->reserved as the new job event. So the scheduler won't put the job into its job index at all. Thus, any subsequent event of that job will be ignored.

This comment has been minimized.

@dongahn

dongahn Mar 27, 2018

Author Contributor

I will try to make a minor modification to the JSC code in flux-core#1398 to allow your original tool flow. Stay tuned.

This comment has been minimized.

@grondo

grondo Mar 27, 2018

Contributor

Thanks @dongahn. IMO the "reserved" state should be ignored by the scheduler -- it was originally just a state that marked a kvs directory as reserved for a writer. In the wreck changes I considered not emitting an event for it, but I knew it might break the scheduler. (Of course, we can revisit this once we start work on job-ingest module there might not be a use for a reserved state)

This comment has been minimized.

@dongahn

dongahn Mar 27, 2018

Author Contributor

@grondo: effectively the scheduler is ignoring it. The reason JSC needs to keep track of this event is to allow the scheduler to ignore all of the state transitions that started from reserved state which I initially thought was just for the wreckrun use case. (If it makes sense). But now the emulator does the same, and I think I know how to accommodate it.

This comment has been minimized.

@dongahn

dongahn Mar 27, 2018

Author Contributor

@SteVwonder or @grondo: wreck.state.submitted will have to change something like the following to emulate the new wreck behavior. What kvs_path should I use with the wreck.state.submitted event emitted from the submit emulator module?

    if (!(msg = flux_event_pack ("wreck.state.submitted",
                                 "{ s:I s:s s:i s:i s:i}",
                                 "lwj", new_jobid,
                                 "kvs_path", path,
                                 "nnodes", job->nnodes,
                                 "ntasks", job->ncpus,
                                 "walltime", (int64_t)job->time_limit))
        || flux_send (h, msg, 0) < 0) {
        return -1;
    }

This comment has been minimized.

@grondo

grondo Mar 27, 2018

Contributor

@dongahn, I will have to look at the simulator, but it should be the same path returned in the response to the job.create call(?)

@dongahn dongahn force-pushed the dongahn:aug_submit_event branch from be67009 to 4c6749b Mar 28, 2018

@codecov-io

This comment has been minimized.

Copy link

codecov-io commented Mar 28, 2018

Codecov Report

Merging #295 into master will decrease coverage by 6.65%.
The diff coverage is 55.55%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #295      +/-   ##
==========================================
- Coverage   74.39%   67.74%   -6.66%     
==========================================
  Files          49       46       -3     
  Lines        9279     8692     -587     
==========================================
- Hits         6903     5888    -1015     
- Misses       2376     2804     +428
Impacted Files Coverage Δ
sched/sched.c 61.69% <55.55%> (-12.39%) ⬇️
simulator/simulator.c 0.84% <0%> (-89.92%) ⬇️
sched/sched_backfill.c 13.37% <0%> (-77.08%) ⬇️
src/common/libutil/log.c 28.88% <0%> (-33.34%) ⬇️
sched/rsreader.c 76.38% <0%> (-20.14%) ⬇️
sched/rs2rank.c 80.17% <0%> (-14.66%) ⬇️
src/common/libutil/shortjansson.h 77.96% <0%> (-10.37%) ⬇️
sched/sched_fcfs.c 88.63% <0%> (-5.69%) ⬇️
sched/flux-waitjob.c 85.12% <0%> (-3.31%) ⬇️
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 09ad395...0e099cb. Read the comment docs.

@dongahn

This comment has been minimized.

Copy link
Contributor Author

dongahn commented Mar 28, 2018

@SteVwonder: I added this support for both in this PR and flux-framework/flux-core#1389. But somehow the emulator-based tests all failed with "no event remaining message" after scheduling the first job.

2018-03-27T23:44:11.925585Z submit.debug[0]: sent a reply request: {"sim_time": 19643.0, "event_timers": {"sched": -1.0, "sim_exec": -1.0, "submit": -1.0}, "mod_name": "submit"}
2018-03-27T23:44:11.925709Z sim.debug[0]: No events remaining

I'm not familiar enough w/ the emulator code that I feel I am not effectively debugging this. Could you please take a look at the broker log and see if you can spot anything? Log is here:
t2000-fcfs.broker.log.

@grondo

This comment has been minimized.

Copy link
Contributor

grondo commented Mar 28, 2018

@dongahn, for some reason GitHub also thinks this is out-of-date with master, you may need to rebase and push again

@grondo

This comment has been minimized.

Copy link
Contributor

grondo commented Mar 28, 2018

I restarted the build, though I guess we know it will fail. This does need to be rebased on the current master apparently.

dongahn added some commits Mar 26, 2018

jsc: Add support for the augmented wreck.state.submitted event
Add support within the scheduler and emulator for
the new wreck.state.submitted event with which
job request info such as nnodes and walltime is piggybacked.

Reduce KVS accesses to fetch job request information
as well as a legacy event: null to null job state
transition event.

Use job.submit within the emulator because this optimized
eventing is implemented only in submit event, not create
event. job.create event still emits wreck.state.reserved
event.
@grondo

This comment has been minimized.

Copy link
Contributor

grondo commented Mar 29, 2018

@dongahn, I would merge this branch but it does still need to be rebased on master for some reason. Also quickly disable the failing tests so future PRs don't appear to fail known failing tests.

testsuite: disable all simulator driven tests
Disable known broken simulator driven tests for now.
Renable temporarily by setting FLUX_SCHED_ENABLE_SIM_TESTS

@grondo grondo force-pushed the dongahn:aug_submit_event branch from 4c6749b to 0e099cb Mar 29, 2018

@grondo

This comment has been minimized.

Copy link
Contributor

grondo commented Mar 30, 2018

@dongahn, I went ahead and rebased this branch on current master, and disabled all the simulator driven tests. Just a warning that I force-pushed over your branch, so if you have local work be sure to save it on a different branch or at least don't reset against your upstream branch.

All simulator tests are disabled for now, to re-enable temporarily set FLUX_SCHED_ENABLE_SIM_TESTS in the environment

@dongahn

This comment has been minimized.

Copy link
Contributor Author

dongahn commented Mar 30, 2018

@grondo: thank you! Sorry for the late response. Had a dentist appointment.

So if Traverse turns green, this should go in?

@grondo

This comment has been minimized.

Copy link
Contributor

grondo commented Mar 30, 2018

Yes. I've tested it against my pending flux-core branch and seems to work well. (Note: that PR adds a new ncores attribute to job submission, if you have a chance you might want to take a look)

@dongahn

This comment has been minimized.

Copy link
Contributor Author

dongahn commented Mar 30, 2018

@grondo's change looks great. @garlick, @trws or @SteVwonder, could you push the button? It seems not having this in the master is confusing many at this point.

@garlick

This comment has been minimized.

Copy link
Member

garlick commented Mar 30, 2018

Agreed!

@garlick garlick merged commit 92fc364 into flux-framework:master Mar 30, 2018

1 of 4 checks passed

codecov/patch 55.55% of diff hit (target 74.39%)
Details
codecov/project 67.74% (-6.66%) compared to 09ad395
Details
coverage/coveralls Coverage decreased (-6.9%) to 68.968%
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@SteVwonder SteVwonder referenced this pull request Apr 1, 2018

Closed

re-enable emulator tests #303

@grondo grondo referenced this pull request May 11, 2018

Closed

Need 0.5.0 Release #340

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.