Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add resource update support #543

Merged
merged 17 commits into from Dec 5, 2019
Merged

add resource update support #543

merged 17 commits into from Dec 5, 2019

Conversation

@dongahn
Copy link
Contributor

dongahn commented Nov 5, 2019

This PR is WIP because I will need to clean up #537 first. It adds support for update as described in Issue #528.

  • Add update() to the base reader class as well as its derived classes.

  • Implement update() within the JGF reader class. Other derived classes get an empty update() implementation for future extension.

  • Integrate the update method of readers into the traverser code and Overload run() method so that the traverser can run to update the resource graph using a JGF subgraph, jobid, starttime and duration as well as a reader object.

  • Integrate update command into resource-query. The overall syntax:
    resource-query> update allocate jgf_file jobid starttime duration

  • Add two shares tests: one dealing with a machine with a single subsystem and the other containing two subsystems.

@dongahn dongahn force-pushed the dongahn:reader_update branch from 65597bd to 167b0be Nov 5, 2019
@dongahn

This comment has been minimized.

Copy link
Contributor Author

dongahn commented Nov 5, 2019

FYI -- Travis CI was failing because there is a git filter for the files that end with .out so a couple of expected output files didn't go in. I just forced to add them back.

[gingerfoot:data/resource/expected] ahn1% git add -f update_multi/

Once CI is happy, this PR should be okay to be reviewed and I requested @milroy and @SteVwonder as the reviewers.

However, what would be better for everyone is to work and land PR #527 first and rebase this PR on top of that. So that will be my next task.

@dongahn dongahn force-pushed the dongahn:reader_update branch from 167b0be to db27967 Nov 5, 2019
@codecov-io

This comment has been minimized.

Copy link

codecov-io commented Nov 5, 2019

Codecov Report

Merging #543 into master will increase coverage by 0.2%.
The diff coverage is 75.02%.

Impacted file tree graph

@@           Coverage Diff            @@
##           master    #543     +/-   ##
========================================
+ Coverage   75.59%   75.8%   +0.2%     
========================================
  Files          69      70      +1     
  Lines        6515    6992    +477     
========================================
+ Hits         4925    5300    +375     
- Misses       1590    1692    +102
Impacted Files Coverage Δ
resource/schema/sched_data.cpp 62.5% <ø> (+1.97%) ⬆️
resource/readers/resource_reader_grug.hpp 100% <ø> (ø) ⬆️
resource/utilities/command.hpp 100% <ø> (ø) ⬆️
resource/readers/resource_reader_base.hpp 100% <ø> (ø) ⬆️
resource/store/resource_graph_store.hpp 100% <ø> (ø) ⬆️
resource/readers/resource_reader_jgf.hpp 100% <ø> (ø) ⬆️
resource/traversers/dfu_impl.hpp 100% <ø> (ø) ⬆️
resource/readers/resource_reader_hwloc.hpp 100% <ø> (ø) ⬆️
resource/store/resource_graph_store.cpp 100% <100%> (ø) ⬆️
resource/writers/match_writers.hpp 100% <100%> (ø) ⬆️
... and 19 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b54ff48...9d5a973. Read the comment docs.

@dongahn dongahn requested a review from milroy Nov 14, 2019
@dongahn dongahn force-pushed the dongahn:reader_update branch from db27967 to 5e31e1c Nov 14, 2019
@dongahn dongahn changed the title WIP: add resource update support add resource update support Nov 14, 2019
@dongahn

This comment has been minimized.

Copy link
Contributor Author

dongahn commented Nov 14, 2019

I rebased this to PR #537 which has two intermediate commits that I just added to address @milroy's review.

As PR #537 is getting closer to be merged, I also dropped "WIP" from the title of this PR: This PR should be ready to be reviewed as well although this will needs to be adjusted once PR #537 is merged.

@SteVwonder and @milroy: if you want to review this PR, you can do that from
the 7d10b0c commit.

@dongahn dongahn force-pushed the dongahn:reader_update branch from 5e31e1c to 4b97e2d Nov 16, 2019
@dongahn dongahn requested review from SteVwonder and milroy and removed request for milroy Nov 16, 2019
@dongahn

This comment has been minimized.

Copy link
Contributor Author

dongahn commented Nov 16, 2019

@milroy and @SteVwonder: I rebased this w/ PR #537 and the Travis CI is green. There is a slight decrease in the testing coverage. I glanced through the diffs and they seem to be reasonable.

This should be good for your reviews.

I wonder if there is a way to decrease the sensitivity for PASS/FAILURE criteria. @grondo: do you know how?

Copy link
Member

milroy left a comment

I added some very minor comments, but the new contributions of this PR look good to me.

However, the first 14 commits have already been merged in PR #537, so this PR is a bit confusing. I'm not sure what complexity that creates for merging.

const f_resource_graph_t &g,
const vtx_t &u,
unsigned int needs,
bool exclusive)
Comment on lines 356 to 359

This comment has been minimized.

Copy link
@milroy

milroy Dec 3, 2019

Member

Super nit here: looks like some misalignment in the arguments' spacing.

const f_resource_graph_t &g,
const edg_t &e)
Comment on lines 321 to 322

This comment has been minimized.

Copy link
@milroy

milroy Dec 3, 2019

Member

Super nit here: looks like some misalignment in the arguments' spacing.

std::string relation = "contains";
std::string rev_relation = "in";
edg_t e;
bool inserted; // set to false when we try and insert a parallel edge

This comment has been minimized.

Copy link
@milroy

milroy Dec 3, 2019

Member

General comment: I would like to understand the performance advantages of using different edge types in the adjacency list. Associative containers can obviate the inserted bool here. I'm commenting here to tie into issue #534.

// is reloaded with a different match policy and the job-manager wants
// to reconstruct a previous allocated job.
// In this case, you can't find allocated resources from an accelerated
// depth first visit (dfv). There shouldn't be no idata for the allocation.

This comment has been minimized.

Copy link
@milroy

milroy Dec 3, 2019

Member

Consider rephrasing: "There should be idata for the allocation."

@dongahn

This comment has been minimized.

Copy link
Contributor Author

dongahn commented Dec 4, 2019

However, the first 14 commits have already been merged in PR #537, so this PR is a bit confusing. I'm not sure what complexity that creates for merging.

Sorry about this. This should have been rebased and conflict should have been resolved. This probably fell through the crack through travel and such. I will address your comments and as part I'd that rebase this to the current master.

 

@dongahn

This comment has been minimized.

Copy link
Contributor Author

dongahn commented Dec 4, 2019

@milroy: I pushed two new commits (which should be squashed later) for your review. I will also rebase this to the master to see if the previous commits will all disappear.

dongahn added 13 commits Nov 4, 2019
Position so that emit methods can return error
codes in the future.
Fix a bug where those paths generated from association
rules were not making into the by_path indexer.

Also update JGF definition of power system conf

Problem: It turned out the paths in the power
subsystem can be ambiguous because of pdu ID naming.

Change the pdu ID naming rule within the
power.graphml grug file and update the power.jgf
definition as well.
Problem: The traverser uses graph edges to annotate
selection information such as how much resource
in their target vertices can be selected. This info
is then later used by update() to finalize the best
matching subgraph. Now, because the root vertices
of the graph do not have in-edges, this scheme required
special casing for the roots.

Add virtual in-edge support for root vertices so
that the core selection/update algorithm does
not have to have special casing.

Fix one of the side effect of the special casing:
longer method parameters for core methods like
select () and update ().
Change where we manage the meta data
used by the traverser to accelerate its operations:
to infrastructure data from scheduler data.

Facilitate separation of concerns.
Add update() to the base class as well as
its derived classes.

Implement update() within the JGF reader class.
Other derived classes get an empty update()
implementation for future extension.

- Refactor the JGF parser code for the unpack()
method so that it can be reused for the update()
operation as well.

- Find the vertices and edges corresponding to
the subgraph written in JGF.

- If the vertex is marked "exclusive", update
its schedule.plans planner object as well as
schedule.reservations and schedule.allocations
table. Note that we update these data in this
reader code, not in the traverser code, so
even those vertices that the traverser won't
walk still get their plans updated correctly.

- update_edges() updates the source edge
and target edge of each JGF edge element. It
adds selection information such as how much
resource in the target vertex must be selected
to the edge object which then later gets used
by the traverser code's update() method.
Add support so that upper layer can update the existing
resource graph using a JGF subgraph.

Overload run() method so that the traverser can run
to update the resource graph using a JGF subgraph,
jobid, starttime and duration as well as a reader
object. Note that only the JGF reader supports this.

Significantly refactor and clean up the traverser's
update walking code so that it can be used both by
the existing run() as well as this alternative
run() method.

Pass the resource_graph_db_t object as well as
the filter graph object to handle the cases
where it must update or cancel a JGF subgraph
created by a different traverser instantiation.
For example, this condition can arise when the
resource module is is reloaded with a different
match policy and the job-manager wants
to reconstruct a previous allocated job.
Integrate the traverser's new run () code to
provide the update operation within resource-query.

Add two sub-commands, allocate and reserve so
that a user can differentiate different types
of update.

The overall syntax:

resource-query> update allocate jgf_file jobid starttime duration
Populate tests by leveraging the fact that
the resource state created by "match" and "update"
must be identical.

t/t4009-update-basic.t uses a machine configuration
with only one subsystem (containment).

t/t4010-update-multi.t uses a machine configuration
with two subsystems.

For the latter, add coverage where resource
is reloaded with a different dominant
subsystem system.
@dongahn dongahn force-pushed the dongahn:reader_update branch from 98a7746 to 9d5a973 Dec 4, 2019
@dongahn

This comment has been minimized.

Copy link
Contributor Author

dongahn commented Dec 4, 2019

@milroy: OK. With the new rebase, all of the previous commits disappeared with no conflicts! Once you take a brief look at the last two commits which then gets squashed, this PR can go in. I am okay if this goes in before our new tag. I will just add a few more items to NEWS.

@milroy

This comment has been minimized.

Copy link
Member

milroy commented Dec 5, 2019

The new commits look good. I'm merging this PR.

@milroy milroy merged commit af4447f into flux-framework:master Dec 5, 2019
2 of 3 checks passed
2 of 3 checks passed
codecov/patch 75.02% of diff hit (target 75.59%)
Details
codecov/project 75.8% (+0.2%) compared to b54ff48
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@dongahn

This comment has been minimized.

Copy link
Contributor Author

dongahn commented Dec 5, 2019

Oops. I needed to squash the last two commits though...

@dongahn

This comment has been minimized.

Copy link
Contributor Author

dongahn commented Dec 5, 2019

For future reference, I use a commit message with "[squash]:... " to indicate that commit needs to be squashed once the review is done...

@grondo or @garlick: is okay to rewrite the commit history of master? Maybe this is minor enough we just leave this as is.

@garlick

This comment has been minimized.

Copy link
Member

garlick commented Dec 5, 2019

Your call, but I'd probably just leave it.

@grondo

This comment has been minimized.

Copy link
Contributor

grondo commented Dec 5, 2019

If it is soon enough to where travis builds haven't even finished, then it may be ok to rewrite history of master in extreme cases. (force-stop all travis builds first).

I wonder if we could make it so that branches with commits having these types of tags can't be merged. Something to look into.

@dongahn

This comment has been minimized.

Copy link
Contributor Author

dongahn commented Dec 5, 2019

Ok. I will just leave it. Not much harm done, other than having some intermediate commits

I wonder if we could make it so that branches with commits having these types of tags can't be merged. Something to look into.

This is a common enough workflow so having a way to enforce this (not merge yet) makes sense to me.

@milroy

This comment has been minimized.

Copy link
Member

milroy commented Dec 5, 2019

Sorry about the premature merge @dongahn. I misunderstood your comment about squashing the last two commits. I'll be careful not to do that in the future.

@dongahn

This comment has been minimized.

Copy link
Contributor Author

dongahn commented Dec 5, 2019

Not a big deal. Thank you for the review -- much harder part!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.