Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

log loadbalancing info on rank 0 #448

Merged
merged 1 commit into from
Mar 2, 2020

Conversation

fgfuchs
Copy link
Contributor

@fgfuchs fgfuchs commented Feb 27, 2020

load balancing info is collected on rank 0 and logged with OpmLog

also removed unused variable my_num

@akva2
Copy link
Member

akva2 commented Feb 27, 2020

jenkins build this please

opm/grid/cpgrid/CpGrid.cpp Outdated Show resolved Hide resolved
@blattms
Copy link
Member

blattms commented Feb 27, 2020 via email

@bska
Copy link
Member

bska commented Feb 27, 2020

BTW: Is it me or is github not correctly working, currently?

There does seem to be something that's not quite right. The GitHub site status page also suggests similar experience.

opm/grid/cpgrid/CpGrid.cpp Outdated Show resolved Hide resolved
@bska
Copy link
Member

bska commented Feb 27, 2020

I'm just going to rerun the build test. The earlier test, though successful, did not propagate the successful status back to the PR.

@bska
Copy link
Member

bska commented Feb 27, 2020

jenkins build this please

@fgfuchs
Copy link
Contributor Author

fgfuchs commented Feb 28, 2020

thanks for the code review. I have fixed the typo. I will leave it up to Atgeirr to decide if I should change more.

@atgeirr
Copy link
Member

atgeirr commented Feb 28, 2020

Interesting approach. Do we need the sum? I would have used cc.gather(&num_cells, num_cells.data(), 1, 0);

I think you are right, in that gather() is perhaps more like how it is usually done. It is interesting though that sum() is simpler to understand (do not need the 1, 0 arguments). Perhaps "single item from each -> array on root/all" overloads of gather()/allgather() could be added to the collective communication class for cases like this.

As is I think that using sum() is fine here.

Copy link
Member

@atgeirr atgeirr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, asking for minor fixes only.

opm/grid/cpgrid/CpGrid.cpp Outdated Show resolved Hide resolved
opm/grid/cpgrid/CpGrid.cpp Outdated Show resolved Hide resolved
@atgeirr
Copy link
Member

atgeirr commented Feb 28, 2020

Ok, squash the commits and we are done!

opm/grid/cpgrid/CpGrid.cpp Outdated Show resolved Hide resolved
opm/grid/cpgrid/CpGrid.cpp Outdated Show resolved Hide resolved
@fgfuchs
Copy link
Contributor Author

fgfuchs commented Feb 28, 2020

jenkins build this please

Copy link
Member

@atgeirr atgeirr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turned out there were a few more things with the last changes.

str += "----------------\n";
Opm::OpmLog::info(str);
}
for(const auto& cc_num_cells: nc) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You must put nc before cc_num_cells.

Also some nitpicks: space after for, space around colon, and move the definition of cc_num_cells to line 251 instead of 100 lines above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@fgfuchs
Copy link
Contributor Author

fgfuchs commented Feb 28, 2020

I think that formatting should be handled automatically via some hooks.

removed unused variable my_num

aborting all processes if at least one process has zero cells
@fgfuchs
Copy link
Contributor Author

fgfuchs commented Feb 28, 2020

jenkins build this please

@bska
Copy link
Member

bska commented Feb 28, 2020

jenkins build this please

It appears you (currently?) don't have the ability to launch Jenkins build tests. Don't worry. I'll do it for you.

@bska
Copy link
Member

bska commented Feb 28, 2020

jenkins build this please

@fgfuchs
Copy link
Contributor Author

fgfuchs commented Feb 28, 2020

Thank you

Copy link
Member

@blattms blattms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are right, in that gather() is perhaps more like how it is usually done. It is interesting though that sum() is simpler to understand (do not need the 1, 0 arguments).

1 is the size and I think you are passing the size of the array to sum, too. In addition you need to tell it where to gather. That is the 0.

As is I think that using sum() is fine here.

I don't.
It is an all-to-all communication versus an all-to-one communication. That means that it sends twice as many message as gather (that might depend on the implementation of course). In addition, each message has size P, the number of processes, versus size 1 for the gather.

@atgeirr
Copy link
Member

atgeirr commented Mar 2, 2020

It is an all-to-all communication versus an all-to-one communication. That means that it sends twice as many message as gather (that might depend on the implementation of course). In addition, each message has size P, the number of processes, versus size 1 for the gather.

Point taken, I am convinced. Let's use gather().

@fgfuchs
Copy link
Contributor Author

fgfuchs commented Mar 2, 2020

I don't.
It is an all-to-all communication versus an all-to-one communication. That means that it sends twice as many message as gather (that might depend on the implementation of course). In addition, each message has size P, the number of processes, versus size 1 for the gather.

At the moment, I am using sum to throw an exception on all processes, if one process has zero cells. What I mean are these lines:

        for (const auto& nc : cc_num_cells) {
            if (nc == 0) {
                throw std::runtime_error("At least one process has zero cells. Aborting.");
            }
        }

@blattms
Copy link
Member

blattms commented Mar 2, 2020

Was that throw there before and needed?

Somehow I thought the bug with one process having no cells has been fixed somewhen. Am I mistaking?

@atgeirr
Copy link
Member

atgeirr commented Mar 2, 2020

Yes, the check and throw was already there. I think adding it was the fix, it used to just crash.

@blattms
Copy link
Member

blattms commented Mar 2, 2020 via email

@atgeirr
Copy link
Member

atgeirr commented Mar 2, 2020

we should retest whether there is still a problem with zero elements on one process

I think this was observed with MPI but not Zoltan, because then cartesian division of the domain is used. Of course, running a 3-cell case on 4 processes should also trigger it, Zoltan or not...

@atgeirr
Copy link
Member

atgeirr commented Mar 2, 2020

Issue made as requested, I'll then merge.

@atgeirr atgeirr merged commit f6cda38 into OPM:master Mar 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants