Work balance #11216

friedmud · 2018-04-07T16:50:17Z

Create a new VectorPostprocessor that can help in determining the quality of a partitioning.

closes #11209

moosebuild · 2018-04-07T17:10:34Z

Job Documentation on b185589 wanted to post the following:

View the site here

This comment will be updated on new commits.

…ll modifications to WorkBalance documentation closes idaholab#11217 refs idaholab#11209

permcody

Looks pretty good, just a few minor comments.

permcody · 2018-04-07T19:11:53Z

framework/src/vectorpostprocessors/StatisticsVectorPostprocessor.C

+    }
+    case 5: // norm2
+      return std::sqrt(std::accumulate(
+          stat_vector.begin(), stat_vector.end(), 0, [](Real running_value, Real current_value) {


Minor inconsistency between your two lambdas. Neither one needs a capture, but the one above includes an ampersand.

The one with the & needs to capture mean.

Again: I am not writing this code randomly.

Oh, well then how about &mean?

permcody · 2018-04-07T19:12:58Z

framework/src/vectorpostprocessors/StatisticsVectorPostprocessor.C

+
+          }));
+    default:
+      mooseError("Unknown statistics type1");


Typo? Maybe print the stat number.

permcody · 2018-04-07T19:17:53Z

framework/src/vectorpostprocessors/WorkBalance.C

+  // Now Node info
+  auto wb_nl = WBNodeLoop(_fe_problem);
+
+  Threads::parallel_reduce(*mesh.getLocalNodeRange(), wb_nl);


Hmm, on one hand you aren't really after efficiency (you are doing two complete sweeps over the mesh when you could just do them together). On the other hand you went through the trouble of creating two custom threaded loops (when we don't run threads very often). Are you sure this is a better design than just gather both stats at once deriving from an existing looping type?

I do really wish you would give me the benefit of the doubt. I didn't just "do" this stuff because it makes me happy...

You can't do an element loop and count things on nodes because nodes are shared by elements so you double count. The same goes for doing a node loop and counting elemental things. Yes, you could invent some complicated logic / structure... but it would be a mess or waste memory / efficiency. It's better to do them in two loops.

This is exactly the same reason why we have two different loops for elemental things (like elemental postprocessors) and nodal things (like nodal postprocessors).

As for doing it threaded: any mesh-sweep we do should be threaded.

Not only that - but I've done the hard work to make it threaded. Why on earth would we remove it now?

Just asking the question. I just saw a bunch of new loops and wanted to make sure that this was the right choice here. Sounds like you've thought about it.

permcody · 2018-04-07T19:26:11Z

framework/src/vectorpostprocessors/StatisticsVectorPostprocessor.C

+#include "ThreadedNodeLoop.h"
+
+#include "libmesh/quadrature.h"
+


#include <numeric>

I guess - there are LOTS of things that we use that we never #include... tons in fact.

If I #included everything used in this file it would be ~20 or so header files...

Because you need it: https://civet.inl.gov/job/174638/

That's not what caused that failure. The failure was an actual copy-and-paste bug that you missed: https://github.com/idaholab/moose/pull/11216/files#diff-5e0bb82e85bf3311c34cfcf6d004e3e9R13

Hmm, Are you sure? Doing a "grep" shows that we only have three includes of "numeric" in the whole framework. Two of them in vector postprocessors and one in MooseUtils.h. The unity build will work because of the former, but we are definitely pushing the lucky side of things if we are relying on MooseUtils to pull it in through an indirect include.

I #included it to be sure

permcody · 2018-04-07T19:27:41Z

framework/src/vectorpostprocessors/StatisticsVectorPostprocessor.C

+      const auto & name = the_pair.first;
+      const auto & values = *the_pair.second.current;
+
+      auto & stat_vector = *_stat_vectors[name];


Potential accidental insertion... Try to avoid using map [] operators on the RHS - like ever :)

It can't be - initialize() above guarantees that it will exist. But I'll change it because I don't feel like fighting over pedantry.

I've fixed literally dozens of these over the years. Maybe even hundreds. The pedantry is more than worth it. I'm not picking on you, I'm trying to encourage defensive coding. If you want to keep the RHS bracket usage, add a mooseAssert right before it, or at least at the top of the method. Maybe someday we'll refactor and initialize and execute won't appear next to one another in the code.

Fair - I will change it. What is your preferred way here again? Do we have it written down somewhere (maybe it should be on the code-standards... maybe we should even look for this in precheck!)

Almost impossible in a pre-check. Using the bracket operator on the RHS with a vector is common and the only way to get values out. Telling the difference between a map and a vector would require full semantic analysis. I usually just go with std::map::find(), Robert likes std::map::count() better. If you want to use the brackets, then just put a mooseAssert somewhere before so we can actually guarantee (at least in debug) that the value is there. Finally there is std::map::at() but we've more or less decided against using that because it can throw (which at least finds the problem, but doesn't give useful information when it causes your simulation to crash, especially in parallel).

I wonder if it would be possible to write a custom clang plugin that would error if you call std::map::operator[]...

Anyway - I changed it...

permcody · 2018-04-07T19:29:42Z

test/tests/vectorpostprocessors/work_balance/tests

+    csvdiff = 'distributed_work_balance_0000.csv'
+    min_parallel = 2
+    max_parallel = 2
+    mesh_mode = distributed


This shouldn't be necessary for a framework test, but I'll leave that up to you. We do run distributed mesh in parallel. This is fine.

Again: please give me the benefit of the doubt. It's not just random. If you see something that I've done that looks like I did it on purpose (not just copy and paste)... THEN I DID IT ON PURPOSE. Please try to think about why I would have done that instead of just criticizing. I've been working hard on making good code that is tested and documented...

In this case the gold files are particular to the partitioner and the number of processors the test is run on. There isn't one gold file that will work for both replicated and distributed - but I want to make sure that both are tested with this object (it's obviously important that it works with both).

Oh right, good point. In fact there's a possibly it won't be stable on different platforms. I guess we'll see.

permcody · 2018-04-07T19:30:38Z

framework/src/vectorpostprocessors/WorkBalance.C

+
+    auto n_vars = node.n_vars(sys);
+
+    for (decltype(n_vars) var = 0; var < n_vars; var++)


Minor: Lots of whitespace (newlines), not really clear why.

permcody · 2018-04-07T19:32:19Z

framework/src/vectorpostprocessors/WorkBalance.C

+
+    // For MOOSE, system 0 is the nonnlinear system - that's what we care about here
+    // I've left the other code commented out here because I might change my mind in a little while
+    // and add back the ability to set the system or compute over all


Perhaps an option for either or both is in order:

moose/framework/src/postprocessors/NumDOFs.C

Line 20 in 215da4b

MooseEnum system_enum("NL AUX ALL", "ALL");

Possibly - but I accept pull requests... I don't have time to do all the possibilities.

Yeah, I'm just commenting that we previously ran into this before and I read your comment here showing you that we eventually added the option. You don't have to change anything, but you can see that we may want to add options. Again, I'm not saying that you should change it, just getting you to think and be consistent with other pars of the code.

Well - the other problem here is that it might be some other system other than NL or Aux. For instance: I have a few systems in my ray tracing stuff.

I know that I can get the enum to accept more values than what are given... but how exactly do I get an id for those? Do you have to give the real name of the system (like rt0) or do you provide the system number or something?

This is what I wasn't quite sure of so I didn't know how I wanted to proceed.

I'll at least add in NL, AUX and ALL for now.

Also: thanks for pointing to NumDOFs... I'll make this consistent.

I'm not sure how we get the extra ids. @aeslaughter - has rewritten that system and things have changed quite a bit. NL and Aux will be sufficient for almost anyone

…loses idaholab#11218

permcody · 2018-04-09T14:42:19Z

framework/src/vectorpostprocessors/StatisticsVectorPostprocessor.C

+                                     stat_vector.end(),
+                                     0,
+                                     [&mean](Real running_value, Real current_value) {
+                                       return running_value + std::pow(current_value - mean, 2);


Minor: Could use one of the fast pow utilities we have (either libMesh's templated one, or Moose's fast one).

friedmud added 2 commits April 7, 2018 09:09

Add new WorkBalance VectorPostprocessor refs idaholab#11209

54d427b

Add documentation closes idaholab#11209

76d54a6

friedmud added 3 commits April 7, 2018 12:46

Add StatisticsVectorPostprocessor closes idaholab#11217

68e3214

Test work balance with distributed mesh refs idaholab#11209

33fa4a9

Add some documentation for StatisticsVectorPostprocessor and some sma…

84d997d

…ll modifications to WorkBalance documentation closes idaholab#11217 refs idaholab#11209

permcody reviewed Apr 7, 2018

View reviewed changes

friedmud added a commit to friedmud/moose that referenced this pull request Apr 7, 2018

Address comments refs idaholab#11216 idaholab#11209 idaholab#11217

994b5be

friedmud force-pushed the work_balance_11209 branch from 994b5be to 717d952 Compare April 7, 2018 23:26

friedmud added a commit to friedmud/moose that referenced this pull request Apr 7, 2018

Address comments refs idaholab#11216 idaholab#11209 idaholab#11217

717d952

Address comments refs idaholab#11216 idaholab#11209 idaholab#11217

5bdd0e9

friedmud force-pushed the work_balance_11209 branch from 717d952 to 5bdd0e9 Compare April 7, 2018 23:39

friedmud added 2 commits April 7, 2018 21:44

Add HistogramVectorPostprocessor refs idaholab#11218

1ed12ab

Add some documentation on how to plot a HistogramVectorPostprocessor c…

b185589

…loses idaholab#11218

friedmud force-pushed the work_balance_11209 branch from 2821790 to b185589 Compare April 8, 2018 14:43

friedmud mentioned this pull request Apr 8, 2018

Set node processor ids less unevenly libMesh/libmesh#1621

Closed

permcody approved these changes Apr 9, 2018

View reviewed changes

permcody merged commit f1a649d into idaholab:devel Apr 9, 2018

brianmoose mentioned this pull request Apr 10, 2018

CIVET: '64bit dof id Test' failure #11228

Closed

		#include "ThreadedNodeLoop.h"

		#include "libmesh/quadrature.h"


		auto n_vars = node.n_vars(sys);

		for (decltype(n_vars) var = 0; var < n_vars; var++)

Work balance #11216

Work balance #11216

Conversation

friedmud commented Apr 7, 2018

moosebuild commented Apr 7, 2018 • edited

permcody left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

moosebuild commented Apr 7, 2018 •

edited