Add capability to store Monte Carlo sample data in parallel #13906

jiangwen84 · 2019-08-13T19:22:25Z

Please do not close ref #13879.

moosebuild · 2019-08-13T20:04:28Z

Job Documentation on d666216 wanted to post the following:

View the site here

This comment will be updated on new commits.

jiangwen84 · 2019-08-14T16:59:18Z

@aeslaughter could you take a look at this PR? Thanks.

aeslaughter

I don't prefer this design as it restricts the distributed version to a very specific case where the number of matrices match the number or processors. It also introduces two new virtual methods to implement.

My goal for the Sampler object is for the number of matrices and rows per matrix to be arbitrary. I don't want to run into a case in the future were are rigid design of the matrix and row structure makes it difficult to implement.

I am open to discuss this further. Hopefully we can get something that is general but allows you to meet your needs.

What if "reinit" accepted size pairs for the number of rows and columns for each matrix and an overloaded version for when they are all the same:
reinit( (10,2), (11,3), ...)
reinit(3, (10, 10))

Also, the distribution should can just break matrices wherever needed and we can provide a smart
iterator: localSampleRowBegin(), localSampleRowEnd() that would allow each row to be accessed for on the local process.

framework/include/samplers/Sampler.h

...ols/test/tests/multiapps/batch_commandline_control/gold/master_multiple_out_storage_0002.csv

aeslaughter · 2019-08-20T19:50:15Z

Another possibility is that I am trying to be too general, in that case we should just remove the multiple DenseMatrices and just use one. This would allow a single offset to be stored for distributing the data.

jiangwen84 · 2019-08-21T04:35:52Z

The current implementation should allow the number of rows/matrices to be different from the number of processors? Maybe I misunderstand what said about "specific case where the number of matrices match the number or processors"

If you agree to have just one DenseMatrix, it will be easier to implement at least for now.

Could you explain to me how you would like the DenseMatirx to be distributed across all processors? Do you think the use of linear partitioning of number of rows is not a good idea?

jiangwen84 · 2019-08-21T20:47:18Z

@bwspenc since you are also interested in this for Grizzly application, so I am tagging you to join in our discussion.

aeslaughter · 2019-08-22T23:10:49Z

@jiangwen84 I misunderstood what you were doing with the two getTotal... methods. Also, the distribution looks fine. I just went through this too quick, sorry about that.

The only thing don't like about this is two function overrides. Can we go with setters instead that are called from the constructor of MonteCarloSampler. The reinit() method can error if they have not been called.

setTotalNumberOfMatrices(42);
setTotalNumberOfSamples(10000);

jiangwen84 · 2019-08-28T17:53:21Z

@aeslaughter Can you take another look at? I made the sample matrix partitioning consistent with multi-app batch. I also added a new option in MCSampler to reuse the computed sample matrix because recompute might take a long time when there are more than 100million samples. Thanks!

ref idaholab#13879

jiangwen84 · 2019-09-03T20:42:27Z

@aeslaughter any thoughts? Thanks.

aeslaughter · 2019-09-04T16:38:15Z

I like where this is going, I am going to add a few things to this rather than try to explain what I am thinking.

(refs idaholab#13906)

(refs idaholab#13906) The test removed isn't needed, it is executed with threads by the test system

(refs idaholab#13906)

(refs idaholab#13906) The test removed isn't needed, it is executed with threads by the test system

(refs idaholab#13906)

…trival (closes idaholab#13906)

(refs idaholab#13906)

(refs idaholab#13906) The test removed isn't needed, it is executed with threads by the test system

(refs idaholab#13906)

jiangwen84 changed the title ~~Added an option to store Monte Carlo sample data in parallel~~ Add option to store Monte Carlo sample data in parallel Aug 13, 2019

jiangwen84 changed the title ~~Add option to store Monte Carlo sample data in parallel~~ Add capability to store Monte Carlo sample data in parallel Aug 13, 2019

jiangwen84 force-pushed the distribute_sample_matrix branch from f5823db to dea3f22 Compare August 13, 2019 19:25

moosebuild added the PR: Failed but allowed label Aug 13, 2019

aeslaughter self-assigned this Aug 20, 2019

aeslaughter suggested changes Aug 20, 2019

View reviewed changes

framework/include/samplers/Sampler.h Outdated Show resolved Hide resolved

...ols/test/tests/multiapps/batch_commandline_control/gold/master_multiple_out_storage_0002.csv Show resolved Hide resolved

jiangwen84 force-pushed the distribute_sample_matrix branch from dea3f22 to 61f73c9 Compare August 22, 2019 18:14

moosebuild added PR: Failed but allowed and removed PR: Failed but allowed labels Aug 22, 2019

jiangwen84 force-pushed the distribute_sample_matrix branch from 61f73c9 to acdbcca Compare August 22, 2019 22:46

moosebuild removed the PR: Failed but allowed label Aug 22, 2019

jiangwen84 force-pushed the distribute_sample_matrix branch from acdbcca to 9035ca0 Compare August 28, 2019 14:28

moosebuild added the PR: Failed but allowed label Aug 28, 2019

jiangwen84 force-pushed the distribute_sample_matrix branch from 9035ca0 to 3778376 Compare August 28, 2019 17:15

jiangwen84 force-pushed the distribute_sample_matrix branch from 3778376 to 87612f6 Compare August 29, 2019 16:22

moosebuild removed the PR: Failed but allowed label Aug 29, 2019

Add capability to store Monte Carlo sample data in parallel

d666216

ref idaholab#13879

jiangwen84 force-pushed the distribute_sample_matrix branch from 87612f6 to d666216 Compare August 29, 2019 16:28

moosebuild added the PR: Failed but allowed label Aug 29, 2019

aeslaughter added a commit to aeslaughter/moose that referenced this pull request Sep 6, 2019

WIP: New interface for Sampler objects

bbb4005

(refs idaholab#13906)

aeslaughter added a commit to aeslaughter/moose that referenced this pull request Sep 6, 2019

Redesign of MC working

efcb047

(refs idaholab#13906)

aeslaughter added a commit to aeslaughter/moose that referenced this pull request Nov 20, 2019

Remove copy of test that ran with threads

fa10d7e

(refs idaholab#13906) The test removed isn't needed, it is executed with threads by the test system

aeslaughter added a commit to aeslaughter/moose that referenced this pull request Nov 20, 2019

fixup! Update stochastic tools documentation to include getNextLocalRow

e1957d6

(refs idaholab#13906)

aeslaughter added a commit to aeslaughter/moose that referenced this pull request Nov 20, 2019

Update stochastic tools documentation to include getNextLocalRow

755986f

(refs idaholab#13906)

aeslaughter added a commit to aeslaughter/moose that referenced this pull request Nov 20, 2019

Remove copy of test that ran with threads

54427f8

(refs idaholab#13906) The test removed isn't needed, it is executed with threads by the test system

aeslaughter mentioned this pull request Nov 21, 2019

Distribute Sampler data in parallel #14191

Merged

aeslaughter added this to In progress in Stochastic Tools via automation Nov 21, 2019

aeslaughter added a commit to aeslaughter/moose that referenced this pull request Nov 21, 2019

Disable recover testing for python scripts.

ac35b8b

(refs idaholab#13906)

aeslaughter added a commit to aeslaughter/moose that referenced this pull request Nov 25, 2019

New interface for Sampler objects to allow for global and/or local re…

fa39858

…trival (closes idaholab#13906)

aeslaughter added a commit to aeslaughter/moose that referenced this pull request Nov 25, 2019

Create row iterator methods for Sampler object

ce50b62

(refs idaholab#13906)

aeslaughter added a commit to aeslaughter/moose that referenced this pull request Nov 25, 2019

WIP: Add matrix/vector size limits to Sampler

368158f

(refs idaholab#13906)

aeslaughter added a commit to aeslaughter/moose that referenced this pull request Nov 25, 2019

Update SamplerData for getNextLocalRow

1185660

(refs idaholab#13906)

aeslaughter added a commit to aeslaughter/moose that referenced this pull request Nov 25, 2019

Update SamplerTransfer to use getNextLocalRow

fa06dcd

(refs idaholab#13906)

aeslaughter added a commit to aeslaughter/moose that referenced this pull request Nov 25, 2019

Update MultiAppCommandLineControl to use getNextLocalRow

d16db11

(refs idaholab#13906)

aeslaughter added a commit to aeslaughter/moose that referenced this pull request Nov 25, 2019

Update SamplerFullSolveMultiApp to use getNextLocalRow

522bc5c

(refs idaholab#13906)

aeslaughter added a commit to aeslaughter/moose that referenced this pull request Nov 25, 2019

Test errors within Sampler base class

4b1e0ed

(refs idaholab#13906)

aeslaughter added a commit to aeslaughter/moose that referenced this pull request Nov 25, 2019

Improve interface for python util for Sampler memory data

3f57f77

(refs idaholab#13906)

aeslaughter added a commit to aeslaughter/moose that referenced this pull request Nov 25, 2019

Add row compute method to API and improve documentation

f37594f

(refs idaholab#13906)

aeslaughter added a commit to aeslaughter/moose that referenced this pull request Nov 25, 2019

Add perf graph to Sampler and Distribution objects

3ec4b7c

(refs idaholab#13906)

aeslaughter added a commit to aeslaughter/moose that referenced this pull request Nov 25, 2019

Update stochastic tools documentation to include getNextLocalRow

21b8e79

(refs idaholab#13906)

aeslaughter added a commit to aeslaughter/moose that referenced this pull request Nov 25, 2019

Remove copy of test that ran with threads

1b8b2be

(refs idaholab#13906) The test removed isn't needed, it is executed with threads by the test system

aeslaughter added a commit to aeslaughter/moose that referenced this pull request Nov 25, 2019

Update Sampler API to use getGlobalSamples to avoid ambiguity

0e3bc67

(refs idaholab#13906)

aeslaughter added a commit to aeslaughter/moose that referenced this pull request Nov 25, 2019

Add legacy API for short-term application support

4c3684c

(refs idaholab#13906)

aeslaughter added a commit to aeslaughter/moose that referenced this pull request Nov 25, 2019

Disable recover for python script testing

0414e10

(refs idaholab#13906)

aeslaughter added a commit to aeslaughter/moose that referenced this pull request Nov 27, 2019

Fix input values for 64-bit integer support

b1bc07c

(refs idaholab#13906)

aeslaughter mentioned this pull request Nov 27, 2019

Fix input values for 64-bit integer support #14453

Merged

aeslaughter added a commit to aeslaughter/moose that referenced this pull request Nov 27, 2019

Disable valgrind for python test

9901494

(refs idaholab#13906)

aeslaughter added a commit to aeslaughter/moose that referenced this pull request Nov 27, 2019

Disable valgrind for python test

d8f5667

(refs idaholab#13906)

aeslaughter added a commit to aeslaughter/moose that referenced this pull request Nov 27, 2019

Fix input values for 64-bit integer support

cff954d

(refs idaholab#13906)

aeslaughter added a commit to aeslaughter/moose that referenced this pull request Nov 27, 2019

Disable valgrind for python test

29c4385

(refs idaholab#13906)

aeslaughter moved this from In progress to Done in Stochastic Tools Dec 10, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add capability to store Monte Carlo sample data in parallel #13906

Add capability to store Monte Carlo sample data in parallel #13906

jiangwen84 commented Aug 13, 2019

moosebuild commented Aug 13, 2019 •

edited

jiangwen84 commented Aug 14, 2019

aeslaughter left a comment

aeslaughter commented Aug 20, 2019

jiangwen84 commented Aug 21, 2019

jiangwen84 commented Aug 21, 2019

aeslaughter commented Aug 22, 2019

jiangwen84 commented Aug 28, 2019

jiangwen84 commented Sep 3, 2019

aeslaughter commented Sep 4, 2019

Add capability to store Monte Carlo sample data in parallel #13906

Add capability to store Monte Carlo sample data in parallel #13906

Conversation

jiangwen84 commented Aug 13, 2019

moosebuild commented Aug 13, 2019 • edited

jiangwen84 commented Aug 14, 2019

aeslaughter left a comment

Choose a reason for hiding this comment

aeslaughter commented Aug 20, 2019

jiangwen84 commented Aug 21, 2019

jiangwen84 commented Aug 21, 2019

aeslaughter commented Aug 22, 2019

jiangwen84 commented Aug 28, 2019

jiangwen84 commented Sep 3, 2019

aeslaughter commented Sep 4, 2019

moosebuild commented Aug 13, 2019 •

edited