Available simulation results? #20

dombrno · 2018-03-21T19:09:38Z

Good evening,
I am using ALPSCORE/CT-HYB for the calculation of dynamic susceptibilities. This involves calculating G2, inverting the BSE equation, and performing the analytic continuation. The analytic continuation uses Maxent and thus needs some error information as input data. In order to feed a god estimate of the error to Maxent, I am considering a Jackknife resampling procedure on the output of the simulation. Hence, my questions:
a) is there anything better/easier than this available somewhere within the ALPS project?
b) if not, is the detailed simulation data available in the h5 output file?

Thanks a lot for your help.

shinaoka · 2018-03-21T21:37:21Z

Hi, I did not implement estimate of error bars simply because I did not see a need at that point.
There is no available error information in current h5 output file.

How do you propagate the errors of average sign in a Jackknife resampling?

dombrno · 2018-03-21T21:48:27Z

The idea would be to mimick a series of shorter QMC runs by splitting the full set of points of the simulation from a single run into bunches. We would then invert the BSE and obtain the dynamic susceptibility in imaginary bosonic frequency, based on each of these bunches. For this, what we would need is G2(l,l', omega_n) at each measurement point of the QMC simulation, if that is dumped anywhere?
Said differently, we are trying to avoid having to run a number of independent runs by analyzing the data of the long run we already have. but probably these data are not dumped, I would imagine this would be several Go large.

dombrno · 2018-03-21T21:53:07Z

In the old times, in the very first version of Alps, there was a function "evaluate" which actually had to be called once the QMC was finished, and did the all the averaging etc. Based on this architecture, it would have been possible to do the procedure mentioned above. But I understand that the new architecture probably does not give access to this fine-grained detail, which is not necessary, since the averaging is done at the end of the QMC run before results are dumped.

shinaoka · 2018-03-21T21:54:29Z

It may be possible with the accumulator or alea libraries in ALPSCore.
Markus Wallerberger may be the right person to answer this technical question.
Alex, Emanuel, could you assign him to this thread?

shinaoka · 2018-03-27T04:16:48Z

Alex, Emanuel, any idea?

egull · 2018-03-27T12:11:12Z

Yes. We have the FullBinning observable in the old ALPS. That one will keep a number of bins from the simulation (typically 128, I think) and fill them with the values. They are then written into the HDF5 file. A BS equation calculation can then take the data from the bins and post process them.
A word of warning though: this will require 128 times the storage for the HDF5 file (and in memory), which is substantial for vertex functions.
Hiroshi, to enable this you would have to find the accumulator and change it from mean or nobinning to fullbinning. @dombrno would then have to rerun the calculation.
With the new ALPS we have the same capabilities, just the rewrite of the ALEA has the binning procedure optimized a bit. A paper is in preparation.

dombrno · 2018-03-27T12:15:57Z

I understand, that makes sense: I remember that in the course of adapting the old Alps code (segment) for my needs (expand it to two orbitals, and use non diagonal hybridization function), I replaced the proposed accumulators with others which did not do any binning, and enjoyed a much more compact output file (I did not check memory consumption, and I was only calculating single-particle GF).

egull · 2018-03-27T12:16:46Z

So... how can we help? Should we get you a version that can do binning so you can try?

dombrno · 2018-03-27T12:19:55Z

I think that before anything is done, I should assess the memory and storage which are needed at the moment, and consider how much more the hardware is able to handle on my side.

shinaoka · 2018-03-27T12:20:54Z

Thank you, Emanuel.

@dombrno
I want to ask you a real need for your BS equation stuff.
What if you just run the CT-HYB solver several times with different random seeds?
This may also give some estimates of error bars.

dombrno · 2018-03-27T12:25:05Z

Yes, that is an option.

Maybe we can keep this feature as a "nice to have" option, but I would give it the lowest priority.

dombrno · 2018-03-27T12:37:39Z

For what it's worth, with the current code, the h5 file is 100 Mo large, while, and the calculation consumes 5Go RAM. The available RAM is 128Go on my hardware, so I could probably use up to 20 bins, if this option were ever implemented.

If I do have a real strong need for it, I will ask for guidance as to which type of accumulator is best to use in your opinion, and implement it on my side, but for the time being I thank you for your answer and your help, which perfectly answers my initial question.

dombrno · 2018-04-27T22:58:14Z

@shinaoka I finally went for the option you suggested: run the CT-HYB solver several times with different random seeds. I do this by using the job array feature of the PBS scheduler: a number of jobs are launched with the exact same inputs. Each job uses one full node with 24 cpus. The only difference between each job is the value of SEED in the input.ini file.

Does it look reasonable to you? In particular, as the seed in my setup increases by one at each node, I would like to make sure that the same seed is used by all cpus controlled by a given job, so that there can be no seed overlap between cpus from different jobs. In other terms, I would like a confirmation that all the mpi processes of a single run share the same seed. This is a probably a question for @galexv ? Thanks a lot.

shinaoka · 2018-04-27T23:29:06Z

Hmm, the value of the seed increases one by one for different nodes?
This could cause a problem.
For the MPI process of rank n, its pseudorandom number generator is initialized with SEED + n.
Here, SEED is the seed in a given file. (This is the specification of ALPSCore libraries)
So, the value of SEED should increase at least by 24 from one node to another one.

shinaoka · 2018-04-27T23:31:18Z

To prevent this happen, I may be able to apply some non-linear transformation $f$ to SEED to initialize Brandon-number generators with f(SEED) + n.
What do you think?

dombrno · 2018-04-28T06:45:20Z

I suspected this could be the case, thanks for clarifying - I will simply increase the seed by 24 on each node, and should be fine then. Thank you!

dombrno · 2018-04-28T07:24:41Z

Maybe just one detail to make sure everything is working as expected: I am controlling the seed via the key "SEED" at the top level of the .ini file, based on what I saw implemented in alps/mc/mcbase.cp. Is this the recommended way to control this parameter?

shinaoka · 2018-04-28T09:39:43Z

You're right!

dombrno · 2018-05-01T08:23:12Z

I have now obtained the data corresponding to 64 runs on a single node, using different seed values. I would like to do some resampling of the quantities G1_LEGENDRE and G2_LEGENDRE. For this purpose I need to use the number of measurements performed on each node for these quantities (the average sign is 1.0). It looks like

/simulation/results/G1_Re/count
/simulation/results/G2_Re/count

might be the suitable fields - can you please confirm if this is correct?
Thank you!

egull · 2018-05-01T10:52:36Z

Yes, that is correct.

…

On Tue, May 1, 2018, 4:23 AM Dominique Geffroy ***@***.***> wrote: I have now obtained the data corresponding to 64 runs on a single node, using different seed values. I would like to do some resampling of the quantities G1_LEGENDRE and G2_LEGENDRE. For this purpose I need to use the number of measurements performed on each node for these quantities. It looks like - /simulation/results/G1_Re/count - /simulation/results/G2_Re/count might be te suitable fields - can you please confirm if this is correct? Thank you! — You are receiving this because you were assigned. Reply to this email directly, view it on GitHub <#20 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AG29Re2uVBs_BXN0vqtiEOboQW0wzz7Aks5tuBtxgaJpZM4S1W9m> .

shinaoka · 2018-05-01T11:19:03Z

BTW, why do you need to know the number of measurements?

dombrno · 2018-05-01T15:09:43Z

Well, I have 64 samples, and need to calculate the values of G1 and G2 over subsets of these samples, so I was thinking that the number of measurements is the reasonable weight to apply to the contribution of each sample to the partial resummation, given the fact that the average sign is 1.0

shinaoka self-assigned this Mar 21, 2018

shinaoka assigned galexv and egull Mar 27, 2018

dombrno closed this as completed Mar 27, 2018

shinaoka reopened this Apr 27, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Available simulation results? #20

Available simulation results? #20

dombrno commented Mar 21, 2018

shinaoka commented Mar 21, 2018 •

edited

Loading

dombrno commented Mar 21, 2018

dombrno commented Mar 21, 2018

shinaoka commented Mar 21, 2018

shinaoka commented Mar 27, 2018

egull commented Mar 27, 2018

dombrno commented Mar 27, 2018

egull commented Mar 27, 2018

dombrno commented Mar 27, 2018

shinaoka commented Mar 27, 2018

dombrno commented Mar 27, 2018

dombrno commented Mar 27, 2018

dombrno commented Apr 27, 2018

shinaoka commented Apr 27, 2018

shinaoka commented Apr 27, 2018

dombrno commented Apr 28, 2018

dombrno commented Apr 28, 2018

shinaoka commented Apr 28, 2018

dombrno commented May 1, 2018 •

edited

Loading

egull commented May 1, 2018 via email

shinaoka commented May 1, 2018

dombrno commented May 1, 2018

Available simulation results? #20

Available simulation results? #20

Comments

dombrno commented Mar 21, 2018

shinaoka commented Mar 21, 2018 • edited Loading

dombrno commented Mar 21, 2018

dombrno commented Mar 21, 2018

shinaoka commented Mar 21, 2018

shinaoka commented Mar 27, 2018

egull commented Mar 27, 2018

dombrno commented Mar 27, 2018

egull commented Mar 27, 2018

dombrno commented Mar 27, 2018

shinaoka commented Mar 27, 2018

dombrno commented Mar 27, 2018

dombrno commented Mar 27, 2018

dombrno commented Apr 27, 2018

shinaoka commented Apr 27, 2018

shinaoka commented Apr 27, 2018

dombrno commented Apr 28, 2018

dombrno commented Apr 28, 2018

shinaoka commented Apr 28, 2018

dombrno commented May 1, 2018 • edited Loading

egull commented May 1, 2018 via email

shinaoka commented May 1, 2018

dombrno commented May 1, 2018

shinaoka commented Mar 21, 2018 •

edited

Loading

dombrno commented May 1, 2018 •

edited

Loading