Restart wont't work on large meshes #80

yungyuc · 2013-11-30T03:58:24Z

Originally reported by: David Bilyeu (BitBucket: david_b, GitHub: david_b)

When restart from a large mesh SOLVCON fails with out any error messages.
Increasing the number of mpi processes does not help.
Each solvcon.dump.solver files is about 900MB
the solvcon.dump.case.obj file was about 50GB
The system that I ran it on has ~30GB of memory per node.

It might be useful to decrease the size of the solvcon.dump.case.obj files or split up the loading.

Bitbucket: https://bitbucket.org/solvcon/solvcon/issue/80

yungyuc · 2013-11-30T06:35:11Z

Original comment by Yung-Yu Chen (BitBucket: yungyuc, GitHub: yungyuc):

Does this happen in the 0.1.2+ version of SOLVCON?

yungyuc · 2013-12-22T14:37:40Z

Original comment by Yung-Yu Chen (BitBucket: yungyuc, GitHub: yungyuc):

@david_b Do you have further update for this issue?

yungyuc · 2014-01-31T20:37:17Z

Original comment by David Bilyeu (BitBucket: david_b, GitHub: david_b):

Not much, I couldn't get Solvcon to run on the large memory nodes. I cannot get gcc or python to compile on it and the default versions are too old for Solvcon to work. I have a ticket in for them to provide an updated version of gcc but it could take some time before it gets installed.

yungyuc · 2014-02-01T03:00:07Z

Original comment by Yung-Yu Chen (BitBucket: yungyuc, GitHub: yungyuc):

As we discussed in hangout, I think the first step to resolve this issue is
to create a tiny test case to study how we can shink the dump file. 50GB
is just unacceptable.

yungyuc · 2014-05-21T15:57:26Z

Original comment by Yung-Yu Chen (BitBucket: yungyuc, GitHub: yungyuc):

@david_b any update on this issue? A small, reproducing test case would be the next step.

yungyuc · 2014-07-12T13:20:10Z

Original comment by David Bilyeu (BitBucket: david_b, GitHub: david_b):

Sorry, I don't have any update on this issue. I was unable to compile the
required version of gcc on the large memory node and am waiting for the
system admins to install a newer version.
This issue should be reproducible if you try to load a restart file that is
larger than the available memory. If this is run on a singe computer then
virtual memory would need to be used during the simulation. I have done a
simulation that required virtual memory before so I don't know how python
would handle it. Another possibility would be to limit the amount of
available memory on the system, e.g. lode a large file into paraview.

David

yungyuc · 2014-07-13T05:19:10Z

Original comment by Yung-Yu Chen (BitBucket: yungyuc, GitHub: yungyuc):

@david_b I think I may be able to try that on AWS.

yungyuc · 2014-11-25T07:20:54Z

Original comment by Yung-Yu Chen (BitBucket: yungyuc, GitHub: yungyuc):

v0.1.3 will focus on a sequential run. Let's postpone this to later releases.

yungyuc added bug sach (solver&anchor&case&hook) labels Dec 8, 2015

yungyuc mentioned this issue Dec 7, 2017

Discuss development plan after 2017 #196

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Restart wont't work on large meshes #80

Restart wont't work on large meshes #80

yungyuc commented Nov 30, 2013

yungyuc commented Nov 30, 2013

yungyuc commented Dec 22, 2013

yungyuc commented Jan 31, 2014

yungyuc commented Feb 1, 2014

yungyuc commented May 21, 2014

yungyuc commented Jul 12, 2014

yungyuc commented Jul 13, 2014

yungyuc commented Nov 25, 2014

Restart wont't work on large meshes #80

Restart wont't work on large meshes #80

Comments

yungyuc commented Nov 30, 2013

yungyuc commented Nov 30, 2013

yungyuc commented Dec 22, 2013

yungyuc commented Jan 31, 2014

yungyuc commented Feb 1, 2014

yungyuc commented May 21, 2014

yungyuc commented Jul 12, 2014

yungyuc commented Jul 13, 2014

yungyuc commented Nov 25, 2014