-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resampling a very big file with nbodykit #646
Comments
Hi, The current resampling algorithm in pmesh is doing more or less what you wanted to do. It calls to https://github.com/rainwoodman/pmesh/blob/f8f1d78eb22027fbcf71345d382477cea25ab9e3/pmesh/pm.py#L479 What you do appears to be correct, but the memory is actually bit tight. The resampling algorithm does use a lot of memory -- it wasn't very optimized. It was originally written to handle 128/256 meshes under extremely low memory load (in the over-decomposed regime) I think. When I scanned the code in pmesh,, I see scratch spaces for r2c (1x), for the indices (2 to 4 long integers, 4x or 8x of floats), for global sorting (2x-ish), and c2r(1x), and output space (1x), and some masks that can probably be ignored. So the 512 GB appears to be a bit tight. (as 13 * 40G = 520G). As usually more nodes = faster processing time, if you can get 32 nodes, then give it a try. We can also try to curb the memory usage of resample, it is likely some of the objects can be eagerly freed with del, for example. Another reason to run into OOM is that if you use too many ranks such that 1152x1152 is not divided by the process mesh; in that case some ranks will not receive any part of the mesh. This seems to be unlikely as you only have 16 nodes (and potentially 16*64 = 32x32 ranks; how many ranks did you run the script with?), and 1152 = 128 * 9. |
Thank you -- that's good to know! I am currently using I believe 64 ranks (2 tasks per node and 32 nodes):
|
I am not super familiar with the desc environment. Could you do a conda list? Is the the environment you built? Did you check if the result of these commands make sense to you?
and
(A year ago they were giving unexpected results) Running on NERSC computing nodes with MPI, I usually use the bcast based environment provided by m3035:
|
Hi Yu Feng, I will attempt to follow those instructions in more detail if I run into more issues in the future, but for now it seems like mpirun is working properly across nodes and outputs different host names! I was more worried about the question of whether the code is properly parallelized: e.g. when I do manipulations on a bigfile mesh such as mesh.paint('real'), mesh.apply(filter), ArrayMesh(numpy_array), FFTPower(mesh), and mesh.save()? I am slightly confused by the documentation about which functions are and are not parallelized, but I may also be misreading! Cheers, |
On Wed, Nov 11, 2020 at 10:20 PM boryanah ***@***.***> wrote:
Hi Yu Feng,
I will attempt to follow those instructions in more detail if I run into
more issues in the future, but for now it seems like mpirun is working
properly across nodes and outputs
different host names!
Good to know! I've seen right hostnames but wrong comm.rank values before;
so definitely check that too.
I was more worried about the question of whether the code is properly
parallelized: e.g. when I do manipulations on a bigfile mesh such as
mesh.paint('real'), mesh.apply(filter), ArrayMesh(numpy_array),
FFTPower(mesh), and mesh.save()? I am slightly confused by the
documentation about which functions are and are not parallelized, but I may
also be misreading!
When we designed nbodykit, all functions were intended to work on data
distributed to multiple ranks (parallized), unless specifically mentioned
or we are hitting a bug / scaling bottleneck.
"preview()" was the only one that does not scale well that pops into my
mind(visualization must be single headed). HaloCatalog used to be bad due
to dependency on halotools, but that got fixed last year. Can't think of
anything else right now..
- Yu
… Cheers,
Boryana
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#646 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABBWTCBJI3USRIHKPYXRX3SPN5EJANCNFSM4TL2WM3Q>
.
|
Thanks a lot for your help! This is all very useful to know! |
Hi Yu Feng,
I am trying to resample a mesh stored in a bigfile from 2304^3 (40 GB) to 1152^3 and save it again to a bigfile mesh using a parallel script (on NERSC).
So far I have tried doing the following, granting the task 512 GB spread across 16 nodes, but got an out of memory error. Here is the script:
(The failure is after loading the bigfile. Also I have very very slightly modified the save function in the source code to allow the user to specify the new downsampled mesh-size Nmesh_new by adding Nmesh=Nmesh to the following line in the source code self.compute(mode,Nmesh=Nmesh) in the save function and have tested through a simple example that it works)
The failure of this line makes me think that the save step is perhaps not parallelized or perhaps that when resampling (i.e. Nmesh is not equivalent to the Nmesh of the original mesh) the self._paint_XXX function isn't.
A new approach that I intend to pursue is loading the bigfile mesh and converting it to a complex field through the paint command (as I believe that uses pfft) and then zeroing all modes that are higher than pi Nmesh_new/Lbox before reverting back to real space through the paint command (and saving to a bigfile the downsampled field). Is that approach parallelized? And if not what functions can I apply to make it so?
My nbodykit version is 0.3.15
The text was updated successfully, but these errors were encountered: