-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue Writing Large Files #3
Comments
I will take a look, but it may be a day or two until I can get you a
proper response. Please let me know if you find out anything new in the
meantime.
- Rhys
…On Dec 13, 2016 10:26 AM, "clarkpede" ***@***.***> wrote:
I have encountered a problem writing large array using a single processor.
I'm running a serial code that's working with a large array. When I work
with smaller arrays (64x64x64, for example) the following example works
fine. My *.h5 files contain 1e-6 in every position, like it should. But
when I bump up the size, my *.h5 file output just contains 0's.
Here's my minimal working example:
program ESIO_test
use, intrinsic :: iso_c_binding
use mpi
use esio
implicit none
integer :: myrank, nprocs, ierr
real(C_DOUBLE) :: Udata(2,1024,1024,512,3)
type(esio_handle) :: h
call mpi_init(ierr)
call mpi_comm_rank(MPI_COMM_WORLD, myrank, ierr)
call mpi_comm_size(MPI_COMM_WORLD, nprocs, ierr)
call esio_handle_initialize(h, MPI_COMM_WORLD)
call esio_file_create(h,"/work/04114/clarkp/lonestar/fields/512/PS/restart00000000.h5",.true.)
Udata = 1e-6
call esio_field_establish(h, 1024, 1, 1024, 1024, 1, 1024, 512, 1, 512, ierr)
call esio_field_writev_double(h, "u", Udata(:,:,:,:,1), 2)
call esio_field_writev_double(h, "v", Udata(:,:,:,:,2), 2)
call esio_field_writev_double(h, "w", Udata(:,:,:,:,3), 2)
call mpi_barrier(MPI_COMM_WORLD, ierr)
call esio_file_close(h)
call esio_handle_finalize(h)
call mpi_finalize(ierr)
end program ESIO_test
I also modified this example to check the "ierr" flags at each step, but
they remained 0.
Yes, I know that ESIO is really meant for parallel reading/writing and
yes, I know that parallelizing the rest of my code would fix the problem.
But up until now, the serial portion of the code has worked fine, and even
with large arrays it only takes a minute to run. While switching to generic
hdf5 library calls and/or making the code parallel might be better, both
would require time to rewrite code and/or extra code complexity. I'd prefer
to use a serial code if I can get away with it.
System Information:
My compiler is ifort (IFORT) 16.0.1 20151021 and I'm using the -fopenmp
flag.
I'm using the release branch 0.1.9 of ESIO.
I'm using Cray mpich 7.3.0
I'm running this on an interactive session on Lonestar 5 at TACC, with 1
node and 16 tasks allocated. I'm only running the above example with 1 MPI
task.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#3>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAFNqyrF3Sa3a7qTqg7x1yD7l_ief2ohks5rHrkJgaJpZM4LL25S>
.
|
Any change in behavior if you try...
a) Adding the TARGET attribute to Udata?
b) Break Udata into Udata, Vdata, and Wdata (therefore dropping the last
"3" dimension)?
c) Write scalar-valued data instead of 2-vectors (therefore dropping the
first "2" dimension)?
Hunch (a) is that somehow you're spilling into a different memory layout
based on the size of the array. Because there's no TARGET attribute I
think the compiler is free to do whatever it wants. Hunch (b) and (c) are
just wild guesses about funkiness in the dope vector or trying to reduce
the problem to a smaller test case.
Let me know what you find,
Rhys
On Tue, Dec 13, 2016 at 6:28 PM, Rhys Ulerich <rhys.ulerich@gmail.com>
wrote:
… I will take a look, but it may be a day or two until I can get you a
proper response. Please let me know if you find out anything new in the
meantime.
- Rhys
On Dec 13, 2016 10:26 AM, "clarkpede" ***@***.***> wrote:
> I have encountered a problem writing large array using a single
> processor. I'm running a serial code that's working with a large array.
> When I work with smaller arrays (64x64x64, for example) the following
> example works fine. My *.h5 files contain 1e-6 in every position, like it
> should. But when I bump up the size, my *.h5 file output just contains 0's.
>
> Here's my minimal working example:
>
> program ESIO_test
> use, intrinsic :: iso_c_binding
> use mpi
> use esio
> implicit none
>
> integer :: myrank, nprocs, ierr
> real(C_DOUBLE) :: Udata(2,1024,1024,512,3)
> type(esio_handle) :: h
>
> call mpi_init(ierr)
> call mpi_comm_rank(MPI_COMM_WORLD, myrank, ierr)
> call mpi_comm_size(MPI_COMM_WORLD, nprocs, ierr)
>
> call esio_handle_initialize(h, MPI_COMM_WORLD)
> call esio_file_create(h,"/work/04114/clarkp/lonestar/fields/512/PS/restart00000000.h5",.true.)
>
> Udata = 1e-6
> call esio_field_establish(h, 1024, 1, 1024, 1024, 1, 1024, 512, 1, 512, ierr)
> call esio_field_writev_double(h, "u", Udata(:,:,:,:,1), 2)
> call esio_field_writev_double(h, "v", Udata(:,:,:,:,2), 2)
> call esio_field_writev_double(h, "w", Udata(:,:,:,:,3), 2)
> call mpi_barrier(MPI_COMM_WORLD, ierr)
>
> call esio_file_close(h)
> call esio_handle_finalize(h)
>
> call mpi_finalize(ierr)
>
> end program ESIO_test
>
> I also modified this example to check the "ierr" flags at each step, but
> they remained 0.
>
> Yes, I know that ESIO is really meant for parallel reading/writing and
> yes, I know that parallelizing the rest of my code would fix the problem.
> But up until now, the serial portion of the code has worked fine, and even
> with large arrays it only takes a minute to run. While switching to generic
> hdf5 library calls and/or making the code parallel might be better, both
> would require time to rewrite code and/or extra code complexity. I'd prefer
> to use a serial code if I can get away with it.
>
> System Information:
> My compiler is ifort (IFORT) 16.0.1 20151021 and I'm using the -fopenmp
> flag.
> I'm using the release branch 0.1.9 of ESIO.
> I'm using Cray mpich 7.3.0
> I'm running this on an interactive session on Lonestar 5 at TACC, with 1
> node and 16 tasks allocated. I'm only running the above example with 1 MPI
> task.
>
> —
> You are receiving this because you are subscribed to this thread.
> Reply to this email directly, view it on GitHub
> <#3>, or mute the thread
> <https://github.com/notifications/unsubscribe-auth/AAFNqyrF3Sa3a7qTqg7x1yD7l_ief2ohks5rHrkJgaJpZM4LL25S>
> .
>
|
There's no change if I apply a, b, or c. Sorry. After some experimentation, I've found that this happens when I cross the threshold from 512x512x512 to 1024x1024x512. Therefore, if I break down the array into smaller blocks (such as 512x512x512 blocks) and write them individually, the code works. |
Any chance you can isolate the behavier to the particular compiler you are
using? That edge in sizes is bizarre.
- Rhys
…On Dec 15, 2016 9:24 AM, "clarkpede" ***@***.***> wrote:
There's no change if I apply a, b, or c. Sorry.
After some experimentation, I've found that this happens when I cross the
threshold from 512x512x512 to 1024x1024x512. Therefore, if I break down the
array into smaller blocks (such as 512x512x512 blocks) and write them
individually, the code works.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#3 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAFNq9ckA8MWzNLt02sE9xA6BAJjuTyYks5rIU2mgaJpZM4LL25S>
.
|
I just tried it with gcc 4.9.3 and cray_mpich 7.3.0. I got the exact same result. ESIO stores all zeros for arrays that are 1024x1024x512, but stores the arrays properly for arrays that are 64x64x512. I've also tried using the development branch and releases 0.1.7 and 0.1.9 (all with the intel compiler). This problem doesn't appear to be version-dependent. |
Thanks. I will see what I can do. May be a few days.
- Rhys
On Dec 16, 2016 8:17 AM, "clarkpede" <notifications@github.com> wrote:
I just tried it with gcc 4.9.3 and cray_mpich 7.3.0. I got the exact same
result. ESIO stores all zeros for arrays that are 1024x1024x512, but stores
the arrays properly for arrays that are 64x64x512.
I've also tried using the development branch and releases 0.1.7 and 0.1.9
(all with the intel compiler). This problem doesn't appear to be
version-dependent.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#3 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAFNqzfP_YSaGGboKb0fRpZ26hR0y_otks5rIo9PgaJpZM4LL25S>
.
|
Ok. I can work around this issue by splitting the third index (the 512 in the examples) into suitably small chunks, and the speed is only slightly slower when I do that. So there's not really a rush. |
Any updates on this issue? |
No news here. Can you reproduce with some MPI besides cray_mpich 7.3.0? |
I tested a modified example script on my desktop. A 512x512x512 array works fine, but a 1024x512x512 gave the following error message:
I got the same error with Intel 16.0.0 compilers and GCC 5.2.0 compilers. I also tested both MPICH2 3.1.4 and OpenMPI 1.10.0. The modified Fortran program is: program ESIO_test
use, intrinsic :: iso_c_binding
use mpi
use esio
implicit none
integer :: ierr
real(C_DOUBLE) :: Udata(1024,512,512)
type(esio_handle) :: h
call mpi_init(ierr)
call esio_handle_initialize(h, MPI_COMM_WORLD)
call esio_file_create(h,"output.h5", .true.)
Udata = 1e-6
call esio_field_establish(h, 1024, 1, 1024, 512, 1, 512, 512, 1, 512, ierr)
call esio_field_writev_double(h, "u", Udata(:,:,:), 1)
call mpi_barrier(MPI_COMM_WORLD, ierr)
call esio_file_close(h)
call esio_handle_finalize(h)
call mpi_finalize(ierr)
end program ESIO_test Have you been able to reproduce any of these problems yourself? |
Additional information:
|
I am able to compile/run your recreate against the develop branch. I also see...
...but see sensible data coming in on the stack...
...by which I mean the strides/sizes all seem to check out. At the entry to
...which feels sane as
I think the trick will be understanding why that absurd For posterity my setup:
|
Realistically, I'm not going to have time to track this down. I'm sorry. Valgrind shows the ESIO layer clean at 512x512x512. Where is Valgrind complaining about things? This smells fishy at the HDF5 level. |
One thing I did not check was that the values passed in from Fortran are arriving at |
|
Have you been able to confirm/deny that the sides/parameters going into HDF5 are sane on your install? |
No, I haven't been able to confirm that. |
I have encountered a problem writing large array using a single processor. I'm running a serial code that's working with a large array. When I work with smaller arrays (64x64x64, for example) the following example works fine. My *.h5 files contain 1e-6 in every position, like it should. But when I bump up the size, my *.h5 file output just contains 0's.
Here's my minimal working example:
I also modified this example to check the "ierr" flags at each step, but they remained 0.
Yes, I know that ESIO is really meant for parallel reading/writing and yes, I know that parallelizing the rest of my code would fix the problem. But up until now, the serial portion of the code has worked fine, and even with large arrays it only takes a minute to run. While switching to generic hdf5 library calls and/or making the code parallel might be better, both would require time to rewrite code and/or extra code complexity. I'd prefer to use a serial code if I can get away with it.
System Information:
My compiler is ifort (IFORT) 16.0.1 20151021 and I'm using the -fopenmp flag.
I'm using the release branch 0.1.9 of ESIO.
I'm using Cray mpich 7.3.0
I'm running this on an interactive session on Lonestar 5 at TACC, with 1 node and 16 tasks allocated. I'm only running the above example with 1 MPI task.
The text was updated successfully, but these errors were encountered: