Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Zero-Sized (NULL) Blocs and Operators #3460

Closed
ax3l opened this issue Feb 2, 2023 · 8 comments
Closed

Bug: Zero-Sized (NULL) Blocs and Operators #3460

ax3l opened this issue Feb 2, 2023 · 8 comments
Milestone

Comments

@ax3l
Copy link
Contributor

ax3l commented Feb 2, 2023

On Summit as of ADIOS 2.8.3, writes crash when trying to write zero-sized/null blocks and having c-blosc 1.21.0 compression enabled.

Crash in void adios2::format::BPSerializer::PutOperationPayloadInBuffer<unsigned long>(adios2::core::Variable<unsigned long> const&, adios2::core::Variable<unsigned long>::BPInfo const&)+0xd8:

 3: /sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-9.3.0/adios2-2.8.1-r37wqwbpanpyyelbijmm5qj27v6xe4x5/lib64/../lib64/libadios2_core.so.2(_ZN6adios26format12BPSerializer27PutOperationPayloadInBufferImEEvRKNS_4core8VariableIT_EERKNS6_6BPInfoE+0xd8) [0x2000378725d8]
    ?? ??:0

 4: /sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-9.3.0/adios2-2.8.1-r37wqwbpanpyyelbijmm5qj27v6xe4x5/lib64/../lib64/libadios2_core.so.2(_ZN6adios24core6engine9BP4Writer13PutSyncCommonImEEvRNS0_8VariableIT_EERKNS6_6BPInfoEb+0x130) [0x2000377e3820]
    ?? ??:0

 5: /sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-9.3.0/adios2-2.8.1-r37wqwbpanpyyelbijmm5qj27v6xe4x5/lib64/../lib64/libadios2_core.so.2(_ZN6adios24core6engine9BP4Writer16PerformPutCommonImEEvRNS0_8VariableIT_EE+0xec) [0x2000377eaafc]
    ?? ??:0

 6: /sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-9.3.0/adios2-2.8.1-r37wqwbpanpyyelbijmm5qj27v6xe4x5/lib64/../lib64/libadios2_core.so.2(_ZN6adios24core6engine9BP4Writer11PerformPutsEv+0x81c) [0x2000377dae8c]
    ?? ??:0

 7: /sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-9.3.0/adios2-2.8.1-r37wqwbpanpyyelbijmm5qj27v6xe4x5/lib64/libadios2_cxx11.so.2(_ZN6adios26Engine11PerformPutsEv+0xf0) [0x20002696a780]
    ?? ??:0

Compression parameters used:

operator.type = blosc
operator.parameters.compressor = zstd
operator.parameters.clevel = 1
operator.parameters.doshuffle = BLOSC_BITSHUFFLE
operator.parameters.threshold = 2048
operator.parameters.nthreads = 6

attn @pnorbert @guj
cc @franzpoeschel

X-Ref.:

@ax3l
Copy link
Contributor Author

ax3l commented Feb 3, 2023

Latest release, ADIOS 2.8.3 w/ c-blosc 1.21.0, fails with the same segfault.

Testing master as of 98bef0e and c-blosc 2.6.1 next.

@ax3l
Copy link
Contributor Author

ax3l commented Feb 3, 2023

Compile error in SST as of master:

/ccs/home/huebl/src/adios2/source/adios2/toolkit/sst/dp/ucx_dp.c: In function 'UcxReadRemoteMemory':
/ccs/home/huebl/src/adios2/source/adios2/toolkit/sst/dp/ucx_dp.c:457:5: error: unknown type name 'ucp_request_param_t'
  457 |     ucp_request_param_t param;
      |     ^~~~~~~~~~~~~~~~~~~
/ccs/home/huebl/src/adios2/source/adios2/toolkit/sst/dp/ucx_dp.c:458:10: error: request for member 'op_attr_mask' in something not a structure or union
  458 |     param.op_attr_mask = 0;
      |          ^
/ccs/home/huebl/src/adios2/source/adios2/toolkit/sst/dp/ucx_dp.c:459:16: warning: implicit declaration of function 'ucp_get_nbx'; did you mean 'ucp_get_nb'? [-Wimplicit-function-declaration]
  459 |     ret->req = ucp_get_nbx(RS_Stream->WriterEP[Rank], Buffer, Length,
      |                ^~~~~~~~~~~
      |                ucp_get_nb
/ccs/home/huebl/src/adios2/source/adios2/toolkit/sst/dp/ucx_dp.c:459:14: warning: assignment to 'ucs_status_ptr_t' {aka 'void *'} from 'int' makes pointer from integer without a cast [-Wint-conversion]
  459 |     ret->req = ucp_get_nbx(RS_Stream->WriterEP[Rank], Buffer, Length,
      |              ^
gmake[2]: *** [source/adios2/toolkit/sst/CMakeFiles/sst.dir/build.make:188: source/adios2/toolkit/sst/CMakeFiles/sst.dir/dp/ucx_dp.c.o] Error 1
gmake[2]: *** Waiting for unfinished jobs....
gmake[1]: *** [CMakeFiles/Makefile2:3830: source/adios2/toolkit/sst/CMakeFiles/sst.dir/all] Error 2
gmake[1]: *** Waiting for unfinished jobs....

Will disable SST #3463 to proceed

@ax3l
Copy link
Contributor Author

ax3l commented Feb 3, 2023

The master branch as of 98bef0e with c-blosc 2.6.1 segfaults on the same routines:

 3: /ccs/home/huebl/sw/summit/adios2-master-c-blosc-2.6.1/lib64/../lib64/libadios2_core.so.2(_ZN6adios26format12BPSerializer27PutOperationPayloadInBufferImEEvRKNS_4core8VariableIT_EERKNS6_6BPInfoE+0xd8) [0x200037851388]
    ?? ??:0

 4: /ccs/home/huebl/sw/summit/adios2-master-c-blosc-2.6.1/lib64/../lib64/libadios2_core.so.2(_ZN6adios24core6engine9BP4Writer13PutSyncCommonImEEvRNS0_8VariableIT_EERKNS6_6BPInfoEb+0x130) [0x2000377c4980]
    ?? ??:0

 5: /ccs/home/huebl/sw/summit/adios2-master-c-blosc-2.6.1/lib64/../lib64/libadios2_core.so.2(_ZN6adios24core6engine9BP4Writer16PerformPutCommonImEEvRNS0_8VariableIT_EE+0xec) [0x2000377cbc5c]
    ?? ??:0

 6: /ccs/home/huebl/sw/summit/adios2-master-c-blosc-2.6.1/lib64/../lib64/libadios2_core.so.2(_ZN6adios24core6engine9BP4Writer11PerformPutsEv+0x81c) [0x2000377bb83c]
    ?? ??:0

 7: /ccs/home/huebl/sw/summit/adios2-master-c-blosc-2.6.1/lib64/libadios2_cxx11.so.2(_ZN6adios26Engine11PerformPutsEv+0xf0) [0x20002696eaf0]
    ?? ??:0

@ax3l
Copy link
Contributor Author

ax3l commented Feb 3, 2023

I think the segfault is here:

@ax3l
Copy link
Contributor Author

ax3l commented Feb 3, 2023

Tried #3465, but void adios2::core::engine::BP4Writer::PutSyncCommon<unsigned long>(adios2::core::Variable<unsigned long>&, adios2::core::Variable<unsigned long>::BPInfo const&, bool) still segfaults... Will have to stop for now, happy if someone can write a small test and take it over? :)

@anagainaru
Copy link
Contributor

@ax3l we merged a fix for BP4, could you try running this again and see if you get an error?

@pnorbert
Copy link
Contributor

PR #3542 should fix this bug

@vicentebolea vicentebolea modified the milestones: v2.9.0, v2.9.1 Mar 29, 2023
@vicentebolea vicentebolea removed this from the v2.9.1 milestone Jul 7, 2023
@pnorbert
Copy link
Contributor

Can we have a test for this? Zero blocks and compression.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants