Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ch3: valgrind "Syscall param writev points to uninitialised byte(s)" #6843

Open
jczhang07 opened this issue Dec 13, 2023 · 0 comments
Open

Comments

@jczhang07
Copy link

$ /scratch/jczhang/petsc/arch-kokkos-dbg/bin/mpichversion 
MPICH Version:      4.1.2
MPICH Release date: Wed Jun  7 15:22:45 CDT 2023
MPICH ABI:          15:1:3
MPICH Device:       ch3:sock
MPICH configure:    --prefix=/scratch/jczhang/petsc/arch-kokkos-dbg MAKE=/usr/bin/gmake --libdir=/scratch/jczhang/petsc/arch-kokkos-dbg/lib CC=gcc CFLAGS=-fPIC -Wno-lto-type-mismatch -Wno-stringop-overflow -g -O0 AR=/usr/bin/ar ARFLAGS=cr CXX=g++ CXXFLAGS=-Wno-lto-type-mismatch -Wno-psabi -g -O0 -std=gnu++20 -fPIC FFLAGS=-fPIC -ffree-line-length-none -ffree-line-length-0 -Wno-lto-type-mismatch -g -O0 -fallow-argument-mismatch FC=gfortran F77=gfortran FCFLAGS=-fPIC -ffree-line-length-none -ffree-line-length-0 -Wno-lto-type-mismatch -g -O0 -fallow-argument-mismatch --enable-shared --with-pm=hydra --disable-java --with-hwloc=embedded --enable-fast=no --enable-error-messages=all --with-device=ch3:sock --enable-g=meminit,dbg PYTHON=/usr/bin/python3 --disable-maintainer-mode --disable-dependency-tracking
MPICH CC:           gcc -fPIC -Wno-lto-type-mismatch -Wno-stringop-overflow -g -O0   -O0
MPICH CXX:          g++ -Wno-lto-type-mismatch -Wno-psabi -g -O0 -std=gnu++20 -fPIC  -O0
MPICH F77:          gfortran -fPIC -ffree-line-length-none -ffree-line-length-0 -Wno-lto-type-mismatch -g -O0 -fallow-argument-mismatch  -O0
MPICH FC:           gfortran -fPIC -ffree-line-length-none -ffree-line-length-0 -Wno-lto-type-mismatch -g -O0 -fallow-argument-mismatch  -O0

Use this mpich to compile the MPI_Bcast example at https://rookiehpc.org/mpi/docs/mpi_bcast/index.html, then run it with valgrind

$ /scratch/jczhang/petsc/arch-kokkos-dbg/bin/mpirun  -n 2 valgrind ./ex100
==1866237== Memcheck, a memory error detector
==1866237== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==1866237== Using Valgrind-3.22.0 and LibVEX; rerun with -h for copyright info
==1866237== Command: ./ex100
==1866237== 
==1866238== Memcheck, a memory error detector
==1866238== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==1866238== Using Valgrind-3.22.0 and LibVEX; rerun with -h for copyright info
==1866238== Command: ./ex100
==1866238== 
[MPI process 0] I am the broadcast root, and send value 12345.
==1866237== Syscall param writev(vector[0]) points to uninitialised byte(s)
==1866237==    at 0x5191867: writev (writev.c:26)
==1866237==    by 0x4DA2EA8: MPL_large_writev (mpl_sock.c:31)
==1866237==    by 0x4D75A8B: MPIDI_CH3I_Socki_handle_write (sock.c:3688)
==1866237==    by 0x4D74834: MPIDI_CH3I_Sock_wait (sock.c:3367)
==1866237==    by 0x4D7D09B: MPIDI_CH3i_Progress_wait (ch3_progress.c:173)
==1866237==    by 0x4D7F248: MPIDI_CH3I_Progress (ch3_progress.c:808)
==1866237==    by 0x4C30234: MPIR_Wait_state (request_impl.c:885)
==1866237==    by 0x4C3036F: MPIR_Wait_impl (request_impl.c:908)
==1866237==    by 0x4BAC801: MPID_Wait (mpidpost.h:273)
==1866237==    by 0x4BACA58: MPIC_Wait (helper_fns.c:63)
==1866237==    by 0x4BACD35: MPIC_Send (helper_fns.c:130)
==1866237==    by 0x4ABC5D5: MPIR_Bcast_intra_binomial (bcast_intra_binomial.c:146)
==1866237==  Address 0x5b5cd6c is 60 bytes inside a block of size 176 alloc'd
==1866237==    at 0x4848889: malloc (in /home/jczhang/homebrew/Cellar/valgrind/3.22.0/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==1866237==    by 0x4D6A6D7: MPL_malloc (mpl_trmem.h:373)
==1866237==    by 0x4D6A92D: MPIDI_CH3I_Connection_alloc (ch3u_connect_sock.c:150)
==1866237==    by 0x4D6D243: MPIDI_CH3I_Sock_connect (ch3u_connect_sock.c:1070)
==1866237==    by 0x4D6D0C7: MPIDI_CH3I_VC_post_sockconnect (ch3u_connect_sock.c:1016)
==1866237==    by 0x4D7EB8D: MPIDI_CH3I_VC_post_connect (ch3_progress.c:652)
==1866237==    by 0x4D7A1A4: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:154)
==1866237==    by 0x4D29ABC: MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:262)
==1866237==    by 0x4D4C7FB: MPID_Send (mpid_send.c:119)
==1866237==    by 0x4D4CCF9: MPID_Send_coll (mpid_send.c:206)
==1866237==    by 0x4BACC93: MPIC_Send (helper_fns.c:126)
==1866237==    by 0x4ABC5D5: MPIR_Bcast_intra_binomial (bcast_intra_binomial.c:146)
==1866237== 
[MPI process 1] I am a broadcast receiver, and obtained value 12345.
==1866237== Syscall param write(buf) points to uninitialised byte(s)
==1866237==    at 0x518B697: write (write.c:26)
==1866237==    by 0x4D72FE1: MPIDI_CH3I_Sock_write (sock.c:2614)
==1866237==    by 0x4D79B72: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:68)
==1866237==    by 0x4CA39BD: MPIDI_CH3U_VC_SendClose (ch3u_handle_connection.c:237)
==1866237==    by 0x4D6931B: MPIDI_PG_Close_VCs (mpidi_pg.c:973)
==1866237==    by 0x4D3A9EF: MPID_Finalize (mpid_finalize.c:99)
==1866237==    by 0x4C228B8: MPII_Finalize (mpir_init.c:399)
==1866237==    by 0x4C22B5F: MPIR_Finalize_impl (mpir_init.c:454)
==1866237==    by 0x49C8727: internal_Finalize (finalize.c:37)
==1866237==    by 0x49C879E: PMPI_Finalize (finalize.c:83)
==1866237==    by 0x10927B: main (ex100.c:40)
==1866237==  Address 0x1ffeffcdd8 is on thread 1's stack
==1866237==  in frame #3, created by MPIDI_CH3U_VC_SendClose (ch3u_handle_connection.c:199)
==1866237== 
==1866238== Syscall param write(buf) points to uninitialised byte(s)
==1866238==    at 0x518B697: write (write.c:26)
==1866238==    by 0x4D72FE1: MPIDI_CH3I_Sock_write (sock.c:2614)
==1866238==    by 0x4D79B72: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:68)
==1866238==    by 0x4CA39BD: MPIDI_CH3U_VC_SendClose (ch3u_handle_connection.c:237)
==1866238==    by 0x4D6931B: MPIDI_PG_Close_VCs (mpidi_pg.c:973)
==1866238==    by 0x4D3A9EF: MPID_Finalize (mpid_finalize.c:99)
==1866238==    by 0x4C228B8: MPII_Finalize (mpir_init.c:399)
==1866238==    by 0x4C22B5F: MPIR_Finalize_impl (mpir_init.c:454)
==1866238==    by 0x49C8727: internal_Finalize (finalize.c:37)
==1866238==    by 0x49C879E: PMPI_Finalize (finalize.c:83)
==1866238==    by 0x10927B: main (ex100.c:40)
==1866238==  Address 0x1ffeffcdd8 is on thread 1's stack
==1866238==  in frame #3, created by MPIDI_CH3U_VC_SendClose (ch3u_handle_connection.c:199)
==1866238== 
==1866237== Syscall param write(buf) points to uninitialised byte(s)
==1866237==    at 0x518B697: write (write.c:26)
==1866237==    by 0x4D72FE1: MPIDI_CH3I_Sock_write (sock.c:2614)
==1866237==    by 0x4D79B72: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:68)
==1866237==    by 0x4CA3B1B: MPIDI_CH3_PktHandler_Close (ch3u_handle_connection.c:274)
==1866237==    by 0x4D7DA5E: MPIDI_CH3I_Progress_handle_sock_event (ch3_progress.c:397)
==1866237==    by 0x4D7D121: MPIDI_CH3i_Progress_wait (ch3_progress.c:185)
==1866237==    by 0x4D7F248: MPIDI_CH3I_Progress (ch3_progress.c:808)
==1866237==    by 0x4CA3CFC: MPIDI_CH3U_VC_WaitForClose (ch3u_handle_connection.c:356)
==1866237==    by 0x4D3AA72: MPID_Finalize (mpid_finalize.c:104)
==1866237==    by 0x4C228B8: MPII_Finalize (mpir_init.c:399)
==1866237==    by 0x4C22B5F: MPIR_Finalize_impl (mpir_init.c:454)
==1866237==    by 0x49C8727: internal_Finalize (finalize.c:37)
==1866237==  Address 0x1ffeffccf8 is on thread 1's stack
==1866237==  in frame #3, created by MPIDI_CH3_PktHandler_Close (ch3u_handle_connection.c:259)
==1866237== 
==1866238== 
==1866238== HEAP SUMMARY:
==1866238==     in use at exit: 149 bytes in 3 blocks
==1866238==   total heap usage: 9,524 allocs, 9,513 frees, 6,207,412 bytes allocated
==1866238== 
==1866237== 
==1866237== HEAP SUMMARY:
==1866237==     in use at exit: 149 bytes in 3 blocks
==1866237==   total heap usage: 9,546 allocs, 9,533 frees, 6,221,330 bytes allocated
==1866237== 
==1866238== LEAK SUMMARY:
==1866238==    definitely lost: 128 bytes in 1 blocks
==1866238==    indirectly lost: 21 bytes in 2 blocks
==1866238==      possibly lost: 0 bytes in 0 blocks
==1866238==    still reachable: 0 bytes in 0 blocks
==1866238==         suppressed: 0 bytes in 0 blocks
==1866238== Rerun with --leak-check=full to see details of leaked memory
==1866238== 
==1866238== Use --track-origins=yes to see where uninitialised values come from
==1866238== For lists of detected and suppressed errors, rerun with: -s
==1866238== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
==1866237== LEAK SUMMARY:
==1866237==    definitely lost: 128 bytes in 1 blocks
==1866237==    indirectly lost: 21 bytes in 2 blocks
==1866237==      possibly lost: 0 bytes in 0 blocks
==1866237==    still reachable: 0 bytes in 0 blocks
==1866237==         suppressed: 0 bytes in 0 blocks
==1866237== Rerun with --leak-check=full to see details of leaked memory
==1866237== 
==1866237== Use --track-origins=yes to see where uninitialised values come from
==1866237== For lists of detected and suppressed errors, rerun with: -s
==1866237== ERROR SUMMARY: 4 errors from 3 contexts (suppressed: 0 from 0)
@hzhou hzhou changed the title Valgrind "Syscall param write(buf) points to uninitialised byte(s)" with MPI_Bcast ch3: valgrind "Syscall param writev points to uninitialised byte(s)" Apr 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant