Skip to content

Memory debugging using Valgrind tools

Ali.Abdolali edited this page Nov 9, 2021 · 7 revisions

VALGRIND INFO

Valgrind is a free programming tool for memory debugging, memory leak detection, and profiling.
The program is available as a module on NOAA RDHPC machines.

Description

  • Non-interactive tool for Linux environment
  • Calls the binary, so said to work with any programming language, though targets C/C++. Works for fortran!
  • Open source / free software
    Pros
  • Extremely easy to set up / run
  • Widely available
    Cons
  • False positives, especially in the case of MPI
  • For MPI, you need to run for each processor, though it is still one call

What Errors does Valgrind/Memcheck Detect?

  • Reading/writing freed memory or incorrect memory areas
  • Uninitialized values
  • Incorrect freeing of memory, such as double freeing heap blocks
  • Misuse of functions for memory allocations: new(), malloc(), free(), deallocate(), etc.
  • Memory leaks - unintentional memory consumption often related to program logic flaws which lead to loss of memory pointers prior to deallocation.

Limitations of Valgrind

  • Does not perform bounds checking on static arrays (i.e., memory allocated on the stack)
  • Only checks programs dynamically -- May report no errors on a particular input set although the program contains bugs
  • Consumes more memory (~2x)
  • Slows down the programs (10x and more)
  • Optimized binaries can cause Valgrind to wrongly report uninitialized value errors

See link below for additional suggestions for MPI prep

Useful Links:
Valgrind homepage
Quickstart Guide
Explanation of error messages
MPI Debugging (have not used before, but might be useful)


In order to utilize Valgrind, the following steps should be taken:

1. Prep: Compile and execute programs up to the executable that you want to utilize Vaigrind for, in debug mode: Use -c intel_debug (Required Flags: -g -traceback -O0).
Use -q ww3_multi to get executable.
See Prep template job card below.

Sample prep job card:

#!/bin/sh --login
#SBATCH -n 1
#SBATCH -q debug
#SBATCH -t 00:30:00
#SBATCH -A marine-cpu
#SBATCH -J ww3_vlgd
#SBATCH -o prep_vlgd.out

cd <home>/WW3/regtests
module purge
module use /scratch2/NCEPDEV/nwprod/hpc-stack/libs/hpc-stack/modulefiles/stack
module load hpc/1.1.0
module load hpc-intel/18.0.5.274
module load hpc-impi/2018.0.4
module load netcdf/4.7.4
module load jasper/2.0.25
module load zlib/1.2.11
module load png/1.6.35
module load hdf5/1.10.6
module load bacio/2.4.1
module load g2/3.4.1
module load w3nco/2.4.1
module load esmf/8_1_1

export NETCDF_CONFIG=$NETCDF_ROOT/bin/nc-config
export METIS_PATH=/scratch2/COASTAL/coastal/save/Ali.Abdolali/hpc-stack/parmetis-4.0.3
export JASPER_LIB=$JASPER_ROOT/lib64/libjasper.a
export PNG_LIB=$PNG_ROOT/lib64/libpng.a
export Z_LIB=$ZLIB_ROOT/lib/libz.a
export ESMFMKFILE=$ESMF_LIB/esmf.mk
export WW3_PARCOMPN=4

echo ' '
echo ' **********************************************'
echo ' *** WAVEWATCH III matrix of regression tests ***'
echo ' **********************************************'
echo ' '

./bin/run_test -b slurm -c intel_debug -S -T -s MPI -w work_1 -m grdset_c -f -p srun -n 1 -q ww3_multi -o all ../model <test>

echo ' '
echo ' **************************************************************'
echo ' * end of WAVEWATCH III matrix of regression tests *'
echo ' **************************************************************'
echo ' '

2. Run: Run the executable with Valgrind
valgrind --leak-check=full /<path-to-exe>/<executable> (i.e. WW3/model/exe/ww3_multi)
Other available flags:
--leak-check=full: "each individual leak will be shown in detail".
--show-leak-kinds=all: Show all of "definite, indirect, possible, reachable" leak kinds in the "full" report.
--track-origins=yes: Favor useful output over speed. This tracks the origins of uninitialized values, which could be very useful for memory errors. Consider turning off if Valgrind is unacceptably slow.
--verbose: Can tell you about the unusual behavior of your program. Repeat for more verbosity.
--log-file: Write to a file. Useful when output exceeds terminal space.
--showreachable=yes: Find absolutely every unpaired call to allocate/deallocate.

See bottom of output Summary for some suggestions
See Valgrind run card below.

Sample valgrind run card:

#!/bin/sh --login
#SBATCH -n 1
#SBATCH -q batch
#SBATCH -t 1:00:00
#SBATCH -A marine-cpu
#SBATCH -J ww3_vlgd
#SBATCH -o prep_vlgd.out

cd <home>/WW3/regtests
module purge
module use /scratch2/NCEPDEV/nwprod/hpc-stack/libs/hpc-stack/modulefiles/stack
module load hpc/1.1.0
module load hpc-intel/18.0.5.274
module load hpc-impi/2018.0.4
module load netcdf/4.7.4
module load jasper/2.0.25
module load zlib/1.2.11
module load png/1.6.35
module load hdf5/1.10.6
module load bacio/2.4.1
module load g2/3.4.1
module load w3nco/2.4.1
module load esmf/8_1_1
module load valgrind

export NETCDF_CONFIG=$NETCDF_ROOT/bin/nc-config
export METIS_PATH=/scratch2/COASTAL/coastal/save/Ali.Abdolali/hpc-stack/parmetis-4.0.3
export JASPER_LIB=$JASPER_ROOT/lib64/libjasper.a
export PNG_LIB=$PNG_ROOT/lib64/libpng.a
export Z_LIB=$ZLIB_ROOT/lib/libz.a
export ESMFMKFILE=$ESMF_LIB/esmf.mk
export WW3_PARCOMPN=4
​export VGDIR=/regtests/<test>/work_1
​export VGEXE=/WW3/model/exe

echo ' '
echo ' *********************************************'
echo ' *** WAVEWATCH III --
-- VALGRIND ***'
echo ' **********************************************'
echo ' '

cd ${VGDIR}
valgrind --leak-check=full --show-reachable=yes --log-file=vlgd.out ${VGEXE}/ww3_multi
#output report will be in outfile specified above (vlgd.out)

echo ' '
echo ' **************************************************************'
echo ' * end of WAVEWATCH III --*-- valgrind *'
echo ' **************************************************************'
echo ' '

3. Analyze: Start at the bottom with the: LEAK SUMMARY
Focus on definitely lost blocks, start with highest yield items and/or easiest to fix. Some will definitely require more sleuthing around the code then others.
For memory leaks, you’re essentially looking for an ALLOCATE without a matching DEALLOCATE. Valgrind shows you the ALLOCATE, you need to assess if it needs to be DEALLOCATE’d, and if so where that statement should go (The line number in Valgrind standard output refers to model/src/*.F90 directory).
Loosely speaking, you are probably OK to skip past MPI stuff initially. MPI, especially init-related will have the highest amount of false positives
Ex., Conditional jump or move depends on uninitialised value(s) - in the case of MPI init is typical.
You will start to build intuition pretty quickly, so you’ll get a better idea of what may be ‘false positives’, and what items are likely to be easier than others.

Ideally, we would like to have zero leak HEAP SUMMARY:
in use at exit: 0 bytes in 0 blocks
total heap usage: 636 allocs, 636 frees, 25,393 bytes allocated

All heap blocks were freed -- no leaks are possible

ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

but, in reality,
==145333==
==145333== HEAP SUMMARY:
==145333== in use at exit: 20,558 bytes in 7 blocks
==145333== total heap usage: 25 allocs, 18 frees, 32,653 bytes allocated

Several kinds of leaks reported:

  • "definitely lost": leaking memory -- fix it!
  • “possibly lost”: general indicates leaking memory – fix it!
  • “indirect lost”: usually disappear if the “definitely” lost block that caused the indirect leak is fixed.