Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Valgrind findings #8

Closed
maxaehle opened this issue Feb 23, 2022 · 3 comments · Fixed by #26
Closed

Valgrind findings #8

maxaehle opened this issue Feb 23, 2022 · 3 comments · Fixed by #26

Comments

@maxaehle
Copy link
Contributor

maxaehle commented Feb 23, 2022

In this issue I'd like to collect findings from applying valgrind to the Python interpreter while it loads and uses the gam-gate and gam-g4 modules.

This continues OpenGATE/Gate#480.

My current procedure (for the record)

Setup singularity container, download and build Geant4 etc.:
First clone gate-singularity and checkout the gam branch (commit eeb1ca2d), build the Singularity image, and inside the Singularity container, download/build GATE and its dependencies. This is described in more detail in the README.md. Exit and restart the container to update the environment.

gam-g4:
As described in the gam-gate readme.md,

cd /software
git clone --recurse-submodules https://github.com/OpenGATE/gam-g4.git 
cd gam-g4
pip3 install -e . # ignore the error
cd build/temp.*
cmake -DGeant4_DIR=/software/geant4/install/ -DPYTHON_EXECUTABLE=`which python3` -DPYTHON_INCLUDE_DIR=/usr/include/python3.6m -DPYTHON_LIBRARY=/usr/lib/libpython3.6m.so -DITK_DIR=/software/ITK/bin -DROOT_DIR=/software/root-cern/install -DCMAKE_CXX_FLAGS="-Wno-pedantic" -DCMAKE_BUILD_TYPE=Debug .
make -j 1

in order to create the Python module /software/gam-g4/gam_g4/gam_g4.cpython-36m-x86_64-linux-gnu.so. Exit and restart the container to update the environment.

gam-gate:
Then build the Python module gam-gate via

cd /software
git clone --recurse-submodules https://github.com/OpenGATE/gam-gate.git # having git-lfs installed
cd gam-gate
pip3 install -e .

gatetools:
In addition to the gam-gate readme.md, we need the latest gatetools package for Python:

cd /software
git clone --recursive https://github.com/OpenGATE/GateTools.git
cd GateTools
pip install -e .

run tests:
Now we can run the individual tests, e.g.

python3 /software/gam-gate/gam_tests/src/test001_G4ThreeVector.py

To run with valgrind, e.g.

cd /software/gam-gate/gam_tests/src/
valgrind --leak-check=full --track-origins=yes --log-file=valg --suppressions=$ROOTSYS/etc/valgrind-root.supp python3 test001_G4ThreeVector.py
@maxaehle
Copy link
Contributor Author

Running the test004_double_WIP.py testcase in valgrind/vgdb, I found that the initialization of the random number generator involves an uninitialized value:

In Simulation.py:282, we have the following Python call:

g4.G4Random.setTheSeeds(self.actual_random_seed, 0)

where self.actual_random_seed is a Python integer.

According to the binding in pyRandomize.cpp:28,

py::class_<HepRandom>(m, "G4Random")
// other def's ...
.def("setTheSeeds", &HepRandom::setTheSeeds);

this calls the following C++ function in CLHEP's Random.cc:

void HepRandom::setTheSeeds(const long* seeds, int aux)
{
  theDefaults().theEngine->setSeeds(seeds,aux);
}

Note that the first argument has the C++ type const long*! This type is typically used to pass a read-only long array. Indeed, in my setup the call to setSeeds dispatches to MTwistEngine::setSeeds(const long* seeds, int k) which reads and utilizes seeds[1].

I don't know whether the pybind11 documentation specifies how it maps the Python integer into a C++ const long*. In my case, the C++ seeds[0] has the same value 123654 as the Python integer, so I would guess that it makes a copy and references it. Either way it is very unlikely that pybind11 assigns a value to seeds[1], e.g. because the array length cannot be inferred at compile-time. Therefore seeds[1] is uninitialized.

@maxaehle
Copy link
Contributor Author

Possible fix:

gam-g4:

diff --git a/gam_g4/g4_bindings/pyRandomize.cpp b/gam_g4/g4_bindings/pyRandomize.cpp
index 7936b98..4df7ece 100644
--- a/gam_g4/g4_bindings/pyRandomize.cpp
+++ b/gam_g4/g4_bindings/pyRandomize.cpp
@@ -25,7 +25,7 @@ void init_Randomize(py::module &m) {
         .def("setTheEngine", &HepRandom::setTheEngine)
         .def("showEngineStatus", &HepRandom::showEngineStatus)
         .def("getTheSeed", &HepRandom::getTheSeed)
-        .def("setTheSeeds", &HepRandom::setTheSeeds);
+        .def("setTheSeed", &HepRandom::setTheSeed);
 
     py::class_<HepRandomEngine>(m, "HepRandomEngine");

gam-gate:

diff --git a/gam_gate/Simulation.py b/gam_gate/Simulation.py
index 9df0be3..b5d67fd 100644
--- a/gam_gate/Simulation.py
+++ b/gam_gate/Simulation.py
@@ -279,7 +279,7 @@ class Simulation:
             self.actual_random_seed = self.user_info.random_seed
 
         # set the seed
-        g4.G4Random.setTheSeeds(self.actual_random_seed, 0)
+        g4.G4Random.setTheSeed(self.actual_random_seed, 0)
 
     def initialize_g4_verbose(self):
         # For a unknow reason, when verbose_level == 0, there are some

Alternatively, if the setTheSeeds functionality must be accessible, pybind11 can automatically convert a Python list into a C++ std::vector, whose data() we could pass to setTheSeeds. However then we should check that the list is sufficiently long, so setTheSeeds does not read past the end.

@maxaehle
Copy link
Contributor Author

maxaehle commented Mar 2, 2022

Another error comes up in e.g. test009_voxels.py, see this valgrind log entry.

The problem in G4Region.icc:78 is that the G4Region object *this has already been delete'd/free'd by Python's garbage collector when a member of it is written to. A quick fix in the gam-g4 repository would be

diff --git a/gam_g4/g4_bindings/pyG4RegionStore.cpp b/gam_g4/g4_bindings/pyG4RegionStore.cpp
index d71aead..a0aebe4 100644
--- a/gam_g4/g4_bindings/pyG4RegionStore.cpp
+++ b/gam_g4/g4_bindings/pyG4RegionStore.cpp
@@ -20,5 +20,5 @@ void init_G4RegionStore(py::module &m) {
         .def("Get", [](G4RegionStore *r, int i) { return (*r)[i]; }, py::return_value_policy::reference)
         .def("GetInstance", &G4RegionStore::GetInstance, py::return_value_policy::reference)
         .def("GetRegion", &G4RegionStore::GetRegion, py::return_value_policy::reference)
-        .def("FindOrCreateRegion", &G4RegionStore::FindOrCreateRegion);
+        .def("FindOrCreateRegion", &G4RegionStore::FindOrCreateRegion, py::return_value_policy::reference);
 }

With this change, Python will not take ownership of the G4Region object returned by

G4Region* G4RegionStore::FindOrCreateRegion(const G4String& name)

and will therefore not garbage collect it when the returned object is not referenced any more from the Python side. In turn the suggested change makes memory leaks more likely. I'm wondering if the py::return_value_policy::reference was left away intentionally, because the other .def's have it.

maxaehle added a commit to maxaehle/gam-gate that referenced this issue Mar 30, 2022
See OpenGATE#8 .
The C++ code for HepRandom::setTheSeeds expects an array of seeds.
If we pass a single number on the Python side, the C++ code
accesses memory that was not (intentionally) initialized.
maxaehle added a commit to maxaehle/gam-gate that referenced this issue Mar 30, 2022
This was referenced Mar 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant