New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crash (segfault) of the Deep CNN sulci recognition in about 6% of cases #96
Comments
Here is the complete output of the command using the dataset mentioned in the bug description:
|
Using the pdb debugger I was able to locate the crash in this block of code: aims-free/pyaimsalgo/python/soma/aimsalgo/sulci/graph_pointcloud.py Lines 245 to 273 in fd90fef
Therefore, I am transferring this issue to aims-free and continuing to investigate... |
More precisely, the crash happens on that statement: aims-free/pyaimsalgo/python/soma/aimsalgo/sulci/graph_pointcloud.py Lines 259 to 262 in fd90fef
|
It looks like the graph structure is corrupted: the call that triggers the segfault is |
Instrumenting the code as shown below shows that the same vertex is present in different split groups, which results in a crash when the same vertex is considered a second time, after having been merged: # fusion pass: in each split group, merge vertices which share the same
# label and are adjacent
for split_group in split_groups.values():
labels = {}
for v in split_group:
labels.setdefault(v['label'], []).append(v)
if len(labels) == len(split_group):
# all vertices have different labels: skip this step
continue
for label, vertices in labels.items():
vertices = set(vertices) # copy set
while len(vertices) >= 2:
v = next(iter(vertices))
print(f"v={id(v)}, calling v.edges()")
v.edges()
# check junctions
junctions = [j for j in v.edges()
if j.getSyntax() == 'junction'
and all(v2 in vertices
for v2 in j.vertices())]
if len(junctions) == 0:
vertices.remove(v)
else:
# merge v and 1st connected other vertex
v2 = [v3 for v3 in junctions[0].vertices()
if v3 is not v][0]
# v2 will disappear
vertices.remove(v2)
print(f"calling aims.FoldArgOverSegment(graph).mergeVertices(v={id(v)}, v2={id(v2)})")
aims.FoldArgOverSegment(graph).mergeVertices(v, v2)
del v2
# do v again next time since it may have other edges
|
I got it:
I am about to push a fix. |
I confirm that the issue is fixed, I could label all 1558 hemispheres without a single crash. |
Thanks @ylep ! |
Describe the bug
Sulci recognition with Deep CNN crashes on some datasets. On the 1558 hemispheres that I processed today, 91 exhibited the crash.
Testing on one of the faulty hemispheres showed that the crash is not systematic, but happens frequently (9 times over 10 runs). It happens irrelespective of if the GPU is used (cuda = 0) or the CPU (cuda = -1).
The last messages printed on standard output are:
The crash seems related to a segfault in
aimssip.so
, since the kernel log contains:To Reproduce
Steps to reproduce the behavior:
/neurospin/tmp/yleprince/2023-03-13_deepcnn_crash.zip
bv python3 -m capsul deepsulci.sulci_labeling.capsul.labeling.SulciDeepLabeling graph=Lsub-50011.arg labeled_graph=output.arg model_file=/casa/install/share/brainvisa-share-5.1/models/models_2019/cnn_models/sulci_unet_model_left.mdsm param_file=/casa/install/share/brainvisa-share-5.1/models/models_2019/cnn_models/sulci_unet_model_params_left.json roots=Lroots_sub-50011.nii.gz skeleton=Lskeleton_sub-50011.nii.gz fix_random_seed=True
Environment:
The text was updated successfully, but these errors were encountered: