-
Notifications
You must be signed in to change notification settings - Fork 144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance degrades with parallel openmp threads #47
Comments
Hi, |
Hi, yes the filterreg algorithm (in the case where the computation fails to converge in the M-step) runs more than 6 times faster on one core than on 8. |
Thanks.
|
As far as I can tell from the source code, filterreg uses OpenMP both for the
|
Thanks. import os
import time
import copy
import numpy as np
import open3d as o3d
import igl
from probreg import filterreg
#import logging
#log = logging.getLogger('probreg')
#log.setLevel(logging.DEBUG)
frame_vertices = []
frame_normals = []
for filename in ['pt2pl-no-converge/frame00019.obj', 'pt2pl-no-converge/frame00018.obj']:
[v, _, n, _, _, _] = igl.read_obj(filename)
frame_vertices.append(o3d.utility.Vector3dVector(v))
frame_normals.append(o3d.utility.Vector3dVector(n))
print('read frames: ' + str(len(frame_vertices)))
test_source = o3d.geometry.PointCloud()
test_source.points = frame_vertices[0]
test_source.normals = frame_normals[0]
test_target = o3d.geometry.PointCloud()
test_target.points = frame_vertices[1]
test_target.normals = frame_normals[1]
s = time.time()
mstepresult = filterreg.registration_filterreg(test_source, test_target,
target_normals=test_target.normals,
maxiter=1000,
sigma2=0.1,
tol=0.001,
objective_type='pt2pl',
callbacks=[])
print(time.time() - s)
CPU usage history. |
For the record:
|
While searching for the cause of #46 I set
OMP_NUM_THREADS=1
to exclude any threading issues. Interestingly, the performance increased by about a factor of 6 on this 8-core Intel machine when compared toOMP_NUM_THREADS=8
.The text was updated successfully, but these errors were encountered: