New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite make_whole in Cython #1965
Conversation
package/setup.py
Outdated
include_dirs=include_dirs, | ||
define_macros=define_macros, | ||
extra_compile_args=extra_compile_args) | ||
extra_compile_args=[a for a in extra_compile_args | ||
if not a.startswith('-std')]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so c++ didn't play well with the -std=c99
flag. What does this do? And does removing it break anything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This flag sets the C standard we are using. Yeah, c++ doesn't have this standard. Having an explicit standard for the C-code is nice. I did a PR a while ago where I did small clean up in the C code and I would like to keep the explicit standard requirement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also set the c++ standard we use. I'll have a look at the function you are using and give you my advise after reading up on the docs. The behavior has some subtle changes depending on the used standard version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None of the functions I used make use of modern c++ features. If we want to have the compatibility with very old compilers gcc 4.8 and smaller and therefore almost all linux installations out there we can go for the 2003 standard version. Generally I prefer the c++11/14 standard aka modern C++ but we don't make use of any features of those just yet.
package/MDAnalysis/lib/_cutil.pyx
Outdated
|
||
@cython.boundscheck(False) | ||
@cython.wraparound(False) | ||
def _is_contiguous(int[:] atoms, int[:, :] bonds, int start): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this needs a docstring! I don't see why this function should be hidden.
package/setup.py
Outdated
include_dirs=include_dirs, | ||
define_macros=define_macros, | ||
extra_compile_args=extra_compile_args) | ||
extra_compile_args=[a for a in extra_compile_args | ||
if not a.startswith('-std')]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also set the c++ standard we use. I'll have a look at the function you are using and give you my advise after reading up on the docs. The behavior has some subtle changes depending on the used standard version
package/MDAnalysis/lib/_cutil.pyx
Outdated
output = intset() | ||
|
||
for val in a: | ||
if b.count(val) != 1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no need to be defensive here. The c++ set container will return the existing element if it is already there. https://en.cppreference.com/w/cpp/container/set/insert
You doing the count and the insert doubles the computation cost. Because sets don't have duplicates the count will always be either 0 or 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually go directly for the std version of this algorithm. It scales better linearly instead of O(a.size() * log(b.size)
https://en.cppreference.com/w/cpp/algorithm/set_difference
from libcpp.algorithm import set_difference
from libcpp.iterator import inserter
cdef intset difference(intset a, intset b):
output = intset()
set_difference(a.begin(), a.end(), b.begin(), b.end(), inserter(output, output.begin()))
return output
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be a bit more complicated because the stl functions aren't exposed by cython -.-
package/MDAnalysis/lib/_cutil.pyx
Outdated
x = bonds[i, 0] | ||
y = bonds[i, 1] | ||
# only add bonds if both atoms are in atoms set | ||
if total.count(x): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
combine into one if statement if total.count(x) and total.count(y)
.
package/MDAnalysis/lib/_cutil.pyx
Outdated
N = total.size() | ||
|
||
nloops = 0 | ||
while seen.size() < N: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while seen.size() < N and nloops < N:
package/MDAnalysis/lib/_cutil.pyx
Outdated
@@ -51,7 +53,7 @@ def unique_int_1d(np.ndarray[np.int64_t, ndim=1] values): | |||
cdef int i = 0 | |||
cdef int j = 0 | |||
cdef int n_values = values.shape[0] | |||
cdef np.ndarray[np.int64_t, ndim=1] result = np.empty(n_values, dtype=np.int64) | |||
cdef long[:] result = np.empty(n_values, dtype=np.int64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you should continue to use the np.int64_t
type to ensure the size of the type independent of the used platform. What a long can vary in size.
I addressed my own comments already. If this works correctly it could give another little boost in performance but this should be checked! |
d7be442
to
ec69f49
Compare
Ping #1961 |
They might be faster. It's always worth to check the c-code cython generates. In this case I don't assume that it did not make a huge difference. Especially since we have to cast the memoryview back into a numpy array. |
if c++14 isn't available on macOS give c++11 a try. This will also work here. |
8c6ac1e
to
0d5150e
Compare
@kain88-de thanks for the help, I've not used cpp much ever! |
well in the end I didn't add much. I just fixed the unique_id code you broke by accident and clean up the code that's it. osx means we can't use nice things -.- |
btw @richardjgowers how much faster is this version? |
@kain88-de the cython version of check_contiguous was about 2x faster than pure python, we were mostly just doing set operations to check connectivity and python's sets are good I've reworked |
|
||
oldpos = atomgroup.positions | ||
newpos = np.zeros((oldpos.shape[0], 3), dtype=np.float32) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I checked the C code made for this, between here and the return at the bottom is pure C :)
package/MDAnalysis/lib/_cutil.pyx
Outdated
vec[i] = oldpos[other, i] - oldpos[atom, i] | ||
# Apply periodic boundary conditions to this vector | ||
if ortho: | ||
apply_pbc_ortho(&vec[0], &box[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So here ideally we'd use the minimum_image
things from calc_distances.h but there's type mismatches I need to figure out
package/MDAnalysis/lib/_cutil.pyx
Outdated
ortho = True | ||
box = atomgroup.dimensions | ||
for i in range(3): | ||
if box[i + 3] != 90.0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should I use something like fabs(box[i + 3] - 90) > SOMETHING_SMALL
here?
b6ef988
to
0d62ca5
Compare
package/MDAnalysis/lib/_cutil.pyx
Outdated
else: | ||
from .mdamath import triclinic_vectors | ||
# forgive me code Jesus for I have sinned | ||
B = triclinic_vectors(box) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does anyone know how we'd turn a 3,3 numpy array into the coordinate[3]
type we need for the pbc_triclinic
below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see libmdaxdr
<coordinate*>&box[0, 0]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh to the typecasting on the B of course.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The line takes the first pointer to the first element of box
aka box[0, 0]
and casts that to a pointer of type coordinate.
def test_zero_box_size(self, universe, ag): | ||
universe.dimensions = [0., 0., 0., 90., 90., 90.] | ||
with pytest.raises(ValueError): | ||
mdamath.make_whole(ag) | ||
|
||
def test_too_small_box_size(self, universe, ag): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why removing that test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's likely to happen, you either have box dimensions or not. The chance of an incorrect box being supplied is low.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could keep the check & test, but it would make the algorithm slower because you need to calculate all bond lengths. I thought it was too much of an unlikely case to slow things down for it
27c4a9e
to
20f8a52
Compare
Ok thanks to @davidercruz we've got a fullerene test too Should be good to go now |
@MDAnalysis/coredevs reviews please :) |
I plan on looking at this either tonight or tomorrow. Le 3 juil. 2018 2:43 PM, Richard Gowers <notifications@github.com> a écrit :@MDAnalysis/coredevs reviews please :)
—You are receiving this because you commented.Reply to this email directly, view it on GitHub, or mute the thread.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly questions. The code looks sane. I am a bit annoyed that the precision of the tests needed to be reduced, though.
return np.array(result) | ||
|
||
|
||
cdef intset difference(intset a, intset b): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does not look the most efficient way to do, see https://en.cppreference.com/w/cpp/algorithm/set_difference. But it should do for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @kain88-de tried to use this but it's not available on osx or something
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can rewrite the algorithm ourselves. They assume that the containers are sorted (which they are for c++11 and later). Using the std algorithm is difficult on linux. I did not get the compiler to resolve the template. This could get better when we update to use the new conda compilers.
package/MDAnalysis/lib/_cutil.pyx
Outdated
"You can set dimensions using 'atomgroup.dimensions='") | ||
|
||
ortho = True | ||
for i in range(3): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not range(3, 6)
instead of doing index gymnastic?
if ortho: | ||
minimum_image(&vec[0], &box[0], &inverse_box[0]) | ||
else: | ||
minimum_image_triclinic(&vec[0], <coordinate*>&tri_box[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't tri_box
be casted only once before the loops?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does casting take any time? I thought it was more an instruction to the compiler rather than a copy of data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not expect any cost for the cast; though, I do not see why tri_box could not be directly the right type. Nevertheless, it is really not a big deal.
raise ValueError("Reference atom not in atomgroup") | ||
|
||
# Check all of atomgroup is accessible from ref | ||
if not _is_contiguous(atomgroup, ref): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is not there in the new version. From what I understand, a non contiguous atom group will trigger the test at the very end of the new version, but the error message is not super clear about the reason, and it fails late instead of early.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So when benchmarking this I found it takes a non negligible amount of time. Because we also implicitly do this check when applying the algorithm, it made more sense to not check this before and just try. We only apply the position change if the alg. worked, so it's defensive in that respect. But yes this means that failing because of non-contiguous is slower, although succeeding is faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. Good for me.
@@ -305,13 +282,13 @@ def test_solve_1(self, universe, ag): | |||
|
|||
assert_array_almost_equal(universe.atoms[:4].positions, refpos) | |||
assert_array_almost_equal(universe.atoms[4].position, | |||
np.array([110.0, 50.0, 0.0])) | |||
np.array([110.0, 50.0, 0.0]), decimal=3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a drastic decrease in precision, doesn't it? Could it be that make_whole needs to work on doubles?
6e5fb65
to
cac9201
Compare
lib.mdamath.make_whole now supports triclinic boxes renamed from new to avoid cpp name conflict in calc_distances.h added fullerene test case for make_whole
cac9201
to
9ea6328
Compare
@jbarnoud I bumped precision up to decimal=5 and it still passes (hopefully) |
The performance of make_whole has been raised in #1961 benchmarking it showed that the check that the molecule is traversable via bonds was taking a significant amount of the time. This rewrites make_whole in Cython and also adds support for triclinic unit cells