Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

primitive neighbor list #68

Closed
moradza opened this issue Sep 2, 2022 · 9 comments
Closed

primitive neighbor list #68

moradza opened this issue Sep 2, 2022 · 9 comments

Comments

@moradza
Copy link

moradza commented Sep 2, 2022

Hi,

Do you have support for primitive neighbor list? https://wiki.fysik.dtu.dk/ase/_modules/ase/neighborlist.html

I am also using CellListMap in python, memory copy takes long time from list to numpy and torch, is it possible to return numpy array direclty?

Best

@lmiq
Copy link
Member

lmiq commented Sep 3, 2022

Can you tell me a little bit more about your problem? I guess that is not the conversion that takes time, but the fact that new lists of neighbors are created (a new array is allocated) on every call (the return type of the neighborlist function is equivalent to an np.array([], dtype="i,i,f").

It is of course possible to implement an inplace version of the neighborlist function, which will be faster on successive calls, and that is something that I almost implemented several times, so you can perhaps give me the motivation to do it, it wouldn't take me more than a couple of days.

But, in general, one uses the neighbor list for for something, and with celllistmap you can compute that directly without even having to build the list, which is even faster.

Do you have support for primitive neighbor list? https://wiki.fysik.dtu.dk/ase/_modules/ase/neighborlist.html

I read that, but I didn't get what exactly do you want that is not already given by the neighborlist function we have.

@moradza
Copy link
Author

moradza commented Sep 3, 2022

The motivation for my work is this for each frame of trajectory, I need to compute neighbor list. This neighborlist should be converted to an edge_index = torch.tensor of shape (2, -1). And I should move it to gpu, julia computation time gives me a speed up of 180 times but the conversion of list to torch.tensor reduces speed up to 2 times faster.
I am newbie when it comes to Julia, but return type of CellListMap is a list on python not a numpy array, am I right?

I read that, but I didn't get what exactly do you want that is not already given by the neighborlist function we have.

This is not primary issue right now. This is the case with crystals, where unit cell is smaller than the cutoff. We have to repeat the system on the fly (kx, ky, kz) times to get neighbor list up to cutoff.

@lmiq
Copy link
Member

lmiq commented Sep 4, 2022

This neighborlist should be converted to an edge_index = torch.tensor of shape (2, -1). And I should move it to gpu, julia computation time gives me a speed up of 180 times but the conversion of list to torch.tensor reduces speed up to 2 times faster.

Maybe we could try to return exactly that already. How can I generate one of such tensors in Python (I don't use torch)?

The return type of CellListMap is, I think, the equivalent of a numpy array with a custom dtype. That I can check and I will implement the inplace version of neighborlist, which will receive such array as a parameter and updated it. I think that, in principle, that will be easy.

This is the case with crystals, where unit cell is smaller than the cutoff

Uhm... no, currently we do not support that. I never thought of that possibility. That is not straightforward to implement, no, for the moment the only idea that comes into my mind is to replicate the system. From a first sight, I think I would need to define two unitcells. I'll think about that.

@lmiq
Copy link
Member

lmiq commented Sep 5, 2022

I am newbie when it comes to Julia, but return type of CellListMap is a list on python not a numpy array, am I right?

I am newbie in Python, so let me know how bad this is in terms of performance:

In [4]: from juliacall import Main as jl

In [5]: x = jl.seval("[ (i,i,Float64(i)) for i in 1:3 ]")

In [6]: x
Out[6]: 
3-element Vector{Tuple{Int64, Int64, Float64}}:
 (1, 1, 1.0)
 (2, 2, 2.0)
 (3, 3, 3.0)

In [7]: import numpy as np

In [8]: y = np.full(len(x), (0,0,0.0), dtype="i,i,f")

In [9]: for i in range(0,len(y)) :
   ...:     y[i] = x[i]
   ...: 

In [10]: y
Out[10]: 
array([(1, 1, 1.), (2, 2, 2.), (3, 3, 3.)],
      dtype=[('f0', '<i4'), ('f1', '<i4'), ('f2', '<f4')])

This will copy the elements of the output list into the y np.array, of the correct type. The for loop may be too slow, I am not sure if there is a vectorized way to do that. The x array as created there is of the same type as the list returned by CellListMap.

@lmiq
Copy link
Member

lmiq commented Sep 6, 2022

I have played a little with the conversion of the list that is output by CellListMap to a regular numpy array, and with this option there doesn´t seem to be a large overhead implied in copying the data. The key is to perform the copy of the arrays with a simple Julia function, and not on the python side. I also tested preallocating all data, but that had a minor implication only:

Basically, if copying the data with the Julia function, you get:

In [8]: %timeit convert.run_withcopy_juliaside(x)
49.1 ms ± 568 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

But if the data copying is done with a python loop:

In [9]: %timeit convert.run_withcopy_pyside(x)
1.27 s ± 80.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

The complete code is here, probably it can guide a practical implementation of this in general:

"""

# Execute with:

import numpy as np
import convert

x = np.random.random((50_000,3))
%timeit convert.run_nocopy(x)

%timeit convert.run_withcopy_juliaside(x)

%timeit convert.run_withcopy_pyside(x)

xt = x.transpose()
iinds = np.full((0,), 0, dtype=np.int64)  
jinds = np.full((0,), 0, dtype=np.int64)  

# first call: needs to allocate all indices
%timeit convert.run_inplace(xt, iinds, jinds)

# second call: the list is almost of the correct size
%timeit convert.run_inplace(xt, iinds, jinds)

"""
from juliacall import Main as jl
import numpy as np
import timeit

jl.seval("using CellListMap")

# Simple Julia function to copy the data on the Julia side
jl.seval("""
function copy_to_np(jl_list, iinds, jinds)
    for i in eachindex(jl_list)
        iinds[i] = jl_list[i][1]
        jinds[i] = jl_list[i][2]
    end
    return nothing
end
""")

# Python function to resize arrays of indices and copy with the Julia function
def copy_from_jl(jl_list, iinds, jinds) :
    iinds.resize((len(jl_list),), refcheck=False)
    jinds.resize((len(jl_list),), refcheck=False)
    jl.copy_to_np(jl_list, iinds, jinds)

# Python function to resize arrays of indices and copy with a Python loop
def copy_from_py(jl_list, iinds, jinds) :
    iinds.resize((len(jl_list),), refcheck=False)
    jinds.resize((len(jl_list),), refcheck=False)
    for i in range(0,len(jl_list)) :
        iinds[i] = jl_list[i][0]
        jinds[i] = jl_list[i][1]

# Run the neighborlist computation only
def run_nocopy(x) : 
    xt = x.transpose()
    jl_list = jl.CellListMap.neighborlist(xt, 0.05)

# Run and copy using the Julia function
def run_withcopy_juliaside(x) :
    iinds = np.full((0,), 0, dtype=np.int64)
    jinds = np.full((0,), 0, dtype=np.int64)
    xt = x.transpose()
    jl_list = jl.CellListMap.neighborlist(xt, 0.05)
    copy_from_jl(jl_list, iinds, jinds)
    return iinds, jinds
    
# Run and copy using the python function
def run_withcopy_pyside(x) :
    iinds = np.full((0,), 0, dtype=np.int64)
    jinds = np.full((0,), 0, dtype=np.int64)
    xt = x.transpose()
    jl_list = jl.CellListMap.neighborlist(xt, 0.05)
    copy_from_py(jl_list, iinds, jinds)
    return iinds, jinds

# Run but do all inplace to test the cost of allocations
def run_inplace(xt, iinds, jinds) :
    jl_list = jl.CellListMap.neighborlist(xt, 0.05)
    copy_from_jl(jl_list, iinds, jinds)
    return iinds, jinds

@moradza
Copy link
Author

moradza commented Sep 6, 2022

Thanks for the codes, I will run them this week to benchmark their performance.

I also tested preallocating all data, but that had a minor implication only

My test also shows no improvement with pre-allocation.

@moradza
Copy link
Author

moradza commented Sep 6, 2022

It works fine and I have 20-100 x faster neighbor list generation! Closing this.

@moradza moradza closed this as completed Sep 6, 2022
@lmiq
Copy link
Member

lmiq commented Sep 6, 2022

Nice, thanks for the feedback. I will add something like that to the user guide.

@lmiq
Copy link
Member

lmiq commented Sep 12, 2022

I've added a simpler interface for Python users here:

https://m3g.github.io/CellListMap.jl/stable/python/

You may be able to run the julia code multi-threaded as well, and get some additional speedup.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants