Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NeighborList should be constructible from system-like object and 2D array of bond indices #748

Open
bdice opened this issue Apr 14, 2021 · 3 comments
Labels

Comments

@bdice
Copy link
Member

bdice commented Apr 14, 2021

Is your feature request related to a problem? Please describe.
Occasionally I need to construct a NeighborList from bond topology data, which is currently somewhat hard to do. I've written the pattern below at least a dozen times, and it would be much better as a single function.

Describe the solution you'd like
A new class method like NeighborList.from_system(system, indices, weights) would be helpful.

Here's a rough implementation that needs to be improved/tested. Note that this snippet will break with #738 (distances need to be replaced by vectors).

    @classmethod
    def from_system(cls, system, indices, weights=None):
        R"""Create a NeighborList from a system-like object and indices.

        Example::

            import freud
            import numpy as np
            box = freud.box.Box(2, 3, 4, 0, 0, 0)
            points = np.array([[0, 0, -1], [0.5, -1, 0]])
            indices = np.array([[0, 1], [1, 0]])
            nlist = freud.locality.NeighborList.from_system(
                system=(box, points),
                indices=indices,
            )

        Args:
            system:
                Any object that is a valid argument to
                :class:`freud.locality.NeighborQuery.from_system`.
            indices (:math:`\left(N_{bonds}, 2\right)` :class:`np.ndarray`):
                Array of integers corresponding to indices in the sets of
                query points and points.
            weights (:math:`\left(N_{bonds}, \right)` :class:`np.ndarray`, optional):
                Array of per-bond weights (if :code:`None` is given, use a
                value of 1 for each weight) (Default value = :code:`None`).
        """
        cdef NeighborQuery nq = NeighborQuery.from_system(system)
        query_point_indices = indices[:, 0]
        point_indices = indices[:, 1]
        distances = np.linalg.norm(
            system.box.wrap(system.points[point_indices] - system.points[query_point_indices])
            axis=-1,
        )
        return cls.from_arrays(
            num_query_points=len(system.points),
            num_points=len(system.points),
            query_point_indices=query_point_indices,
            point_indices=point_indices,
            distances=distances,
            weights=weights,
        )
@bdice bdice added good first issue Good for newcomers locality labels Apr 14, 2021
@vyasr
Copy link
Collaborator

vyasr commented Apr 15, 2021

What types of system are you typically working with when this comes up? Is this in the context of OVITO? If you actually have a Box and numpy arrays of positions/indices, then (especially with Box.compute_distances, no need to do that yourself) this is as simple as:

distances = box.compute_distances(points[indices[:, 0]], points[indices[:, 1]])
NeighborList.from_arrays(...)

Admittedly the NeighborList.from_arrays API is a little bit clunky, and of course the from_system call becomes necessary if you're working with other inputs.

I'm not opposed to adding this function if it really is the best path forward, but I think the fact that something this straightforward is an annoyance suggests room for improvement in the existing APIs. For one, since as of #738 we're brainstorming API breaks for freud 3.0 we should consider moving num_query_points and num_points to the end of the arg list of from_arrays so that num_points can default to num_query_points like everywhere else in freud. Second, We we should consider extracting the first half of NeighborQuery.from_system into a separate function that just handles the conversion of various input types into a (box, points) tuple (with box always being a Box object rather than any valid input to Box.from_box). Then we'd have a simpler way to get this data without building a NeighborQuery, which is a roundabout way to import data from other sources (assuming that's why you're doing it here).

@bdice
Copy link
Member Author

bdice commented Apr 16, 2021

There are a few motivating cases for this. Sometimes I want to use another package to find neighbors, and I get back arrays of indices. Sometimes polymers/"bonds" in a GSD file also need this (e.g. to find the center of mass or radius of gyration). Being able to quickly combine (box, positions), indices into a NeighborList is purely a matter of brevity. This is simply a convenience API that would wrap the full from_arrays and replace the long snippet above with something more concise.

Regarding your idea of splitting the logic in NeighborQuery.from_system, I think a similar thing is already happening via the _RawPoints class, which is just a container for a box and points. I don't think we need to improve anything there?

freud/freud/locality.pyx

Lines 365 to 368 in ff9d561

if cls == NeighborQuery:
# If called from this abstract parent class, always make
# :class:`~._RawPoints`.
return _RawPoints(*system)

@vyasr
Copy link
Collaborator

vyasr commented Apr 23, 2021

The workflow you're suggesting makes sense, but I think it highlights the fact that there are any number of convenience APIs that we could implement that are fundamentally built around needing to coerce different input types into a (box, points) tuple. The conversion to a NeighborList is fine and could be added on top of that, but just the conversion from external objects to (box, points) would be useful to expose to users, especially if they need to do some more bespoke logic before inputting the data into freud (either for NeighborList construction or directly into a compute after some modification). If we wanted to expose _RawPoints publicly (probably renamed), then one option would be to implement __iter__ so that it could be transparently coerced via unpacking box, points = RawPoints(...).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants