Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration with CellListMap.jl #659

Closed
Datseris opened this issue Jul 24, 2022 · 47 comments
Closed

Integration with CellListMap.jl #659

Datseris opened this issue Jul 24, 2022 · 47 comments
Labels
enhancement New feature or request example-integration A new example or showcase of integration with the Julia ecosystem performance

Comments

@Datseris
Copy link
Member

Datseris commented Jul 24, 2022

I just became aware of this package:

https://github.com/m3g/CellListMap.jl

Its goal is to accelerate computations of forces, or any kind of interactions, that occur between pairs of "agents" in 2D or 3D continuous space. Simply put, it does both what interacting_pairs currently does, and also computes the forces, which a user would have to do by hand after getting the interacting pairs. Performance benchmarks also look like its better than competing alternatives (common story of Julia packages it would seem).

I think the best place to start integrating this package with Agents.jl is to write an integration example that showcases using it to accelerate computations occurring in an ABM. From there we can see what can be made even more general and integrated more formally into the source of Agents.jl.

@Datseris Datseris added enhancement New feature or request performance example-integration A new example or showcase of integration with the Julia ecosystem labels Jul 24, 2022
@ehsanirani
Copy link

Hi. I have a simple code to mimic run & tumble dynamics of bacteria in porous media (implemented for an unpublished researh project). CellListMap is used to find collisions between agents in the ABM model and an array of random fixed obstacles. Bacteria are modeled as sphero-cylinders, that adds a bit complexity to the code to find real distances and calculate torques. Would it be a good integration example (checkout the video)?

I can also simplify the code, to use disks instead of cylinders. Then it would be much simpler, similar to the SIR example with addition of random obstacles (another research project of mine).

Maybe @lmiq has a better suggestion for the integration example.

test.mp4

@lmiq
Copy link
Contributor

lmiq commented Jul 25, 2022

I was looking at the Agents examples, and found that probably the Flocking example of birds is the closer to the typical application of CellListMap.

Yet, as I understand, the function agend_step applies to one bird. To effectively use a neighbor list (or function to compute the interactions directly), the way I'm used to see simulations is to have a "double"-loop over the elements (the birds), to compute the interactions between neighboring birds. That double loop can be replaced by a single call to the map_pairwise! function from CellListMap.

That loop, is however, sort of implicit in the way Agents is implemented, as the first loop over agents is hidden (at least in that interface), and the function that is called for each agents computes the neighbors.

Thus, I think that I need to interact to something more internal, or access an API which is not apparent at first sight. Any help is appreciated.

edit: it seems that model_step! is what I'm searching for here.

@lmiq
Copy link
Contributor

lmiq commented Jul 25, 2022

A typical simulation written on top of CellListMap is here, if anyone wants to take a look: https://github.com/m3g/2021_FortranCon/blob/main/celllistmap_vs_namd/simulate.jl

(these are molecular simulations - it would be nice to see how Agents could be used for such applications as well).

@lmiq
Copy link
Contributor

lmiq commented Jul 25, 2022

@ehsanirani If you can share your running code (even if privately), I would like to see how do you integrate it with Agents. Mostly what I don't see is how to call a function once per step, and use the output of that in the update of the properties of every agent.

@ehsanirani
Copy link

@ehsanirani If you can share your running code (even if privately), I would like to see how do you integrate it with Agents. Mostly what I don't see is how to call a function once per step, and use the output of that in the update of the properties of every agent.

I created a minimal repository on codeberg, please check it out:

https://codeberg.org/ehsan/test-Agents-CellListMap

edit: it seems that model_step! is what I'm searching for here.

You are right, you need simething like model_step! function for using CellListMap.

@lmiq
Copy link
Contributor

lmiq commented Jul 26, 2022

That is a really nice application. I think as a an example it is better to make it simpler. But that is a perfect example. I will try to simplify it as much as possible, and put it as an integration example on the CellListMap docs as well. After trying to simplify it, I will post it here. Thank you very much, that is a very nice contribution to all, I think.

@ehsanirani
Copy link

I think as a an example it is better to make it simpler.

I totally agree. I think most of the complexity comes from the fact that my agents (bacteria) are modeled as spherocylinders. Maybe assuming agents with simpler shape (disks?) helps with simplifying the code.

@Datseris
Copy link
Member Author

Hello both, thanks so much for all the input and sorry for joining late!

Yes, I agree with the suggestion of simplifying the rods to spheres. The added complexity of rods does not really affect the integration of the two libraries (if I have understood things correctly, the "rods" are in fact a bunch of spheres that are "linked" to each other, so in the end one still calculates distances between spheres).

Although I must admit that the video with the roods looks so much cooler ;) . One thing we can do, provided that @ehsanirani is okay with this, is show the full integration with spheres only. Then, at the end, provide a comment like "one can create rods or other shapes from spheres, and the above would work in a similar manner. We do this in a different file, which produces this output:" and then show the video. I'm suggesting this because the code that makes rods out of spheres already exists and already works and is quite intelligent, so why not show it anyways?

@lmiq let me know if there are questions about using Agents.jl. Yes, model_step! is the way to go. That's also the way we do it here, which is a code that can completely be replaced by using CellListMap to make things fast: https://juliadynamics.github.io/AgentsExampleZoo.jl/dev/examples/social_distancing/

@Datseris
Copy link
Member Author

Depend on how hard or easy the integration example turns out to be, I'll consider whether it is useful to create some extra fields in the ContinuousSpace struct in Agents.jl, so that it out-of-the box has the things expected by CellListMap already set up (there will be a boolean flag whether to enable this so the user can choose).

@Datseris
Copy link
Member Author

Oh wow, looking at the example here https://codeberg.org/ehsan/test-Agents-CellListMap/src/branch/main/src/bacteria.jl it seems SUPER straightforward to couple the two packages. As expected of Julia of course.

@ehsanirani
Copy link

(if I have understood things correctly, the "rods" are in fact a bunch of spheres that are "linked" to each other, so in the end one still calculates distances between spheres).

@Datseris Spherocylinders are a bit more complicated than that. What you described is the model used in the bacteria growth example in Agents.jl docs.

Anyway, I totally agree that a simpler example fits better with the docs/tutorials. Just let me know how I can help.

PS. In a month or two, we are submitting a paper using the more advanced version of the code that I shared here, with chemotaxis and other cool biophysical features. Afterwards, I will make it open source and maybe it can be added to the docs, as an example of real research project done by Agents.jl and CellListMap.jl.

@ehsanirani
Copy link

Oh wow, looking at the example here https://codeberg.org/ehsan/test-Agents-CellListMap/src/branch/main/src/bacteria.jl it seems SUPER straightforward to couple the two packages. As expected of Julia of course.

I have started working with Julia since two months ago, and that is one of the reasons that I fell in love with it. Another reason is solving the two language problem. As a computational physicist, I really appreciate and admire it. I am already converting codes for 5 different projects to use Agents.jl, Molly.jl and CellListMap.jl.

@lmiq
Copy link
Contributor

lmiq commented Jul 27, 2022

What is the size (number of bacteria and obstacles) of the actual simulations? CellListMap is expected to provide reasonable speedups for >10k interactions, I guess. That is probably a relevant information to provide in the example.

I'm trying to simplify the example as much as possible (and learning how to use Agents.jl in the meantime). I will try to use that as an integration example on the CellListMap side as well.

@ehsanirani
Copy link

What is the size (number of bacteria and obstacles) of the actual simulations?

There are 500-1000 bacteria and 2500-5000 non-overlapping obstacles (different box sizes: volume fraction between 0.2 to 0.7) in each simulation.

@lmiq
Copy link
Contributor

lmiq commented Jul 27, 2022

I am working in a minimalist version of the integration here:

https://github.com/lmiq/agents_with_celllistmap/

The simulations ARE working (the video is produced), but:

  1. Seems that my agent_step! function is somewhat redundant to model_step!, as this last one is only computing the forces. I could just update the positions and velocities there.

  2. Agents.jl has move_agents! and update_velocities! functions, which seem the reasonable choice to use here. However, from the docs I would imagine that this could integrate with DifferentialEquations for more sophisticated updating schemes. Don't they?

  3. The fact that coordinates and velocities cannot be represented within an Agent as a StaticVector is somewhat odd, these are tuples but with all the arithmetic defined, so are more convenient than NTuples to represent coordinates.

  4. To use CellListMap at some point one need the coordinates as an vector of vectors. This sort of duplicates the coordinate information, which is contained in the agent positions. The continous layout in memory of the coordinates if fundamental for performance. No big deal, as these are scalar copies. but I wander if the Agents could be structured as an structure of array layout, which would be more efficient (except for distributed simulations, I think).

As I mentioned, the code is not working yet, because I am still learning the Agents.jl interface. Probably I'm not the best person now to do this part of the integration, but it should be easy. If you want to comment there or even pull, that would be great.

@lmiq
Copy link
Contributor

lmiq commented Jul 28, 2022

The example above is now running correctly, and produces this video: https://github.com/lmiq/agents_with_celllistmap/blob/main/test.mp4

The best way to take advantage of Agents.jl is to be discussed.

Also it would be interesting to know how much CellListMap accelerates this simulation, relative to the "standard" way implemented in Agents.jl, as a function of the number of particles.

@Datseris
Copy link
Member Author

  1. Seems that my agent_step! function is somewhat redundant to model_step!, as this last one is only computing the forces. I could just update the positions and velocities there.

You do not have to always use agent_step!. Use dummystep in place of agent_step! if you don't feel like you need to write rules that auto-loop over agents.

2. Agents.jl has move_agents! and update_velocities! functions, which seem the reasonable choice to use here. However, from the docs I would imagine that this could integrate with DifferentialEquations for more sophisticated updating schemes. Don't they?

No, not by default. They could though. In the docs of Agents.jl there is an integration example with DifferentialEquations.jl. However this version of update_velocities! I think will stay with the Euler scheme. It is an overkill to use DiffEq, and if a user wants to do it, they can integrate it formally anyways.

3. The fact that coordinates and velocities cannot be represented within an Agent as a StaticVector is somewhat odd, these are tuples but with all the arithmetic defined, so are more convenient than NTuples to represent coordinates.

Yes, this is a design decision I am not happy with and may likely change in the future to SVector. The problem is SVector itself is dubious as well. In my opinion, static vectors should be part of the standard library.

4. To use CellListMap at some point one need the coordinates as an vector of vectors. This sort of duplicates the coordinate information, which is contained in the agent positions. The continous layout in memory of the coordinates if fundamental for performance. No big deal, as these are scalar copies. but I wander if the Agents could be structured as an structure of array layout, which would be more efficient (except for distributed simulations, I think).

Unfortunately this duplication is necessary. It is by far the most intuitive and convenient way to represent agents as instances of structs with several fields, only one of which is a position. If each property of the agents was instead removed from the agents and into vectors, this would make for a less intuitive experience for the user, and much longer code. The bright side is that making this vector of positions is negligible cost both in memory and computationally when compared to computing anything using this vector.

@Datseris
Copy link
Member Author

@lmiq in your example version, why is the struct CellListMapData necessary? It isn't, right? One could make its fields directly be properties of the AgentBasedModel, saving the complexity of creating one more struct.

@Datseris
Copy link
Member Author

@lmiq when you are okay with the example, I'd suggest to open a PR here adding it, and there I can also push on the code and/or comment on the code!

@Datseris
Copy link
Member Author

Also it would be interesting to know how much CellListMap accelerates this simulation, relative to the "standard" way implemented in Agents.jl, as a function of the number of particles.

Yes, I think it is important of us to do this. Although it's 99.99% certain that CellListMap will accelerate performance, it is anyways good to know by how much.

What is the size (number of bacteria and obstacles) of the actual simulations? CellListMap is expected to provide reasonable speedups for >10k interactions, I guess. That is probably a relevant information to provide in the example.

That's something I don't immediately understand. The speedup here is versus what other version of the code? Just looping over all "agents"? I am supersprised if this is the case, as I would expect CellListMap to provide performance acceleration pretty much immediatelly even for a few agents. In Agents.jl we also have a structure that compartiments the continuous space and makes rectangular compartments for accelarating nearest neighbor searches. it is practically immediatelly worth it to use this accelerating structure according to our testing.

@lmiq
Copy link
Contributor

lmiq commented Jul 28, 2022

Yes, this is a design decision I am not happy with and may likely change in the future to SVector.

Couldn't the type of array be made generic?

@lmiq in your example version, why is the struct CellListMapData necessary? It isn't, right?

It isn't. But if one wants to use parallel in-place updates of the cell lists, some more fields are necessary and it may be clearer what if for what there. It is just a stylistic choice, yes.

That's something I don't immediately understand. The speedup here is versus what other version of the code? Just looping over all "agents"? I am supersprised if this is the case, as I would expect CellListMap to provide performance acceleration pretty much immediatelly even for a few agents.

There is a price that is paid on constructing the cell lists. If the number of particles is too small, it it not worth it (or building neighbor lists, etc). But I tested now and with 100 particles it is already faster than a naive loop, so your intuition was correct.

@lmiq when you are okay with the example, I'd suggest to open a PR here adding it, and there I can also push on the code and/or comment on the code!

The reason I'm not OK with the example is that I don't see what exactly I'm gaining in using Agents. I posted now an implementation of the same simulation without agents, here. The code is actually shorter, and the code is faster, as shown below.

The only thing I would get from using Agents, here, is the function to create the video, since the double loop on particles is implicit on CellListMap, and the integration scheme I had to write manually in both cases.

Thus, my feeling is that I don't know what I can really gain by using Agents and its integration with other tools (from Agents itself or from the ecosystem in general), and the example feels artificial.

julia> includet("./Particles.jl")

julia> using .Particles

julia> @time WithAgents.simulate();
run! progress: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| Time: 0:00:03
  6.561433 seconds (8.04 M allocations: 435.431 MiB, 2.06% gc time, 54.78% compilation time)

julia> @time WithAgents.simulate();
run! progress: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| Time: 0:00:03
  3.211807 seconds (64.54 k allocations: 4.913 MiB)

julia> @time NoAgents.simulate();
  2.377066 seconds (387.15 k allocations: 21.186 MiB, 12.53% compilation time)

julia> @time NoAgents.simulate();
  2.081114 seconds (2.12 k allocations: 622.031 KiB)

@Datseris
Copy link
Member Author

great, thanks!

Couldn't the type of array be made generic?

Hm, I am not sure what you mean by that. The position is a field of a struct, the agent struct. It needs to be concretely typed. We chose NTuple because it is simpler and more intuitive for a general user, but also because it is equivalent with SVector internally. (an SVector is nothing more than a wrapper of NTuple)

The reason I'm not OK with the example is that I don't see what exactly I'm gaining in using Agents

Well this of course depends on the application. However, as far as I can tell there are many reasons to be using Agents, even in your case where the simulation is simple enough that keeping track of only positions between particles is necessary. In a typical application the "particles" would have other properties as well, like size, charge, chemical whatever, name, or whatever else. Agents binds all of this into a dedicated struct that has all the properties corresponding to a particle, and then using the id::Int field allows finding this particle. Agents.jl also has several functions for finding things nearby a position. nearby_agents(pos, model, r) will iterate over agents given a position and radius. How would I code this from CellListMap.jl? Then, the plotting by itself is not just "a feature", but an absolutely massive piece of work that took many many months, and allows extreme convenience: simply providing the model stepping functions turns the "simple plotting" into a fully interactive GUI that allows you to step the model on demand as well as change model parameters live during model evolution. Another thing is data collection: it is trivial to collect data by using Agents.jl to later analyzing them with DataFrames.jl, see the Schelling example for how easy this becomes. For example, if I wanted to collect the chemical potential property of all agents, average it over agents, but only include agents that have mass > 0.5, it would still take me only a single line of code to do so. In fact, I'd say just the data collection by itself would be a reason to use Agents.jl. There are many more things you can get from Agents.jl that are well beyond computing distance-based interactions between particles. The API listing is too large to start listing things here one by one: https://juliadynamics.github.io/Agents.jl/stable/api/ . So, I think the better way to think about it is not "what am I gaining by using Agents as a user of CellListMap". The better way is "what am I gaining as a user of Agents by using CellListMap". The latter makes more sense, because Agents.jl is a generic simulation framework, and thus it often doesn't provide specialized computation acceleration tools out of the box.

Therefore, the best person to do this PR into Agents.jl that adds an integration example is a user that has already benefitted by combining both software, i.e., @ehsanirani .

and the example feels artificial

Here I have to disagree. The point of the example is to show how one can integrate the two software, not how to make novel science nor how to use the most of either. Just how to use both at the same time. Users of Agents.jl will then know how to utilize this to speed up calculations in continuous space scenarios. Users of CellListMap.jl will see that there are frameworks that allow doing really complicated simulations seamlessly, especially when it comes to the data collection part. Also providing incredible plotting infrastructure out of the box.

the code is faster, as shown below

That's true, but that's also expected, as the agents in Agents.jl are stored in a dictionary that is slower to iterate over. This cost comes from the convenience of being able to identify agents by their ID, which remains unique and unchangeable during the simulation. However, I am surprised that the code is twice as fast. Later I will take a look at the code in more detail because this difference is surprising and much larger than what it should be from the change of vector to dictionary.

@Datseris
Copy link
Member Author

Datseris commented Jul 28, 2022

With two very simple changes I immediatelly noticed, the difference in performance becomes much smaller:

With agents
  2.228872 seconds (3.91 k allocations: 833.219 KiB)
No agents
  1.751680 seconds (2.14 k allocations: 625.312 KiB)

code at: https://github.com/Datseris/agents_with_celllistmap

Notice that the run! function performs both the model evolution, as well as data collection. But the bulk of the performance difference came from using ntuple instead of Tuple.

@Datseris
Copy link
Member Author

Notice also that besides the fact that in one case we iterate over a dictionary while in the other over a vector, one more reason the performance of using both frameworks will never reach the perforamnce of using only CellListMap.jl is due to the duplicacy of information of the agent positions, as well as updating them twice in the integration case.

@Datseris
Copy link
Member Author

Separating the model creation from the model stepping gives:

With agents, only stepping
  2.260443 seconds
No agents, only stepping
  1.725879 seconds

which highlights that (1) the difference in memory usage is only during model creation and (2) all performance difference is still in the stepping, not initialization part.

I still need to make a version that uses only Agents.jl and interacting_pairs.

@lmiq
Copy link
Contributor

lmiq commented Jul 28, 2022

Hm, I am not sure what you mean by that. The position is a field of a struct, the agent struct.

The agent type could be parametrized for the type of coordinates (i. e. struct Agent{V} pos::V end).

Concerning all the integration importance and the features of Agents: To be clear, I do not doubt that a lot can be gained by using the Agents infrastructure. The thing is that, not being myself (currently) a user of Agents, I simply do not know if what I've written as an Integration example uses the Agents infrastructure enough so that all that is really possible.

I still need to make a version that uses only Agents.jl and interacting_pairs.

Yes, that is the interesting one (I would expect more than 2x :-), particularly for large systems, where the computation of pairwise interactions dominate - note that using a vector instead a dictionary allows for simd, memory alignment and locality, etc. There is a lot to be gained there).

@Datseris
Copy link
Member Author

Datseris commented Jul 28, 2022

@lmiq could you help me in the following, just to be sure I don't mess up:

The cutoff parameter that is calculated at the start is so that we only calculate forces between agents that are within distance cutoff of each other, right? Because later there is also

    d = sqrt(d2)
    if d  (ri + rj)

and only then forces are calculated. Could you please clarify why there are two "cutoffs" and what does each one succeed in doing? EDIT: And also please tell me what is d2. EDIT2: It's the squared distance between particles which I assume is automatically computed in CellListMap.jl but in the Agents-only iteration it isn't. Do I still have to compute it, or is the first cutoff enough?

@AayushSabharwal
Copy link
Collaborator

I had a quick glance at this thread, and it looks really cool! A couple things I'd like to add:

@Datseris
I'll consider whether it is useful to create some extra fields in the ContinuousSpace struct in Agents.jl, so that it out-of-the box has the things expected by CellListMap already set up (there will be a boolean flag whether to enable this so the user can choose).

While I'm sure you've already thought of this, I think we should only modify the struct if it's absolutely necessary for a clean API and is general enough to be applicable to a large variety of models, for the same reason that we keep pathfinding separate. If CellListMap has a breaking change, to support it we might need a breaking change as well.

@lmiq
note that using a vector instead a dictionary allows for simd, memory alignment and locality, etc. There is a lot to be gained there).

The downside here is that deleting and adding agents during a simulation becomes much more complicated. Either we do the O(n) operation of deleting an element from an array, or leave a hole there and add a check whenever that data is accessed. With the second approach, we also need to start tracking how many holes there are, periodically clean it up and/or add new agents to those holes. The exact semantics of doing so in a bug free manner are non-trivial at best. It would also require measuring at what point the performance difference is noticeable, and by how much.

That said, it would be cool if we could have our cake and eat it too, as the saying goes :)

@lmiq
Copy link
Contributor

lmiq commented Jul 28, 2022

Yeah, I could have simplfied that. The idea there is that each particle has a radius, and the interaction potential is zero if the distance between the particles is greater than the sum of the radii of the two particles involved. The cutoff that is used to build the cell lists must be the sum of the two greater radii of the system, so no interaction is lost.

Yet, in the example I am starting all particles with the same radius, because of this default:

Particle(id, pos, vel) = Particle(id, Tuple(pos), Tuple(vel), 10.0, 1.0, 1.0)

(all radii are 10.0), thus the cutoff can be just 20.

Clarifying a little bit further:

The cell list computation will run over all pairs of particles closer than a cutoff. This is the cutoff of the code. That is independent on how the forces are calculated for one pair of particles, except that the forces won't be calculated if the distance between the particles is greater than that cutoff. In this example, the particles have radii, and the interaction is only different from zero if the distance between the particles is smaller than the sum of their radii. That can be simply the same cutoff if all particles have the same radii and cutoff = 2 * radius, but it can be more general than that. We can simplify the model in this sense, but in this case it would not make sense to have a radius field for the agents, since all agents would be identical.

Yes, d2 is the squared distance, that is computed by CellListMap and passed to the inner function that is mapped, generally it is used, or it would have to be recalculated.

@lmiq
Copy link
Contributor

lmiq commented Jul 28, 2022

The downside here is that deleting and adding agents during a simulation becomes much more complicated.

Yes, certaintly there is some gained flexibility with the current Agents approach, although for the applications I'm used to the downsides I think are greater than the gains. Anyway, in the lines of your comment, I think having a clear path on to how use CellListMap in the context of Agents is good enough, CellListMap is a very specialized package for computing pairwise interactions within a cutoff (I think it may be much faster than the interaction_pairs function, actually, and it supports general periodic boundary conditions, is parallel, etc). How important would be to have Agents much faster in the cases where CellListMap applies depends on how frequent are these cases and the sizes of the systems people use to simulate.

@Datseris
Copy link
Member Author

Yes, as expected, it is much faster. About 10x. Here are the numbers (code is at https://github.com/Datseris/agents_with_celllistmap)

With Agents.jl
  0.209374 seconds (3.97 k allocations: 841.672 KiB)
No Agents.jl
  0.163874 seconds (2.14 k allocations: 625.641 KiB)
Only Agents.jl
  2.537267 seconds (14.01 M allocations: 812.777 MiB, 28.40% gc time)
With Agents.jl, only stepping
  0.215256 seconds
No Agents.jl, only stepping
  0.165318 seconds
Only Agents.jl, only stepping
  2.655041 seconds (14.01 M allocations: 815.216 MiB, 26.31% gc time)

I have to admit however, that we've spent no time optimizing interacting_pairs. We've spent a ridiculous amount of time optimizing nearby_ids, but this is a completely different beast ;)

How about we open a PR for an integration example using the code we all have, and start commenting on that PR how to make things clearer?

@Datseris
Copy link
Member Author

@AayushSabharwal a bit off topic, but here the idea we've breafly chatted about once becomes useful: The "different versions of AgentBasedModel and a version for "unkillable" agents, which uses vector and hence much better performance. Would you please be so kind to open a new issue discussing this, before we forget it? There are benefits to implement this, it's clear, so let's have an issue for it in case someone has the time to do it.

@lmiq
Copy link
Contributor

lmiq commented Jul 28, 2022

One note: CellListMap provides a neighborlist function, which can almost be used identically to what you implemented there in interaction_pairs, it will return the list of neighbors withing the cutoff (:all) - but also their distances. Maybe that is an interesting alternative, in terms of simplicity for the user - it will be slightly slower than the lower level interface we are using now, but probably still faster than what you have.

see: https://m3g.github.io/CellListMap.jl/stable/examples/#The-CellListMap.neighborlist-function

edit: one thing that you inspired me is to implement this as an iterator. I'll see if I can do that.

@Datseris
Copy link
Member Author

Cool, that's great to know! But I think if a user goes for integrating this package, they probably should also go for the extra 5 lines of code that use the low level and faster interface. That is, unless you manage to do the iterator version of neighborlist and you notice that there is practically no performance defecit in using that (at the moment simply allocating this large vector is already pretty costly)

@lmiq
Copy link
Contributor

lmiq commented Jul 28, 2022

I was looking at the Agents code now, and maybe it is possible to create a bridge package that implements the same interfaces you have, but with CellListMap on the lower level. Maybe then we would need only to load the package and specify some specific types of models, which would wrap the current models for dispatch. Probably nearby_ids can be accelerated as well, assuming the cellist is available already.

While we build the example I will consider that.

@ehsanirani
Copy link

am working in a minimalist version of the integration here:

https://github.com/lmiq/agents_with_celllistmap/

@lmiq Is there any specific reason that you use parallel=false in your integration example? I think an important gain that someone can get from such integration is the EASY parallelization that CellListMap.jl can bring to Agents.jl, to the neighbor finding and force calculation part (which is computationally the heaviest part?). So it might worth to have it in the integration example, and also in the benchmarks.

@lmiq
Copy link
Contributor

lmiq commented Jul 29, 2022

Yes, sure. That was the first attempt. To effectively parallelize we need to store two additional auxiliary structures. I'll do that asap.

@lmiq
Copy link
Contributor

lmiq commented Jul 29, 2022

I have updated all the codes including the parallel version and some additional benchmarks, in my repo (here).

If my benchmarks are correct, the 10x speedup is observed in the 1k particles example, but with more particles the difference becomes much greater (200x for 10k particles for example).

The parallelization is not good for 1k particles, but provides greater speedups for larger systems. For 100k particles I get 3.5x (with 4 cores/8 threads, in my laptop).

julia> include("./Particles.jl")
  Activating project at `~/Drive/Work/JuliaPlay/agents_with_celllistmap`
-------------------------
 n = 1000 
 nsteps = 1000 
-------------------------
With Agents.jl
  0.268133 seconds (176.20 k allocations: 20.066 MiB)
No Agents.jl
  0.215426 seconds (174.63 k allocations: 18.273 MiB)
Only Agents.jl
  3.096844 seconds (14.01 M allocations: 812.723 MiB, 33.90% gc time)
With Agents.jl, only stepping
 serial: 
  0.275467 seconds (141 allocations: 23.000 KiB)
 parallel: 
  0.303102 seconds (162.46 k allocations: 17.515 MiB)
No Agents.jl, only stepping
 serial: 
  0.210262 seconds (152 allocations: 24.562 KiB)
 parallel: 
  0.219809 seconds (162.18 k allocations: 15.926 MiB)
Only Agents.jl, only stepping
  2.674290 seconds (14.01 M allocations: 812.399 MiB, 25.45% gc time)
-------------------------
 n = 10000 
 nsteps = 50 
-------------------------
With Agents.jl
  0.182498 seconds (81.89 k allocations: 17.305 MiB)
No Agents.jl
  0.121114 seconds (70.41 k allocations: 15.874 MiB)
Only Agents.jl
 43.372756 seconds (7.01 M allocations: 478.473 MiB, 0.41% gc time)
With Agents.jl, only stepping
 serial: 
  0.371516 seconds (2.06 k allocations: 1.451 MiB)
 parallel: 
  0.177183 seconds (10.41 k allocations: 2.354 MiB)
No Agents.jl, only stepping
 serial: 
  0.216945 seconds (2.02 k allocations: 1.386 MiB)
 parallel: 
  0.116470 seconds (10.06 k allocations: 1.998 MiB)
Only Agents.jl, only stepping
 44.099356 seconds (7.00 M allocations: 476.659 MiB, 0.35% gc time)
-------------------------
 n = 100000 
 nsteps = 1 
-------------------------
julia> compare(n, nsteps)
With Agents.jl
  0.130113 seconds (202.37 k allocations: 70.953 MiB)
No Agents.jl
  0.063600 seconds (100.36 k allocations: 56.182 MiB)
Only Agents.jl
*** I killed it after some minutes

I'm not sure "where" do you want to post pull request about this. Feel free to do it. (@ehsanirani is the author of the original code from which I started simplifying, so his opinion and agreement would be important here - also I think his example can be provided in a linked repo for a more interesting and beautiful result).

@lmiq
Copy link
Contributor

lmiq commented Jul 29, 2022

I've added a version of the integration example here:

https://github.com/lmiq/agents_with_celllistmap/blob/main/celllistmap.jl

which I'm commenting and polishing to become a file in your Agents/examples directory (possible PR soon). Then I will write the corresponding docs page and submit a PR as well. If you want to contribute already, go for it, I won't be able to work further on it until Monday.

@Datseris
Copy link
Member Author

In Agents.jl we have a dedicated page for integration examples and hence the example needs to be added here: https://github.com/JuliaDynamics/Agents.jl/tree/main/examples . The examples use Literate.jl to convert runnable Julia files (which we already have) into markdown pages for use with Documenter.jl. I can guide the person that does the PR into correct format.

@ehsanirani please let us know whether you will continue from the source code we currently have in @lmiq 's repo and make an example or not. Otherwise I can do it sometime in the near future.

@lmiq
Copy link
Contributor

lmiq commented Jul 30, 2022

I was not aware of Literate.jl. I can easily format the file, and will probably use it at the CellListMap docs as well.

@ehsanirani
Copy link

ehsanirani commented Aug 2, 2022

@ehsanirani please let us know whether you will continue from the source code we currently have in @lmiq 's repo and make an example or not. Otherwise I can do it sometime in the near future.

I think @lmiq example is quite satisfactory. But if you mean the bacteria use-case, the repo. here is already revised. I can write some docs maybe in a week or two, if it is still needed by then.

@Datseris
Copy link
Member Author

Datseris commented Aug 2, 2022

Yes please, either of you should go ahead and open a PR here which I can then work on as well if need be.

@lmiq
Copy link
Contributor

lmiq commented Aug 2, 2022

I'm almost ready with the Literate version here: https://github.com/lmiq/agents_with_celllistmap/blob/main/celllistmap.jl

(let us not duplicate work :-) )

@lmiq
Copy link
Contributor

lmiq commented Aug 2, 2022

Added the example file. I hope that it will render correctly (never used that before, but I tried to follow the same path that was done for the other examples).

@AayushSabharwal
Copy link
Collaborator

Is this issue resolved? If so, can we close it?

@Datseris
Copy link
Member Author

yes. @lmiq had some ideas for further integration, but we should be clean and discuss these in a different issue when the time arrives.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request example-integration A new example or showcase of integration with the Julia ecosystem performance
Projects
None yet
Development

No branches or pull requests

4 participants