-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integration with CellListMap.jl #659
Comments
Hi. I have a simple code to mimic run & tumble dynamics of bacteria in porous media (implemented for an unpublished researh project). CellListMap is used to find collisions between agents in the ABM model and an array of random fixed obstacles. Bacteria are modeled as sphero-cylinders, that adds a bit complexity to the code to find real distances and calculate torques. Would it be a good integration example (checkout the video)? I can also simplify the code, to use disks instead of cylinders. Then it would be much simpler, similar to the SIR example with addition of random obstacles (another research project of mine). Maybe @lmiq has a better suggestion for the integration example. test.mp4 |
I was looking at the Yet, as I understand, the function That loop, is however, sort of implicit in the way Thus, I think that I need to interact to something more internal, or access an API which is not apparent at first sight. Any help is appreciated. edit: it seems that |
A typical simulation written on top of CellListMap is here, if anyone wants to take a look: https://github.com/m3g/2021_FortranCon/blob/main/celllistmap_vs_namd/simulate.jl (these are molecular simulations - it would be nice to see how Agents could be used for such applications as well). |
@ehsanirani If you can share your running code (even if privately), I would like to see how do you integrate it with Agents. Mostly what I don't see is how to call a function once per step, and use the output of that in the update of the properties of every agent. |
I created a minimal repository on codeberg, please check it out: https://codeberg.org/ehsan/test-Agents-CellListMap
You are right, you need simething like |
That is a really nice application. I think as a an example it is better to make it simpler. But that is a perfect example. I will try to simplify it as much as possible, and put it as an integration example on the CellListMap docs as well. After trying to simplify it, I will post it here. Thank you very much, that is a very nice contribution to all, I think. |
I totally agree. I think most of the complexity comes from the fact that my agents (bacteria) are modeled as spherocylinders. Maybe assuming agents with simpler shape (disks?) helps with simplifying the code. |
Hello both, thanks so much for all the input and sorry for joining late! Yes, I agree with the suggestion of simplifying the rods to spheres. The added complexity of rods does not really affect the integration of the two libraries (if I have understood things correctly, the "rods" are in fact a bunch of spheres that are "linked" to each other, so in the end one still calculates distances between spheres). Although I must admit that the video with the roods looks so much cooler ;) . One thing we can do, provided that @ehsanirani is okay with this, is show the full integration with spheres only. Then, at the end, provide a comment like "one can create rods or other shapes from spheres, and the above would work in a similar manner. We do this in a different file, which produces this output:" and then show the video. I'm suggesting this because the code that makes rods out of spheres already exists and already works and is quite intelligent, so why not show it anyways? @lmiq let me know if there are questions about using Agents.jl. Yes, |
Depend on how hard or easy the integration example turns out to be, I'll consider whether it is useful to create some extra fields in the |
Oh wow, looking at the example here https://codeberg.org/ehsan/test-Agents-CellListMap/src/branch/main/src/bacteria.jl it seems SUPER straightforward to couple the two packages. As expected of Julia of course. |
@Datseris Spherocylinders are a bit more complicated than that. What you described is the model used in the bacteria growth example in Agents.jl docs. Anyway, I totally agree that a simpler example fits better with the docs/tutorials. Just let me know how I can help. PS. In a month or two, we are submitting a paper using the more advanced version of the code that I shared here, with chemotaxis and other cool biophysical features. Afterwards, I will make it open source and maybe it can be added to the docs, as an example of real research project done by Agents.jl and CellListMap.jl. |
I have started working with Julia since two months ago, and that is one of the reasons that I fell in love with it. Another reason is solving the two language problem. As a computational physicist, I really appreciate and admire it. I am already converting codes for 5 different projects to use Agents.jl, Molly.jl and CellListMap.jl. |
What is the size (number of bacteria and obstacles) of the actual simulations? CellListMap is expected to provide reasonable speedups for >10k interactions, I guess. That is probably a relevant information to provide in the example. I'm trying to simplify the example as much as possible (and learning how to use Agents.jl in the meantime). I will try to use that as an integration example on the CellListMap side as well. |
There are 500-1000 bacteria and 2500-5000 non-overlapping obstacles (different box sizes: volume fraction between 0.2 to 0.7) in each simulation. |
I am working in a minimalist version of the integration here: https://github.com/lmiq/agents_with_celllistmap/ The simulations ARE working (the video is produced), but:
As I mentioned, the code is not working yet, because I am still learning the Agents.jl interface. Probably I'm not the best person now to do this part of the integration, but it should be easy. If you want to comment there or even pull, that would be great. |
The example above is now running correctly, and produces this video: https://github.com/lmiq/agents_with_celllistmap/blob/main/test.mp4 The best way to take advantage of Agents.jl is to be discussed. Also it would be interesting to know how much CellListMap accelerates this simulation, relative to the "standard" way implemented in Agents.jl, as a function of the number of particles. |
You do not have to always use
No, not by default. They could though. In the docs of Agents.jl there is an integration example with DifferentialEquations.jl. However this version of
Yes, this is a design decision I am not happy with and may likely change in the future to
Unfortunately this duplication is necessary. It is by far the most intuitive and convenient way to represent agents as instances of structs with several fields, only one of which is a position. If each property of the agents was instead removed from the agents and into vectors, this would make for a less intuitive experience for the user, and much longer code. The bright side is that making this vector of positions is negligible cost both in memory and computationally when compared to computing anything using this vector. |
@lmiq in your example version, why is the struct |
@lmiq when you are okay with the example, I'd suggest to open a PR here adding it, and there I can also push on the code and/or comment on the code! |
Yes, I think it is important of us to do this. Although it's 99.99% certain that CellListMap will accelerate performance, it is anyways good to know by how much.
That's something I don't immediately understand. The speedup here is versus what other version of the code? Just looping over all "agents"? I am supersprised if this is the case, as I would expect CellListMap to provide performance acceleration pretty much immediatelly even for a few agents. In Agents.jl we also have a structure that compartiments the continuous space and makes rectangular compartments for accelarating nearest neighbor searches. it is practically immediatelly worth it to use this accelerating structure according to our testing. |
Couldn't the type of array be made generic?
It isn't. But if one wants to use parallel in-place updates of the cell lists, some more fields are necessary and it may be clearer what if for what there. It is just a stylistic choice, yes.
There is a price that is paid on constructing the cell lists. If the number of particles is too small, it it not worth it (or building neighbor lists, etc). But I tested now and with 100 particles it is already faster than a naive loop, so your intuition was correct.
The reason I'm not OK with the example is that I don't see what exactly I'm gaining in using The only thing I would get from using Agents, here, is the function to create the video, since the double loop on particles is implicit on CellListMap, and the integration scheme I had to write manually in both cases. Thus, my feeling is that I don't know what I can really gain by using Agents and its integration with other tools (from Agents itself or from the ecosystem in general), and the example feels artificial. julia> includet("./Particles.jl")
julia> using .Particles
julia> @time WithAgents.simulate();
run! progress: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| Time: 0:00:03
6.561433 seconds (8.04 M allocations: 435.431 MiB, 2.06% gc time, 54.78% compilation time)
julia> @time WithAgents.simulate();
run! progress: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| Time: 0:00:03
3.211807 seconds (64.54 k allocations: 4.913 MiB)
julia> @time NoAgents.simulate();
2.377066 seconds (387.15 k allocations: 21.186 MiB, 12.53% compilation time)
julia> @time NoAgents.simulate();
2.081114 seconds (2.12 k allocations: 622.031 KiB) |
great, thanks!
Hm, I am not sure what you mean by that. The position is a field of a struct, the agent struct. It needs to be concretely typed. We chose
Well this of course depends on the application. However, as far as I can tell there are many reasons to be using Therefore, the best person to do this PR into Agents.jl that adds an integration example is a user that has already benefitted by combining both software, i.e., @ehsanirani .
Here I have to disagree. The point of the example is to show how one can integrate the two software, not how to make novel science nor how to use the most of either. Just how to use both at the same time. Users of Agents.jl will then know how to utilize this to speed up calculations in continuous space scenarios. Users of CellListMap.jl will see that there are frameworks that allow doing really complicated simulations seamlessly, especially when it comes to the data collection part. Also providing incredible plotting infrastructure out of the box.
That's true, but that's also expected, as the agents in Agents.jl are stored in a dictionary that is slower to iterate over. This cost comes from the convenience of being able to identify agents by their ID, which remains unique and unchangeable during the simulation. However, I am surprised that the code is twice as fast. Later I will take a look at the code in more detail because this difference is surprising and much larger than what it should be from the change of vector to dictionary. |
With two very simple changes I immediatelly noticed, the difference in performance becomes much smaller:
code at: https://github.com/Datseris/agents_with_celllistmap Notice that the |
Notice also that besides the fact that in one case we iterate over a dictionary while in the other over a vector, one more reason the performance of using both frameworks will never reach the perforamnce of using only CellListMap.jl is due to the duplicacy of information of the agent positions, as well as updating them twice in the integration case. |
Separating the model creation from the model stepping gives:
which highlights that (1) the difference in memory usage is only during model creation and (2) all performance difference is still in the stepping, not initialization part. I still need to make a version that uses only Agents.jl and |
The agent type could be parametrized for the type of coordinates (i. e. Concerning all the integration importance and the features of Agents: To be clear, I do not doubt that a lot can be gained by using the Agents infrastructure. The thing is that, not being myself (currently) a user of Agents, I simply do not know if what I've written as an Integration example uses the
Yes, that is the interesting one (I would expect more than 2x :-), particularly for large systems, where the computation of pairwise interactions dominate - note that using a vector instead a dictionary allows for simd, memory alignment and locality, etc. There is a lot to be gained there). |
@lmiq could you help me in the following, just to be sure I don't mess up: The d = sqrt(d2)
if d ≤ (ri + rj) and only then forces are calculated. Could you please clarify why there are two "cutoffs" and what does each one succeed in doing? EDIT: And also please tell me what is |
I had a quick glance at this thread, and it looks really cool! A couple things I'd like to add:
While I'm sure you've already thought of this, I think we should only modify the struct if it's absolutely necessary for a clean API and is general enough to be applicable to a large variety of models, for the same reason that we keep pathfinding separate. If CellListMap has a breaking change, to support it we might need a breaking change as well.
The downside here is that deleting and adding agents during a simulation becomes much more complicated. Either we do the That said, it would be cool if we could have our cake and eat it too, as the saying goes :) |
Yeah, I could have simplfied that. The idea there is that each particle has a radius, and the interaction potential is zero if the distance between the particles is greater than the sum of the radii of the two particles involved. The cutoff that is used to build the cell lists must be the sum of the two greater radii of the system, so no interaction is lost. Yet, in the example I am starting all particles with the same radius, because of this default:
(all radii are 10.0), thus the cutoff can be just 20. Clarifying a little bit further: The cell list computation will run over all pairs of particles closer than a cutoff. This is the Yes, |
Yes, certaintly there is some gained flexibility with the current |
Yes, as expected, it is much faster. About 10x. Here are the numbers (code is at https://github.com/Datseris/agents_with_celllistmap)
I have to admit however, that we've spent no time optimizing How about we open a PR for an integration example using the code we all have, and start commenting on that PR how to make things clearer? |
@AayushSabharwal a bit off topic, but here the idea we've breafly chatted about once becomes useful: The "different versions of |
One note: CellListMap provides a see: https://m3g.github.io/CellListMap.jl/stable/examples/#The-CellListMap.neighborlist-function edit: one thing that you inspired me is to implement this as an iterator. I'll see if I can do that. |
Cool, that's great to know! But I think if a user goes for integrating this package, they probably should also go for the extra 5 lines of code that use the low level and faster interface. That is, unless you manage to do the iterator version of |
I was looking at the Agents code now, and maybe it is possible to create a bridge package that implements the same interfaces you have, but with CellListMap on the lower level. Maybe then we would need only to load the package and specify some specific types of models, which would wrap the current models for dispatch. Probably While we build the example I will consider that. |
@lmiq Is there any specific reason that you use |
Yes, sure. That was the first attempt. To effectively parallelize we need to store two additional auxiliary structures. I'll do that asap. |
I have updated all the codes including the parallel version and some additional benchmarks, in my repo (here). If my benchmarks are correct, the 10x speedup is observed in the 1k particles example, but with more particles the difference becomes much greater (200x for 10k particles for example). The parallelization is not good for 1k particles, but provides greater speedups for larger systems. For 100k particles I get 3.5x (with 4 cores/8 threads, in my laptop). julia> include("./Particles.jl")
Activating project at `~/Drive/Work/JuliaPlay/agents_with_celllistmap`
-------------------------
n = 1000
nsteps = 1000
-------------------------
With Agents.jl
0.268133 seconds (176.20 k allocations: 20.066 MiB)
No Agents.jl
0.215426 seconds (174.63 k allocations: 18.273 MiB)
Only Agents.jl
3.096844 seconds (14.01 M allocations: 812.723 MiB, 33.90% gc time)
With Agents.jl, only stepping
serial:
0.275467 seconds (141 allocations: 23.000 KiB)
parallel:
0.303102 seconds (162.46 k allocations: 17.515 MiB)
No Agents.jl, only stepping
serial:
0.210262 seconds (152 allocations: 24.562 KiB)
parallel:
0.219809 seconds (162.18 k allocations: 15.926 MiB)
Only Agents.jl, only stepping
2.674290 seconds (14.01 M allocations: 812.399 MiB, 25.45% gc time)
-------------------------
n = 10000
nsteps = 50
-------------------------
With Agents.jl
0.182498 seconds (81.89 k allocations: 17.305 MiB)
No Agents.jl
0.121114 seconds (70.41 k allocations: 15.874 MiB)
Only Agents.jl
43.372756 seconds (7.01 M allocations: 478.473 MiB, 0.41% gc time)
With Agents.jl, only stepping
serial:
0.371516 seconds (2.06 k allocations: 1.451 MiB)
parallel:
0.177183 seconds (10.41 k allocations: 2.354 MiB)
No Agents.jl, only stepping
serial:
0.216945 seconds (2.02 k allocations: 1.386 MiB)
parallel:
0.116470 seconds (10.06 k allocations: 1.998 MiB)
Only Agents.jl, only stepping
44.099356 seconds (7.00 M allocations: 476.659 MiB, 0.35% gc time)
-------------------------
n = 100000
nsteps = 1
-------------------------
julia> compare(n, nsteps)
With Agents.jl
0.130113 seconds (202.37 k allocations: 70.953 MiB)
No Agents.jl
0.063600 seconds (100.36 k allocations: 56.182 MiB)
Only Agents.jl
*** I killed it after some minutes
I'm not sure "where" do you want to post pull request about this. Feel free to do it. (@ehsanirani is the author of the original code from which I started simplifying, so his opinion and agreement would be important here - also I think his example can be provided in a linked repo for a more interesting and beautiful result). |
I've added a version of the integration example here: https://github.com/lmiq/agents_with_celllistmap/blob/main/celllistmap.jl which I'm commenting and polishing to become a file in your |
In Agents.jl we have a dedicated page for integration examples and hence the example needs to be added here: https://github.com/JuliaDynamics/Agents.jl/tree/main/examples . The examples use Literate.jl to convert runnable Julia files (which we already have) into markdown pages for use with Documenter.jl. I can guide the person that does the PR into correct format. @ehsanirani please let us know whether you will continue from the source code we currently have in @lmiq 's repo and make an example or not. Otherwise I can do it sometime in the near future. |
I was not aware of Literate.jl. I can easily format the file, and will probably use it at the CellListMap docs as well. |
I think @lmiq example is quite satisfactory. But if you mean the bacteria use-case, the repo. here is already revised. I can write some docs maybe in a week or two, if it is still needed by then. |
Yes please, either of you should go ahead and open a PR here which I can then work on as well if need be. |
I'm almost ready with the Literate version here: https://github.com/lmiq/agents_with_celllistmap/blob/main/celllistmap.jl (let us not duplicate work :-) ) |
Added the example file. I hope that it will render correctly (never used that before, but I tried to follow the same path that was done for the other examples). |
Is this issue resolved? If so, can we close it? |
yes. @lmiq had some ideas for further integration, but we should be clean and discuss these in a different issue when the time arrives. |
I just became aware of this package:
https://github.com/m3g/CellListMap.jl
Its goal is to accelerate computations of forces, or any kind of interactions, that occur between pairs of "agents" in 2D or 3D continuous space. Simply put, it does both what
interacting_pairs
currently does, and also computes the forces, which a user would have to do by hand after getting the interacting pairs. Performance benchmarks also look like its better than competing alternatives (common story of Julia packages it would seem).I think the best place to start integrating this package with Agents.jl is to write an integration example that showcases using it to accelerate computations occurring in an ABM. From there we can see what can be made even more general and integrated more formally into the source of Agents.jl.
The text was updated successfully, but these errors were encountered: