Native support for multiple agent types #1894

EwoutH · 2023-11-30T11:56:43Z

On 2023-12-17 this PR was fully updated based on all feedback.

This PR adds a new agents dictionary to the Mesa Model class, enabling native support for handling multiple agent types within models. This way all modules can know which agents and agents type are in the model at any given time, by calling model.agents.

Motivation

NetLogo has had agent types, called breeds, built-in from the start. It works perfectly in all NetLogo components, because it's a first class citizen and all components need to be designed to consider different breeds.

In Mesa, agent types are an afterthought at best. Almost nothing is currently designed with multiple agent types in mind. That has caused several issues and limitations over the years, including:

Especially in scheduling, space and datacollection, lack of a native, consistent construct for agent types severely limits the possibilities. With the discussion about patches and "empty" this discussion done again. You might want empty to refer to all agents or only a subset of types or single type. That's currently cumbersome to implement.

Basically, by always having dictionary available of which agents of which types are in the model, you can always trust on a consistent construct to iterate over agents and agent types.

Conceptualisation

@tpike3 has made a really nice conceptual overview how Mesa now looks with this construct:

Implementation Details

The Model class now uses a defaultdict(list) to manage agents. This eliminates the need for explicit checks when adding new agent types.
Updated the Agent class to automatically register itself with the model upon creation, leveraging the defaultdict functionality.
Modified the remove method in the Agent class, ensuring safe removal of agents.

Advantages

In the agent initialization, the agent gets added to the Model's agents dictionary. Thereby, you can always, from everywhere:

Acces the available agent types (using the model.agent_types property)

Iterate over the different agent types, by:

for agent_type, agents in self.agents.items():
    yield agent_type, agents

Get all the agents of a certain type by model.agents_by_type[AgentType]

This will make proper multiple agent support in all Mesa components (the scheduler, grid, datacollector and visualisation) easier, because there is now a consistent model variable they can access.

Backwards compatibility

This PR itself breaks nothing, it just adds a dictionary to the model.

If you have a single agent, looping over the agents will just mean that only one agent type is iterated over (since there is always at least one agent type). So this work well with both model with a single and multiple agent types.

Future implications

While this PR itself won't break current models, if we modify the scheduler, datacollector or grids to use this construct, it might start breaking models. By using weak references (weakref), we might be able to limit backwards breaking changes.

Review

Since this is a very important change to the core of Mesa, I recommend really critically reviewing this PR, both conceptually (what are the implications for users) and implementation wise (is this the most efficient, scalable and flexible solution).

Any feedback is appreciated!

codecov · 2023-11-30T11:58:28Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (76229a5) 77.35% compared to head (2154cb1) 77.53%.
Report is 4 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1894      +/-   ##
==========================================
+ Coverage   77.35%   77.53%   +0.17%     
==========================================
  Files          15       15              
  Lines        1007     1015       +8     
  Branches      220      221       +1     
==========================================
+ Hits          779      787       +8     
  Misses        197      197              
  Partials       31       31

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

EwoutH · 2023-11-30T12:10:35Z

mesa/model.py

@@ -41,6 +41,29 @@ def __init__(self, *args: Any, **kwargs: Any) -> None:
        self.running = True
        self.schedule = None
        self.current_id = 0
+        self.agents_by_type = {}
+
+    def create_agent(self, agent_type, *args, **kwargs):


@rht Could we type hint that agent_type should be an instance of Agent? How would that work?

If I am not mistaken it should be the same as in the agent class where you pass in the model model: Model so it would be agent:Agent you will need to import the Agent module though.

rht · 2023-11-30T13:58:36Z

I'd say a less drastic change would be to modify the schedulers. Currently, there needs to be a permutation of [random activation, staged activation, simultaneous activation, discrete event, ...] x [1 agent, multiple agent]. This needs to be reduced to 1D of [random activation, staged activation, simultaneous activation, discrete event, ...] that is multiple type by default.

There are 2 problems:

if you specify agents_by_type in the model, the implementation-specific collection container will be tied to the model class and can't be changed. There could be a fast backend typed dict implementation via Numba/Cython/etc, and if you solder in the agents_by_type, it is no longer modular.
the agents container is replicated 3x in 3 objects: the model, scheduler, and space.

rht · 2023-11-30T14:02:26Z

the agents container is replicated 3x in 3 objects: the model, scheduler, and space.

This is not the case given that you said the scheduler will access model.agents_by_type directly, so that it is only 2x.

rht · 2023-11-30T14:11:38Z

There could be a fast backend typed dict implementation via Numba/Cython/etc, and if you solder in the agents_by_type, it is no longer modular.

Actually, it could still be modular even in the model class. There could be an argument container_backend that could be default, numba_typeddict, etc.

Note this all can be implemented without breaking compatibility by modifying the schedulers to by multiple type by default, so can happen without having to wait until Mesa 3.0. The remaining issue of whether to put the agents_by_type in the model or in the scheduler can be decided later.

rht · 2023-11-30T14:25:54Z

Another problem is that this is limited to grouping based on agent classes. Sometimes you want to "groupby" agents with various other metrics (inside/outside a bar in the El Farol model, the condition in Epstein civil violence model, or solvent/insolvent bank). This is the same issue I raised in the multi-agent data collection thread #348 (comment). See also https://github.com/tpike3/multilevel_mesa.

EwoutH · 2023-11-30T15:01:09Z

Thanks for the extensive replies!

if you specify agents_by_type in the model, the implementation-specific collection container will be tied to the model class and can't be changed. There could be a fast backend typed dict implementation via Numba/Cython/etc, and if you solder in the agents_by_type, it is no longer modular.

Sorry, I don’t understand the problem fully. Could you maybe give an example?

This is not the case given that you said the scheduler will access model.agents_by_type directly, so that it is only 2x.

Is there a fundamental reason we couldn’t keep everything in the model and let both the scheduler and the grid access if from there.

The remaining issue of whether to put the agents_by_type in the model or in the scheduler can be decided later.

For me, model feels like the right place. It’s the big overarching construct, from which other modules can access.

Another problem is that this is limited to grouping based on agent classes. Sometimes you want to "groupby" agents with various other metrics (inside/outside a bar in the El Farol model, the condition in Epstein civil violence model, or solvent/insolvent bank).

I think you’re spot on here: this is another problem. I’m suggesting native support for different kind (classes) of agents throughout the model, like NetLogo has breeds. You’re suggesting a universal way to filter agents on current state values. State values are parametric differences, while different classes are structural. So I think that are two different problems.

rht · 2023-11-30T15:10:34Z

if you specify agents_by_type in the model, the implementation-specific collection container will be tied to the model class and can't be changed. There could be a fast backend typed dict implementation via Numba/Cython/etc, and if you solder in the agents_by_type, it is no longer modular.

Sorry, I don’t understand the problem fully. Could you maybe give an example?

I already replied that it is not a problem, because you could pass a custom agent container constructor to the class.

Is there a fundamental reason we couldn’t keep everything in the model and let both the scheduler and the grid access if from there.

The grid/space is its own agent container that stores spatial relations between the agents. Needed for getting the neighbors.

I think you’re spot on here: this is another problem. I’m suggesting native support for different kind (classes) of agents throughout the model, like NetLogo has breeds. You’re suggesting a universal way to filter agents on current state values. State values are parametric differences, while different classes are structural. So I think that are two different problems.

If I were to summarize. #348 (comment) does collect() based on various ways to structure the agents. In this case, it is step() on various ways to structure the agents.

rht · 2023-11-30T17:20:52Z

I suppose I need to describe some concrete examples. I replaced model.agents_by_type with model.agents so that it is more concise.

# staged activation by type and condition
# Both wolf and sheep
model.agents.do_each("move")
model.agents.get_by(class=Sheep).do_each("eat_or_die")
model.agents.get_by(class=Wolf).do_each("hunt_or_die")
# Both wolf and sheep
model.agents.get_by(condition=lambda a: a.check_is_ready()).do_each("reproduce")

# RiskAverseTrader and AdaptiveTrader
# This is a string because a group could be a class that both classes inherit from, or for some conditional.
# The group argument is for when it is final, i.e. unchanged after the model initialization. For dynamic grouping, use the condition argument.
model.agents.get_by(group="trader").do_each("trade")

It's a bit more verbose than NetLogo's.
It's not declarative, because I suppose the scheduler is meant to describe chronology of actions.

EwoutH · 2023-11-30T18:27:24Z

This is a great example and I really like the cleanness. It reads well from left to right, like proper Python.

It's also nice to see the step not being one solid thing, but being modular and flexible. The Staged and ByType schedulers are also not that necessary anymore, and could maybe be replaced.

I have another idea, I will try to work it out tomorrow. But I already like this discussion and talking about the different options!

rht · 2023-11-30T18:52:14Z

I have another idea, I will try to work it out tomorrow. But I already like this discussion and talking about the different options!

I was using system 1 (in this context) to write that. But deliberating on system 2 is better, and need some time.

EwoutH · 2023-12-01T11:02:21Z

I started typing out this long detailed reply, only come to the conclusion that I basically ended up with the same solution you did.

model.agents solves the problem to access all agents from everywhere
get_by() solves the problem to easily and quickly select a subset of agents

I have still a few questions however:

What is agents exactly? Still a dict? Or a dataframe?
How does get_by() work on agents? Is it a method of it? (I imagined get_by as a model method, returning a set of agents)
What arguments does do_each() take? An agent or model function? As string? And probably something to randomize the order?

I would really like to see an implementation for your approach, since I think it's the right direction! We might still encounter some minor things but generally I really like it.

Okay, is my idea. I don't necessarily know if it's better, but maybe we can reach best of both.

It's based these premises:

A different agent class is by definition an different agent type. If you model an agent like a different class, there is some reason you felt the need to subclass, which could either be having different attributes or different methods. Otherwise you could just define an attribute for classification, like strategy is "risk averse" or "adaptive".
Since a different class has different attributes or methods, it is not guaranteed to be compatible among functions, so they should (by default) not be called together

Then, like I said before, I see this as two different problems:

A model-wide construct to keep track of which agents and agent types there are, which can be accessed from all modules
A efficient way to select/filter agent based on type, properties or states.

(note that in Python, a property and a state are both modelled as an agent attribute. How ever, the convention is to call things that don't change during the model run properties (they are static) and things that do change states (they are dynamic).

Now, the solution for problem 1) I have offered in this PR. I think that still is the best way to do it, just keep a dictionary of types. Is conceptually easy, straightforward to use and aligns well with PEP 20 in my opinion.

There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.

(I know I'm cheating since I'm Dutch)

Then, the solution for problem 2. I would propose to have an select_agents method added in the Model class that can select a subsection of agents based on different criteria. It could looks like this:

Edit: Started working on it and we basically have the same solution.

tpike3 · 2023-12-02T12:04:25Z

I started typing out this long detailed reply, only come to the conclusion that I basically ended up with the same solution you did.

model.agents solves the problem to access all agents from everywhere

get_by() solves the problem to easily and quickly select a subset of agents

I have still a few questions however:

What is agents exactly? Still a dict? Or a dataframe?

How does get_by() work on agents? Is it a method of it? (I imagined get_by as a model method, returning a set of agents)

What arguments does do_each() take? An agent or model function? As string? And probably something to randomize the order?

I would really like to see an implementation for your approach, since I think it's the right direction! We might still encounter some minor things but generally I really like it.

@EwoutH So this would be my take based on this conversation

Agents are python objects that are stored in a dictionary in mesa.time.py. Type only exists right now in RandomActivationByType to do get_by() should be fairly trivial to reference the dictionary key which will then return a list of those agent objects.*

do_each would then reference the function in that agent type object, So you would randomize the order of agents in the list which is already part of RandomActivationByType returned when you called get_by() and then iterate through the list executing the function passed in via do each()

*This leads to the question of whether the agents by type should be more basic (e.g. part of the time.py base scheduler), but code a little test a little :)

EwoutH · 2023-12-03T17:54:02Z

Thanks for getting back!

do_each would then reference the function in that agent type object, So you would randomize the order of agents in the list which is already part of RandomActivationByType returned when you called get_by() and then iterate through the list executing the function passed in via do each()

Wait, so you first get all the agent step functions, and then filter them? That sounds less efficient than first filtering the list of agents and then executing the step functions of those agents (but I'm not sure).

Originally, I imagined a model.select_agents() method that returned a selection of agents. Something like (from the model perspective):

rich_traders = self.select_agents(class=Trader, attributes=["wealth">=5])
total_wealth_rich_traders = sum(a.wealth for a in rich_traders)

However, I liked rth's proposal just as much, especially the do_each() part.

This leads to the question of whether the agents by type should be more basic (e.g. part of the time.py base scheduler), but code a little test a little :)

So in this proposal and in my opinion, definitely yes. And even more general than only in time.py, it should be in the model by default so you can always trust in any Mesa Module, Class and method that the type is available and you can iterate over it.

Agents are python objects that are stored in a dictionary in mesa.time.py.

This PR proposes to change that aspect (see the code diff and commit/PR messages). The agent types should be accessible from both the model (model.agents_by_type, or model.agents like rht proposed) and each agent (agent.agent_type, which is just the Class).

I'm now really curious for a full (draft) implementation of rht's proposal. @rht are you willing to make one?

tpike3 · 2023-12-08T12:26:24Z

Apologies for the delay, This is a pretty big architectural change, but merits discussion. And it makes me realize our current architectural drawing is out of date (below).

My thought, would be to discuss the architecture as I think it will reduce the cognitive load on the discussion. The idea would be I update the architecture design and then we can discuss around that. Let me know what you think.

EwoutH · 2023-12-13T12:59:49Z

@tpike3 looking forward to your new conceptualisation!

AgentPy also uses a construct like this in almost all their example models:

class VirusModel(ap.Model):
    def setup(self):
        # Create agents
        self.agents = ap.AgentList(self, self.p.population, Person)

class WealthModel(ap.Model):
    def setup(self):
        # Create agents
        self.agents = ap.AgentList(self, self.p.agents, WealthAgent)

So the concept of having one single variable that contains all the agents - in our case stored by type - in not new in Python ABM modelling.

EwoutH · 2023-12-16T12:42:04Z

Todo:
~~- Write tests for create_agent() method~~
~~- Update Boltzmann Wealth Model / tutorial~~

Edit: Both are not relevant anymore, the new implementation already takes care of both.

EwoutH · 2023-12-17T10:08:42Z

@quaquel thanks for you feedback. I looked at it, and have the following thoughts:

Using class names (.__name__) instead of direct class reference: I think going with class references is more logical in Mesa. We might want to filter, select, check or do other things depending on the type. Having the class reference themselves is then more flexible, it allows direct method access, inspection and type checking.
- As example, we might have a class Predator, which is a subclass of Animal. If we want to filter all animals, we can easily do that with the class reference (with isinstance()), but not with the class name.
Using the __subclasses__ dunder method: I didn't know this existed and this is very smart! The only concern I have with using that __subclasses__() method in Python does not track instantiation; it only tracks class definitions. So if we have some definitions but don't use them in a model run, there might be classes where there are no instances in the model of, leading to unexpected behavior.
```
class Model:
    @property
    def agent_types(self):
        return Agent.__subclasses__()
```
Considering that, an variable that's updated by the create_agent() and remove_agent() methods might be more fitting.
Specifying type as Property in the Agent Class instead of assigning it on init: I see the advantages of having it be a property and it's probably best practice. My only concern is the readability for less experienced programmers. I don't think I've ever encountered a model where the Agent class is changed during the model run, because that's the only case I can think of where it would be beneficial to define type as a property.
```
class Agent:
    @property
    def type(self):
        return self.__class__
```
Using Weak References is a really interesting idea. It would looks something like this right?
```
class Model:
    def __init__(self, *args: Any, **kwargs: Any) -> None:
        self.agents_by_type = weakref.WeakValueDictionary()

    def create_agent(self, agent_type: type, *args, **kwargs):
        """Creates an agent of the given type with a unique ID."""
        agent = agent_type(self.next_id(), self, *args, **kwargs)
        self.agents_by_type.setdefault(agent.type, weakref.WeakSet()).add(agent)
        return agent
```
It would provide benefits in automatic cleanup and provide a backup if agents are removed using something else than the remove_agent() method, preventing memory leaks. However, I think the model would be the only place that they are explicitly referenced, assuming we modify the schedulers to use the model's agents_by_type in the future. If that only reference is a weak reference, it might . Considering that, I'm slightly leaning towards a robust remove_agent() method and keeping it a regular dict. That would be in line with the concept that in the end, the Model knows what Agents are part of it.
- It might be interesting to use weak references in the scheduler in the future however!

Thanks again for you suggestions, I'm learning a lot diving deeper in those Python constructs and thinking critically about how they would apply in our situation. So if you have the time, keep it coming!

@rht, @Corvince and @wang-boy, I'm also really curious on your takes on this!

quaquel · 2023-12-17T11:29:15Z

Just some thoughts without diving too deeply into MESA itself but coming at it as a generic devs/abm modeler with ample Python experience who also happens to teach ABM with MESA to MSc. students (and has build a pPython connector to NetLogo without ever having built a single NetLogo model).

I agree that having an agent_by_type overview at the model level is a good idea.

Similarly, keeping track of agents within the model should not be a concern of the user. This means using weakrefs everywhere (so in all schedulers and spaces) except in one place. You need to use hard refs in one place because, otherwise, agent instances are garbage collected once all references have been removed.Model.agent_by_type might be an ideal place for this one hard reference. Using weakrefs means no endless bookkeeping and removes possible locations for memory leaks.

What I don't like is burdening the user with additional api calls when creating and killing agent instances (i.e., create_agent and remove_agent). In particular, since all of this can, in principle, be handled within the agent class. As long as the user subclasses this and calls super in the __init__, the rest should just work. This means probably tying into some of Python's dunder methods. For example, __new__ and __subclasses__ can play a role in this.

When creating a new Agent instance, this should handle the creation of the identifier (do you even really need this....?) as well as adding the instance to Model.agent_by_type. If you want to remove an agent, you probably need a kill method on the agent class. This kill method should then remove the agent from Model.agent_by_type. This means that you have to think carefully about the data structure for Model.agent_by_type. It seems like defauldict(set) might be the easiest solution, although some benchmarking is probably in order. The reason for a kill method on the agent is that there is no Python equivalent for __new__ or __init__. The destructor dunder method is __del__, but this is only called once all references to the instance have been removed.

As an aside, I think looking at NetLogo for how you want to handle anything is strange. NetLogo is based on a now 50-year-old language. Why confine yourself to choices made 50 years ago if you can access a much more modern language with very sophisticated OO features?

EwoutH · 2023-12-17T11:46:36Z

Right, I forgot that Agents have access to their model instance from very start, since they’re initiated with it. I’m my head is was a problem, because I thought some things needed to happen from the model scope (so in a model method like create_agent), but since the agent knows it’s model, it can just happen from the agent __init__.

Thanks, I’m going to try to move the create and kill logic into the agent.

Edit: Not having to move people to start using create_agent() is also a major benefit that improved backwards compatibility. Obviously the way to go.

EwoutH · 2023-12-17T17:49:15Z

I reimplemented this PR fully in c29ab08 based on the discussions and updated the PR description. In short:

I removed the create_agent() model method, that's now handled in the Agent init
I replaced and renamed the agent_by_type dict to a agents defaultdict
I added a remove() method to the Agent
I use type(Agent) instead of Agent.__class__ (its more pythonic)
I added a agent_types property that gives a list of all agent classes in the model

With these changes, there is now no backwards incompatible behavior and users don't need to modify existing models.

mesa/agent.py

Tracks agents in the model with a defaultdict. This PR adds a new `agents` dictionary to the Mesa `Model` class, enabling native support for handling multiple agent types within models. This way all modules can know which agents and agents type are in the model at any given time, by calling `model.agents`. NetLogo has had agent types, called [`breeds`](https://ccl.northwestern.edu/netlogo/docs/dict/breed.html), built-in from the start. It works perfectly in all NetLogo components, because it's a first class citizen and all components need to be designed to consider different breeds. In Mesa, agent types are an afterthought at best. Almost nothing is currently designed with multiple agent types in mind. That has caused several issues and limitations over the years, including: - projectmesa#348 - projectmesa#1142 - projectmesa#1162 Especially in scheduling, space and datacollection, lack of a native, consistent construct for agent types severely limits the possibilities. With the discussion about patches and "empty" this discussion done again. You might want empty to refer to all agents or only a subset of types or single type. That's currently cumbersome to implement. Basically, by always having dictionary available of which agents of which types are in the model, you can always trust on a consistent construct to iterate over agents and agent types. - The `Model` class now uses a `defaultdict` to store agents, ensuring a set is automatically created for each new agent type. - The `Agent` class has been updated to leverage this feature, simplifying the registration process when an agent is created. - The `remove` method in the `Agent` class now uses `discard`, providing a safer way to remove agents from the model.

In the tests the MockModel currently isn't properly initialized, because super().__init__() isn't called, and thus the Model init isn't ran. This commit fixes that.

EwoutH · 2023-12-17T19:15:05Z

Final review checklist

PR description is updated including conceptual model
No breaking changes or user actions needed
Docstring is updated
Tests are added
Commits are cleaned up:
- b8b490c contains the feature changes
- 6b849da fixes old test
- 542a9b8 updates the model docstring
- a74b120 adds tests for new functionality
CI passes
Code coverage is increased

@tpike3 and @jackiekazil please do final review. Let's get this merged!

Corvince · 2023-12-17T22:03:34Z

I am sorry for putting on brakes on this, but I think one detail needs to be changed.

This stores agents in sets. While this appears to be the right datatype it has one big disadvantage: In Python the order of items in a set is non-deterministic. So in contrast to dictionaries they don't preserve insertion order.

Since mesa is used for research we shouldn't introduce non-deterministic behavior to allow completely reproducible results.

Since dictionaries work very similar, you can probably (ab)use dictionary keys instead of using a set.

Otherwise this really looks very clean and since it has no breaking changes very mergeable.

lists keeps insertion order.

EwoutH · 2023-12-17T22:38:09Z

In Python the order of items in a set is non-deterministic. So in contrast to dictionaries they don't preserve insertion order.

Fair point. Changed from a defaultdict of sets to a defaultdict of lists (see a8bb15f).

quaquel · 2023-12-18T06:56:34Z

You are taking a massive performance hit by going to lists when you want to remove agents.

I also fail to understand why, for research purposes, the order needs to be deterministic. The primary purpose of this is to keep track of agents per type. If you want deterministic behavior, you can always do a sort on ID and return a list. This would solve the performance loss while also ensuring deterministic behavior. Also, there are more sophisticated data structures that could be created that have the retrieval speed for removal of a set while providing deterministic behavior (e.g., ordered sets or some more involved linked list kind of thing such as a dict of dicts: [type]:{[id]:instance}).

So the question becomes: what is done more frequently: removing agents from the dict or retrieving the collection of agents for a specific agent type?

EwoutH · 2023-12-18T08:47:26Z

I did a benchmark to see the impact. Since most models never remove agents, it tool the Wolf-Sheep model from mesa-examples and ran 5 times with the default settings, 125 times. Here are the results:

The t-test is insignificant.

from scipy import stats
stats.ttest_ind(list_results, set_results)

TtestResult(statistic=-1.2073182676004544, pvalue=0.22845980311728672, df=248.0)

mesa_with_wolfsheep_benchmark.zip

Given that with a model that frequently deletes agents, the performance impact is unnoticeable, I would say lets go with the lists.

If we find some weird edge-case later we can always patch it. Perfect is the enemy of done.

EwoutH · 2023-12-18T10:37:41Z

@quaquel did some benchmarks (thanks!):

self.agents = defaultdict(list)
self.agents[type(agent)].append(agent)

Performance for adding and removing 1000 agents: 2.06 ms ± 3.11 µs

self.agents = defaultdict(set)
self.agents[type(agent)].add(agent)

Performance for adding and removing 1000 agents: 439 µs ± 1.3 µs

self.agents = defaultdict(dict)
self.agents[type(agent)][agent] = None

Performance for adding and removing 1000 agents: 460 µs ± 2.11 µs

All options are imperfect:

Lists are slow (and allow duplicates)
Sets don't keep insertion order (making runs non-deterministic)
dicts are ugly in usage (due to their redundant None values)

What we need in an OrderedSet: https://discuss.python.org/t/add-orderedset-to-stdlib/12730/85

Unfortunately, that doesn't exist in the standard library (yet).

There are some packages, the most popular being ordered-set: https://pypi.org/project/ordered-set/

from ordered_set import OrderedSet

self.agents: defaultdict[type, OrderedSet] = defaultdict(OrderedSet)

But we have to add that to the requirements.

Personally, I'm happy with:

Using lists
Using the OrderedSet from ordered-set

The dict with all values None I find simply too ugly from and UX point of view. Teaching Mesa is complicated enough as it's already. You wouldn't have to specify .keys() everytime you want some set of agents.

EwoutH · 2023-12-18T11:03:42Z

Benchmarked ordered-set, it's extremely slow (22.9 ms ± 1.09 ms, written in pure Python).

There's another package called Ordered-set-37 and it's a lot faster, almost as fast as the other set/dict options (615 µs ± 2.34 µs).

So, let's vote:

🚀: use defaultdict(dict) = None
🎉: use defaultdict(OrderedSet)
❤️: use defaultdict(list)

Corvince · 2023-12-18T11:57:57Z

Thanks for the performance benchmarks! I think for large lists, list should even perform much worse.

I voted for dict. It's a bit ugly, but it's also what happens for Ordered Set 37 and I think explicitly handling it in mesa instead of a hidden in a dependency is more explicit

tpike3 · 2023-12-18T11:59:25Z

self.agents = defaultdict(dict)
self.agents[type(agent)][agent] = None

This is really great @EwoutH! Substantive discussion and forward progress. I am very much leaning to option 1 but instead of the None would be possible to do

self.agents[type(agent)][agent.unique_id] = agent

This will get rid of the None values and although you will have things like .keys() I think that can also be a plus as it will give users more flexibility. In addition I would not be opposed to adding those simple functions (e.g. .keys(), .values() to model so users can get lists of agent ids or agent objects in an effort to mitigate your UX concerns. Let me know what you think

tpike3 · 2023-12-18T12:00:49Z

Thanks for the performance benchmarks! I think for large lists, list should even perform much worse.

I voted for dict. It's a bit ugly, but it's also what happens for Ordered Set 37 and I think explicitly handling it in mesa instead of a hidden in a dependency is more explicit

I agree and tried to think of a middle ground, let me know what you think

Corvince · 2023-12-18T12:01:13Z

Ah I didn't catch the last part of your comment. Dict should only be used internally and we should return a list via a getter. Turning the keys into a list should be negligible in terms of performance

EwoutH · 2023-12-18T12:09:47Z

Thanks for both your responses. My biggest concern is that with the dict you can't simply do

model.agents[SomeType]

anlymore. Now you have to add .keys(), and that's just ugly. I already hear a hunderd students ask "What does the .keys() do?" and then I have to explain it's a stupid implementation detail.

You can indeed solve that by hiding everything behind utility function, and while I'm a big fan of utility functions, in this case I think it just shouldn't be needed. It doesn't follow the spirit of PEP 20.

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Flat is better than nested.
Readability counts.
Special cases aren't special enough to break the rules.

Having agents be a dict is just so much more explainable than having it be a nested dict, even if the values are just None.

tpike3 · 2023-12-18T12:29:55Z

I registered my vote, but this is a great discussion and I am excited to see how it ends up. My two cents for what is worth are very much in line with @Corvince

I do not think we should import the Ordered Set 37 when we can just handle it natively
I like the performance of the dictionaries particularly as I saw someone run a model with Mesa last November with 22,921,301 nodes (https://github.com/njiang8/Shareable-Reusable-Synthetic-Population-AND-Three-Agent-Based-Models )

However, I do appreciate @EwoutH concerns having my own students struggle with dictionaries. Regardless, great job @EwoutH however this ends up it is going to be an awesome move forward.

EwoutH · 2023-12-18T12:35:51Z

Okay, it seems the majority is for a dict with None values. Let's do that.

I propose to handle all agent selection utility in the proposed select_agents() method from #1905.

This commit introduces dics nested into a single defaultdict for agent storage, using agent instances as inner keys and None as inner values. The decision is based on: 1. Determinism: A dictionary ensures deterministic behavior for reproducible research outcomes. 2. Performance: Benchmarks showed that dictionaries offer a good balance between performance and functionality, especially in models with frequent agent updates. 3. Usability and Explicitness: While the use of None values is unconventional, this approach is practical and avoids the complexity of nested structures or external dependencies. It aligns with clear and explicit coding practices, making the framework more accessible and maintainable. The choice of a defaultdict with None values addresses the need for deterministic behavior, performance efficiency, and clarity in implementation. Additional utility functions like `select_agents()` will be added to enhance usability. Discussion for historical reference: projectmesa#1894 (comment)

EwoutH · 2023-12-18T12:47:38Z

Implemented in 2154cb1. Let's get this merged.

tpike3

LGTM--- Thanks @EwoutH

And I always appreciate you improving the docstrings as you go!

EwoutH · 2023-12-18T13:10:05Z

Thanks everyone for all the feedback and critical thinking. We've made a great step here!

Next up: Selecting subsets of agents: #1905.

Corvince · 2023-12-18T13:13:51Z

Thanks for both your responses. My biggest concern is that with the dict you can't simply do
model.agents[SomeType]
anlymore. Now you have to add .keys(), and that's just ugly. I already hear a hunderd students ask "What does the .keys() do?" and then I have to explain it's a stupid implementation detail.

Yes, this should absolutely be the right syntax and I think we should make this work. I think we can implement our own OrderedSet (light) class by simply creating a class that inherits from dict and setting getitem. And we would use this to set the agents internally as _agents: defaultdict[type, OrderedSet] and public as agents: dict[type, list(agent)].

Native support for multiple agent types #1894

Native support for multiple agent types #1894

Conversation

EwoutH commented Nov 30, 2023 • edited Loading

Motivation

Conceptualisation

Implementation Details

Advantages

Backwards compatibility

Future implications

Review

codecov bot commented Nov 30, 2023 • edited Loading

Codecov Report

EwoutH Nov 30, 2023

Choose a reason for hiding this comment

tpike3 Nov 30, 2023

Choose a reason for hiding this comment

rht commented Nov 30, 2023

rht commented Nov 30, 2023

rht commented Nov 30, 2023

rht commented Nov 30, 2023

EwoutH commented Nov 30, 2023 • edited Loading

rht commented Nov 30, 2023

rht commented Nov 30, 2023

EwoutH commented Nov 30, 2023

rht commented Nov 30, 2023

EwoutH commented Dec 1, 2023

tpike3 commented Dec 2, 2023

EwoutH commented Dec 3, 2023 • edited Loading

tpike3 commented Dec 8, 2023

EwoutH commented Dec 13, 2023

EwoutH commented Dec 16, 2023 • edited Loading

EwoutH commented Dec 17, 2023

quaquel commented Dec 17, 2023

EwoutH commented Dec 17, 2023 • edited Loading

EwoutH commented Dec 17, 2023

EwoutH commented Dec 17, 2023 • edited Loading

Corvince commented Dec 17, 2023

EwoutH commented Dec 17, 2023

quaquel commented Dec 18, 2023 • edited Loading

EwoutH commented Dec 18, 2023

EwoutH commented Dec 18, 2023

EwoutH commented Dec 18, 2023

Corvince commented Dec 18, 2023

tpike3 commented Dec 18, 2023 • edited Loading

tpike3 commented Dec 18, 2023

Corvince commented Dec 18, 2023

EwoutH commented Dec 18, 2023 • edited Loading

tpike3 commented Dec 18, 2023

EwoutH commented Dec 18, 2023

EwoutH commented Dec 18, 2023

tpike3 left a comment

Choose a reason for hiding this comment

EwoutH commented Dec 18, 2023

Corvince commented Dec 18, 2023

EwoutH commented Nov 30, 2023 •

edited

Loading

codecov bot commented Nov 30, 2023 •

edited

Loading

EwoutH commented Nov 30, 2023 •

edited

Loading

EwoutH commented Dec 3, 2023 •

edited

Loading

EwoutH commented Dec 16, 2023 •

edited

Loading

EwoutH commented Dec 17, 2023 •

edited

Loading

EwoutH commented Dec 17, 2023 •

edited

Loading

quaquel commented Dec 18, 2023 •

edited

Loading

tpike3 commented Dec 18, 2023 •

edited

Loading

EwoutH commented Dec 18, 2023 •

edited

Loading