Step index of DataCollector Dataframes are not synchronized between agent and model variables #1768

jabrell · 2023-08-17T14:47:10Z

Describe the bug
Using the DataCollector indexes for the model step of the DataFrames coming from .get_model_vars_dataframe() and .get_agent_vars_dataframe() are different if the DataCollector is called after the schedule.step(). I.e., if the model's step function looks like:

def step(self):
    self.datacollector.collect(self)
    self.schedule.step()

The "Step" index of the dataframe returned by .get_agent_vars_dataframe() will start with 1. In contrast, the index of the dataframe from get_model_vars_dataframe() will start with 0. This makes it diificult to merge frames in result processing.

Expected behavior
The indexes referring to the model step should be consistent. In the setting above, I would expect all step index starting at 1 (since the model has made one step before the report).

To Reproduce
The behavior can be reproduce by altering the step function to the code above in the MoneyModel in the Collecting data section of the introductory tutorial

Additional context
I am a newbie to Mesa, so I can only guess on the reasons. My first guess would be that the different index are caused by the different treatment of model and agent_reporters in the collect function of the DataCollector. The model_reporter seems to be list-based and the index is implicitly derived leading to the index starting at zero independently whether the collector is called before or after execution of the step. In contrast, the agent_reporters use the Scheduelers step property which seems to be advanced by one if the collector is called after execution of the step.

The text was updated successfully, but these errors were encountered:

rht · 2023-08-19T10:16:16Z

It's definitely confusing for the use case when you want to merge the 2 DF's. I don't have a good design solution for this. For now you can just do df_model.index += 1 or something.

But even without merging, people could confuse the index of the model DF as step number.

jabrell · 2023-08-21T14:17:35Z

Thats what I ended up doing. I see that point that there is no "easy" fix for this.

To lower the confusion, what about adjusting the docstring of "get_model_vars_dataframe"? Currently it states:
The DataFrame has one column for each model variable, and the index is (implicitly) the model tick.

Maybe changing it to something like:
The DataFrame has one column for each model variable, and the index is equal to the order of reporting these variables. If the data collector is called before the step function of the model, that (implicitly) equals the model step.

rht · 2023-09-09T13:01:42Z

@jabrell we are marking this to be resolved before the next patch release.

NirobNabil · 2024-03-26T10:00:53Z

hello @rht, I was looking into this issue as preparation for gsoc24. Do you think it would be a good idea to change model_vars[var] from Lists to Dictionaries? In that case i can try to submit a PR with accompanied changes.

Also i found some TODO while going through the code, given the extensive discussions going on about revamping datacollector, is it a good idea to work on the TODOs? Because if you are planning to replace the datacollector with new code anyway then working on it wouldn't make sense.

rht · 2024-03-27T07:47:09Z

hello @rht, I was looking into this issue as preparation for gsoc24. Do you think it would be a good idea to change model_vars[var] from Lists to Dictionaries? In that case i can try to submit a PR with accompanied changes.

SGTM. There shouldn't be much performance hit due to switching to dictionary.

It's fine to work on the TODO's for components that are about to be deprecated, for exercise/educational purpose. For example, place_agent will be deprecated, but we accept #2083 anyway.

NirobNabil · 2024-03-28T04:47:13Z

Thank you, I'll get to work right away.

tpike3 added the bug Release notes label label Sep 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Step index of DataCollector Dataframes are not synchronized between agent and model variables #1768

Step index of DataCollector Dataframes are not synchronized between agent and model variables #1768

jabrell commented Aug 17, 2023

rht commented Aug 19, 2023

jabrell commented Aug 21, 2023

rht commented Sep 9, 2023

NirobNabil commented Mar 26, 2024

rht commented Mar 27, 2024

NirobNabil commented Mar 28, 2024

Step index of DataCollector Dataframes are not synchronized between agent and model variables #1768

Step index of DataCollector Dataframes are not synchronized between agent and model variables #1768

Comments

jabrell commented Aug 17, 2023

rht commented Aug 19, 2023

jabrell commented Aug 21, 2023

rht commented Sep 9, 2023

NirobNabil commented Mar 26, 2024

rht commented Mar 27, 2024

NirobNabil commented Mar 28, 2024