Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Solara: Let users rewind sessions #1805

Closed
wants to merge 4 commits into from
Closed

Solara: Let users rewind sessions #1805

wants to merge 4 commits into from

Conversation

Corvince
Copy link
Contributor

Hi there!

I decided to re-implement one of my favorite experimental feature that I never managed to publish to the old visualization server. Its a feature I think very few ABM libraries are implementing, although I find it quite powerful and useful.

It basically implements that you can go back in time and analyze each step.

Let me show you a quick screencast of the feature

screen-capture.webm

As you can see with this change it is possible to go back to previous steps. @rlskoeser since you are very active at the moment, do you think this would be a useful feature for you? This is also a stepping stone on my way to implement another possibly interesting feature and that is running models (with different settings) side-by-side.

And I know this will come up in the discussion: Yes this implementation seems to be a bit RAM hungry in that each model step is saved. In reality I couldn't find any notable spikes in my ram usage, so we shouldn't make this a problem before it is one. Remember that we are dealing with visualisable models that tend to be fairly small (more testing would be needed for larger models)

@codecov
Copy link

codecov bot commented Sep 15, 2023

Codecov Report

Attention: 40 lines in your changes are missing coverage. Please review.

Comparison is base (7f5a32e) 81.47% compared to head (cd9184b) 81.15%.
Report is 13 commits behind head on main.

❗ Current head cd9184b differs from pull request most recent head 9480ced. Consider uploading reports for the commit 9480ced to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1805      +/-   ##
==========================================
- Coverage   81.47%   81.15%   -0.32%     
==========================================
  Files          15       15              
  Lines         896      897       +1     
  Branches      193      197       +4     
==========================================
- Hits          730      728       -2     
- Misses        142      146       +4     
+ Partials       24       23       -1     
Files Coverage Δ
mesa/experimental/jupyter_viz.py 38.69% <34.42%> (+4.02%) ⬆️

... and 2 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@rht
Copy link
Contributor

rht commented Sep 16, 2023

RAM issue is a solvable problem. You can store just enough info to recreate the model state at any given time, and this includes capturing only 1 step of data collector output, not the whole timeseries, for any particular step.

See https://github.com/Logende/mesa-replay/blob/main/mesa_replay/cacheable_model.py.

@rht
Copy link
Contributor

rht commented Sep 16, 2023

Though the space is probably the most RAM consuming part, unless it can be stored in a memory efficient way. That said, for huge model space, the space for each step can always be serialized to a file.

@rht
Copy link
Contributor

rht commented Sep 16, 2023

This is a very cool economy simulator: https://news.ycombinator.com/item?id=37527773, that has buttons to skip step by 4 or 200. And it is a space-less model that still has visualization. This would be a fun student project to recreate in Mesa.

@tpike3
Copy link
Contributor

tpike3 commented Sep 16, 2023

This is super cool @Corvince! I think the RAM challenge is just something to continually improve and mitigate as @rht commented. But this feature and its value added, makes the RAM concerns worth the cost.

@Corvince
Copy link
Contributor Author

Thanks for the kind words @tpike3

And maybe my wording was a bit unclear, but I haven't even seen ram usage as an issue so far. So I totally agree this is something we can mitigate if it comes up as an issue. It is just something I was worried about while developing but so far it seems fine.

@rht
Copy link
Contributor

rht commented Sep 20, 2023

For this to be mergeable, it needs to be optional, as a separate component. @tpike3 commented that having a minimalist default UI helps a lot so as not to overwhelm students.

The current jupyter_viz.py is still small, but this seems to be a suitable time to split it into several files. I'd say the space drawers and plot drawers should be moved to a separate file instead of the UI controllers.

@Corvince
Copy link
Contributor Author

Corvince commented Sep 20, 2023

I agree with most of your comment, but lets dissect it a bit to see where we agree and possibly disagree. I think this PR actually can serve as a good base for the things you want.

For this to be mergeable, it needs to be optional, as a separate component. @tpike3 commented that having a minimalist default UI helps a lot so as not to overwhelm students.

This gave me some ideas how to further simplify the code and now these controls are available as TimelineControls and a simplified version with only reset, step and play as BaseControls. This is toggleable via a new JupyterViz parameter with a boolean timeline. The further simplification makes it theoretically possible to use these controls without any model - they are fully control-agnostic for maximal reusability.
/edit but I really would like the timeline controls to be the default. For now I "reverted" to the BaseControls to avoid overwhelming students - but I am afraid this feature will be less discoverable now.

The current jupyter_viz.py is still small, but this seems to be a suitable time to split it into several files. I'd say the space drawers and plot drawers should be moved to a separate file instead of the UI controllers.

I think the file is already quite large so I totally agree on splitting - but I wouldn't want to do it as part of this PR, this should be a separate PR. From React projects I am used to have one component per file.

Also I would really like to have this PR merged soonish, because it bakes in some nice code simplifications. Foremost model is now a state variable that is newly set for every step. This means I could remove reset_counter and current_step to trigger any state changes. This also means that means problems like in #1789 are removed - we don't need something else to trigger a rerender.

My vision for our frontend is to have each component act as independently of each other as possible. Thus JupyterViz would act as the default glue between them and most of the dependent logic should happen there. Especially something like a SpaceView should be callable on its own to display single components in a notebook environment.

@Corvince Corvince force-pushed the solara-rewind branch 3 times, most recently from 99084a8 to 0dd6fcc Compare September 20, 2023 20:23
max_step=max(model_cache.keys()),
) if timeline else BaseControls(
play_interval=play_interval,
on_step=handle_step,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

handle_step does caching regardless of whether the control is TimelineControls or BaseControls. It's not a problem for models that can be visualized, but it's still consuming more RAM than necessary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarification: *It's not a problem for the scale of models that can be visualized, which is usually small.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but it's still consuming more RAM than necessary.

There is the use case when the viz server needs to be hosted in an AWS EC2 t2.nano (0.5 GB) or t2.micro (1 GB).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, handle_step only does caching if you specify a specific step. If it is called without a step (like the BaseControls do) the model is simply advanced. Added a docstring to handle_step to clarify this. There is still the model cache and to keep the code simple it always holds the initial step. But I think we can live with 1 additional copy. That might also be handy if someone wants to add the option to go to the first step later, to re-run the model. Currently it is a bit ambiguous if the reset button should just go back to the first step or regenerate a new model (It does the latter but there is no reason why it shouldn't just rewind).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But in the hypothetical situation when I am not familiar with the frontend code, and want to read a small subset of this component, I have to read the docstring about caching and so on. Whereas if you have a simpler handle_step for the simpler controller, I can grasp the structure better before moving on to a more complicated component.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As in, have 2 handle_step-like functions for each controller classes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But in the hypothetical situation when I am not familiar with the frontend code, and want to read a small subset of this component, I have to read the docstring about caching and so on.

That doesn't make sense. If you want to understand the code, why wouldn't you read the docstring?
And the whole function is only 10 lines of code, 5 of which would need to be duplicated. I think this would be more confusing and more difficult to maintain. And imagine someone writing a new Controls component. Now they have to decide which handle step method they want to use or add yet another one. If the function were more complex I would agree but I think it is not worth the complications.

Is this a blocker?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now they have to decide which handle step method they want to use or add yet another one.

This is far better than fattening up handle_step with more if-else's if the existing function does not cover the use case.

Why do I need to go through the caching code and have to consciously read and then to ignore it, just to read about stepping for basic controls? Such a waste of effort. I strongly disagree.

Copy link
Contributor Author

@Corvince Corvince Sep 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need a third opinion. My final statement is that my experience as a software developer tells me that in 95% of cases one should aim for maintainability. And one function is easier to maintain than two.

@Corvince
Copy link
Contributor Author

There is a discussion about whether to split handle_step. I revisited the code and needed to do a small adjustment for an upcoming feature. With that it is easy enough to split handle_step into two functions without code duplication. I still think it would not be necessary, but this way I am fine to follow @rht suggestion about splitting the function.

So I will do that, but there is a bug I need to fix first. So, dont merge yet, update coming

@Corvince
Copy link
Contributor Author

Got it. @rht let me know if there is anything else you spot with this PR. 👍

@Corvince
Copy link
Contributor Author

What I now changed: I now only and always save the model state before step is changed. Spoiler: This is needed for interactively changing the model state (soonish delivered by another PR)

@Corvince
Copy link
Contributor Author

Would be great if we could merge this soonish @tpike3 @rht . It blocks me a little bit right now

if not model.running:
return

updated_model = copy.deepcopy(model)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are probably going to say it's fine for small models. But this recreates the grid, agent vars, and model vars from scratch every time. It will hit the resource ceiling sooner than before. Python objects are not meant to be performant for immutable paradigm, are they?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compare with e.g. Agents.jl, which is frugal with resource.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are probably going to say it's fine for small models. But this recreates the grid, agent vars, and model vars from scratch every time. It will hit the resource ceiling sooner than before. Python objects are not meant to be performant for immutable paradigm, are they?

Yes this is exactly what I am going to say 😅

Plus, also for large models that should be okay for a long time. For performance there is no right or wrong, but in terms of speed copying is much faster than actually recreating and for memory usage when we deactivate the timeline we only ever hold 2 models at the same time.

I would say we should try it this way, because it makes the code much simpler and easier to reason about. If we eventually do hit a performance ceiling we can change things. But right now we should optimize for the 80% use case.

And our solara frontend is still experimental. We should focus on moving forward instead of worrying too much about hypothetical scenarios.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you describe what features you want to implement with this immutable model convenience? IIRC there was a GH issue where the OP requested for a feature to allow them to change model params (only if it makes sense to do so) without triggering a whole system reset. Is this one of them?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that some Solara servers are meant to be accessed by lots of users in parallel, and some are run on a small AWS instance. They could be considered to be within 80% of the use case. I have to weigh over this use case vs the features you want to implement.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all right now instead of a single "model" variable, we have "model", "current_step" and "reset_counter".

They all can be generalized to at most 2 reactive variables: to inform a model reset (reset_counter), and to inform a model update.

model_update = solara.reactive(0)

This reactive variable is updated whenever you want to trigger components rerender (e.g. after a model.step()).

While reset_counter can be removed once ipywidgets.Play is replaced.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I have to ask again for clarification. I am really not doing this annoy you.

What you write is absolutely correct. I just don't know its purpose. Is there a hidden question or some form of conclusion I am not getting or is it a simple statement?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot to quote the remaining paragraph

Both of the latter are used for triggering change detection, because changes to the model object will go unnoticed as far as solara is concerned. If we implement further model-changing functionalities (ex. interactivity) we always have to worry about keeping model up to date (by adding more and more dependency variables to the make_model function).

What I am saying is that the model_update variable can be used as a proxy for any interactive features to be based on, if they want to get a redraw signal from model.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that, but I am asking what you want to say. Are you seriously opposed to merging this PR, because it copies the model for state updates? This approach incidentally makes the code more straightforward, comprehensible, maintainable and easier to reason about.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current Solara code is already far simpler than mesa-viz-tornado, for what it can already do. Either:

  • use model_update as a compromise
  • we consider your current immutable model implementation as a training wheel so that it is easier for you to experiment with features, but eventually, this must be optimized in the future

Which one do you prefer? It's rather paradoxical that Agents.jl is more performant yet they are more resource conscious, while you are being more extravagant.

set_current_step(model.schedule.steps)
def on_value_play(_):
if playing.value:
on_step()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So basically, when the simulation runs until model.running is False, the buttons will look like they are still playing indefinitely, but that there are just no more viz state update?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, hadn't thought about this case. But I would consider the current "play" button a temporary solution, because it also doesn't fit the style of the other buttons and it shows an unnecessary additional stop button that does the same as the pause button. Therefore I would resolve that issue once we implement a better "play" solution

@Corvince
Copy link
Contributor Author

That is it. I’m done. @rht you won. The continuous hoops that I feel I’m being asked to jump through are becoming exhaustive.

I opened this PR with the intention of introducing a beneficial feature. Instead of being happy for the submission and helping to improve the PR, @rht makes me jump through hoops by constantly imagining, in my view, unlikely scenarios like people being overwhelmed by three additional buttons, people who want to understand the code but can’t be bothered to read a docstring and ten lines of code and finally people who want to serve their mesa models to lots users, but can only afford the smallest AWS instances. I want to clarify that I understand the importance of considering various user needs and scenarios, but the focus on such extensive hypothetical situations has been draining, leading to prolonged discussions that feel unproductive and detract from the substantive aspects of the PR. The bulk of the 40 comments on this PR revolve around these speculative situations rather than the concrete code changes proposed.

But to be clear this is not about any code changes. "It’s primarily about the tone of the discussion. I’ve yet to see comments like “nice feature, here are some suggestions,” or “thank you for implementing my comments.” Instead, the interactions often seem marked by a lack of trust and feel provocative, and the recent offer of “training wheels” felt quite personal and dismissive. I’ve been contributing to Mesa for six years and have several years of experience as a professional web developer; I believe I’m past needing training wheels

And, of course, this doesn’t only affect this PR. My last (non-trivial) merged PRs also needed lengthy discussions about absurd details. I am creating PRs because I have fun coding for mesa and want to add some nice features that I truly believe are useful and for example in this case are somewhat unique in the ABM landscape. Receiving positive feedback from @tpike, @jackiekazil and others has been reassuring, but with @rht comments it feels like I am a supplicant dependent on the grace of the master and not like a contributor on the same level.

This has left me feeling undervalued and, frankly, exhausted. It seems my contributions are more subjected to scrutiny over seemingly inconsequential details than valued for their merit and potential impact. This atmosphere is becoming toxic for me, leading me to decide to contribute in other forms. I will pursue some of my ideas in another repo and we can see if at some point I will contribute back to main mesa or just showcase features in mesa-examples. I will keep watching issues and try to help users and review PRs, but I don’t think I will open any PRs soon.

@rht
Copy link
Contributor

rht commented Sep 28, 2023

That sounds like how you would describe yourself. In the past years, you had initiated dismissive remarks to me (e.g. most recent one, which was definitely intentional). But when I somehow unintentionally irked you, you ragequitted.

My PR's were being scrutinized and I have been fine with it. This is a larger PR with non-atomic changes, you should be aware that of course larger surface has more comments.

unlikely scenarios like people being overwhelmed by three additional buttons

My words were being twisted. This was from @tpike3's feedback when teaching a class.

@jackiekazil
Copy link
Member

jackiekazil commented Sep 28, 2023

From my brief passing this evening, I see how frustrating it is to be in either position in this discussion. There is an opportunity for us to consider possible procedural changes in governance to ensure everyone's expectations are met. I will spend a day or two thinking about this and then respond further. It is essential for us as a project. I personally value both of your contributions to Mesa.

I am going to temporarily lock the conversation. (I think you both might still have access -- so consider this a symbolic gesture for the time being.)

@projectmesa projectmesa locked as too heated and limited conversation to collaborators Sep 28, 2023
@jackiekazil
Copy link
Member

We value these contributions. @tpike3 and I wanted to help the review on these to work through any issues -- @Corvince, please let us know if you are open to it. I will leave PRs closed unless you open and move forward OR unless someone else picks up your contributions to push forward.

We are also proposing a Code of conduct discussion which is happening here -- #1822 And will continue during our dev meeting Oct 14 in homes to provide guardrails to avoid situations like what occurred.

@projectmesa projectmesa unlocked this conversation Sep 29, 2023
@rht
Copy link
Contributor

rht commented Sep 30, 2023

Note that I review this PR with the objective of getting this merged soon.

@Corvince if I understand your criticisms correctly:

Instead of being happy for the submission and helping to improve the PR, @rht makes me jump through hoops by constantly imagining, in my view, unlikely scenarios ...

You understand that this is core Mesa repo, and so being careful with what is merged is justified. I intentionally wrote the current JupyterViz to be very similar in functionality as mesa-viz-tornado. As much as you ridiculed my concerns, I didn't find the scenarios I raised unlikely, and so, I disagreed. Am I not supposed to express disagreement? I have already said the rationale behind the 3 additional buttons earlier. Regarding with code clarity, in addition to the early version of the components not being orthogonal, I was confused by your code before you added the docstring, and this was coming from me who wrote the initial code, not some random users. Regarding with the AWS comment, this is rather inconvenient for me to disclose, but I am one of those people who can only afford the smallest AWS instances, and so, here is a concrete data point.

But to be clear this is not about any code changes. "It’s primarily about the tone of the discussion. I’ve yet to see comments like “nice feature, here are some suggestions,” or “thank you for implementing my comments.” Instead, the interactions often seem marked by a lack of trust and feel provocative, and the recent offer of “training wheels” felt quite personal and dismissive. I’ve been contributing to Mesa for six years and have several years of experience as a professional web developer; I believe I’m past needing training wheels

I thought @tpike3's comment was already sufficient? And given that you have been here for a long time, I wouldn't have to treat you as a first timer, because you know the worth of this PR. If you need an explicit opinion for contextualization, then here you go: I find the rewind feature very useful for exploration that was harder to do previously, where you would have to restart the simulation over and over. There would be a system which would undergo a phase transition several times, and so, just like a detective who would need to pause and rewind to do their investigation on a video recording, and so does a complexity researcher.

You extracted the "training wheels" remark out of context. I also said:
"It's rather paradoxical that Agents.jl is more performant yet they are more resource conscious, while you are being more extravagant."

This is in the context of your position that Mesa doesn't need performance optimization that badly: #1610 (comment). And you are entitled to it.

And, of course, this doesn’t only affect this PR. My last (non-trivial) merged PRs also needed lengthy discussions about absurd details.

I couldn't find such evidence for "absurd details" for the last merged PRs: #1786 and #1788. On the other hand, my PR's have had lengthy discussions from you, and I was fine with it. Appreciate the eyeballs though.

Lastly, you should be aware that I have been upset with your dismissive remarks. If you, after an internal self-review, acknowledge that they were inappropriate, then we can move on.

^Edit was because of typo, I at-mentioned @tpike3 without 3.

@Corvince
Copy link
Contributor Author

@rht I value the effort you've made to address my criticisms. However, I believe that continuing to discuss specific points further may not be productive.

@Corvince
Copy link
Contributor Author

Regarding with the AWS comment, this is rather inconvenient for me to disclose, but I am one of those people who can only afford the smallest AWS instances, and so, here is a concrete data point.

Again, not wanting to discuss any details here, but this should be completely orthogonal and I find this quite interesting. Can you provide any details for how to run mesa on AWS? If your workflow is in any way reproducible it would be great if you could write up how you host your models on AWS, I think this would be of interest for some people. Also (for the future of mesa frontend), how does your user base look like, for example how many users to you typically serve/ max user count?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants