Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REF-1885] Shard Substates when serializing to Redis #2574

Merged
merged 14 commits into from
Feb 21, 2024

Conversation

masenf
Copy link
Collaborator

@masenf masenf commented Feb 10, 2024

This is a performance optimization for apps that have many smaller states that end up taking excessive time to serialize/deserialize for every event, like reflex-web.

Instead of only serializing the top-level rx.State and implicitly all substates by virtue of their presence in the self.substates dictionary, the StateManagerRedis applies special logic to persist substates into their own keys. When retrieving a particular state, it's parent and all of it's substates (recursively) are fetched from redis and assembled into a structure resembling the previous implementation. The main difference is that when using redis, sibling states are not directly accessible. This will be addressed in a followup patch, that allows arbitrary access to any state defined in the app.

Fixes #2554

reflex/state.py Outdated
@@ -1405,6 +1419,24 @@ def on_load_internal(self) -> list[Event | EventSpec] | None:
type(self).set_is_hydrated(True), # type: ignore
]

def __getstate__(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This leaves router_data on each substate. Is that the intended behavior?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, yes. But I might be able to get away with stripping it out.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe router_data turns out to be a substate on its own in the end 😄

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abulvenz not a bad idea actually.. it may involve having to save two states each time as a downside, but it might be a cleaner separation in the future.

Copy link
Contributor

@picklelo picklelo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comments

@@ -1002,7 +1002,8 @@ async def upload_file(request: Request, files: List[UploadFile]):
)

# Get the state for the session.
state = await app.state_manager.get_state(token)
substate_token = token + "_" + handler.rpartition(".")[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe we can pull this out into a function get_substater_token(token, handler)

Returns:
The substate token.
"""
substate = self.name.rpartition(".")[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we pull it out, then we can reuse the function here

@@ -1479,6 +1511,8 @@ async def __aenter__(self) -> StateProxy:
"""
self._self_actx = self._self_app.modify_state(
self.__wrapped__.router.session.client_token
+ "_"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment about pulling this out

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding a utility for this in the followup PR

parent_state_key, top_level=False, get_substates=False
)
# Set up Bidirectional linkage between this state and its parent.
if parent_state is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can just be an else:

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think so, because we may or may not set parent_state in the previous condition (for example, when retrieving the base state of the app, parent_state will still be None as this point, even though we entered the previous condition)

@masenf masenf merged commit 756bf9b into main Feb 21, 2024
45 checks passed
@masenf masenf deleted the masenf/state-shard-redis2 branch February 21, 2024 09:50
@picklelo picklelo mentioned this pull request Feb 27, 2024
18 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[REF-1885] Explore: substate sharding
3 participants