Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Optimize bytewax pod resource with zero-copy #3826

Merged
merged 3 commits into from
Nov 14, 2023

Conversation

sudohainguyen
Copy link
Collaborator

What this PR does / why we need it: To optimize bytewax pods resource usage, for the case in #3825, each pods only consume 300MB of memory.

Which issue(s) this PR fixes:

Fixes #3825

Signed-off-by: Hai Nguyen <quanghai.ng1512@gmail.com>
@sudohainguyen sudohainguyen marked this pull request as draft November 6, 2023 11:04
@sudohainguyen sudohainguyen marked this pull request as ready for review November 6, 2023 11:09
@sudohainguyen sudohainguyen force-pushed the feat/zero-copy-bytewax branch 4 times, most recently from c3fca82 to 4eb3209 Compare November 6, 2023 13:17
bytewax_replicas: int = 5
""" (optional) Number of process to spawn in each pods to handle a file in parallel"""

bytewax_worker_per_process: int = 1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe that this setting, and the setting of the number of replicas will currently be respected.

I'd be happy to help with getting this fixed, but perhaps that should be the subject of a separate PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yeah you're right, looking forward to working on a separate PR

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe that this setting, and the setting of the number of replicas will currently be respected.

I'd be happy to help with getting this fixed, but perhaps that should be the subject of a separate PR.

could you point out exactly where the setting is not respected? seems like the env of bytewax pods consume it correctly

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where we call the bytewax entrypoint for the dataflow:

which ends up calling:

https://github.com/bytewax/bytewax/blob/v0.15.1/src/execution/mod.rs#L709-L717

Since we aren't using any of the environment variables at this point, we should only ever see one Python process with one worker per pod.

Happy to chat further about enabling multiple workers. Feel free to reach out on slack if you would prefer to do it there.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, would love to chat more, but by the way we're using bytewax 0.17.1 as of now

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by the way, in case the REPLICAS setting doesn't work, we still have at least 1 process and 1 worker per pods.

and the implementation of this PR still reduce memory footprint compares to master's one. should we process this first? @achals

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sudohainguyen I'd prefer to update the pr so only the changes we expect to actually make a difference are merged (specifically the arrow changes) - we can bump the dependency on bytewax and make any necessary changes for replicas in a separate pr

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree, then I will narrow down the PR, to make bytewax settings unexposed from batch engine config

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done @achals

Signed-off-by: Hai Nguyen <quanghai.ng1512@gmail.com>
Signed-off-by: Hai Nguyen <quanghai.ng1512@gmail.com>
@achals
Copy link
Member

achals commented Nov 8, 2023

@sudohainguyen are the current tests sufficient to ensure this test is sitting as expected? Can you test some of this manually?

@sudohainguyen
Copy link
Collaborator Author

Yep I did, and already applied to my production, that's how I can measure memory usage reduced.

@andrewshrout
Copy link

Bumping this out of interest. Hoping it's merged soon.

@mjurkus
Copy link

mjurkus commented Nov 13, 2023

An amazing improvement! Used this in our workflows. Peak usage was at 512GB before the update.

image

@achals achals merged commit ae1bb8b into feast-dev:master Nov 14, 2023
22 checks passed
@sudohainguyen sudohainguyen deleted the feat/zero-copy-bytewax branch November 15, 2023 02:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bytewax pods consume lots of resource
5 participants