Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calling .length() and .width() in a loop harms parallelism for Ray Backend #6841

Closed
arunjose696 opened this issue Dec 27, 2023 · 0 comments · Fixed by #6842
Closed

Calling .length() and .width() in a loop harms parallelism for Ray Backend #6841

arunjose696 opened this issue Dec 27, 2023 · 0 comments · Fixed by #6842
Labels
Ray ⚡ Issues related to the Ray engine

Comments

@arunjose696
Copy link
Collaborator

Anti-pattern: Calling ray.get in a loop harms parallelism.

In Dataframe class we have .length() and .width() calls being called in a for loop. This has to be fixed such that in loop for Ray backend these functions are called without materialization and later materialized together at the end of loop.

Note:This isssue is a subtask of #5524

@arunjose696 arunjose696 added the Ray ⚡ Issues related to the Ray engine label Dec 27, 2023
anmyachev added a commit that referenced this issue Jan 8, 2024
… called in a loop (#6842)

Co-authored-by: Andrey Pavlenko <andrey.a.pavlenko@gmail.com>
Co-authored-by: Anatoly Myachev <anatoliimyachev@mail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ray ⚡ Issues related to the Ray engine
Projects
None yet
1 participant