-
Notifications
You must be signed in to change notification settings - Fork 652
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
how to take down ray and put up again in local mode #7249
Comments
Hi @SiRumCz, thanks for posting this issue. I guess there might be an issue with multiple Ray initialization in Modin codebase. We would have to look into this deeper. Meanwhile, can you explicitly put |
@YarShev Thanks for your response. Yes, I have tried that method, and unfortunately I got: |
@SiRumCz, could you try to execute |
Signed-off-by: Igoshev, Iaroslav <iaroslav.igoshev@intel.com>
@SiRumCz, I opened #7280, which adds import modin.pandas as pd
from modin.utils import reload_modin
import ray
ray.init(num_cpus=16) # can be commented out, works
df = pd.read_csv("example.csv")
df = df.abs()
print(df)
ray.shutdown()
reload_modin()
ray.init(num_cpus=16) # can be commented out, works
df = pd.read_csv("example.csv")
df = df.abs()
print(df) |
thanks, I ended up using a Process to wrap my task into a new process, ray will be taken down when process ends. But I am happy that there will be a feature for this, cheers :-) |
@YarShev, I have another question regarding to the reload function, can I only shutdown the ray I initialized from the process? My understanding is that |
Your understanding is correct, ray.shutdown() kills all Ray processes. If we are talking about your warkaround,
I think you can avoid the calls to ray.init() and ray.shutdown() in the process wrapping your task. You should set up a Ray cluster manually on your machine with this instruction, for instance, and then Modin will be able to connect to the existing Ray cluster in your process. |
Wrapping my task in a Process only partially addressed my problem. I am also encountering another problem where majority of the memory go into Buff/Cache, only leaving a tiny bit to the free memory. Have you guys encountered similar situation? |
How much memory do you have on the system? What data sizes do you want to process? |
My system has 32GB memory, the data size is around 5 millions lines of log data (~1.5GB csv file). But my project involves quite complicated works, and because it uses nested dataframes and nested modin functions such as |
32GB might be insufficient but Ray should start spilling objects onto disk if available memory got depleted and the flow should finish. Do you encounter OOM error? |
I tried to optimize my project to fit into 32GB, and yes Ray object spilling helped a lot. But one of my real challenges is after it finishes, not all the memory are being released, the majority goes into the Buffer/Cache if I look at |
what I am seeing is very similar to this post: ray-project/ray#7053 (comment) |
@SiRumCz, let's keep track of the issue in Ray. Also, we merged |
My program has memory risk, and part of it seems to come from memory leak (idling ray workers holding a big chunk of memory). I have a for loop to independently run chunks of csv file on a series of tasks, I wish to kill ray after each iteration to release memory, and let Modin to put it up again with fresh ray workers. However, my code is the following:
however, I got below error:
The text was updated successfully, but these errors were encountered: