You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm not sure if the following behavior is caused by Dagger.jl or not, but I'm wondering if it's related to the issues I've been seeing with my code and using Dagger.jl with Distributed.jl (see, e.g., #438).
I have code that I run that uses DTables.jl, but it frequently (but not always) does not run to completion. I tried to distill the essence of the code to create the MWE in #438, but that MWE doesn't capture everything I'm seeing with my actual code. (In fact, that MWE might be working perfectly now.) Currently, my code hangs frequently, and I'm not sure why. The CPU utilization drops to 0 when it hangs, so it's almost as if tasks aren't getting scheduled to run (but I really have no idea if that's the case).
Anyway, I also noticed that every time I exit Julia after running my code (as long as I don't have to kill the Julia processes from the terminal, which sometimes I have to do when it hangs because ctrl-c won't work), I get quite a large dump of debugging messages about evicting datastore entries. See an example after the questions below. (Also note that Dagger.jl is my only dependency that uses MemPool.jl.)
Questions:
@jpsamaroo do you have any insight as to why the datastore fails to clean? Does it mean my code is doing something wrong, like somehow holding onto objects longer than it should? Or could it be another Dagger.jl issue?
Could it be that, when running a task, if the task gets interrupted (e.g., to do garbage collection) there is an issue with rescheduling it? (Do tasks get interrupted for garbage collection or other reasons when in a single-threaded, multi-process environment?)
(Unrelated) There are a couple of places in my code with @sync and @async. Would these cause any issues with Dagger.jl? I've tried removing them, but I haven't noticed any difference in behavior.
@jpsamaroo do you have any insight as to why the datastore fails to clean? Does it mean my code is doing something wrong, like somehow holding onto objects longer than it should? Or could it be another Dagger.jl issue?
It could be either your code, DTables, or Dagger causing this, but it may be that it's not actually a problem but just looks like one from the intimidating warnings. Since these are all pure in-memory objects (that's what MemPool.CPURAMDevice represents), I can workaround this in MemPool by just ignoring such objects in this eviction code, because they'll be destroyed once Julia exits anyway. I'll put this together momentarily.
Could it be that, when running a task, if the task gets interrupted (e.g., to do garbage collection) there is an issue with rescheduling it? (Do tasks get interrupted for garbage collection or other reasons when in a single-threaded, multi-process environment?)
The only time I'd expect there to be an issue with handling an interruption is if you send a Ctrl-C/SIGINT to a worker - Julia doesn't handle this well at all right now, and it will cause random things to break or hang. A solution is in the works, but it will take some time (and will only show up in a future Julia version).
(Unrelated) There are a couple of places in my code with @sync and @async. Would these cause any issues with Dagger.jl? I've tried removing them, but I haven't noticed any difference in behavior.
They shouldn't cause any problems, I use @sync and Threads.@spawn together frequently when working with Dagger.
I'm not sure if the following behavior is caused by Dagger.jl or not, but I'm wondering if it's related to the issues I've been seeing with my code and using Dagger.jl with Distributed.jl (see, e.g., #438).
I have code that I run that uses DTables.jl, but it frequently (but not always) does not run to completion. I tried to distill the essence of the code to create the MWE in #438, but that MWE doesn't capture everything I'm seeing with my actual code. (In fact, that MWE might be working perfectly now.) Currently, my code hangs frequently, and I'm not sure why. The CPU utilization drops to 0 when it hangs, so it's almost as if tasks aren't getting scheduled to run (but I really have no idea if that's the case).
Anyway, I also noticed that every time I exit Julia after running my code (as long as I don't have to kill the Julia processes from the terminal, which sometimes I have to do when it hangs because ctrl-c won't work), I get quite a large dump of debugging messages about evicting datastore entries. See an example after the questions below. (Also note that Dagger.jl is my only dependency that uses MemPool.jl.)
Questions:
@sync
and@async
. Would these cause any issues with Dagger.jl? I've tried removing them, but I haven't noticed any difference in behavior.The text was updated successfully, but these errors were encountered: