Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cycles are possible in the dependency graph #83

Open
arthurp opened this issue May 24, 2021 · 5 comments
Open

Cycles are possible in the dependency graph #83

arthurp opened this issue May 24, 2021 · 5 comments

Comments

@arthurp
Copy link
Member

arthurp commented May 24, 2021

Because of the trick of allowing a task to be split and having any task that depends on the original task depend on the final segment split away from the task. The issue is that since we are "pausing" (really splitting) a task after create new tasks and blocking on tasks (including new ones maybe), the new tasks can potentially have their parent as a dependency. The new task depending on the creator is reasonable. And the creator depending on the new task is reasonable. However if we do both, we have a cycle.

This breaks some of our guarantees and it's not at all clear to me how to fix it.

@arthurp
Copy link
Member Author

arthurp commented May 24, 2021

@insertinterestingnamehere This is something we maybe to talk about and return to the system semantics. (I'm a bit embarrassed I didn't see this issue. I had this weird bad feeling about splitting tasks last year, but it was a useful feature and I didn't think follow through on the feeling.)

@insertinterestingnamehere
Copy link
Member

Honestly, I'm not 100% sure of what the right fix for this is. Part of me is saying that it's okay to let people write a deadlock as long as we provide some way to detect that failure case easily. Probably just finding a way to raise an a meaningful error if there are tasks queued with none of them ready. Being able to await tasks you just spawned before doing other stuff is a win for usability even if it complicates the tasking model a bit. Our async/await stuff is a very reasonable answer to some "how do I write this easily" kind of questions.

If I had to guess at a solution, I'd say we should just disallow child tasks depending on parent tasks. I can't think of any use-cases for it that aren't more naturally covered by other idioms that we do support. For example, it might be useful if you want a bunch of tasks to run after some setup code, but the way we designed that to be handled was to literally just launch the tasks after the setup was run. I don't think any of our current benchmarks/examples have a launched task depend on a parent. OTOH, detecting this might be nontrivial. We'd have to have each task track which task was running when it was originally launched and then crawl the implied stack for each task at launch time to raise an appropriate error. Tracking info about which task launched what seems appropriate for debugging. Crawling that info at launch time sounds expensive.

@arthurp
Copy link
Member Author

arthurp commented Jun 4, 2021

I think we might actually already track that. Or at least I thought about doing it. The reason was that I wanted to provide fake stacks for exceptions so that you could see the "spawn" stack instead of the real stack used in the worker thread. This spawn stack is important for debugging, and if available we could use it for the checking you want.

We could also have a fall back cycle checker that runs if the scheduler runs out of work while tasks still exist, to trigger a warning that advises turning on the stack tracking debugging feature.

@insertinterestingnamehere
Copy link
Member

I think we might actually already track that. Or at least I thought about doing it. The reason was that I wanted to provide fake stacks for exceptions so that you could see the "spawn" stack instead of the real stack used in the worker thread. This spawn stack is important for debugging, and if available we could use it for the checking you want.

We could also have a fall back cycle checker that runs if the scheduler runs out of work while tasks still exist, to trigger a warning that advises turning on the stack tracking debugging feature.

Yah, +1 to both these ideas. More readable stack traces would be a big win. Falling back to cycle detection to raise a meaningful error when deadlock occurs makes sense since we won't pay the cost at runtime.

@insertinterestingnamehere
Copy link
Member

Another thought on this: this would be very easy to detect in the sequential execution case. We need to add that execution mode for better debugging anyway. During sequential execution we could just raise an error on task creation if any of the specified dependencies haven't already completed.

Given that this kind of deadlock is the result of a user doing something weird, addressing this that way should be good enough to address this specific issue (though I'm still in favor of better error stacks and better error messages from the scheduler).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants