Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disallow saving bad DAGs #77

Closed
thieman opened this issue May 13, 2014 · 9 comments · Fixed by #89
Closed

Disallow saving bad DAGs #77

thieman opened this issue May 13, 2014 · 9 comments · Fixed by #89

Comments

@thieman
Copy link
Owner

thieman commented May 13, 2014

From a discussion on #60. Currently, it's possible to save a Job when its DAG has cycles. Let's disallow this and update the UI to not be as annoying about it when a cycle is created.

@thieman thieman assigned thieman and unassigned thieman May 25, 2014
@rclough
Copy link
Collaborator

rclough commented May 29, 2014

Do you think the check should happen at the add-dependency level (ie as soon as you create a cycle when adding a dependency, it fails to save it) or when saving the whole DAG? (or both)

@thieman
Copy link
Owner Author

thieman commented May 29, 2014

I did some thinking on this last week and realized that there are a few complications here.

tl;dr: We need to implement a better consistency model before we can disallow saving bad DAGs. I'll go create an issue for that.

First, some facts:

  1. The state of a particular DAG can be stored in three places at any one time: a user's browser, the running dagobahd, and the database. It should be our goal to keep these as consistent as possible.
  2. Currently, DAG updates are done through small atomic updates, e.g. "add a dependency between nodes 1 and 2." There is not currently a way to send a complete DAG through the API and overwrite whatever is in the database.

Atomic updates will not be sufficient if we are going to disallow saving bad DAGs. This is what the interaction will look like:

  1. User loads the DAG in their browser. All three states are consistent.
  2. User adds a dependency that does not invalidate the DAG. This can go through the atomic update path. All three states are consistent.
  3. User adds a dependency that invalidates the DAG. This tries to go through the atomic update path but is rejected by the server. The bad dependency stays in the browser DAG so the user can complete whatever restructuring they are trying to do. The browser now diverges from the server and database. The browser marks its DAG as dirty (divergent).
  4. Updates while the browser DAG is dirty cannot use the atomic updates (since state diverges) and instead must POST the entire browser DAG to the server. This is a very destructive operation. Any update any other user has made in this time will be blown away.

Dagobah's consistency model is already bad, but this was mitigated partially by the nature of atomic updating. I'm not comfortable with full-DAG POSTs unless we have a better consistency model. Once we have a consistency model, this method should be fine.

@rclough
Copy link
Collaborator

rclough commented May 29, 2014

In step 3, what is wrong with removing the dependency when the server rejects it? Or not updating the front end model until the server has confirmed it's validity?

@thieman
Copy link
Owner Author

thieman commented May 29, 2014

You don't want to remove the dependency from the front-end because the user purposefully put it there. She's probably in the process of making some more changes to make the DAG valid again. Refusing to save the incremental changes along the way could be very annoying.

@rclough
Copy link
Collaborator

rclough commented May 29, 2014

I mean, is it that much of a problem to ask them to fix the cycle before allowing them to add that connection? What about turning that arrow red (or something) in the front end?

@thieman
Copy link
Owner Author

thieman commented May 29, 2014

That would be possible, though we would then need to maintain some sort of sync queue in the frontend with changes that have been made in the browser but not successfully saved on the server. We would then need to try to sync everything in the queue whenever something else is changed. This could probably be made easier by having some DAG validation logic in the frontend as well.

This does seem like it might be simpler. I think that approach is safer and could be merged before the consistency model, though I still want to add that in at some point.

@rclough
Copy link
Collaborator

rclough commented May 29, 2014

I just don't see what's so bad about disallowing adding a dependency if there's a cycle, how often would this case happen? And wouldn't immediate feedback be useful?

@thieman
Copy link
Owner Author

thieman commented May 29, 2014

I suppose it won't be all that annoying, since the only action that could cause an inconsistency is adding an edge. It would be more annoying if a node addition/deletion got backed out, but that won't happen. Yeah, we can do it that way. So basically:

  1. User adds an edge in the browser, tries to tell the API to add it.
  2. API freaks out and returns a 400.
  3. Browser tells the user it would invalidate the DAG and deletes the edge.

@rclough
Copy link
Collaborator

rclough commented May 29, 2014

Yeah, that was my thought process. Although #88 is a nice-to-have for potential simultaneous editing.

The other issue is if we do put in a PR for this, if users upgrade, it wont detect any existing cyclic DAGs, but I guess that would mean they're not in use anyway because they won't run with cycles.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants