Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run all paragraphs sequentially #2619

Closed
wants to merge 6 commits into from

Conversation

namanmishra91
Copy link
Contributor

What is this PR for?

This PR introduces "run all" behaviour for a note in a sequential manner. The "Run All paragraphs" button will now run the notebook sequentially. There is a UX change as part of this PR, which is discussed below. Any suggestions to make it better are welcome.
Following is the approach taken in brief:

  • Replace the behaviour of "Run All paragraphs" to "Run All sequentially"
  • Submit paragraphs to the respective scheduler only after the previous one completes execution (via wait-notify mechanism)
  • Communicate the status of the sequential run to client over websocket
  • Since the paragraphs won't be submitted immediately, changing their status is a challenge while keeping the code maintaining it at one place. Therefore, the approach I have taken is to keep the status of the paragrahs as-is and disallow explicit run of individual paragraphs when sequential run is in progress (see GIF for details).

What type of PR is it?

Improvement

Todos

What is the Jira issue?

https://issues.apache.org/jira/browse/ZEPPELIN-2368

How should this be tested?

  • Have a notebook with different interpreter types (e.g. sh, md, spark etc.)
  • Click "Run All paragraphs"
  • The paragraphs should run sequentially
  • While sequential run is in progress, explicit run of paragraphs should not be allowed
  • Correct sequential run state should be communicated to the browser when note is loaded

Screenshots (if appropriate)

sequential_run_all_1

sequential_run_all_2

Questions:

  • Does the licenses files need update? No
  • Is there breaking changes for older versions? Probably not, but the UX would be different
  • Does this needs documentation? Yes

@zjffdu
Copy link
Contributor

zjffdu commented Oct 11, 2017

@namanmishra91 Thanks for the contribution. But the implementation is a little complicated to me. I think the easiest implementation is just run paragraphs sequentially in backend. Look at the following code.
https://github.com/apache/zeppelin/blob/master/zeppelin-server/src/main/java/org/apache/zeppelin/socket/NotebookServer.java#L1681
persistAndExecuteSingleParagraph run paragraph in non-blocking way. What we need to do it to run it blocking way.

@namanmishra91
Copy link
Contributor Author

@zjffdu

What we need to do it to run it blocking way.

The code does exactly that. It might look complicated overall as a lot of code is related to handling and passing of sequential run state to UI. For backend changes, please check runAllParagraphsSequentially() in NotebookServer.java.

In a gist, sequential run is ultimately performed by SequentialNoteRunner.java which basically runs a paragraph, blocks until it is complete and then submits the next one. If it helps, I can divide this into 2 PRs : one for backend only changes and another for integration with frontend.

@zjffdu
Copy link
Contributor

zjffdu commented Oct 11, 2017

@namanmishra91 I don't mean your PR doesn't do it correctly. I mean you could do it in an easier approach with less code changes. Just modifying persistAndExecuteSingleParagraph to make it run paragraph in a blocking way should be sufficient.

@namanmishra91
Copy link
Contributor Author

namanmishra91 commented Oct 11, 2017

@zjffdu Thanks for the feedback. Directly modifying the persistAndExecuteSingleParagraph method will lead to the current behaviour getting lost completely. The reason I chose to not overwrite the existing code flow was that there were suggestions on the JIRA that we should retain the current (concurrent) behaviour as well. Hence, added a new code path in backend as well as UI.

If we don't need the concurrent functionality anymore, I can change the existing code flow to make this work.

@zjffdu
Copy link
Contributor

zjffdu commented Oct 11, 2017

@namanmishra91 I don't see the current behavior is retained in this PR. Personally I don't think the current behavior needs to be retained. Even we want to keep the current behavior, it should be easy to make persistAndExecuteSingleParagraph to be both blocking and non-blocking (Just add one flag). I don't think it is necessary to invoke any frontend change, backend change should be sufficient.

@zjffdu
Copy link
Contributor

zjffdu commented Oct 13, 2017

ping @namanmishra91

@namanmishra91
Copy link
Contributor Author

@zjffdu Yeah, I can change the implementation to make persistAndExecuteSingleParagraph paragraph blocking. Then we won't need UI side changes for separate event handling etc. However, we will still need other UI changes for paragraph state management because the execution workflow is changing. Just to make sure that we are on the same page, let me take a moment to explain why those will be needed:

Unless we want to change the zeppelin architecture of separate interpreter queues, we need to hold paragraphs to prevent them from being executed. Currently all paragraphs get submitted immediately and their status changes to PENDING. With the above approach, only one paragraph will be in any interpreter's queue at any given point of time; hence the status of yet-to-be executed paragraphs will not get updated and the communication to the user that those paragraphs will run eventually will be lost. This will be a major change in the UX. We need to discuss and think about what should be the best way to handle this but as a preliminary implementation, we need to prevent explicit runs of the paragraphs that are not running yet. This will require maintaining state about whether a run all is in progress and passing this to front-end.

Let me know your thoughts on the above.

@zjffdu
Copy link
Contributor

zjffdu commented Oct 13, 2017

@namanmishra91 I don't understand why it would invoke frontend. Backend know more context about paragraph status and backend could control the workflow of paragraph running easily.

@zjffdu
Copy link
Contributor

zjffdu commented Oct 17, 2017

@namanmishra91 I made another PR #2627 which only change the backend code. Let me know whether this meet your requirement.

@tinkoff-dwh
Copy link
Contributor

#2627 merged

maybe close this PR?

@namanmishra91
Copy link
Contributor Author

Closed as PR 2627 addresses the same thing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants