Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Select complete multi-line statement when running run-in-console action without selection #6515

Merged
merged 19 commits into from Jun 18, 2019

Conversation

@BoPeng
Copy link
Collaborator

@BoPeng BoPeng commented Jun 9, 2019

#4330 (comment)

Code changes

See updated algorithm at the end of this comment.

User-facing changes

Using run-in-console action to run through the cell

See updated screenshots here

Potential risk of executing unintended code

Jason pointed out two potential problems with this PR. Namely there is no indication of what would be executed before the action, and the possibility of executing large trunk of unintended code. The PR has since been updated to closely mimic RStudio's behavior in case of syntax errors, which will only include code before the current line if complete statements are identified from the beginning of the cell and the current line falls into a complete multi-line statement. This has worked pretty well for all cases I have tested.

Potential performance problem:

Each time the algorithm will start from the top of cell, send code to the kernel to find a complete statement that contains the current line. This can be a bit slow for large cells but Jupyter does not encourage such cells anyway.

@jasongrout please have a look.

@jupyterlab-dev-mode
Copy link

@jupyterlab-dev-mode jupyterlab-dev-mode bot commented Jun 9, 2019

Thanks for making a pull request to JupyterLab!

To try out this branch on binder, follow this link: Binder

@BoPeng BoPeng changed the title Select complete statement when running run-in-console action without selection Select complete multi-line statement when running run-in-console action without selection Jun 10, 2019
@jasongrout
Copy link
Contributor

@jasongrout jasongrout commented Jun 10, 2019

This seems very convenient. My big concern in the idea here is that this might run out of control. For example, if I accidentally try to execute a non-code line (perhaps just a line of text in a text file that also happens to contain python code), it may suck up the whole file and try to execute it? Or if I have a missing closing paren, will it mistakenly try to execute the entire file from the cursor to the end?

@jasongrout
Copy link
Contributor

@jasongrout jasongrout commented Jun 10, 2019

It would be great if there were some ui indication of what code was going to be executed before execution. I'm not sure how to do that in an efficient way, though.

@BoPeng
Copy link
Collaborator Author

@BoPeng BoPeng commented Jun 10, 2019

I had the same worry. Basically, if the current line is invalid, it is unlikely that adding lines before or after it will make it run. If the current line is part of a huge multi-line function, then yes the entire function will be executed.

For example, if I accidentally try to execute a non-code line (perhaps just a line of text in a text file

At least now run-in-console only works in code cells.

Or if I have a missing closing paren, will it mistakenly try to execute the entire file from the cursor to the end?

No, because adding the previous lines will not make the statement valid, so only the current line will be executed (and an error will be returned).

It would be great if, for example, when I click a line, there is some indication of the boundary of a complete statement. This is however a nice feature by itself and can be added later.

@BoPeng
Copy link
Collaborator Author

@BoPeng BoPeng commented Jun 11, 2019

7be4a25 handles empty lines more cleverly. More specifically,

  1. If run-in-console starts from an empty line, it moves down to find the first non-empty line.
  2. If there are empty lines following end of the statement, it moves to the beginning of the non-empty line.

These are consistent with RStudio behaviors.

Copy link
Collaborator Author

@BoPeng BoPeng left a comment

image

It turns out that the problem was with my version of ipython. The newer version uses IPython.core.inputtransformer2.check_complete instead of IPython.core.inputsplitter.InputSplitter.check_complete, which handles the cited case correctly.

packages/notebook-extension/src/index.ts Outdated Show resolved Hide resolved
@BoPeng
Copy link
Collaborator Author

@BoPeng BoPeng commented Jun 12, 2019

@jasongrout It turns out that the problem I had with IPython kernel was due to a bug that has been fixed in the latest version of ipython (7.5.0). I have submitted a patch to remove my walk-around. This PR works pretty well for all the kernels I have tested (I have updated the ticket with a screenshot for R) and should make run-in-console much easier to use so I highly recommend its inclusion in 1.0.

@jasongrout
Copy link
Contributor

@jasongrout jasongrout commented Jun 12, 2019

An option is also to put this behind an option. In general, I tend to favor predictability and lower chance of failure over trying to be smart by default, especially when it comes to guessing user intentions with little chance to recover from small user errors. However, providing an option where the user can opt into something that tries to be smarter great.

@afshin
Copy link
Member

@afshin afshin commented Jun 12, 2019

We discussed this in the weekly call and proposed the following behavior:

  • There could be setting that indicates the number of lines you want to read ahead with this behavior.
  • The default number could be something small, like 10 lines.
  • If a user wants to disable this, it can be set to 0.
  • If a user has a higher risk threshold, the number could be very large.

What do you think?

@jasongrout
Copy link
Contributor

@jasongrout jasongrout commented Jun 12, 2019

I think that is a nice balance of intelligence vs complexity and the possibility of user error derailing things.

@BoPeng
Copy link
Collaborator Author

@BoPeng BoPeng commented Jun 12, 2019

Stepping through code naturally means "stepping by full statements" in any syntax-aware IDE. RStudio does this nicely, so do PyCharm and other Python IDEs. The run-in-console feature has been introduced to sos notebook for a while and the most complaint I have heard was that it did not work as well as RStudio when stepping through code, and I am glad that we are making JLab on par with RStudio in this aspect.

I think the biggest concern of this PR is that it could probe forward and backward too much and execute unintended codes. In an extreme case, one could step through the code and get some code included/executed again when executing a new line. To assess the risk of this I have played with Rstudio more and have some interesting findings.

When the current line is complete by itself even if it is part of a larger complete statement, e.g.

a = c(1,
 2   <- cursor here.
, 3)

Currently this PR will execute 2 by itself, while RStudio will execute the entire statement, regardless of number of lines for the statement.

Very interestingly, the behavior of RStudio will change with/without an invalid statement before the current statement. I I mean, Ctrl-Enter at the 6 line will send only 6 in the case of

a <- c(
 
c = c(
  6 
)

and the complete

c = c(
  6 
)

in the case of

# a <- c(
 
c = c(
  6 
)

Moreover, in the case of

a <- c(
  
c = c(  <- cursor here
  6  ) 

Rstudio will be able to identify c = c( 6), but fail to get the complete statement in the case of

a <- c(
  
c = c(  
  6  ) <- cursor here

while this PR can.

In a similar case, our PR could get

'''
wrong  = [ 
b = '''

when evaluating a cell with code

a = '''
my string
'''
wrong  = [     <- cursor here
b = '''
my another string 
'''

while RStudio would handle it correctly.

My conclusion is that the algorithm for this PR is not ideal and the cited problems could be resolved by a simpler yet safer algorithm that are more careful in search backward

  1. Starting from the first line and try to find complete statements, if a complete statement that contains the current line is found, submit the statement.
  2. Otherwise starting from the current line, search forward for complete statement, submit if found.
  3. Otherwise send the current line.

In this way the previous statements are only included if statements before the current line is valid.

@BoPeng
Copy link
Collaborator Author

@BoPeng BoPeng commented Jun 12, 2019

The new algorithm has been implemented, which I believe should be much safer, at least as safe as RStudio.

The last line failed to look backward because the error in the middle:

image

The error in the middle would not cause the inclusion of ''' lines before and after it:

image

Similar error handling in R

image

When there is no error, running at the last line or the second last line correctly get the complete statement.

image

@jasongrout
Copy link
Contributor

@jasongrout jasongrout commented Jun 18, 2019

A little simplified await construct:

diff --git a/packages/notebook-extension/src/index.ts b/packages/notebook-extension/src/index.ts
index ea848dc91..1cb7c39ad 100644
--- a/packages/notebook-extension/src/index.ts
+++ b/packages/notebook-extension/src/index.ts
@@ -994,14 +994,9 @@ function addCommands(
             code: code
           };
 
-          let completed = false;
-          await current.context.session.kernel
-            .requestIsComplete(msg)
-            .then(reply => {
-              completed = reply.content.status === 'complete';
-            });
+          let reply = await context.session.kernel.requestIsComplete(msg);
 
-          if (completed) {
+          if (reply.content.status === 'complete') {
             if (curLine < lastLine) {
               // we find a block of complete statement containing the current line, great!
               while (

@jasongrout
Copy link
Contributor

@jasongrout jasongrout commented Jun 18, 2019

I notice that you have the heuristic of stopping when you hit a whitespace line. Is that encoding too much of a syntax assumption about the language?

@BoPeng
Copy link
Collaborator Author

@BoPeng BoPeng commented Jun 18, 2019

The empty lines (empty or with only space and tabs) are handled as follows:

  1. If the starting line is empty, the algorithm moves to the first non-empty line.
  2. It includes empty lines when searching for complete statement.
  3. If a complete statement is found and executed, the algorithm moves the cursor to the non-empty line after the statement.

The behavior saves the trouble of sending empty lines to the console (which should be meaningless in any language) and makes it easier to step through code with empty lines. As for syntax assumption, 2 and 3 should be safe, and the 1 could be problematic with valid empty line such as

a = [1,
  <- cursor here.
2]

However, because "valid empty line" is unlikely to be the starting or ending line, JLab will correctly identify the statement and submit

a = [1

2]

after the cursor is moved to the last line. It would send

2]

when statements before the assignment is invalid, but it is still better than sending an empty line to the console.

So I think the behavior should be safe for any language.

@jasongrout
Copy link
Contributor

@jasongrout jasongrout commented Jun 18, 2019

Thanks. I think this is ready to go in now, and let's try it out. If it turns out to be too smart (i.e., if it is not giving people what is wanted), we can introduce an option to turn it off.

@jasongrout jasongrout merged commit 5f8ca1c into jupyterlab:master Jun 18, 2019
9 checks passed
@jasongrout jasongrout added this to the 1.0 milestone Jun 18, 2019
@BoPeng
Copy link
Collaborator Author

@BoPeng BoPeng commented Jun 18, 2019

Great. Thanks!

@BoPeng BoPeng deleted the run-in-console branch Jun 18, 2019
BoPeng pushed a commit to vatlab/jupyterlab that referenced this issue Jun 18, 2019
@BoPeng BoPeng mentioned this pull request Jun 18, 2019
@beaulucas
Copy link

@beaulucas beaulucas commented Aug 5, 2019

@BoPeng

Sorry if this is ignorant question, but is this update live? I am using run-in-console, and still can't run the multi-line statements.

IPython version: 7.7
jupyter version: 4.4
python env: 3.7.2
jupyterlab version: 1.0.4

Setup on jupyter lab:

{
  "notebook:run-selected-or-current-line-in-console": {
    "command": "notebook:run-in-console",
    "keys": ["Shift G"],
    "selector":  ".jp-Notebook.jp-mod-editMode"
  }
}
# this works
x = 1
# this doesnt work
a = [1,
2]

@jasongrout
Copy link
Contributor

@jasongrout jasongrout commented Aug 5, 2019

Yes, this PR is in jlab 1.0.4. I just checked and it works for me: putting my cursor on the line a = [1, runs both lines: a = [1, 2].

As an aside, Shift-G for the shortcut? As in you can't type capital G in the code cell? Is this perhaps related? (I use Ctrl G so I can still type a capital G)

@beaulucas
Copy link

@beaulucas beaulucas commented Aug 5, 2019

@jasongrout

Thanks for fast response, I diagnosed the issue on my end. I was launching JupyterLab from a 2.7 environment, and had the impression that user settings were kernel-dependent (1.0.4 is on my 3.7.2 environment). Still getting a hang of the ropes on this. Coming from RStudio.

You make a great point on the Shift remark... Reverting back to CTRL.

@jasongrout
Copy link
Contributor

@jasongrout jasongrout commented Aug 5, 2019

Great to hear it's working.

Still getting a hang of the ropes on this. Coming from RStudio.

Welcome!

@lock lock bot locked as resolved and limited conversation to collaborators Sep 4, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

5 participants