Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error method is noncompliant to Jupyter protocol #149

Open
JMurph2015 opened this issue Jul 11, 2017 · 39 comments
Open

Error method is noncompliant to Jupyter protocol #149

JMurph2015 opened this issue Jul 11, 2017 · 39 comments

Comments

@JMurph2015
Copy link

JMurph2015 commented Jul 11, 2017

I posted this issue on the octave_kernel repo, but after digging into the source, I discovered it may be generic to all kernels based on Metakernel.
So here's the issue description again:
So when a kernel errors out, you should be sending back an execute-reply like this:

{
    #...
    'content':{
        'status':'error',
        'ename':<your_error_name>,
        'evalue':<your_error_data>,
        'traceback':<your_traceback>
    }
    #...
}

That way Jupyter can process that as a proper error and not just assume it is any old STDOUT.
Right now it would be a the easiest (?) fix to just rename what you are already returning to 'traceback' and insert some dummy values for the fields that you can't get easily.

It would be really awesome if this were fixed because it's breaking my use case!

@dsblank
Copy link
Member

dsblank commented Jul 12, 2017

Thanks for the report! Can you give a bit more detail on your suggested fix? What do you mean by renaming "what you are already returning"? Any chance of actually making a PR?

@dsblank
Copy link
Member

dsblank commented Jul 12, 2017

See also fixes and discussion here: #147

@JMurph2015
Copy link
Author

Oops, I saw that #147 was actually closed, but please take a look at the comments there.

TL;DR Metakernel needs to either modify pexpect some more, or use the fd oriented class of pexpects. This is really really breaking to me, pls help.

@dsblank
Copy link
Member

dsblank commented Jul 12, 2017

Ok, let's try to nail down exactly what the issue is, and what should be done. Are you saying that if a kernel prints on stderr, should that be interpreted and returned as an error reply?

@blink1073
Copy link
Contributor

This is a fundamental limitation of using pexpect: pty does not distinguish between stderr and stdout. Using raw fds is a road fraught with peril due to locking. I don't see a way to support this use case.

@JMurph2015
Copy link
Author

That's the idea. At least add that option to Metakernel and flag octave_kernel to use that option. I can't exactly ensure that if any kernel prints to stderr, that it will always be a halting error, but octave at the very least will print r".+ error: .+" if there is a halting error. (sorry that regex may be a little wrong).

@JMurph2015
Copy link
Author

And it's not a use case. It's a fundamental feature of being a Jupyter kernel. (that also happens to be my use case.)

@blink1073
Copy link
Contributor

blink1073 commented Jul 12, 2017

I understand, but it is also a fundamental limitation of talking to a spawned process from python. The bash kernel has the same limitation.

@dsblank
Copy link
Member

dsblank commented Jul 12, 2017

It seems like we could at least make any pexpect kernel print out a special identifier that would signal that an error has occurred, right?

@JMurph2015
Copy link
Author

Can Pexpect be modified to distinguish between the two (at least optionally)?

@dsblank
Copy link
Member

dsblank commented Jul 12, 2017

I've thought about some kind of API that would allow pexpect send back image data direct through some kind of mark-up codes. Seems like the same could be done to signal an error.

@JMurph2015
Copy link
Author

And I guess the problem is that Metakernel at the moment can't even distinguish if there was an error because it's using Pexpect that presently doesn't give information about which stream some info came from. If it did that, I think the proper place for the rest of the necessary modifications would be in Metakernel.

@blink1073
Copy link
Contributor

@dsblank, it could be possible to thread something like that through. @JMurph2015, the pty module in core python is the one with the limitation.

@JMurph2015
Copy link
Author

hmmm, but Popen processes can have their streams distinguished iirc. Of course you might not be remotely based on it, but how do they do it?

@JMurph2015
Copy link
Author

@blink1073
Copy link
Contributor

Yes, it is easy to read it after the fact. Reading from both stderr and stdout incrementally is the problem.

@blink1073
Copy link
Contributor

Also, without a pty you can't interrupt the Octave process, so you'd be losing that capability of the kernel as well.

@JMurph2015
Copy link
Author

So this may be a python core issue (assuming pty is an absolute dependency) but there shouldn't be a super good reason something generically communicating with another process shouldn't expose all streams (since they are all just file descriptors anyway)

@JMurph2015
Copy link
Author

JMurph2015 commented Jul 12, 2017

/ the cpython pty module is like 170 lines long. It wouldn't be incredibly difficult for someone (that someone might be me today) modify it and make a new module, perhaps superpty (or maybe pity 😆 ) that supports segregating stdout and stderr.
Edit: typo
Update: Would anyone actually use said modifications? Because if Pexpect doesn't at least optionally enable it, then it's kinda pointless for me to make the module.

@JMurph2015
Copy link
Author

Hi!
Bump on that question of whether or not the stack could pull support down from a hypothetical segregated fd pty module.

@blink1073
Copy link
Contributor

That would mean that pexpect would have to depend on an extension module, or we'd also have to replicate much of what pexpect is doing and depend on an extension module. I am 👎 on the idea.

@dsblank
Copy link
Member

dsblank commented Jul 12, 2017

It might perhaps depend on how much code we'd have to rely on that was outside of the standard libraries. But like @blink1073 suggests, it would not be good to fork a big chunk of code. Can be done by subclassing? Is it smallish in size? Is it something that python standard library would want to include?

@JMurph2015
Copy link
Author

200 lines of code presently. It's unclear how much of Pexpect would need refactoring. The options I put in are entirely optional arguments, but to actually use segregated streams would definitely change at least a few things on Pexpect's side.

@JMurph2015
Copy link
Author

The pty core module itself is relatively tiny at 170 lines total because it is at could be described as some convenience functions that just garble the stdin, stdout, and stderr file descriptors together.

@dsblank
Copy link
Member

dsblank commented Jul 12, 2017

I'd say make a PR and let's check it out.

@JMurph2015
Copy link
Author

JMurph2015 commented Jul 12, 2017

@blink1073 what are the odds something like a ptyplus module could find a home in pexpect, which would do this segregated reading so that any code pexpect depends on is either in base or in pexpect?

@JMurph2015
Copy link
Author

/ sorry for the continued discussion here, but I'm still formulating who, what, and where these changes are going to be made (best to go in with a plan, right?). I think I can even enable pexpect's standard "read all" mode by using some trickery with piping things written onto a separate stderr to a "unified" fd that combines the three just like old times. The tricky part then is say you tell pexpect to read until EOF on the "unified" fd, then how does one know how much of that came from the stderr fd and how much of that came from the stdout fd. So I'm about halfway to a comprehensive solution. Also there are other ways the reading problem could be addressed, like reading each incrementally and waiting for both to hit EOF or some combination of conditions.

@JMurph2015
Copy link
Author

JMurph2015 commented Jul 12, 2017

Or you could have it read from the unified stream and check if each chunk it reads is the next thing on the top of a stream-specific fd.
edit: clarity

@blink1073
Copy link
Contributor

pexpect would have to be refactored to add an optional handling of stderr if that enhanced module were available. I'm not sure how it would interact with the .expect() method. We'd be expecting a prompt, but would need to handle an out-of-band stderr and potentially bail.

@dsblank
Copy link
Member

dsblank commented Jul 13, 2017

@JMurph2015 Are you sure that this is the right approach? What about a pexpect-based kernel that doesn't write to stderr? It still needs a method to signal an error. I'm still wondering about a special text signal that could signal an error, and even an API for routing text to stderr in a jupyter frontend, even if it didn't come from stderr in the external process.

@JMurph2015
Copy link
Author

@dsblank, I like the idea, but I just don't know how one would be able to get that inserted into a runtime's error processing loop. For example, how would you be able to direct Octave that it should print some special characters every time it hits an error so that Metakernel and/or Octave kernel can pick that up?

@JMurph2015
Copy link
Author

Or, what happens if the REPL segfaults and Octave stops altogether? ( this one may be easier because Pexpect should know if the process crashes altogether.)

@blink1073
Copy link
Contributor

Octave prints error: or syntax error: at the start of the line for errors. Right, we can catch a repl segfault.

@blink1073
Copy link
Contributor

@JMurph2015
Copy link
Author

JMurph2015 commented Jul 13, 2017

@blink1073 Not to get too idealist here, but you technically lose some functionality there because you would then prohibit the first thing a user prints from being something matching the regex
".* error: ($s).*".

That's been exactly my workaround for my project so far, but my whole team agrees that it isn't great to have to tell users "Specific kernels don't process errors out of band, so please refer to our compatibility table and follow those rules whenever using those kernels"

edit: readability

@JMurph2015
Copy link
Author

JMurph2015 commented Jul 14, 2017

I had a pretty decent idea just a few minutes ago that might help us all out. If we called the repl in question with a bash pipe that padded stderr with somewhat long sequences of unprintable characters (that we know), then we could just regex on those sequences later and if they were long, uncommon, and unprintable, then it's extremely unlikely to be something a user cared about seeing, and I can feel confident in calling it an actual error.

@blink1073
Copy link
Contributor

That sounds reasonable.

@dsblank
Copy link
Member

dsblank commented Jul 14, 2017

@JMurph2015 Yes, I was thinking along those lines. If we can make it an API (of sorts) that would be very useful in other situations.

@JMurph2015
Copy link
Author

So the train has kinda left the station for me to do this on my internship time, but I can probably hack around a little and talk about what this should look like on my free time. Maybe we should make an empty PR or something to talk out how this should look?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants