Expose a _simpler_ way to get raw object reprs #10647

Open
Carreau opened this Issue Jun 9, 2017 · 27 comments

Comments

Projects
None yet
6 participants
Owner

Carreau commented Jun 9, 2017

This came up in several contexts, @mpacer for subfigure and @rgbkrk who want to have list(objects) have rich reprs.

We can't call _repr_*_ recursively as it may not be defined (user custom reprs, or repr_mimebundle).
and get_ipython().display_formatter.format is bit verbose/hidden.

I propose a small wrapper around get_ipython().display_formatter.format, which is exposed and return the repr of an object. It has to likely have some logic to check for re-entrancy (like repr pretty) and avoid recursion.

Thoughts ?

Owner

ellisonbg commented Jun 9, 2017

Hmm, fun idea! Will think about it a bit.

Owner

Carreau commented Jun 9, 2017

Ok, so playing with that, we should be careful in the IPython terminal to compute all mimetypes. I have a prototype but so far it does not compute html when in CLI.

Owner

rgbkrk commented Jun 9, 2017

xref: ipython/disp#3

What I was curious about was a top level Container type (mimetype application/jupyter-container+json) which would provide collections of mimebundles.

In[1]: [Image(), Image(), Image()]

Out[1]: [
  IMAGE,
  IMAGE,
  IMAGE
]
Owner

ellisonbg commented Jun 9, 2017

Owner

rgbkrk commented Jun 9, 2017

I'm just throwing this out here:

application/jupyter-container+json

{
  'key': MIMEBUNDLE,
  'key2': [MIMEBUNDLE, MIMEBUNDLE, MIMEBUNDLE]
}

Top level has to be a list or a dict, values can be list, object or mimebundle. As for determining what's an object and what's a mimebundle, we could set a key to true to signify if something is a mimebundle or a pure object. Any ideas here?

Owner

ellisonbg commented Jun 9, 2017

Owner

rgbkrk commented Jun 9, 2017

Admittedly, I typically only need this at the top level of an object, not super deep.

Owner

Carreau commented Jun 9, 2017

I think we should stay away from arbitrary container and looking into nested stuff. In above example, if key2 is a list you can't guess that the values are mimebundle, it could easily be a list-of-list of mimebundle.

application/jupyter-container+json

{
  'key': MIMEBUNDLE,
  'key2': [MIMEBUNDLE, MIMEBUNDLE, MIMEBUNDLE]
  'key3': {
     'sub/key': MIMEBUNDLE
  }
}

Here, key3 is ambiguous.

You can also pretty easily define a List call that can nest other reprs:

class List:
    
    def __init__(self, *objs):
        self.objs = list(objs)
        
    def _repr_html_(self):
        rep = ['<ul>']
        for o in self.objs:
            rep.append('<li>'+get_repr(o)[0]['text/html']+'</li>')
        rep.append('</ul>')
        return ''.join(rep)

Arguably it's then puts more work on the implementer, and there is some nesting you cannot do, but that works with current frontends, and can relatively easily handle recursion.

screen shot 2017-06-09 at 13 55 07

I would prefer to investigate that (potentially in disp, and by registering a formatter for list/dict) than to start tweaking the frontends.

@Carreau Carreau added a commit to Carreau/ipython that referenced this issue Jun 9, 2017

@Carreau Carreau [WIP] Safe method to get the repr of an object.
 - Protect against recursion.
 - Incorrect because of bugs in IPython, in particular the priority of
   formatters is not well respected.

First step toward #10647
8d76941
Owner

Carreau commented Jun 9, 2017

See #10651

The other disadvantage of a application/jupyter-container+json is that most of the time it would be the only available mimetype, so all current frontend would have nothing to display. It would be good to iterate on though.

Owner

rgbkrk commented Jun 9, 2017

I would prefer to investigate that (potentially in disp, and by registering a formatter for list/dict) than to start tweaking the frontends.

Ok, that sounds like a good exploration.

Owner

ellisonbg commented Jun 10, 2017

Owner

ellisonbg commented Jun 10, 2017

Contributor

mpacer commented Jun 10, 2017 edited

I'll admit I'm confused about this data representation problem. But I still have something to contribute that is a different approach.

I think I found a case that we should have covered in the representational capacity of our native displaying technology. Call it a minimal expressibility constraint.

A minimal expressibility constraint of our mimetype system: be able to natively inline open ai's gym's env.render() operation.

There should be a way for me to write an extension to the gym.core.Env object type to add a display mechanism compatible with our ipython display() operation.

Note: I am not arguing that we should design this API/data-representation to make a scenario that is this complicated easy to display.

However, even if it is complicated it should be something that is straightforward, documented and (most of all) possible to do.

This way this wouldn't need to be something that people need to invent workarounds for that use matplotlib's plt.imshow:

Should work to do plt.imshow(env.render(mode='rgb_array')) inside of your notebook cell
👍 1
openai/gym#56

Context: playing around with openai's gym on my vacation and it'd be nice to just wrap that in a display call and inline it into the jupyter notebook.

I would enjoy writing the extension to handle this case eventually. First, I'm going to learn more about the gym library itself. That way I can speak more to what exact representation I could imagine using.

Aside: Since gym lets you play atari games, one could thing that might be cool would be generating data by playing the game and having the moves saved. But that sounds like it's a much more complicated task. It seems like it should be possible using ipywidgets based off of @SylvainCorlay's & @jasongrout demos with a video game controller and flight simulator. This would be much simpler to render. Additionally, the input would be fully recordable, making it possible that this could have fully recordable input, making it potential training data for a machine learning algorithm. I don't think that'd be handled by a "mimetype" renderer though.

@jhamrick: wouldn't this be a mechanism to get tutor style training from experts for various games?

Owner

Carreau commented Jun 10, 2017

A minimal expressibility constraint of our mimetype system: be able to natively inline open ai's gym's env.render() operation.

There should be a way for me to write an extension to the gym.core.Env object type to add a display mechanism compatible with our ipython display() operation.

Note: I am not arguing that we should design this API/data-representation to make a scenario that is this complicated easy to display.

However, even if it is complicated it should be something that is straightforward, documented and (most of all) possible to do.

I'm confused. AFAIU it is possible, documented, and not really hard (spark example), of course it might depends on the internal of gym. Also AFAICT, rendering as a widget could be handle by a widget mimetype.

The question of this issue is, if the gym authors provide a _repr_*_, and you as a user provide an alternative _repr_*_ that takes precedence over the gym's one. How can you programmatically and reliably get the right one ? Like if you actually write this widget it needs to embed the repr of the came and a slider for the timestep. how to you get the game REPR in code, without displaying it.

Owner

ellisonbg commented Jun 11, 2017

Contributor

mpacer commented Jun 11, 2017 edited

wondering if we should define a new MIME type to indicate that a link should be opened in an external iframe. In lab, this could be in the dock panel, in classic notebook a separate tab. Would make it really easy to integrate with things such as gym and tensorboard without trying to cram them into notebook output....
~~ @ellisonbg

↑ THIS. +1

I was just thinking this wrt piping a pyglets interface, since that's what openai gym actually is.

However, I want to maintain the ability to interface as well as record the content, so there's going to have to be some kind of tight coupling. But I don't think it should actually be saved to the notebook document format.

This does seem like a great case for a jupyterlab widget though.

Contributor

mpacer commented Jun 11, 2017

@Carreau It looks like a lot of the renders are via pyglet, so it would require piping a pyglet rendering into a context that the notebook needs to know how to describe. They're not outputing pngs, but you have to trick them into doing it by using a different mechanism. And then it's not nearly so seamless an experience. My point is just, we should know to expect that kind of data there to be playable until it isn't there. You have to do more than just the plt.imshow trick because it's using animations and those are just the side effect files. And they also plug into pygame to make interactive games that can also be monitored and saved in the format that the training steps take.

Contributor

mpacer commented Jun 11, 2017 edited

@Carreau there is another approach that resorts to recording it to a background video, which we could do, but it seems like it's going to be a lot harder to just "display" that kind of code using the standard code pattern. It seems like it will need async conversion to an image sequence that the display can accept live data. It will accept live data from the ipython process. That process is somehow(‽) generating the png-ified pyglet image in the background. Somehow it is doing this in a non-blocking fashion, or else this won't work well. Additionally, it would be able to receive live data from a asyncio/trio event-loop managing the interaction with pygame. Pygame is the output method they use when you "play" the game (interactively) instead of "rendering" it (passively).

Owner

Carreau commented Jun 11, 2017

To further hi-jack this thread off to new ideas...wondering if we should
define a new MIME type to indicate that a link should be opened in an
external iframe. In lab, this could be in the dock panel, in classic
notebook a separate tab. Would make it really easy to integrate with things
such as gym and tensorboard without trying to cram them into notebook
output....

That, will be problematic in the use case that @mpacer have in a JupyterHub context. You will have to teach Hub about local services and handle authentication. It feel like shoving into the spec not completely meant for it. I see the use case but if we make it work we think a bout the ramification and why not just publish HTML that user should click.

My point is just, we should know to expect that kind of data there to be playable until it isn't there. You have to do more than just the plt.imshow trick because it's using animations and those are just the side effect files. And they also plug into pygame to make interactive games that can also be monitored and saved in the format that the training steps take.

We can't "just know" of every format on earth (and beyond), AFAICT what you need is to draw pixels on screen, and a back-and forth communication. This means likely a canvas for efficient update and indeed widgets. If it's "live" data it's tricky, but I'm going to assume the training/playing in written partially in multithreaded C, so does not hold the GIL. Hence having "blocking" should be OK.

If it's using PyGame maybe one of the best thing we can do is figure out how to get PyGame to render on a Canvas and have something not gym specific. This will likely use widgets so should not need any changes to IPython/Jupyter itself.

Owner

rgbkrk commented Jun 11, 2017

wondering if we should define a new MIME type to indicate that a link should be opened in an
external iframe

Is there a way to use target= to target an iframe?

At least with setting <base target="_blank" on the overall page it guarantees a separate window.

@Carreau Carreau added a commit to Carreau/ipython that referenced this issue Jun 12, 2017

@Carreau Carreau [WIP] Safe method to get the repr of an object.
 - Protect against recursion.
 - Incorrect because of bugs in IPython, in particular the priority of
   formatters is not well respected.

First step toward #10647
f6c192d

@Carreau Carreau added a commit to Carreau/ipython that referenced this issue Jun 12, 2017

@Carreau Carreau [WIP] Safe method to get the repr of an object.
 - Protect against recursion.
 - Incorrect because of bugs in IPython, in particular the priority of
   formatters is not well respected.

First step toward #10647
7b2803f

@Carreau Carreau added a commit to Carreau/ipython that referenced this issue Jun 12, 2017

@Carreau Carreau [WIP] Safe method to get the repr of an object.
 - Protect against recursion.
 - Incorrect because of bugs in IPython, in particular the priority of
   formatters is not well respected.

First step toward #10647
2c5f60e

@Carreau Carreau added a commit to Carreau/ipython that referenced this issue Jun 12, 2017

@Carreau Carreau [WIP] Safe method to get the repr of an object.
 - Protect against recursion.
 - Incorrect because of bugs in IPython, in particular the priority of
   formatters is not well respected.

First step toward #10647
7ebd81e

@Carreau Carreau added a commit to Carreau/ipython that referenced this issue Jun 12, 2017

@Carreau Carreau [WIP] Safe method to get the repr of an object.
 - Protect against recursion.
 - Incorrect because of bugs in IPython, in particular the priority of
   formatters is not well respected.

First step toward #10647
6424e91
Owner

Carreau commented Jun 14, 2017

@ellisonbg , @rgbkrk how open would you be in standardize some css-class name across nteract/Notebook/Lab/... Right now using pure html/css I can get something like the follwoing simply by registering formaters on types.FunctionType, dict, tuple, list, type and requests.Request object.

html fold

In more details dict/list/tuple show their delimiter with their content collapsible, recursively. Functions show normal repr + expandable to have the content of oinfo-request. Types show normal repr expandable to show base types – recursively. Though getting it to look right need custom styling, and it would be nice to know that some rules are shipped in all frontends. So we would need to standardise a couple of class names.

Owner

rgbkrk commented Jun 14, 2017

If we standardize css class names, they belong in both the nbformat spec and the jupyter client message spec. As long as they get standardized in there, I'm happy to support them wherever. Are you using <summary> and <description> above?

Owner

Carreau commented Jun 14, 2017

As long as they get standardized in there, I'm happy to support them wherever. Are you using

and above?

Yep.

If we standardize css class names, they belong in both the nbformat spec and the jupyter client message spec

Why in the spec ? we allow any mimetypes, and don't require to have all the mimetypes. Can we just have a formal "well the IPython kernel does this, with these class names, and most of the frontends agreed of the meaning". Agreed it should be documented, but unsure about actually in nbformat.

Owner

rgbkrk commented Jun 14, 2017

Why in the spec ?

The text/html and application/javascript mimetypes have this implicit support / ability from the classic notebook that has been a bit of a pitfall with other frontends (especially as libraries code for it).

Owner

ellisonbg commented Jun 14, 2017

Owner

ellisonbg commented Jun 15, 2017

Owner

Carreau commented Jun 19, 2017

With JupyterLab we are working hard to provide a lot of clarity about what
things are public APIs and which things are not. The approach (how it
turned out) in the classic notebook was to have users directly modify the
DOM and target our CSS classes in extensions. The difficulty with that is
that essentially makes your DOM structure and CSS classes public APIs,
which is overly constraining from a maintenance perspective. With Lab, our
policy is that the DOM structure and CSS classes are entirely private. As
Kyle mentioned, this type of thing also makes it difficult for other
frontends to cover these cases.

Thanks, I was not at all speaking of already existing class, currently existing DOM structures, or of Dom even structures created by the frontend. The above example cannot be rendered only by a tree rendered that handle json. Creating a new mimetype does not work either. And "Well scoped" css cannot work cleanly either with current architecture as each leaf node would have to display a duplicate of the 80 line css fixes to render identically on Lab/Nteract/Classic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment