Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Give a more informative error for JSON not serializable. (#269) #273

Merged
merged 8 commits into from
Aug 1, 2018

Conversation

rmarren1
Copy link
Contributor

@rmarren1 rmarren1 commented Jun 17, 2018

Fix for #269
when a non JSON serializable object is returned from a dash callback, the following exception is raised:

dash.exceptions.ReturnValueNotJSONSerializable:
Callback for property `<property>` from component id `<id>`
returned a value `<value>` of type `<type>`,
which is not JSON serializable.

an example app:

import dash
import dash_html_components as html
from dash.dependencies import Input, Output

app = dash.Dash("")

app.layout = html.Div([
    html.Div(id='output'),
    html.Button(id='button')
])


@app.callback(Output('output', 'children'),
              [Input('button', 'n_clicks')])
def test(n_clicks):
    if n_clicks:
        return dash
    return "HELLO WORLD"


app.run_server()

results in:

dash.exceptions.ReturnValueNotJSONSerializable:
Callback for property `children` from component id `output`
returned a value `<module 'dash' from '/home/ryan/Dash/dash/dash/__init__.py'>` of type `module`,
which is not JSON serializable.

when the button is clicked (and returns the dash module rather than a value).

@chriddyp
Copy link
Member

Very nice! One case that I'm interested in is how we can deal with objects that have a non-JSON-serializable object deep inside an object, for example:

html.Div([
    html.Div(lambda: 'test')
])

With this current exception, it would say that "Div is not JSON serializable" which could be misleading. In an ideal case, we would be able to say something like:

The property "children" of a component "Div" is a "function" which is not serializable. This Div is located at:

Div.children[0].children

In general, Dash properties can only be dash components, strings, dictionaries, numbers, None, or lists of those.

Or, if the particular parent object was assigned an ID, like

html.Div([
    html.Div(lambda: 'test', id='some-div')
])

then our error message could be like:

The property "children" on the element "some-div" is not JSON serializable.

Now, I'm not actually sure the best way to go about doing this. One way might be to subclass the plotly.utils.PlotlyJSONEncoder. That class tries to encode an object a bunch of different ways. If it fails all of the ways, it raises an error:
https://github.com/plotly/plotly.py/blob/519e0ab32b2164583f78b51168ce3412c08334e5/plotly/utils.py#L213-L229

It's been a few years since I worked with that class, but I believe that json.dumps traverses the object that you give it and calls cls.default with the nested object
https://github.com/plotly/plotly.py/blob/519e0ab32b2164583f78b51168ce3412c08334e5/plotly/utils.py#L182

So perhaps we could raise a nicer exception within there somehow. I would be OK with our own JSON encoder class that uses that plotly.utils.PlotlyJSONEncoder.

Alternatively, we might be able to traverse the tree ourselves. If the object isn't serializable, then we could walk down the component tree until we find the particular property that isn't serializable. I actually wrote a method that traverses the tree: traverse:

def traverse(self):

which I believe you can call with something like:

for object in layout:
    print(object)

However, I don't think that really gives us the path of the component, it would only give us each individual component in the tree.

Another alternative would be to just raise an exception for the particular property and print the entire component. The user might be able to figure out which component was having the issue if they say both the property that wasn't JSON serializable and if the saw the rest of the properties that weren't serializable. For example, if the issue was something like:

html.Div([
   html.Div(style={'color': 'blue'}, children=lambda: 'test')
])

Then the error message would be like:

The property `children` in the following component is not serializable:
Div(children=<function <lambda> at 0x109181840>, style={'color': 'blue'})

I believe this would be relatively easy with the traverse method or even with subclassing the plotly.utils.PlotlyJSONEncoder and raising a dash.exceptions instead of the default TypeError

Oddly, I can't seem to find any other solution out there.

@rmarren1 rmarren1 changed the title Give a more informative error for JSON not serializable. (#269) [WIP] Give a more informative error for JSON not serializable. (#269) Jun 19, 2018
@rmarren1
Copy link
Contributor Author

rmarren1 commented Jun 19, 2018

I think a good solution would be to edit traverse to recursively build paths, like here:

    def traverse_with_paths(self):
        """Yield each item with its path in the tree."""
        children = getattr(self, 'children', None)
        children_type = type(children).__name__


        # children is just a component
        if isinstance(children, Component):
            yield children_type, children
            for p, t in children.traverse_with_paths():
                yield " -> ".join([children_type, p]), t


        # children is a list of components
        elif isinstance(children, collections.MutableSequence):
            for idx, i in enumerate(children):
                list_path = "{:s} index {:d} (type {:s})".format(
                    children_type,
                    idx,
                    type(i).__name__
                )
                yield list_path, i


                if isinstance(i, Component):
                    for p, t in i.traverse_with_paths():
                        yield " -> ".join([list_path, p]), t

then traverse can just be

    def traverse(self):
        """Yield each item in the tree."""
        for t in self.traverse_with_paths():
            yield t[1]

We can then test if every value in this traversal is one of these types valid = [str, dict, int, float, type(None), Component, dict]

actually, we would need to do something like

def _value_is_valid(val):
            return (
                any([isinstance(val, x) for x in valid]) or
                type(val).__name__ == 'unicode'
            )

to account for that python 2 edge case.

This can give very detailed exceptions, which are constructed here: https://github.com/rmarren1/dash/blob/da65b56ebd532ebf9081424467bf100d3a537f44/dash/dash.py#L481

One example I tried was returning this from a callback:

 html.Div(["Hello", "world", html.Span(["hi", html.Div(["hello", html.P([lambda: 'hi'])])])])

which raises this

dash.exceptions.ReturnValueNotJSONSerializable: 
The callback for property `children` of component `output`
returned a tree with one value having type `function`
which is not JSON serializable.

The value in question is located at

`Div -> list index 2 (type Span) -> list index 1 (type Div) -> list index 1 (type P) -> list index 0 (type function)`

and has string representation

`<function test.<locals>.<lambda> at 0x7f9b0d64c9d8>`.

In general, Dash properties can only be
dash components, strings, dictionaries, numbers, None,
or lists of those.

Also tox was passing locally, so I'll check into what the issue is with CircleCI. I'm going to add more test cases to make sure there arent valid values this is raising on.

@chriddyp
Copy link
Member

😻 that looks SO good @rmarren1 ! This seems like the winning solution.

@plotly/dash - Anyone else like to review?

@nicolaskruchten
Copy link
Contributor

nicolaskruchten commented Jun 21, 2018

Love this! I would recommend putting each path element on its own line rather than separating them with -> as those lines will get looong :)

We could also structure the lines a bit differently like:

The value in question is located at

Div
[2]: Span
[1]: Div
[1]: P
[0]: function

or something? Maybe add in the id of those elements if present?

@rmarren1
Copy link
Contributor Author

It turns out that traverse doesn't yield leaves of the tree which are not of type Component or collections.MutableSequence, which complicates things a bit since an exception will not be thrown for return values like

html.Div(
    html.Div(lambda: 'hi')
)

since the lambda is not wrapped in a list. (Is this the expected behavior of traverse? When I change it to return such elements, test cases fail.)

My quick patch was to check each value in the traversal, and if is a Component and has a child not of type collections.MutableSequence we validate the child.

Also, I changed the output format (thanks @nicolaskruchten). Outputs now read

The callback for property `children` of component `output`
returned a tree with one value having type `function`
which is not JSON serializable.

The value in question is located at
-   Div (id=top-div)
-   Div (id=list-div)
[2] Span 
-   function

and has string representation
`<function change_button.<locals>.<lambda> at 0x7fc67896d6a8>`

In general, Dash properties can only be
dash components, strings, dictionaries, numbers, None,
or lists of those.

I also added special cases for when the bad value is the only value returned or is a list (previously, the error message would talk about trees and print out a tree where there is only one element, this could be confusing). These error messages look like this now:

The callback for property `children` of component `output`
returned a value having type `function`
which is not JSON serializable.

The value in question is either the only value returned,
or is in the top level of the returned list,
and has string representation
`<function change_button.<locals>.<lambda> at 0x7fb2f3ca66a8>`

In general, Dash properties can only be
dash components, strings, dictionaries, numbers, None,
or lists of those.

@rmarren1 rmarren1 changed the title [WIP] Give a more informative error for JSON not serializable. (#269) Give a more informative error for JSON not serializable. (#269) Jun 25, 2018
@nicolaskruchten
Copy link
Contributor

Nice! I would favour the [index] for every element in the list, even when the id is present, so that you could mechanically just walk to the right place in the tree if you wanted do :)

@rmarren1
Copy link
Contributor Author

rmarren1 commented Jun 26, 2018

I think that was just a coincidence in the example I used, which was something like this:

html.Div(
    id='top-div',
    children=html.Div(
        id='list-div',
        children=['hi', 'hello', Span(lambda: 'hi')]
    )
)

If the id is present it should be printed no matter if the component is a single child or a member of a list: https://github.com/plotly/dash/pull/273/files#diff-228c9aefac729977642672aa1416bacaR511.

When an element is a singleton child the prefix is '- ' so that it aligns with the list elements that have prefix '[i] '

I will double check this

@nicolaskruchten
Copy link
Contributor

Right, so what I'm saying is that even if elements are singletons, I think we should show [0] rather than -.

@rmarren1
Copy link
Contributor Author

How could we then differentiate between a singleton element and the first element in a list?

c = html.Div(html.Div(lambda: 'hi'))

would need to be accessed like this:

c.children.children

and

html.Div([html.Div([lambda: 'hi', "valid string"])])

would need to be accessed like this:

c.children[0].children[0]

Showing - instead of [0] could tip off a crawler to use .children rather than .children[0].

I think that would make sense if Dash automatically converted singleton elements into length one lists (which might make coding a bunch of the internals easier).

@nicolaskruchten
Copy link
Contributor

Ah, I see what the problem is. In that case I might suggest [*] just for nicer visual effect scanning down :)

@rmarren1
Copy link
Contributor Author

rmarren1 commented Jul 4, 2018

Error paths now read like this:

[*] Div (id=top-div)
[*] Div (id=list-div)
[2] Span 
[*] function

looks better imo, thanks @nicolaskruchten

@rmarren1
Copy link
Contributor Author

Is this good to merge?

@@ -475,6 +475,110 @@ def _validate_callback(self, output, inputs, state, events):
output.component_id,
output.component_property).replace(' ', ''))

def _validate_callback_output(self, output_value, output):
valid = [str, dict, int, float, type(None), Component]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we JSON serialize with the json.dumps(obj, cls=plotly.utils.PlotlyJSONEncoder), this list actually has a few other items: https://github.com/plotly/plotly.py/blob/6b3a0135977b92b3f7e0be2a1e8b418d995c70e3/plotly/utils.py#L137-L332

So, if there was a component property that accepted a list of numbers, then technically the user could return a numpy array or a dataframe.

The only example that immediately comes to mind is updating the figure property in a callback with the plotly.graph_objs. These objects get serialized because they have a to_plotly_json method.

So, I wonder if instead of validating up-front, we should only run this routine only if our json.dumps(resp, plotly.utils.PlotlyJSONEncoder) call fails.

This would have the other advantage of being faster for the non-error case. If the user returns a huge object (e.g. some of my callbacks in apps return a 2MB nested component), doing this recursive validation might be slow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. I am not sure how large of a speed up that would be since validating a large callback output is likely much faster than the network delay of sending that large output, but it definitely makes it easier than crafting a perfect validator (that would need to update every time we want to extend PlotlyJSONEncoder).

@chriddyp
Copy link
Member

I have just one more comment about the placement of the validation call. Otherwise, I'm 👍 with the code.

It would be great if another dash dev could review this so that more people are familiar with this code. @T4rk1n , could you take a look as well?

Copy link
Contributor

@T4rk1n T4rk1n left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall.

@@ -44,3 +44,7 @@ class CantHaveMultipleOutputs(CallbackException):

class PreventUpdate(CallbackException):
pass


class ReturnValueNotJSONSerializable(CallbackException):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just me, but I don't like that exception name, it gives too much info on the cause while not giving the true nature of the error, that is the return value of a callback is invalid. I would change it to InvalidCallbackReturnValue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me, pushed those changes.

response,
cls=plotly.utils.PlotlyJSONEncoder
)
except TypeError:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 perfect

@chriddyp
Copy link
Member

💃 Looks good, let's do this!

@rmarren1 rmarren1 merged commit 30353a9 into plotly:master Aug 1, 2018
rmarren1 added a commit that referenced this pull request Aug 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants