Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snapshot widget state when syncing #1044

Closed
hainm opened this issue Jan 17, 2017 · 55 comments
Closed

Snapshot widget state when syncing #1044

hainm opened this issue Jan 17, 2017 · 55 comments
Labels
bug resolved-locked Closed issues are locked after 30 days inactivity. Please open a new issue for related discussion.
Milestone

Comments

@hainm
Copy link

hainm commented Jan 17, 2017

Summary

We should snapshot the widget state when syncing state from the frontend to the kernel, since the sync may happen asynchronously because of the throttling.


Original issue

hi

we have simple code in nglview that works fine with ipywidgets 5.2.2.
However, there is no effect if using mode.set(...) in 6.0.0.beta5 version.

one example:

this.model.set('camera_str', JSON.stringify(this.stage.viewer.camera))

will not update camera_str in Python.
https://github.com/arose/nglview/blob/master/js/src/widget_ngl.js#L77-L82

@jasongrout
Copy link
Member

Can you open up the debugger and see if the widget sync message is getting sent over the websocket? I can help you find where this message should be - let me know if you want help finding it.

@jasongrout jasongrout added this to the 6.0 milestone Jan 31, 2017
@hainm
Copy link
Author

hainm commented Jan 31, 2017

let me know if you want help finding it.

yes, please. I am using chrome.

btw, below are screen shots (6.0.0.beta6 vs 5.2.2).
I also clicked the "debug" button in my screen but I think this is not what you meant.

6.0.0.beta6

debug

5.2.2

old

@jasongrout
Copy link
Member

So now go to the 'network' tab in the debug window, and filter to just find websocket (WS) connections. Take a look at the messages going back and forth on the websocket channel.

@jasongrout
Copy link
Member

I tried to reproduce, but ran into errors importing pytraj. Do you have a minimal example not using pytraj?

@hainm
Copy link
Author

hainm commented Jan 31, 2017

either

  • install pytraj
    pip install pytraj

  • or use

# cell 1
import nglview as nv
view = nv.demo()
view

# cell 2
# click the atoms
# then
view.picked

@hainm
Copy link
Author

hainm commented Jan 31, 2017

click

@jasongrout
Copy link
Member

(it may be more helpful to try to debug this on chat, as we can communicate more quickly than here. Please ping me or @SylvainCorlay if you're around on https://gitter.im/ipython/ipywidgets)

@jasongrout
Copy link
Member

preliminary results: it appears that the widget is sending a number of custom messages, which increments the pending_msgs, but it never gets decremented back to 0, so any .touch() after that queues up its sync message instead of sending.

@jasongrout
Copy link
Member

@hainm, does changing this line to this.send instead of this.model.send fix the problem?

https://github.com/arose/nglview/blob/b776e517e459c96c94bbeb39bbeabca968e0ef22/js/src/widget_ngl.js#L185

Basically, the problem ended up being that the frontend sent a message to the backend without setting a callback. The callback is how we keep track of the number of messages in flight. Since the callback wasn't set, the message counter wasn't ever decremented to 0, and that blocked all other outgoing messages. The view send function automatically sets the callbacks, whereas the model send function relies on you setting the appropriate callbacks (such as in https://github.com/ipython/ipywidgets/blob/master/jupyter-js-widgets/src/widget.ts#L694).

Next thing to look at - why doesn't the model.send function set a callback, since the model send function is the one incrementing the pending messages counter. That seems to be a bit of spaghetti code on our end right now.

@hainm
Copy link
Author

hainm commented Feb 1, 2017

hi,

I think the this.model.send is not an issue, it is this.model.set method issue. https://github.com/arose/nglview/blob/b776e517e459c96c94bbeb39bbeabca968e0ef22/js/src/widget_ngl.js#L129

            this.model.set("picked", pd2);
            this.touch();

@jasongrout
Copy link
Member

Can you try it? I'm betting you'll be surprised :)

@hainm
Copy link
Author

hainm commented Feb 1, 2017

it does not fix that. I got new issue if following your change (this.send)

Exception in status msg handler TypeError: Converting circular structure to JSON
    at JSON.stringify (<anonymous>)
    at Object.serialize (main.min.js:22370)
    at Kernel.send_shell_message (main.min.js:23028)
    at Comm.send (main.min.js:22203)
    at DOMWidgetModel.WidgetModel._handle_status (:8869/nbextensions/jupyter-js-widgets/extension.js?v=20170131204412:16669)

@jasongrout
Copy link
Member

oh right, we had that issue too. Hmm...interesting, it's coming from _handle_status? Let me do some more digging tomorrow.

I really think the underlying issue with the sync not getting sent is that one message send that doesn't have callbacks. But let me look at this and see if I can see what the issue is with this.

@jasongrout
Copy link
Member

Oh, right, it's coming from handle_status because that's where the sync message is actually being sent (and the serialization is actually being done).

It's a good thing that we're getting the circular structure error - that's telling us that at least now it's trying to sync to the backend. There's something wrong with the data it's trying to send to the backend, though.

@jasongrout
Copy link
Member

Here is a function that can detect circular references:

var findCircular = function findCircular(obj, ancestors, lineage) {
    ancestors = ancestors || [obj];
	lineage = lineage || [null]
    for (var prop in obj) {
        if (obj.hasOwnProperty(prop)) {
            var value = obj[prop];
            if (value !== null && typeof(value)=="object") {
				var i = ancestors.indexOf(value);
				if (i !== -1) {
					// Circular reference found
					console.log(['circular reference', prop, value, i, ancestors, lineage]);
					var lineageplus = lineage.slice(1);
					lineageplus.push(prop);
					console.log(["paths the same: ['"+lineage.slice(1,i+1).join("']['")+"'] and ['"+ lineageplus.join("']['") + "']"]);
					return false;
				}

                // Step down in the object tree
                // Store value in our ancestor collection
				var a = ancestors.slice();
				var l = lineage.slice();
                a.push(value);
				l.push(prop)
                findCircular(value, a, l);
            }
        }
    }
} 

I modified the NGLViewClass to have this function:

		touch: function() {
			var state = this.model._buffered_state_diff;
			console.log(state);
			widgets.DOMWidgetView.prototype.touch.apply(this, arguments);
			console.log(['.touch() called on '+this.model.id, this.model.msg_buffer]);
			findCircular(this.model.msg_buffer);
			console.log('.touch() ended on '+this.model.id);
		},

That should catch when a model has circular references at the time the model is saved. However, no circular references show up at the time when touch() is called. However, if I put a similar block of code in ipywidgets right when the message in the model's message buffer is actually sent, I do detect circular references, such as these two paths in the object tree pointing to the same object: ['_init_representations']['2']['params']['gidPool']['objectList']['0'] and ['_init_representations']['2']['params']['gidPool']['objectList']['0']['_cp']['structure']. My conjecture is that when .touch() is called, attributes to an object are saved into the message buffer, waiting to be sent until after the custom comm messages have been processed. However, sometime between then and when the properties are sent, the object stored in the message buffer is modified to insert a bunch of self-references.

Does this make sense? Does this seem right to you?

I can think of two things we can do:

  1. right now, comm messages pre-empt state updates. I think that seems unfair.

  2. When a message buffer is modified, we should serialize and deserialize the attributes to create a deep copy.

@hainm
Copy link
Author

hainm commented Feb 1, 2017

Does this make sense? Does this seem right to you?

I tag @arose here with the hope that he can reply to you. I am quite naive with JS :D.

@arose
Copy link

arose commented Feb 1, 2017

It looks like the problem is trying to sync the pickingData object. There are references to NGL's structure object which you definitely do not want to sync. I suggest you copy only the data you need from the pickingData object and for atomProxy and bondProxy object you call .toObject to get a plain javascript object without references to any NGL object.

@hainm
Copy link
Author

hainm commented Feb 1, 2017

@arose: picked in only one example.

even this.model.set("frame", frame); won't update frame in backend. (frame is in integer).

@jasongrout
Copy link
Member

@hainm - the problem is that the sync message is delayed until the in-flight comm messages are resolved. When the comm messages finish being processed, it tries to send all of the sync data that's been building up from multiple this.touch() calls in the code. It appears that at least some of that data starts including things like the above, and tries to serialize and sync. My prediction is that this would happen even if you deleted the this.model.set("frame", frame) line.

We need to do some changes on our end, namely serialize the data right when this.touch() is called, even if it isn't sent at that point. That's a bit tricky because we'll need to correctly merge in further updates that may happen before the message is actually sent. We also should look at whether we want custom comm messages to preempt sync messages.

@arose
Copy link

arose commented Feb 1, 2017

even this.model.set("frame", frame); won't update frame in backend. (frame is in integer).

also raising a circular references error?

@jasongrout
Copy link
Member

also raising a circular references error?

There's a lot more going on behind the scenes than just that one data value. As best as I can guess, it says that it needs to sync some large amount of state, which doesn't contain circular references, and then the state changes to have some circular references before it actually serializes the state to JSON.

@jasongrout
Copy link
Member

It sounds like on the nglView widget's end, we need to be smarter about the data we're actually syncing, like @arose mentions above?

I'll work on the ipywidgets end - making it so that the syncing takes a snapshot of the state when you ask it to sync, and getting the state updates merging to work. It sounds like there will be more work needed to make sure that we are only syncing state that we want to sync on the ngl widget side.

@hainm
Copy link
Author

hainm commented Feb 1, 2017

Thanks. I will take care of nglview.

This would help closing this too #1038

@jasongrout
Copy link
Member

This is going to be a major code refactoring and change, I think better saved for after 6.0.

@jasongrout jasongrout modified the milestones: 6.1, 6.0 Feb 14, 2017
@jasongrout jasongrout added bug and removed needs info labels Feb 14, 2017
@jasongrout
Copy link
Member

https://github.com/arose/nglview/blob/master/nglview/widget.py#L87 - but with _model_module instead of _view_module.

@hainm
Copy link
Author

hainm commented Apr 13, 2017

thanks @jasongrout. After updating nglview code, there is another similar error but I don't know how to fix.

TraitError: The '_model_module' trait of a TrajectoryPlayer instance must be a unicode string, but a value of None <class 'NoneType'> was specified.

TrajectoryPlayer inherits from ipywidgets.widgets.domwidget.DOMWidget

https://github.com/arose/nglview/blob/master/nglview/player.py#L22

@jasongrout
Copy link
Member

jasongrout commented Apr 13, 2017 via email

@hainm
Copy link
Author

hainm commented Apr 13, 2017 via email

@jasongrout
Copy link
Member

Oh, very interesting! My thought in making that change is that there should always be a javascript subclass. Can you describe a bit more about how TrajectoryPlayer works? I see it is syncing a lot of attributes - how are those used in the browser?

@hainm
Copy link
Author

hainm commented Apr 13, 2017

hi,

There is nothing special about this class.
The only reason I let TrajectoryPlayer inherited from DOMWidget is to use observe decorator (very handful).
This class has _view property, which is nglview.NGLWidget, the main widget of nglview. So all communications between backend and broswer will be perform via NGLWidget.

If we ignore TrajectoryPlayer, there is still error below :D
weird

@hainm
Copy link
Author

hainm commented Apr 13, 2017

The only reason I let TrajectoryPlayer inherited from DOMWidget is to use observe decorator (very handful).

If change

TrajectoryPlayer(DOMWidget) to TrajectoryPlayer(object), there will be error

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-3-a2707b8cc497> in <module>()
      1 from nglview.player import TrajectoryPlayer
      2 
----> 3 TrajectoryPlayer(object())

nglview/nglview/player.py in __init__(self, view, step, delay, sync_frame, min_delay)
     67                  min_delay=40):
     68         self._view = view
---> 69         self.step = step
     70         self.sync_frame = sync_frame
     71         self.delay = delay

miniconda3/envs/jupyterlab/lib/python3.5/site-packages/traitlets/traitlets.py in __set__(self, obj, value)
    581             raise TraitError('The "%s" trait is read-only.' % self.name)
    582         else:
--> 583             self.set(obj, value)
    584 
    585     def _validate(self, obj, value):

miniconda3/envs/jupyterlab/lib/python3.5/site-packages/traitlets/traitlets.py in set(self, obj, value)
    555 
    556     def set(self, obj, value):
--> 557         new_value = self._validate(obj, value)
    558         try:
    559             old_value = obj._trait_values[self.name]

miniconda3/envs/jupyterlab/lib/python3.5/site-packages/traitlets/traitlets.py in _validate(self, obj, value)
    588         if hasattr(self, 'validate'):
    589             value = self.validate(obj, value)
--> 590         if obj._cross_validation_lock is False:
    591             value = self._cross_validate(obj, value)
    592         return value

AttributeError: 'TrajectoryPlayer' object has no attribute '_cross_validation_lock'

@jasongrout
Copy link
Member

The only reason I let TrajectoryPlayer inherited from DOMWidget is to use observe decorator (very handful).

Ah. You should inherit from HasTraits:

from traitlets import HasTraits
class TrajectoryPlayer(HasTraits):

That gives you the observe decorator. There's also no need for the .tag(sync=True) on the attributes either.

@hainm
Copy link
Author

hainm commented Apr 13, 2017

Ah. You should inherit from HasTraits:

cools. just tried and ok.

@hainm
Copy link
Author

hainm commented Apr 13, 2017

Per Could not create model error, I blindly added those

(Not sure what' the difference between view and model here)

class NGLWidget(DOMWidget):
    _view_name = Unicode("NGLView").tag(sync=True)
    _view_module = Unicode("nglview-js-widgets").tag(sync=True)
    _model_name = Unicode("NGLView").tag(sync=True)
    _model_module = Unicode("nglview-js-widgets").tag(sync=True)

Got new kind of error:

Could not create model:
Model name	NGLView
Model module	nglview-js-widgets
Model module version	*
ModelType._deserialize_state is not a function

@jasongrout
Copy link
Member

Okay, for now, how about you change it to:

class NGLWidget(DOMWidget):
    _view_name = Unicode("NGLView").tag(sync=True)
    _view_module = Unicode("nglview-js-widgets").tag(sync=True)
    _model_name = Unicode("DOMWidgetModel").tag(sync=True)
    _model_module = Unicode("jupyter-js-widgets").tag(sync=True)

@hainm
Copy link
Author

hainm commented Apr 13, 2017

wow, that actually work. And I confirm that the original issue is solved too. thanks very much.

cheers.

@hainm
Copy link
Author

hainm commented May 20, 2017

Hi @jasongrout The issue appears again.

May be adding regression test for this?

wrong

@jasongrout
Copy link
Member

Hi @jasongrout The issue appears again.

Sorry, this issue was a while ago. What issue appeared again?

@hainm
Copy link
Author

hainm commented May 21, 2017

sorry, I should be clear. This issue appears again: #1044 (comment)

In short: mode.set(...) does not work again.

@jasongrout
Copy link
Member

What errors are you seeing?

@jasongrout
Copy link
Member

jasongrout commented May 21, 2017

From above:

I'll work on the ipywidgets end - making it so that the syncing takes a snapshot of the state when you ask it to sync, and getting the state updates merging to work.

This should be done now.

It sounds like there will be more work needed to make sure that we are only syncing state that we want to sync on the ngl widget side.

Did you look at this, to make sure you aren't trying to sync a circular reference? Last I remember, this ngl view still had an issue: #1044 (comment)

@hainm
Copy link
Author

hainm commented May 22, 2017

Yeah, NGL still has issue with serialization. However, #1270 does resolve the original issue. But the issue appeared again recently.

@hainm
Copy link
Author

hainm commented May 22, 2017

I just checkout the commit in #1270 and try again and the model.set(...) works fine.

@hainm
Copy link
Author

hainm commented May 22, 2017

What errors are you seeing?

strangely there is no error in console.

@jasongrout
Copy link
Member

Can you try checking the "Pause on exceptions" and running the code, to try to see where the error is?

@jasongrout
Copy link
Member

Let's continue conversation on #1375 to track whatever is happening this time.

@github-actions github-actions bot added the resolved-locked Closed issues are locked after 30 days inactivity. Please open a new issue for related discussion. label Feb 13, 2021
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 13, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug resolved-locked Closed issues are locked after 30 days inactivity. Please open a new issue for related discussion.
Projects
None yet
Development

No branches or pull requests

4 participants