Fix Doc.prototype.destroy #204

gkubisa · 2018-04-18T15:47:57Z

It fixes the issue where a document is re-added to a collection after calling destroy, causing a memory leak.

It should also fix #161.

Re not waiting for the unsubscribe callback in destroy, I think it is not necessary because:

unsubscribe cannot fail on the server side (as far as I see),
unsubscribe is performed automatically on disconnect,
the doc is marked as unsubscribed immediately on the client side.

The problem was that unsubscribe re-added the doc to the connection. Now the doc is removed from the connection after unsubscribe. Additionally, we're no longer waiting for the unsubscribe response before executing the callback. It is consistent with Query, unsubscribe can't fail anyway and the subscribed state is updated synchronously on the client side.

See See https://github.com/nodejs/Release

coveralls · 2018-04-18T15:50:22Z

Coverage decreased (-0.2%) to 96.274% when pulling 5e009d1 on Teamwork:fix-doc-destroy into 68bde00 on share:master.

curran · 2018-04-19T08:18:29Z

lib/client/doc.js

@@ -104,10 +104,8 @@ emitter.mixin(Doc);
 Doc.prototype.destroy = function(callback) {
  var doc = this;
  doc.whenNothingPending(function() {
+    if (doc.wantSubscribe) doc.unsubscribe();


IMO the curly brace style is more clear, but it's a rather pedantic point.

Any way to add tests for this?

gkubisa · 2018-04-19T11:42:12Z

Thanks for the review @curran.

I added the braces as you suggested - I also prefer this style but tried to be consistent with the rest of the code base, which very often omits braces.

I also added an assertion to an existing test which failed before my changes and now it passes.

gkubisa · 2018-06-11T12:03:45Z

@nateps @ericyhwang , could you provide some feedback on this PR, or merge it, if it's all good, please?

ericyhwang · 2018-06-14T23:01:56Z

We've got our ~monthly Share PR review meeting tomorrow, so we'll take a look at this and the other couple PRs during the meeting!

(I'll also ask about perhaps switching to shorter, more frequent review sessions, since that'll be easier all around if Nate's schedule can now accommodate it.)

nateps · 2018-06-19T00:09:42Z

Great catch!

I agree that it isn't 100% clear whether we need to wait for the unsubscribe to happen, and we might get away with calling the callback before the unsubscribe is fully effective. As you mentioned, the client calls back when it is disconnected.

Here is the reasoning behind that: Calling back on disconnect is needed to ensure that the callback to unsubscribe is always called. (The unsubscribe callback may be called after acknowledgement from the sever if we are connected, in a nextTick if we are disconnected, or at the time of disconnection.) If the client is disconnected, the server won't be able to send response messages and it will clean up the server-side agent responsible for the client. The client will get a new agent if it reconnects. So from the perspective of the client, the unsubscribe is effective immediately after a disconnection.

It is different if the client is connected, since if we call destroy() on a subscribed document, we could continue to get ops from the server until it acknowledges the unsubscribe. Now, if the document has been destroyed, the connection will ignore these ops. But if the document is then created or fetched again, it could get complicated. I'm not 100% sure I know where this would go wrong, but it seems like we should just be conservative and delay the callback until after the unsubscribe is complete to avoid any possibility of complexity arising from not waiting long enough to call back. In every case where I've assumed something should be safe to do and couldn't prove it, that later bit me. ;-)

Roughly, I could imagine something like the following being an issue:

Connection A is subscribed to a doc
Connection B performs an op on a doc. The server commits it
Connection A deletes the doc, then calls destroy immediately
As soon as the delete op is committed, the Connection A sends an unsubscribe command
In the callback to destroy, Connection A creates the doc
Connection B's op is sent to Connection A before the unsubscribe or create messages are received by the server
Connection A receives the op from Connection B thinking that it shouldn't be getting any more ops, because it called destroy
Connection A might not be able to OT the op from Connection B against its local data, because the document was created anew. If we had waited until the unsubscribe were complete, then we wouldn't have this race condition.

(This would be an issue with existing ShareDB code as well. Just clarifying why I think we should wait until unsubscribe calls back.)

The above is really complicated, but I think it might be an issue, and there is an easy way to avoid testing fate in this case. Knowing that unsubscribe will always call back and destroy waits until pending operations are complete, I think it is best if we just wait until unsubscribe calls back in all cases before calling the destroy callback. I'm a lot more confident we won't run into any race conditions if we do it that way.

Thus, I recommend the following:

Doc.prototype.destroy = function(callback) {
  var doc = this;
  doc.whenNothingPending(function() {
    if (doc.wantSubscribe) {
      doc.unsubscribe(function() {
        doc.connection._destroyDoc(doc);
        callback();
      });
    } else {
      doc.connection._destroyDoc(doc);
      if (callback) callback();
    }
  });
};

gkubisa · 2018-06-19T12:24:52Z

I think it's ok to wait for unsubscribe in destroy, however, I'm not convinced that it's necessary and it doesn't solve a bigger problem I'll describe below.

Waiting for unsubscribe

First of all, here's a slightly improved version of destroy which handles unsubscribe errors, ensures callback is always called asynchronously and always checks, if callback is specified.

Doc.prototype.destroy = function(callback) {
  var doc = this;
  var sync = true; // indicates if whenNothingPending's callback was executed synchronously
  doc.whenNothingPending(function() {
    if (doc.wantSubscribe) {
      doc.unsubscribe(function(err) {
        if (!err) doc.connection._destroyDoc(doc);
        if (callback) return callback(err);
        if (err) this.emit('error', err); 
      });
    } else {
      doc.connection._destroyDoc(doc);
      if (callback) {
        if (sync) process.nextTick(callback);
        else callback();
      }
    }
  });
  sync = false;
};

Why I think it's not necessary

The scenario you outlined above looks like one OT is designed to handle. From reading the source code, it looks like ShareDB can handle ops coming in wrong order, targeted at an older or newer version of the snapshot, with conflicting version, duplicated, etc. So, receiving any extraneous or conflicting operations, or missing some operations, is not a problem.

It also looks like extraneous calls to _handleUnsubscribe (resulting from unsubscribe we did not wait for) would not affect the correctness of a new Doc.

Bigger problem

Docs are shared freely between queries and can be retrieved individually. This is great for performance and efficient, however, it might lead to unpredictable behaviour. For example, an application may have several independent components using ShareDB. If any of the components calls unsubscribe or destroy on their Docs, they might affect other components using the same Docs. If components never call unsubscribe, then unused Docs would still use up resources unnecessarily to process updates. If components never call destroy, then unused Docs will still be referenced by Connection and never garbage collected.

Here's a simple scenario in which destroy leads to problems, regardless of waiting for the callback:

Component A: Get Doc 1
Component B: Get Doc 1 (shared with Component A)
Component A: Destroy Doc 1 (Component B is not aware of it)
Component A: Get Doc 1 (new instance, not shared with Component B)

Components A and B now have 2 separate Doc instances but Connection can support only one. As the components use their Docs, they'll surely get some incorrect behaviour.

I'm not sure what's the best way to solve it... perhaps ref counting to know when Doc is no longer used and can be safely removed from Connection. For subscriptions we could increment wantSubscribe on subscribe and decrement it on unsubscribe, so that wantSubscribe > 0 => subscribed and wantSubscribe === 0 => unsubscribed.

ericyhwang · 2018-07-11T16:44:46Z

Nate's comments from the PR review meeting:

Philosophy: In Share, server shouldn't be responsible for maintaining client state, that should be all handled by the client.

Not going to do promises any time soon. Code should always be async (e.g. process.nextTick) or always be sync, not both.

Waiting on the callback longer likely won't be an issue, since you generally won't be blocking on destroy. I'm leaning towards doing what is safe.

For this PR: Remove the sync workaround, make a separate PR for making whenNothingPending async consistently.

gkubisa · 2018-07-12T11:37:40Z

I made the requested changes and created a new PR to fix whenNothingPending.

nateps

Thanks for the contribution! Definitely good fix to make sure we are cleaning up memory properly. 💥

nateps · 2018-07-23T21:20:16Z

lib/client/doc.js

+      doc.unsubscribe(function(err) {
+        if (err) {
+          if (callback) callback(err);
+          else this.emit('error', err);


this should be doc. I'll just go ahead and merge this change and make the fix, since we're close.

gkubisa added 2 commits April 18, 2018 16:31

Update tested nodejs versions in .travis.yml

af84be6

See See https://github.com/nodejs/Release

curran approved these changes Apr 19, 2018

View reviewed changes

Add a test

09edf92

Merge 'upstream/master' into fix-doc-destroy

c15448f

gkubisa mentioned this pull request Jun 26, 2018

Document version fetching #218

Closed

ericyhwang mentioned this pull request Jul 11, 2018

More maintainers required? #163

Closed

gkubisa added 3 commits July 12, 2018 12:24

Update tested nodejs versions

15cdd1d

Make destroy wait for unsubscribe

cfca37f

Simplify the code

5e009d1

gkubisa mentioned this pull request Jul 12, 2018

Make whenNothingPending always async #222

Merged

nateps approved these changes Jul 23, 2018

View reviewed changes

Merge branch 'master' into fix-doc-destroy

c26c79b

nateps merged commit d255048 into share:master Jul 23, 2018

gkubisa deleted the fix-doc-destroy branch July 24, 2018 07:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Doc.prototype.destroy #204

Fix Doc.prototype.destroy #204

gkubisa commented Apr 18, 2018

coveralls commented Apr 18, 2018 •

edited

curran Apr 19, 2018

gkubisa commented Apr 19, 2018

gkubisa commented Jun 11, 2018

ericyhwang commented Jun 14, 2018

nateps commented Jun 19, 2018 •

edited

gkubisa commented Jun 19, 2018 •

edited

ericyhwang commented Jul 11, 2018 •

edited

gkubisa commented Jul 12, 2018

nateps left a comment

nateps Jul 23, 2018

Fix Doc.prototype.destroy #204

Fix Doc.prototype.destroy #204

Conversation

gkubisa commented Apr 18, 2018

coveralls commented Apr 18, 2018 • edited

curran Apr 19, 2018

Choose a reason for hiding this comment

gkubisa commented Apr 19, 2018

gkubisa commented Jun 11, 2018

ericyhwang commented Jun 14, 2018

nateps commented Jun 19, 2018 • edited

gkubisa commented Jun 19, 2018 • edited

Waiting for unsubscribe

Why I think it's not necessary

Bigger problem

ericyhwang commented Jul 11, 2018 • edited

gkubisa commented Jul 12, 2018

nateps left a comment

Choose a reason for hiding this comment

nateps Jul 23, 2018

Choose a reason for hiding this comment

coveralls commented Apr 18, 2018 •

edited

nateps commented Jun 19, 2018 •

edited

gkubisa commented Jun 19, 2018 •

edited

ericyhwang commented Jul 11, 2018 •

edited