Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bad interaction between methods and buffering #8088

Merged
merged 4 commits into from Jan 18, 2017

Conversation

mitar
Copy link
Contributor

@mitar mitar commented Nov 22, 2016

So it seems we introduced a bug with #5680.

The issue is the following scenario:

  • you have a publish sending changes to a field a
  • they get buffered
  • at the same time a client runs a method with a stub which changes field b
  • method is simulated and ends
  • buffered changes are applied
  • server-side confirms a method
  • client-side switches to a state after a simulated method call using replace message with the old a value, without that published change being present

Or something like this. It is really tricky race condition when this happens. But the solution is simple. Methods should always run on the latest state of data, without any pending buffered state. So this pull requests applies pending buffered state. In retrospective this looks obvious.

I am unsure how to make a test for this to discover such cases in the future with tests.

cc @nathan-muir

@mitar
Copy link
Contributor Author

mitar commented Nov 22, 2016

And do not ask how much debugging went into this one line code change.

@mitar
Copy link
Contributor Author

mitar commented Nov 22, 2016

I am not sure what you are asking.

@mitar
Copy link
Contributor Author

mitar commented Nov 22, 2016

My current workaround to fix this locally:

saveOriginals = Meteor.connection.__proto__._saveOriginals
Meteor.connection.__proto__._saveOriginals = ->
  @_flushBufferedWrites()
  saveOriginals.apply this, arguments

mitar added a commit to peer/mind that referenced this pull request Nov 22, 2016
@mitar
Copy link
Contributor Author

mitar commented Nov 22, 2016

Hm, method call between reset and quiescence test is failing now.

@mitar
Copy link
Contributor Author

mitar commented Nov 23, 2016

Changing the line to:

    if (!self._waitingForQuiescence())
      self._flushBufferedWrites();

Seems to fix the test, but it is unclear if this is the right thing to do.

@ppotoplyak
Copy link

@mitar what are the symptoms of hitting this issue? Could this cause client side method callbacks to not fire? #7868

@mitar
Copy link
Contributor Author

mitar commented Nov 23, 2016

You can try this small patch and see if it fixes for you. But I do not think it is related.

Symptom here for me was that data on the client collection was old data because. So updates to the collection were overridden. I do not think this happens for methods because methods are not "updates".

@mitar
Copy link
Contributor Author

mitar commented Nov 23, 2016

I added a test.

@mitar
Copy link
Contributor Author

mitar commented Dec 12, 2016

Can we get some eyes on this? @glasser?

@glasser
Copy link
Contributor

glasser commented Dec 12, 2016

Gosh, it's been a long long time since I've spent any time in this codebase. @benjamn do you think I'm still the best person to review this?

@abernix
Copy link
Contributor

abernix commented Jan 17, 2017

I'm not sure I'm the best person to review this either, although the original description and the recently added test (18b5ad2) makes it fairly clear to me. There might be a better fix, but I'd need to do quite a bit more digging to know. I'm curious if we should change the test to use lolex like the livedata stub - buffering data test for precision timing on it, but perhaps it's not necessary.

I can confirm that the test fails without the "fix" code and I've gone ahead and rebased this off devel.

@benjamn benjamn merged commit 7baed43 into meteor:devel Jan 18, 2017
@mitar mitar deleted the fix-buffering branch January 18, 2017 16:33
@benjamn
Copy link
Contributor

benjamn commented Jan 18, 2017

I think self._waitingForQuiescence() is at least reliably correlated with the condition we need to protect against: that self._resetStores is true because we recently reconnected (in which case there shouldn't be any buffered writes to flush anyway). Given the thoroughness of the tests, I agree this change seems safe as-is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants