Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: maintain a ref count for objects sent over remote #17464

Merged
merged 2 commits into from Apr 16, 2019

Conversation

@MarshallOfSound
Copy link
Member

commented Mar 19, 2019

Remote Fix

See #17464 (comment) for explanation

Fixes #17039

Refs: https://chromium-review.googlesource.com/c/chromium/src/+/1514516

This also includes the test change as that is the "repro" for this issue 馃槙

Testing Changes

Previously when this assertion failed all tests that ran after the failed assertion also failed. This ensure that the assertion fails for the test that actually caused the issue but cleans up the left-over windows so that future tests do not fail.

Release Notes

Notes: Fixed race condition where the remote module would sometimes fail to fetch properties of a remote object

@codebytere codebytere changed the title spec: clean up after a failed window count assertion test: clean up after a failed window count assertion Mar 20, 2019

spec: clean up after a failed window count assertion
Previously when this assertion failed all tests that ran after the
failed assertion also failed.  This ensure that the assertion fails for
the test that actually caused the issue but cleans up the left-over
windows so that future tests do not fail.

@MarshallOfSound MarshallOfSound force-pushed the clean-up-window-assertion branch from 367088b to 54d20d1 Apr 15, 2019

@MarshallOfSound MarshallOfSound changed the title test: clean up after a failed window count assertion fix: post remote deref message asyncronously Apr 15, 2019

@MarshallOfSound MarshallOfSound force-pushed the clean-up-window-assertion branch from 54d20d1 to 89f9de3 Apr 15, 2019

@zcbenz

This comment has been minimized.

Copy link
Member

commented Apr 16, 2019

This would trigger another regression. Consider following situation:

  1. remote.getCurrentWindow() to create remote object A for current window
  2. Garbage collection
  3. OnObjectGC called for remote object A, RunDestructor schedule for next tick
  4. remote.getCurrentWindow() to create remote object B for current window
  5. RunDestructor gets called, and the objectRegistry in the main process clears the entry for current window
  6. Any operations on remote object B now ends up with missing remote object error

See #4733 (comment) for more history on the original regression, and #4869 for the fix.

I think this PR fixes the CI failure because it deferred the RunDestructor so the actual problem is now hidden.

@MarshallOfSound MarshallOfSound force-pushed the clean-up-window-assertion branch from 89f9de3 to f737634 Apr 16, 2019

@MarshallOfSound MarshallOfSound changed the title fix: post remote deref message asyncronously fix: maintain a ref count for objects sent over remote Apr 16, 2019

@MarshallOfSound

This comment has been minimized.

Copy link
Member Author

commented Apr 16, 2019

@zcbenz I have updated the PR with a (IMO) better approach to ensuring that we only ever remove an object from the cache if all sent references have been dereffed.

@MarshallOfSound

This comment has been minimized.

Copy link
Member Author

commented Apr 16, 2019

IMG_20190415_190716

This is the exact race condition we were hitting. In a rare case when a remote object was fetched a second time and a GC occured at the right time (between the main process sending the remote object metadata and the renderer process pulling the proxy out of a weak map) the renderer would tell the main process that the object had been GC'ed but would then on the next tick receive a remote message telling it to use the object it just free'ed.

I've included the commit message below for more info:

Previously there was a race condition where a GC could occur in the
renderer process between the main process sending a meta.id and the
renderer pulling the proxy out its weakmap to stop it being GC'ed.

This fixes that race condition by maintaining a "sent" ref count in the
object registry and a "received" ref count in the object cache on the
renderer side. The deref request now sends the number of refs the
renderer thinks it owns, if the number does not match the value in the
object registry it is assumed that there is an IPC message containing a
new reference in flight and this race condition was hit.

The browser side ref count is then reduced and we wait for the new deref
message. This guaruntees that an object will only be removed from the
registry if every reference we sent has been guarunteed to be unreffed.

@MarshallOfSound MarshallOfSound force-pushed the clean-up-window-assertion branch from f737634 to cb9aeb9 Apr 16, 2019

@zcbenz

zcbenz approved these changes Apr 16, 2019

Copy link
Member

left a comment

Have ran the code in my mind, seems to be a perfect fix!

@MarshallOfSound

This comment has been minimized.

Copy link
Member Author

commented Apr 16, 2019

Note to self: test failures are because I didn't update https://github.com/electron/electron/blob/master/lib/browser/objects-registry.js#L69

Secondary note to self: typescript would have caught this 馃槅

fix: maintain a ref count for objects sent over remote
Previously there was a race condition where a GC could occur in the
renderer process between the main process sending a meta.id and the
renderer pulling the proxy out its weakmap to stop it being GC'ed.

This fixes that race condition by maintaining a "sent" ref count in the
object registry and a "received" ref count in the object cache on the
renderer side.  The deref request now sends the number of refs the
renderer thinks it owns, if the number does not match the value in the
object registry it is assumed that there is an IPC message containing a
new reference in flight and this race condition was hit.

The browser side ref count is then reduced and we wait for the new deref
message.  This guaruntees that an object will only be removed from the
registry if every reference we sent has been guarunteed to be unreffed.

@MarshallOfSound MarshallOfSound force-pushed the clean-up-window-assertion branch from 6347a68 to 3e0f213 Apr 16, 2019

@MarshallOfSound MarshallOfSound merged commit 78411db into master Apr 16, 2019

15 of 16 checks passed

Backportable? - 5-0-x Backport Failed
Details
Artifact Comparison No Changes
Details
Semantic Pull Request ready to be squashed
Details
WIP Ready for review
Details
appveyor: win-ia32-debug AppVeyor build succeeded
Details
appveyor: win-ia32-testing AppVeyor build succeeded
Details
appveyor: win-ia32-testing-pr AppVeyor build succeeded
Details
appveyor: win-x64-debug AppVeyor build succeeded
Details
appveyor: win-x64-testing AppVeyor build succeeded
Details
appveyor: win-x64-testing-pr AppVeyor build succeeded
Details
build-linux Workflow: build-linux
Details
build-mac Workflow: build-mac
Details
electron-arm-testing Build #20190416.25 succeeded
Details
electron-arm64-testing Build #20190416.24 succeeded
Details
lint Workflow: lint
Details
release-notes Release notes found
@release-clerk

This comment has been minimized.

Copy link

commented Apr 16, 2019

Release Notes Persisted

Fixed race condition where the remote module would sometimes fail to fetch properties of a remote object

@MarshallOfSound MarshallOfSound deleted the clean-up-window-assertion branch Apr 16, 2019

@trop

This comment has been minimized.

Copy link
Contributor

commented Apr 16, 2019

I was unable to backport this PR to "5-0-x" cleanly;
you will need to perform this backport manually.

@trop trop bot added needs-manual-bp/5-0-x and removed target/5-0-x labels Apr 16, 2019

MarshallOfSound added a commit that referenced this pull request Apr 16, 2019

fix: maintain a ref count for objects sent over remote (#17464)
* spec: clean up after a failed window count assertion

Previously when this assertion failed all tests that ran after the
failed assertion also failed.  This ensure that the assertion fails for
the test that actually caused the issue but cleans up the left-over
windows so that future tests do not fail.

* fix: maintain a ref count for objects sent over remote

Previously there was a race condition where a GC could occur in the
renderer process between the main process sending a meta.id and the
renderer pulling the proxy out its weakmap to stop it being GC'ed.

This fixes that race condition by maintaining a "sent" ref count in the
object registry and a "received" ref count in the object cache on the
renderer side.  The deref request now sends the number of refs the
renderer thinks it owns, if the number does not match the value in the
object registry it is assumed that there is an IPC message containing a
new reference in flight and this race condition was hit.

The browser side ref count is then reduced and we wait for the new deref
message.  This guaruntees that an object will only be removed from the
registry if every reference we sent has been guarunteed to be unreffed.
@trop

This comment has been minimized.

Copy link
Contributor

commented Apr 16, 2019

A maintainer has manually backported this PR to "5-0-x", please check out #17825

1 similar comment
@trop

This comment has been minimized.

Copy link
Contributor

commented Apr 16, 2019

A maintainer has manually backported this PR to "5-0-x", please check out #17825

MarshallOfSound added a commit that referenced this pull request Apr 16, 2019

fix: maintain a ref count for objects sent over remote (#17464) (#17825)
* spec: clean up after a failed window count assertion

Previously when this assertion failed all tests that ran after the
failed assertion also failed.  This ensure that the assertion fails for
the test that actually caused the issue but cleans up the left-over
windows so that future tests do not fail.

* fix: maintain a ref count for objects sent over remote

Previously there was a race condition where a GC could occur in the
renderer process between the main process sending a meta.id and the
renderer pulling the proxy out its weakmap to stop it being GC'ed.

This fixes that race condition by maintaining a "sent" ref count in the
object registry and a "received" ref count in the object cache on the
renderer side.  The deref request now sends the number of refs the
renderer thinks it owns, if the number does not match the value in the
object registry it is assumed that there is an IPC message containing a
new reference in flight and this race condition was hit.

The browser side ref count is then reduced and we wait for the new deref
message.  This guaruntees that an object will only be removed from the
registry if every reference we sent has been guarunteed to be unreffed.

Kiku-Reise added a commit to Kiku-Reise/electron that referenced this pull request May 16, 2019

fix: maintain a ref count for objects sent over remote (electron#17464)
* spec: clean up after a failed window count assertion

Previously when this assertion failed all tests that ran after the
failed assertion also failed.  This ensure that the assertion fails for
the test that actually caused the issue but cleans up the left-over
windows so that future tests do not fail.

* fix: maintain a ref count for objects sent over remote

Previously there was a race condition where a GC could occur in the
renderer process between the main process sending a meta.id and the
renderer pulling the proxy out its weakmap to stop it being GC'ed.

This fixes that race condition by maintaining a "sent" ref count in the
object registry and a "received" ref count in the object cache on the
renderer side.  The deref request now sends the number of refs the
renderer thinks it owns, if the number does not match the value in the
object registry it is assumed that there is an IPC message containing a
new reference in flight and this race condition was hit.

The browser side ref count is then reduced and we wait for the new deref
message.  This guaruntees that an object will only be removed from the
registry if every reference we sent has been guarunteed to be unreffed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can鈥檛 perform that action at this time.