Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: Simperium sync issues #1598

Merged
merged 1 commit into from Oct 1, 2019
Merged

Fix: Simperium sync issues #1598

merged 1 commit into from Oct 1, 2019

Conversation

beaucollins
Copy link
Contributor

@beaucollins beaucollins commented Sep 27, 2019

Status

Ready to merge 🎉


Closes #1579

This PR changes the way that the editor's contents and Simperium interact; it's dependent on the changes made in simperium@1.0.0.

Previously we held content in the editor component and would delay sending those changes to the underlying Simperium client library (node-simperium) until the timeout flushed. This made for awkward interactions when we would received remote updates from the server that happened before that timeout flushed; this led to note corruption.

After this PR we are announcing immediately to the underlying client library when we make changes and telling it (via { sync: false } in noteBucket.update()) that it's okay to wait to synchronize those changes with the server. We still maintain our debounce, but instead of sending the editor component's text on the timeout we more simply tell the underlying library to flush its changes (via noteBucket.touch()). The effect is that we've removed a source of concurrency bugs with those remote updates.

After this PR we should be removing a number of related data loss and note corruption issues all relating to the interplay of queued and received changes.

Testing

See detailed testing instructions in p2XJRt-1jm-p2.
Most testing needs are to open up builds of the app with these changes and to exercise as many update interactions as possible. In #1579, for example, we triggered these updates via the following test:

Open the app on two devices: call one Local and the other Remote

  1. Create a note with contents AC and let it synchronize to both apps/devices
  2. Disconnect Local from the network
  3. On Local add a D to the end of the note to make ACD
  4. With Remote still connected to the network, add a B in the middle to make ABC
  5. After a few seconds (to ensure that Remote synchronizes with the server) re-connect Local
  6. Verify the state of the note on both apps/devices

Before this change we were finding variations of ACDD, ACD, AC, and more incorrect end-states. After this change we should find what we expect: ABCD

@beaucollins beaucollins force-pushed the fix/crossed-wires branch 2 times, most recently from e553a49 to df2c5e3 Compare September 27, 2019 04:32
lib/app.jsx Outdated Show resolved Hide resolved

debug('noteUpdatedRemotely: %O', data);

if (state.selectedNoteId !== noteId || !patch) {
return;
}

// working is the state of the note in the editor
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is all handled by node-simperium.

lib/flux/app-state.js Outdated Show resolved Hide resolved
* be.
*
* node-simperium will compare these changes with the changes
* from the server and merge them together.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a wonderful beautiful comment

if (state.selectedNoteId !== noteId) {
return null;
}
return getState().note.data;

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

dmsnell added a commit that referenced this pull request Sep 27, 2019
See #1579, #1598
See Simperium/node-simperium#78
See Simperium/node-simperium#61

When we receive an update from a remote client we have been listening
for it and adjusting our local app state to account for that change.
Unfortunately in cases where we also have local or pending changes we
have been erroneously transforming or _rebasing_ the change before
applying it. The underlying `node-simperium` library has already
performd that transform, however, and when the client application
does that too it leads to double-writes and misapplied patches,
producing note corruption.

In this change we're stopping the rebase operation we have been
performing and that will remove this particular bug from the
application. This change does not solve all of the problem however
because we also have to make sure that the `node-simperium` library is
aware of our local state when it receives those updates.

Further work is taking place in Simperium/node-simperium#61 and #1598 to
close the loop on that but these changes are important enough on their
own to warrant a change.

This will close one bug while opening another but this is a dependent
part of the process in closing the broader issue.
lib/app.jsx Outdated Show resolved Hide resolved
lib/flux/app-state.js Outdated Show resolved Hide resolved
lib/flux/app-state.js Outdated Show resolved Hide resolved
lib/note-detail/index.jsx Outdated Show resolved Hide resolved
@dmsnell dmsnell marked this pull request as ready for review September 30, 2019 16:44
dmsnell added a commit that referenced this pull request Sep 30, 2019
See #1579, #1598
See Simperium/node-simperium#78
See Simperium/node-simperium#61

When we receive an update from a remote client we have been listening
for it and adjusting our local app state to account for that change.
Unfortunately in cases where we also have local or pending changes we
have been erroneously transforming or _rebasing_ the change before
applying it. The underlying `node-simperium` library has already
performd that transform, however, and when the client application
does that too it leads to double-writes and misapplied patches,
producing note corruption.

In this change we're stopping the rebase operation we have been
performing and that will remove this particular bug from the
application. This change does not solve all of the problem however
because we also have to make sure that the `node-simperium` library is
aware of our local state when it receives those updates.

Further work is taking place in Simperium/node-simperium#61 and #1598 to
close the loop on that but these changes are important enough on their
own to warrant a change.

This will close one bug while opening another but this is a dependent
part of the process in closing the broader issue.
@dmsnell dmsnell added this to the 1.10 milestone Oct 1, 2019
@dmsnell dmsnell force-pushed the fix/crossed-wires branch 2 times, most recently from 19f1dae to 79ff256 Compare October 1, 2019 18:09
interact; it's dependent on the changes made in simperium@1.0.0.

Previously we held content in the editor component and would delay
sending those changes to the underlying Simperium client library
(node-simperium) until the timeout flushed. This made for awkward
interactions when we would received remote updates from the server that
happened before that timeout flushed; this led to note corruption.

After this PR we are announcing immediately to the underlying client
library when we make changes and telling it (via { sync: false } in
noteBucket.update()) that it's okay to wait to synchronize those changes
with the server. We still maintain our debounce, but instead of sending
the editor component's text on the timeout we more simply tell the
underlying library to flush its changes (via noteBucket.touch()). The
effect is that we've removed a source of concurrency bugs with those
remote updates.

After this PR we should be removing a number of related data loss and
note corruption issues all relating to the interplay of queued and
received changes.
@dmsnell dmsnell requested a review from a team October 1, 2019 18:44
Copy link
Contributor

@belcherj belcherj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ship it

@dmsnell dmsnell merged commit 2c1029b into develop Oct 1, 2019
@dmsnell dmsnell deleted the fix/crossed-wires branch October 1, 2019 19:50
@loremattei loremattei modified the milestones: 1.10, 1.9 Oct 7, 2019
dmsnell added a commit that referenced this pull request Oct 8, 2019
See #1614

As part of a broader effort to resolve data-flow issues in the app this PR is a
first step in removing direct mutation where transactional atomic updates
should be occurring.

It's not clear if the existing code is the source of existing defects in the software
and this is part of why the code is problematic; we have created inherent
concurrency flaws that open up extremely-difficult-to-reproduce bugs.

Resolving this may or may not resolve any existing bugs but it will definitely
help guard us from introducing new ones.

---

Previously we have been directly mutating the note object when updating
its content. This may have been an attempt to work around confusing
data-flow issues that thankfully don't exist anymore. We have also been
performing inline checks to make sure that we update the editor's
contents if we receive these updates.

This mutation can lead to concurrency defects which expose themselves as
inconsistent UI state. This breaks our Redux model which assumes that
all UI updates happen atomically.

In this patch we're building a new note object when we update a note
in order to maintain our consistency. In light of #1598 we're also
removing some work-around code that attempted to force consistency when
it didn't exist; that consistency now exists since we're tracking the
underlying Simperium data closely now vs. storing it in separate
places.

When updating checklist items we're forcing a sync so that those changes
will propagate immediately. We don't have a need to debounce those
clicks.
dmsnell added a commit that referenced this pull request Oct 13, 2019
…1634)

See #1614

As part of a broader effort to resolve data-flow issues in the app this PR is a
first step in removing direct mutation where transactional atomic updates
should be occurring.

It's not clear if the existing code is the source of existing defects in the software
and this is part of why the code is problematic; we have created inherent
concurrency flaws that open up extremely-difficult-to-reproduce bugs.

Resolving this may or may not resolve any existing bugs but it will definitely
help guard us from introducing new ones.

---

Previously we have been directly mutating the note object when updating
its content. This may have been an attempt to work around confusing
data-flow issues that thankfully don't exist anymore. We have also been
performing inline checks to make sure that we update the editor's
contents if we receive these updates.

This mutation can lead to concurrency defects which expose themselves as
inconsistent UI state. This breaks our Redux model which assumes that
all UI updates happen atomically.

In this patch we're building a new note object when we update a note
in order to maintain our consistency. In light of #1598 we're also
removing some work-around code that attempted to force consistency when
it didn't exist; that consistency now exists since we're tracking the
underlying Simperium data closely now vs. storing it in separate
places.

When updating checklist items we're forcing a sync so that those changes
will propagate immediately. We don't have a need to debounce those
clicks.
belcherj pushed a commit that referenced this pull request Nov 4, 2019
* Update dependency concurrently to v5 (#1631)

* Update dependency electron-rebuild to v1.8.6 (#1533)

* Update dependency highlight.js to v9.15.10 (#1545)

* Update dependency autoprefixer to v9.6.4 (#1628)

* Update dependency react-overlays to v2 (#1620)

* Update dependency react-dropzone to v10.1.10 (#1630)

* Update dependency eslint-config-prettier to v6.4.0 (#1629)

* Update dependency eslint-plugin-react to v7.16.0 (#1623)

* Update react monorepo to v16.10.2 (#1621)

* Update dependency electron to v4.2.11 (#1483)

* Refactor tag operations to stop directly mutating tag objects (#1638)

See #1614

As part of a broader effort to resolve data-flow issues in the app this PR is a
first step in removing direct mutation where transactional atomic updates
should be occurring.

It's not clear if the existing code is the source of existing defects in the software
and this is part of why the code is problematic; we have created inherent
concurrency flaws that open up extremely-difficult-to-reproduce bugs.

Resolving this may or may not resolve any existing bugs but it will definitely
help guard us from introducing new ones.

---

Previously we have been directly mutating note and tag objects when
editing those tags. This mutation can lead to concurrency defects which
expose themselves as inconsistent UI state. This breaks our Redux model
which assumes that all UI updates happen atomically.

In this patch we're building new note objects and tag objects when we
make these updates in order to maintain our consistency.

---

There should be no significant visual or behavioral changes with this PR. We
are changing code related to removing tags, renaming tags, and
reordering tags.

In testing verify that with separate sessions the updates appear as expected.
Add, reorder, and remove tags to make sure the changes synchronize.

* Refactor updating note content to stop directly mutating note object (#1634)

See #1614

As part of a broader effort to resolve data-flow issues in the app this PR is a
first step in removing direct mutation where transactional atomic updates
should be occurring.

It's not clear if the existing code is the source of existing defects in the software
and this is part of why the code is problematic; we have created inherent
concurrency flaws that open up extremely-difficult-to-reproduce bugs.

Resolving this may or may not resolve any existing bugs but it will definitely
help guard us from introducing new ones.

---

Previously we have been directly mutating the note object when updating
its content. This may have been an attempt to work around confusing
data-flow issues that thankfully don't exist anymore. We have also been
performing inline checks to make sure that we update the editor's
contents if we receive these updates.

This mutation can lead to concurrency defects which expose themselves as
inconsistent UI state. This breaks our Redux model which assumes that
all UI updates happen atomically.

In this patch we're building a new note object when we update a note
in order to maintain our consistency. In light of #1598 we're also
removing some work-around code that attempted to force consistency when
it didn't exist; that consistency now exists since we're tracking the
underlying Simperium data closely now vs. storing it in separate
places.

When updating checklist items we're forcing a sync so that those changes
will propagate immediately. We don't have a need to debounce those
clicks.

* Refactor note tag operation to stop directly mutating note object (#1639)

See #1614

As part of a broader effort to resolve data-flow issues in the app this PR is a
first step in removing direct mutation where transactional atomic updates
should be occurring.

It's not clear if the existing code is the source of existing defects in the software
and this is part of why the code is problematic; we have created inherent
concurrency flaws that open up extremely-difficult-to-reproduce bugs.

Resolving this may or may not resolve any existing bugs but it will definitely
help guard us from introducing new ones.

---

Previously we have been directly mutating note objects when editing
their tags. This mutation can lead to concurrency defects which expose
themselves as inconsistent UI state. This breaks our Redux model which
assumes that all UI updates happen atomically.

In this patch we're building new note objects when we make these updates
in order to maintain our consistency.

---

There should be no significant visual or behavioral changes with this PR. We
are changing code related to removing tags, renaming tags, and
reordering tags.

In testing verify that with separate sessions the updates appear as expected.
Add, reorder, and remove tags to make sure the changes synchronize.

* Fix: Broken oAuth flow (#1627)

We have been experiencing problems when trying to login with the
WordPress.com signin. Something appears to have changed in Electron such
that the older versions of the app still work but newer versions are
failing.

In this patch we're rewriting the authentication flow to simplify it and
prepare ourselves for better handling of the failure cases.

In production we are seeing strange behaviors on failure and some on
success: unending re-requests to `simplenote://auth` which trigger full
CPU load; and no response after authentication.

After this patch we should be able to wrangle in errors and add a
timeout to better communicate when things are failing.

Additionally, the unending loop should be closed due to a replacement of
the old network intercept code with a single simplified model.

We have also been sharing sessions between the main window and the auth
window and also sharing sessions between teach time the auth window
appears.

This leads to leaked cookies and can result in confusing flows, largely
because of the shared cookies.

In this patch we're creating a new `Session` for the auth window every
time we open it. By not including `persist:` in the "partition" name
we're making sure it only exists in memory. By introducing randomness
into its name we're making sure we don't share the same session from
one auth attempt to the next. By freeing the window after close we're
making sure we don't leak memory.

Previously we were able to open the auth window after closing it and
instead of logging in again it would open to the "Accept/Deny" view.
After this change it requires logging in on every attempt. This will
likely be more frustrating but much safer than the previous behavior.

* Make system setting the default theme preference (#1581)

* Make system setting the default theme preference

* Update Release Notes

* Make system the first item in the list

* Update system theme logic

* Add option to hide menu bar (#1215)

Closes #293
Based on #1216

This adds an option to auto-hide the menu bar on Windows/Linux Electron. The option will be in Settings ▸ Display.

screen shot 2019-02-21 at 23 26 17

We have to be careful not to get the user stuck in a situation where they can't bring back the menu bar, so we'll show the Settings button and footer links in the Navigation Bar when auto-hide is enabled.

* Support the unicode bullet as a list item bullet in unordered lists (#1551)

Support the unicode bullet character • as a list item bullet because lists copied from HTML or word documents may contain this character.

* Add release note adding bullet char to lists (#1646)

* Note list: Distinguish loading from empty states (#1650)

Alternative idea to #1649

There are a few times when we boot the app and our list of notes is
empty because we haven't received udpates from the server yet. This
happens on intial app boot, immediately after authorizing, and when
we lose our local copy of the notes from `IndexedDB`.

During these times we're showing that there are no notes in the account
which is misleading. In this patch we're starting with a `null` value
for the notes so that we can distinguish between "there are no notes
in the account" and "we haven yet to determine which notes are in the
account."

In comparison to #1649 we're using a tri-state value for `notes` instead
of introducing an additional flag which must be kept in sync with
`notes`.

* Deps: Update simperium to fix infinite duplication bug

Update `simperium` library to incorporate changes that fixed a defect
where we were holding open old WebSocket connections after signing-out
and signing-in. This defect produced an infinite duplication of changes.

* Update the link to the Release Notes in the updater config (#1675)

Update the link to the Release Notes in the updater config

In (#1582) CHANGELOG.md was renamed to RELEASE-NOTES.txt. This caused the link to the release notes to break.

* In Dev mode, open CHrome Deve Tools in detached mode (#1660)

* Update version in package.json for 1.10.0 release

Added the beta version for 1.10.0-beta1 release

* Fix/only run notes loaded when notes loaded (#1680)

* Only run notesLoaded when notes are indeed loaded

After querying the noteBucket we run notes loaded with an empty notes array. This causes havoc because we rely on notes being null until notes are loaded. This commit adds a check to ensure there is at least one note before running notesLoaded

* Add release notes

* Update RELEASE-NOTES.txt

* Update signing certificate (#1682)

Props to @loremattei for creating these changes

* Bump version to v1.10.0-beta2 (#1684)

* Add version for release 1.10.0
dmsnell added a commit that referenced this pull request Nov 16, 2019
Resolves #1690
Obviates session lock in: #1700, #1704, #1707, #1710, #1720
Resolves bug that uncovered signout/signin issue: #1664
Follows Simperium API change in: #1598, #1599, Simperium/node-simperium#61, Simperium/node-simperium#78

When we fixed some deep and underlying issues in `node-simperium` we
started a chain of operations which had to adjust to that change.
Remember that the core problem was an assumption that after we send out
a change that we could wait until it came back. That assumption was
wrong because changes from other remote clients could come in during
that time period while we're waiting. Simperium/node-simperium#61 fixed
that problem but because it was broken in the past we added some
work-arounds in #706; these work-arounds complicated the flow of data in
the corrected version so we had to remove them and change how we handle
remote updates in the Simplnote side.

A critical step in this flow is letting Simperium know what local
changes we may have in the app buffer. We introduced this step in #1598
but never noticed a typo:

```js
getState().selectedNoteId
```

That reference will always be wrong and so we'll never send the local
contents of the note. It should have been...

```js
getState().appState.selectedNoteId
```

The implication of this change is that we always told Simperium that
there are no local changes and so it would grab the currenty copy in the
bucket, which in our case is in shared `indexedDB` tables, and thus it
would occasionally get updated copies of the bucket that were updated in
another session. This would then appear as though we also made the same
change we were receiving, and thus we would prepare our own patch to
send out and start the cycle.

Further we were only sending local updates in the case that we had the
same note open as was updated in the remote change. However, since we
don't update Simperium with notes that we're not editing we were leaving
other notes behind by sending `null` for them. The outcome of this is
that edits to notes that aren't selected would have a modification in
the way they are applied, if otherwise they would have started the
infinite looping.

After this change we're always sending the current local copy. This is
still somewhat of a work-around but it's one that should reliably cover
over the larger problems in the system. Eventually we have to clear out
the data flow, remove local copies of our data, separate independent
clients or sessions so they don't share buckets, and simplify the
control flow to make things easier to debug.
belcherj pushed a commit that referenced this pull request Nov 18, 2019
Resolves #1690
Obviates session lock in: #1700, #1704, #1707, #1710, #1720
Resolves bug that uncovered signout/signin issue: #1664
Follows Simperium API change in: #1598, #1599, Simperium/node-simperium#61, Simperium/node-simperium#78

When we fixed some deep and underlying issues in `node-simperium` we
started a chain of operations which had to adjust to that change.
Remember that the core problem was an assumption that after we send out
a change that we could wait until it came back. That assumption was
wrong because changes from other remote clients could come in during
that time period while we're waiting. Simperium/node-simperium#61 fixed
that problem but because it was broken in the past we added some
work-arounds in #706; these work-arounds complicated the flow of data in
the corrected version so we had to remove them and change how we handle
remote updates in the Simplnote side.

A critical step in this flow is letting Simperium know what local
changes we may have in the app buffer. We introduced this step in #1598
but never noticed a typo:

```js
getState().selectedNoteId
```

That reference will always be wrong and so we'll never send the local
contents of the note. It should have been...

```js
getState().appState.selectedNoteId
```

The implication of this change is that we always told Simperium that
there are no local changes and so it would grab the currenty copy in the
bucket, which in our case is in shared `indexedDB` tables, and thus it
would occasionally get updated copies of the bucket that were updated in
another session. This would then appear as though we also made the same
change we were receiving, and thus we would prepare our own patch to
send out and start the cycle.

Further we were only sending local updates in the case that we had the
same note open as was updated in the remote change. However, since we
don't update Simperium with notes that we're not editing we were leaving
other notes behind by sending `null` for them. The outcome of this is
that edits to notes that aren't selected would have a modification in
the way they are applied, if otherwise they would have started the
infinite looping.

After this change we're always sending the current local copy. This is
still somewhat of a work-around but it's one that should reliably cover
over the larger problems in the system. Eventually we have to clear out
the data flow, remove local copies of our data, separate independent
clients or sessions so they don't share buckets, and simplify the
control flow to make things easier to debug.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Sync: diff_match_patch crashes when rebasing
4 participants