Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync Manager: Support for sync interruption #1684

Closed
thomcc opened this issue Aug 29, 2019 · 6 comments
Closed

Sync Manager: Support for sync interruption #1684

thomcc opened this issue Aug 29, 2019 · 6 comments
Assignees

Comments

@thomcc
Copy link
Contributor

thomcc commented Aug 29, 2019

In #1447, I punted on interruption, mainly as it's hard to make logins/places use the correct interrupt handle (Instead, I gave them a dummy interrupt handle). This is probably a problem long term, as sync (probably) needs to be interruptable.

┆Issue is synchronized with this Jira Task
┆Epic: Sync Ecosystem (backlog)
┆Sprint End Date: 2022-02-03

@bendk bendk reopened this Nov 8, 2021
@bendk
Copy link
Contributor

bendk commented Nov 8, 2021

I'm going to try to fix this one alongside #4617

@bendk bendk self-assigned this Nov 8, 2021
@bendk
Copy link
Contributor

bendk commented Nov 8, 2021

Talked with mhammond about this one some more and I think fixing it is a bigger project. I think we can probably do it along side the sync-manager/places work we're doing, but not with #4617.

@bendk bendk removed their assignment Nov 8, 2021
bendk added a commit to bendk/application-services that referenced this issue Nov 8, 2021
- Changed the bookmarks/history sync engines to use an
  Arc<Mutex<PlacesDb>>, which matches the other engines better
- Made these sync engines create their own `SqlInterruptScope`.  The
  interrupt code is still not working, but I think this is the direction
  we want to go for mozilla#1684.  If not, it should be easy to replace.
- Removed the old `PlacesApi.open_sync_connection()` method and replaced it
  with the new `get_sync_connection()` method
- Lots of unit test updates.  I didn't take the time to understand these
  fully.  I just made changes until it compiled, didn't deadlock, and passed.
bendk added a commit to bendk/application-services that referenced this issue Nov 8, 2021
- Changed the bookmarks/history sync engines to use an
  Arc<Mutex<PlacesDb>>, which matches the other engines better
- Made these sync engines create their own `SqlInterruptScope`.  The
  interrupt code is still not working, but I think this is the direction
  we want to go for mozilla#1684.  If not, it should be easy to replace.
- Updated the `PlacesApi` syncing code to use the new sync engines.
  One thing to note here is that the SyncManager has its own system of
  managing the global state, so the fact that those methods update it
  outside of the Mutex lock that the `SyncEngine` takes is not a problem.
- Updated the importer code to use the new `get_sync_connection()`
  method.  The only difference here is that the importer code needs to
  lock the mutex, this replaces the old system where
  `open_sync_connection()` would return an error if a previously
  returned connection was still alive.
- Removed the old `PlacesApi.open_sync_connection()` method and replaced it
  with the new `get_sync_connection()` method
- Lots of unit test updates.  I didn't take the time to understand these
  fully.  I just made changes until it compiled, didn't deadlock, and passed.
bendk added a commit to bendk/application-services that referenced this issue Nov 11, 2021
- Changed the bookmarks/history sync engines to use an
  Arc<Mutex<PlacesDb>>, which matches the other engines better
- Made these sync engines create their own `SqlInterruptScope`.  The
  interrupt code is still not working, but I think this is the direction
  we want to go for mozilla#1684.  If not, it should be easy to replace.
- Updated the `PlacesApi` syncing code to use the new sync engines.
  One thing to note here is that the SyncManager has its own system of
  managing the global state, so the fact that those methods update it
  outside of the Mutex lock that the `SyncEngine` takes is not a problem.
- Updated the importer code to use the new `get_sync_connection()`
  method.  The only difference here is that the importer code needs to
  lock the mutex, this replaces the old system where
  `open_sync_connection()` would return an error if a previously
  returned connection was still alive.
- Removed the old `PlacesApi.open_sync_connection()` method and replaced it
  with the new `get_sync_connection()` method
- Lots of unit test updates.  I didn't take the time to understand these
  fully.  I just made changes until it compiled, didn't deadlock, and passed.
bendk added a commit to bendk/application-services that referenced this issue Nov 15, 2021
- Changed the bookmarks/history sync engines to use an
  Arc<Mutex<PlacesDb>>, which matches the other engines better
- Made these sync engines create their own `SqlInterruptScope`.  The
  interrupt code is still not working, but I think this is the direction
  we want to go for mozilla#1684.  If not, it should be easy to replace.
- Updated the `PlacesApi` syncing code to use the new sync engines.
  One thing to note here is that the SyncManager has its own system of
  managing the global state, so the fact that those methods update it
  outside of the Mutex lock that the `SyncEngine` takes is not a problem.
- Updated the importer code to use the new `get_sync_connection()`
  method.  The only difference here is that the importer code needs to
  lock the mutex, this replaces the old system where
  `open_sync_connection()` would return an error if a previously
  returned connection was still alive.
- Removed the old `PlacesApi.open_sync_connection()` method and replaced it
  with the new `get_sync_connection()` method
- Lots of unit test updates.  I didn't take the time to understand these
  fully.  I just made changes until it compiled, didn't deadlock, and passed.
bendk added a commit to bendk/application-services that referenced this issue Nov 17, 2021
- Changed the bookmarks/history sync engines to use an
  Arc<Mutex<PlacesDb>>, which matches the other engines better
- Made these sync engines create their own `SqlInterruptScope`.  The
  interrupt code is still not working, but I think this is the direction
  we want to go for mozilla#1684.  If not, it should be easy to replace.
- Updated the `PlacesApi` syncing code to use the new sync engines.
  One thing to note here is that the SyncManager has its own system of
  managing the global state, so the fact that those methods update it
  outside of the Mutex lock that the `SyncEngine` takes is not a problem.
- Updated the importer code to use the new `get_sync_connection()`
  method.  The only difference here is that the importer code needs to
  lock the mutex, this replaces the old system where
  `open_sync_connection()` would return an error if a previously
  returned connection was still alive.
- Removed the old `PlacesApi.open_sync_connection()` method and replaced it
  with the new `get_sync_connection()` method
- Lots of unit test updates.  I didn't take the time to understand these
  fully.  I just made changes until it compiled, didn't deadlock, and passed.
lougeniaC64 pushed a commit that referenced this issue Nov 24, 2021
- Changed the bookmarks/history sync engines to use an
  Arc<Mutex<PlacesDb>>, which matches the other engines better
- Made these sync engines create their own `SqlInterruptScope`.  The
  interrupt code is still not working, but I think this is the direction
  we want to go for #1684.  If not, it should be easy to replace.
- Updated the `PlacesApi` syncing code to use the new sync engines.
  One thing to note here is that the SyncManager has its own system of
  managing the global state, so the fact that those methods update it
  outside of the Mutex lock that the `SyncEngine` takes is not a problem.
- Updated the importer code to use the new `get_sync_connection()`
  method.  The only difference here is that the importer code needs to
  lock the mutex, this replaces the old system where
  `open_sync_connection()` would return an error if a previously
  returned connection was still alive.
- Removed the old `PlacesApi.open_sync_connection()` method and replaced it
  with the new `get_sync_connection()` method
- Lots of unit test updates.  I didn't take the time to understand these
  fully.  I just made changes until it compiled, didn't deadlock, and passed.
@bendk
Copy link
Contributor

bendk commented Dec 28, 2021

This one has been tricky for 2 reasons:

  • Most of the components aren't set up for interrupting sqlite. They use a shared connection for syncing and all other operation. This means if we just try to interrupt that connection, it's possible we'll interrupt the wrong operation.
  • It's not totally clear if this is needed. On Android it seems like we probably don't need that since their interruption model is based on simply killing a processes. On iOS it's unclear if this is useful or not.

I think there is a way forward on this one though:

  • We mostly focusing on just getting the sync engines to use the correct SqlInterruptScope. This would allow for the Rust code to be interrupted.
  • We allow sync engines to optionally pass an rusqlite.InterruptHandle to the SyncManager. Then when the sync manager's interrupt method is call, it would interrupt those handles. In the short term, only places would do this. Places tends to have the longest running queries, so at least we have that covered. In the long term, if we decide we need this for other components, we could update their code to have a dedicated sync connection and then hook them up as well.

I think this makes this issue much more tractable since we can ignore the harder issue of trying to interrupt a sql query for components with a shared DB connection

@mhammond
Copy link
Member

  • We allow sync engines to optionally pass an rusqlite.InterruptHandle to the SyncManager.

Can we just add an interrupt or similar methods on the sync engine trait?

But in general, this SGTM.

@bendk
Copy link
Contributor

bendk commented Jan 5, 2022

I didn't add SyncEngine.interrupt() just because places has 2 sync engines that share a connection, so we only need to interrupt 1 of them. Instead I just made top-level function for it, like get_registered_sync_engine().

At some point, we could define a SyncEngineManager trait which has get_registered_sync_engine() and interrupt_current_sync_query() as its methods.

@bendk
Copy link
Contributor

bendk commented Feb 2, 2022

Ignore those last few comments, we decided to go a different way with this one. #4816 adds some support for this by adding a shutdown module in sql-support that allows the consumer code to enter shutdown mode which permanently interrupts all SqlInterruptScopes. We decided that shutdown mode matches the one known use case for this, which is shutting down on desktop. Android doesn't really need syncing interruption support because Android just kiils the process. It's unclear what the best system for iOS would be, but I think shutdown mode could be good. If we ever need to we can also add code to disable shutdown mode, which might be good for suspend/resume support.

There's still a few items left to do though:

  • Add shutdown to support for components other than places
  • Hook up the SqlInterruptScope created for the our SyncEngines to the shutdown code
  • The interrupt.rs and shutdown.rs modules feel like they belong in sql-support. Let's move them to interrupt support.
  • Expose the shutdown method via UniFFI. I think this means adding a UDL file for interrupt-support. But we could also add it to an existing UDL namespace (maybe sync-manager?).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants