Promisification #518

Closed
dstillman opened this Issue Aug 7, 2014 · 13 comments

Comments

Projects
None yet
4 participants
@dstillman
Member

dstillman commented Aug 7, 2014

As a prelude to API syncing, in a private branch I've been working on switching the data layer to use asynchronous database access, which is necessary to avoid freezing the UI. (Zotero predates asynchronous database access in Mozilla, but synchronous access on the main thread is now strongly discouraged because of its effect on UI responsiveness.) Unfortunately, due to conflicts between synchronous and asynchronous access, it looks like we'll need to switch over all DB queries in Zotero to be async. More unfortunately, async DB access means returning promises from many core data layer methods, which means rewriting essentially the entire codebase to use promises. This seems like a terrible idea that we'll regret later, but we also haven't come up with a better solution.

This will have fairly major effects on the Zotero client ecosystem, so I'm creating this ticket to share some of the issues involved. I'll be pushing my private async_db branch up to GitHub so that people can start playing with it. (It should go without saying that you shouldn't use this branch with a real database.)

  1. I've switched us from the Q promise library (which we were using in various places) to Bluebird, which based on published tests is much faster than Q and uses much less memory. With ES6 generators, Bluebird is supposedly nearly as fast as callbacks alone.

  2. I'm using ES6 generators and yield almost exclusively, since for data layer code they're the most natural way of using promises, resulting in code very similar to the previous synchronous code. Using ES6 generators means we'll have a Fx26 dependency, which is too bad for Fx24 ESR users, but Fx31 ESR is already available.

  3. I'm rewriting many existing methods instead of renaming everything as _Async. We can't keep synchronous versions around anyway, and having everything named *Async seems silly. The main benefit would be that we'd get more obvious breakage for calls that hadn't been updated, but since we're only getting run-time errors either way, I don't think it's worth the code ugliness. (There are still separate non-async and async DB methods, but I'll be making those use non-_Async names once all the synchronous methods have been replaced.)

  4. All Zotero plugins are going to break. I don't think there's really any pleasant way to deal with this. (As far as I know Mozilla's add-on dependency checking can't help here.) Once we have a firm version number for this release, I think we'll just have to tell plugin authors that they should add in error messages telling people running the old version of the plugin against the new Zotero version to upgrade to a newer version of the plugin and vice versa.

  5. I've experimented with a few different approaches to adding promises, with varying implications for existing code:

To do a more-or-less straight conversion of the existing JS API, we'd have to make nearly everything return a promise. This is what I started off doing, but it would result in a much wider spread of asynchronicity to other parts of the code, with generators and yields and promises everywhere. For example, anything that needed to know if an item was a note would have to become asynchronous, because if (!this.isNote()) { would have to become if (!(yield this.isNote())) {, because isNote() would have to do (yield this.itemTypeID) instead of just this.itemTypeID, because getting the itemTypeID might require loading primary data from the 'items' table, which would have to be an asynchronous database query.

In addition to creating much more overhead and requiring many more changes, this spread of asynchronicity would also mean that there wouldn't be a good way to predict whether a given method was async due to some internal dependency on some other async method, which I think would make it very hard to make changes down the line, since some small new bit of functionality might require making an async call from a previously synchronous method.

We can cut down on the number of promises by adjusting where database access happens. So for the above example, we can just say that any Zotero.Item with a libraryID and id or key will have its primary data loaded (which happens to usually be the case already). That means that Zotero.Item methods will never need to load primary data themselves, which means that anything that only needs primary data (such as itemTypeID) can remain synchronous.

Unfortunately that still leaves a lot of things that are currently lazily loaded. For example, item field data is currently not loaded until there's a getField() call for a non-primary data field. This allows you to do something like:

var item = Zotero.Items.get(123); // primary data loaded if not already
var title = item.getField('title'); // item data loaded if not already
var date = item.getField('date'); // returns without loading data

With promises and generators, that would turn into this:

var item = yield Zotero.Items.get(123);
var title = yield item.getField('title');
var date = yield item.getField('date');

However, that would mean that anything that needed any non-primary item field would have to be asynchronous as well, which would have pretty far-reaching effects.

Unfortunately, the lazy-loading is pretty important for performance. Right now, when first opening Zotero, we load all data necessary to display and sort the middle pane, and for large libraries that can already result in a notable delay. If we loaded all of an item's field data whenever generating a Zotero.Item object, starting Zotero would require loading all field data in the database.

My solution has been to trade some elegance for backwards-compatibility and efficiency, with new public Zotero.Item.loadItemData/loadCreators/loadRelatedItems/etc. methods (which were previously private and lazily called) that have to be called asynchronously before the get* (and similar) methods can be used. An attempt to use a get* method when the associated data isn't loaded throws an error, but the get* methods remain synchronous. Some part of the calling code still has to be promise-based in order to call the load* methods, but everything after that can be synchronous:

var item = yield Zotero.Items.get(123);
try {
    var title = item.getField('title'); // item data not yet loaded; throws an error
}
catch (e) {}
yield item.loadItemData(); // this is a no-op if data is already loaded
var title = item.getField('title');
var date = item.getField('date');

This results in many fewer promises/yields and allows more existing code to remain unchanged. The synchronous methods also only fail if their cached values don't exist, which allows for more granular loading in some situations — for example, right now we load entire columns of field data for the middle pane and stuff them into the objects without triggering full loads.

Along with .load*, various other classes of operations will be predictably async: .save(), .erase(), attachment file operations. Some one-off methods called from only one or two places in the UI can also be made async without causing much trouble.

  1. Debugging can be trickier with promises. For now, Bluebird's long stack traces are enabled, which sometimes produce useful stack traces but also hurt performance and will likely need to be disabled for release. We can perhaps turn them on conditionally when debug logging is enabled.

  2. Along with various known issues mentioned in the megacommit, there will certainly be lots of other broken things. Feel free to create new GitHub issues for any bugs you find, being sure to specify that they're about the async DB branch.

Finally, be aware that I will likely rebase this branch before merging it into master (in order to keep a sane commit history, avoid breaking git bisects, and keep upgrade steps as efficient as possible), so if you base any work on it, be prepared to handle the rewind. Apologies in advance.

@dstillman dstillman added this to the 4.1 milestone Aug 7, 2014

dstillman added a commit that referenced this issue Aug 7, 2014

Async DB megacommit
Promise-based rewrite of most of the codebase, with asynchronous database and file access -- see #518 for details.

WARNING: This includes backwards-incompatible schema changes.

An incomplete list of other changes:

- Schema overhaul
  - Replace main tables with new versions with updated schema
  - Enable real foreign key support and remove previous triggers
  - Don't use NULLs for local libraryID, which broke the UNIQUE index
    preventing object key duplication. All code (Zotero and third-party)
    using NULL for the local library will need to be updated to use 0
    instead (already done for Zotero code)
  - Add 'compatibility' DB version that can be incremented manually to break DB
    compatibility with previous versions. 'userdata' upgrades will no longer
    automatically break compatibility.
  - Demote creators and tags from first-class objects to item properties
- New API syncing properties
  - 'synced'/'version' properties to data objects
  - 'etag' to groups
  - 'version' to libraries
- Create Zotero.DataObject that other objects inherit from
- Consolidate data object loading into Zotero.DataObjects
- Change object reloading so that only the loaded and changed parts of objects are reloaded, instead of reloading all data from the database (with some exceptions, including item primary data)
- Items and collections now have .parentItem and .parentKey properties, replacing item.getSource() and item.getSourceKey()
- New function Zotero.serial(fn), to wrap an async function such that all calls are run serially
- New function Zotero.Utilities.Internal.forEachChunkAsync(arr, chunkSize, func)
- Add tag selector loading message
- Various API and name changes, since everything was breaking anyway

Known broken things:

- Syncing (will be completely rewritten for API syncing)
- Translation architecture (needs promise-based rewrite)
- Duplicates view
- DB integrity check (from schema changes)
- Dragging (may be difficult to fix)

Lots of other big and little things are certainly broken, particularly with the UI, which can be affected by async code in all sorts of subtle ways.

@dstillman dstillman modified the milestone: 5.0 Aug 7, 2014

@fbennett

This comment has been minimized.

Show comment
Hide comment
@fbennett

fbennett Aug 7, 2014

Contributor

Ah, well, it was high time to begin separating out the MLZ changes anyway.

Contributor

fbennett commented Aug 7, 2014

Ah, well, it was high time to begin separating out the MLZ changes anyway.

@jlegewie

This comment has been minimized.

Show comment
Hide comment
@jlegewie

jlegewie Aug 8, 2014

Contributor

Just subscribed. This obviously comes with some pain both for the core and plugin developers. But I imagine that it can be a major improvement for the user experience in terms of responsiveness...

Contributor

jlegewie commented Aug 8, 2014

Just subscribed. This obviously comes with some pain both for the core and plugin developers. But I imagine that it can be a major improvement for the user experience in terms of responsiveness...

@dstillman

This comment has been minimized.

Show comment
Hide comment
@dstillman

dstillman Jan 29, 2015

Member

The bulk of promisification work is now complete (with a few big issues remaining, plus a few minor things that aren't on that list), so I'm renaming the async_db branch to api_syncing, since that's the central work on it going forward. (Really this is just the new master, but I don't want to merge to master until we're sure the DB upgrades steps are set, since a handful of people ill-advisedly try to run that.) I'll post more on zotero-dev when we're closer to a release.

Member

dstillman commented Jan 29, 2015

The bulk of promisification work is now complete (with a few big issues remaining, plus a few minor things that aren't on that list), so I'm renaming the async_db branch to api_syncing, since that's the central work on it going forward. (Really this is just the new master, but I don't want to merge to master until we're sure the DB upgrades steps are set, since a handful of people ill-advisedly try to run that.) I'll post more on zotero-dev when we're closer to a release.

@retorquere

This comment has been minimized.

Show comment
Hide comment
@retorquere

retorquere May 12, 2015

Contributor

Should I also be looking at the async-translators branch?

Contributor

retorquere commented May 12, 2015

Should I also be looking at the async-translators branch?

@dstillman

This comment has been minimized.

Show comment
Hide comment
@dstillman

dstillman May 12, 2015

Member

No, that's been merged. (I just deleted the one on here.) Translation is still broken, but hopefully @simonster will get to that soon.

Member

dstillman commented May 12, 2015

No, that's been merged. (I just deleted the one on here.) Translation is still broken, but hopefully @simonster will get to that soon.

@dstillman

This comment has been minimized.

Show comment
Hide comment
@dstillman

dstillman May 22, 2015

Member

The api_syncing branch is now the master branch, so feel free to follow that for updates as we get closer to 5.0. (Syncing won't work on that branch yet, and it definitely shouldn't be used with a production data directory.)

Member

dstillman commented May 22, 2015

The api_syncing branch is now the master branch, so feel free to follow that for updates as we get closer to 5.0. (Syncing won't work on that branch yet, and it definitely shouldn't be used with a production data directory.)

@retorquere

This comment has been minimized.

Show comment
Hide comment
@retorquere

retorquere May 22, 2015

Contributor

OK, so this is happening. Any news (hopefully positive) on the plugin API changes I requested? And is there a place where I can follow specifically when the translators in this branch work again?

Contributor

retorquere commented May 22, 2015

OK, so this is happening. Any news (hopefully positive) on the plugin API changes I requested? And is there a place where I can follow specifically when the translators in this branch work again?

@jlegewie

This comment has been minimized.

Show comment
Hide comment
@jlegewie

jlegewie May 22, 2015

Contributor

Thanks for the update. What is the timeline for the 5.0 update?

Contributor

jlegewie commented May 22, 2015

Thanks for the update. What is the timeline for the 5.0 update?

@retorquere

This comment has been minimized.

Show comment
Hide comment
@retorquere

retorquere Jun 9, 2015

Contributor

Have custom fields made it into 5.0?

Contributor

retorquere commented Jun 9, 2015

Have custom fields made it into 5.0?

@dstillman

This comment has been minimized.

Show comment
Hide comment
@dstillman

dstillman Jun 9, 2015

Member

No — and they can't, because 4.0 will need to continue to sync for a while and would break with unknown fields. But the idea is for 5.0 to lay the groundwork for it to happen later, after a 4.0 sync cut-off.

Member

dstillman commented Jun 9, 2015

No — and they can't, because 4.0 will need to continue to sync for a while and would break with unknown fields. But the idea is for 5.0 to lay the groundwork for it to happen later, after a 4.0 sync cut-off.

dstillman added a commit that referenced this issue Mar 7, 2016

Deasyncification 🔙 😢
While trying to get translation and citing working with asynchronously
generated data, we realized that drag-and-drop support was going to
be...problematic. Firefox only supports synchronous methods for
providing drag data (unlike, it seems, the DataTransferItem interface
supported by Chrome), which means that we'd need to preload all relevant
data on item selection (bounded by export.quickCopy.dragLimit) and keep
the translate/cite methods synchronous (or maintain two separate
versions).

What we're trying instead is doing what I said in #518 we weren't going
to do: loading most object data on startup and leaving many more
functions synchronous. Essentially, this takes the various load*()
methods described in #518, moves them to startup, and makes them operate
on entire libraries rather than individual objects.

The obvious downside here (other than undoing much of the work of the
last many months) is that it increases startup time, potentially quite a
lot for larger libraries. On my laptop, with a 3,000-item library, this
adds about 3 seconds to startup time. I haven't yet tested with larger
libraries. But I'm hoping that we can optimize this further to reduce
that delay. Among other things, this is loading data for all libraries,
when it should be able to load data only for the library being viewed.
But this is also fundamentally just doing some SELECT queries and
storing the results, so it really shouldn't need to be that slow (though
performance may be bounded a bit here by XPCOM overhead).

If we can make this fast enough, it means that third-party plugins
should be able to remain much closer to their current designs. (Some
things, including saving, will still need to be made asynchronous.)
@jlegewie

This comment has been minimized.

Show comment
Hide comment
@jlegewie

jlegewie Mar 21, 2016

Contributor

An update on this would be great particularly with Deasyncification in daf4a8f. Any idea on timeline? Is deasyncification final? Thanks!

Contributor

jlegewie commented Mar 21, 2016

An update on this would be great particularly with Deasyncification in daf4a8f. Any idea on timeline? Is deasyncification final? Thanks!

@retorquere

This comment has been minimized.

Show comment
Hide comment
@retorquere

retorquere Mar 22, 2016

Contributor

@dstillman I might have relevant experience for the kind of thing described in daf4a8f ... I've just ran few quick tests, and it looks like I can load/save a 3k (OK, OK, 2885) item database in approx 95/100 ms respectively on average. I can't say that the constraints I'm willing to work under fit your case though.

Contributor

retorquere commented Mar 22, 2016

@dstillman I might have relevant experience for the kind of thing described in daf4a8f ... I've just ran few quick tests, and it looks like I can load/save a 3k (OK, OK, 2885) item database in approx 95/100 ms respectively on average. I can't say that the constraints I'm willing to work under fit your case though.

@dstillman

This comment has been minimized.

Show comment
Hide comment
@dstillman

dstillman Mar 22, 2016

Member

@jlegewie I posted to the dev list with an update. Let's follow up there.

@retorquere Feel free to open a separate issue to discuss.

Member

dstillman commented Mar 22, 2016

@jlegewie I posted to the dev list with an update. Let's follow up there.

@retorquere Feel free to open a separate issue to discuss.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment