Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Operation hooks for persistent models (intent-based hooks) #403

Merged
merged 3 commits into from Jan 29, 2015

Conversation

bajtos
Copy link
Member

@bajtos bajtos commented Jan 21, 2015

Connect #367, see my comment there for description of this patch.

/to @raymondfeng @ritch please review
/cc @fabien FYI, feel free to chime in too

The New Hook API

This patch introduces a new API for "intent-based" hooks. These hooks are not tied to a particular method (e.g. "find" or "update"). Instead, they are triggered from all methods that execute a particular "intent".

The consumer API is very simple, there is a new method Model.observe(name, observer), where the observer is function observer(context, callback).

Observers are inherited by child models and it is possible to register multiple observers for the same hook.

A list of hooks implemented by this PR follows below.

query

The query hook is triggered whenever a database is queried for models. Observers
may modify the query, e.g. by adding extra restrictions.

Context properties

  • Model - the constructor of the model that will be queried
  • query - the query containing fields where, include, order, etc.

Examples:

Model.observe('query', function logQuery(ctx, next) {
  console.log('Querying %s for %s', ctx.Model.modelName, ctx.query.where);
  next();
});

Model.observe('query', function limitToTenant(ctx, next) {
  ctx.query.where.tenantId = loopback.getCurrentContext().tenantId;
  next();
});

before save

The hook before save is triggered before a model instance is about to be modified (created, updated). The hook is triggered before the validation.

Depending on which method triggered this hook, the context will have one of the following sets of properties.

Full save of a single model

  • Model - the constructor of the model that will be saved
  • instance - the model instance to be saved. The value is an instance of Model class.

Partial update of possibly multiple models

  • Model - the constructor of the model that will be saved
  • where - the where filter describing which instances will be affected
  • data - the (partial) data to apply during the update

Examples:

Model.observe('before save', function updateTimestamp(ctx, next) {
  if (ctx.instance) {
    ctx.instance.updated = new Date();
  } else {
    ctx.data.updated = new Date();
  }
  next();
});

Model.observe('before save', function computePercentage(ctx, next) {
  if (ctx.instance) {
    ctx.instance.percentage = 100 * ctx.instance.part / ctx.instance.total;
  } else if (ctx.data.part && ctx.data.total) {
    ctx.data.percentage = 100 * ctx.data.part / ctx.data.total;
  } else if (ctx.data.part || ctx.data.total) {
    // either report an error or fetch the missing properties from DB
  }
  next();
});

after save

The hook after save is called after a model change was successfully persisted
to the datasource.

Depending on which method triggered this hook, the context will have one of the following
sets of properties.

Single model

  • Model - the constructor of the model that will be saved
  • instance - the model instance that was saved. The value is an instance of Model class
    and contains updated values computed by datastore (e.g. auto-generated id).

Partial update of possibly multiple models

  • Model - the constructor of the model that will be saved
  • where - the where filter describing which instances were queried
  • data - the (partial) data applied during the update

At the moment, this second set is used exclusively by Model.updateAll.

after delete

The hook after delete is triggered after some models were removed from the datasource.

Context properties

  • Model - the constructor of the model that will be queried
  • where - the where filter describing which instances were queried.

@bajtos bajtos added the #review label Jan 21, 2015
callback(err, obj);
if(!err) Model.emit('changed', obj);
var self = this;
Model.notify('before save', { model: obj }, function(err) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the convention for the event name? before-save, beforeSave or before save?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

before save ATM. I am open to other conventions like before-save, save:before or even saving (before) and saved (after). I'd like to avoid beforeSave as it may be confused with the old Model.beforeSave hook.

@raymondfeng
Copy link
Contributor

I like the idea of observe and notify. Naming wise, we probably want to avoid the conflict with Object.observe (ES6) which works at property level.

@bajtos
Copy link
Member Author

bajtos commented Jan 21, 2015

I like the idea of observe and notify. Naming wise, we probably want to avoid the conflict with Object.observe (ES6) which works at property level.

FWIW Object.observe seems to be an ES7 feature.

If the Mozilla doc is correct, observe is a static method of the Object class that is not inherited. Or at least it is not inherited in io.js 1.0.3 by functions (model constructors).

Can you propose a better name? Few ideas from myself: observeOperation, observeAsync.

@raymondfeng
Copy link
Contributor

@bajtos
Copy link
Member Author

bajtos commented Jan 21, 2015

Not at all. Node's async listeners are synchronous functions that are called whenever an async method is invoked. In our case, we have an async (callback-based) function that is called before/after an operation is performed.

@bajtos
Copy link
Member Author

bajtos commented Jan 21, 2015

I am also not very happy with the term "operation", but I could not come with a better name. E.g. "intent" does not make much sense to me. Although I guess observeIntent('before save') is not that bad.

@raymondfeng
Copy link
Contributor

@bajtos
Copy link
Member Author

bajtos commented Jan 21, 2015

The Java conventions specify the names of the lifecycle events, I don't see how it can help us to pick up a better name for Model.observe.

@bajtos bajtos force-pushed the feature/intent-hooks branch 2 times, most recently from 3c727b4 to 5de33f3 Compare January 23, 2015 14:47
@bajtos
Copy link
Member Author

bajtos commented Jan 23, 2015

@ritch @raymondfeng I have finished the first cut of this feature, please review. I'll try to catch you up before the standup to discuss it online.

Good news: The current implementation in Juggler works supports
connector-specific optimized implementations too. We don't have
to change the connectors in order to get the new hooks.

Major change against the proposal: I have omitted before/after
validate hooks. The hook before save is executed before the
validation takes place. The reason for that is that before save
can update the model to be saved and we must ensure that the
modified model is still valid. Now when before save is triggered
before the validation, I don't see any use case for before validate.
The hook after validate was always weird, as it basically means
the same as before save but with the extra possibility that
the valid model will not be saved at the end because the write
DB operation fails.

Critical points to discuss and resolve before this PR can be finished:

1) updateOrCreate and validation (#262)

The current implementation of updateOrCreate does not validate
the model when the connector provides optimized implementation.
When we fix this issue, existing applications that were sending invalid
data to updateOrCreate will stop working.

Does it mean such fix will be a breaking change? If it is, do we want to deal
with the hassle of releasing a new major version of juggler AND loopback,
or are we ok with keeping the current (incorrect) behaviour as far as this PR
is concerned?

2) updateAll and context data

What data should be passed to hooks from updateAll, which basically
does something like:

UPDATE Cars SET Name='Yellow' WHERE Name='Red'

At the moment, both before save and after save has the model-to-be-saved
in the context. Because updateAll does not load model data, we don't have
the model instance at hand.

Fetching the data from the database may be suboptimal:

  • Multiple instances may be affected, thus a lot of data must be fetched.
  • In many cases these data are not used at all, e.g. when the replication
    is disabled.

I am thinking about the following solution:

Implement two flavours of before save and after save hook:
The first flavour contains model that is being updated.
The second flavour contains where and data properties describing
updateAll parameters. Consuments of the hook must be prepared to handle
both flavours.

The benefit of this approach is that full model data can be fetched lazily,
only when there is an observer that needs them.

The drawback is adding more complexity for the developers to deal with, making the hooks more difficult to use.

What do you think? Can you come up with a better solution?

3) Id-based query modified by before delete hook

PersistedModel.prototype.delete performs an id-based DELETE operation.
To keep the hook API as small as possible, this id-based query is converted
to a full where object, which is the same as what PersitedModel.deleteAll
sends.

The problem: a hook observer is allowed to change the where object, expecting
that the juggler will use the modified version when performing the operation.
However, since prototype.delete uses an id only, there isn't a straightforward
way how to apply the modified query.

Possible solutions:

  • Detect the situation when the query was modified and run full deleteAll
    in such case.
  • Add a context flag readonly to signal that the where object should not
    be edited by the hook.
  • Introduce another context flavour, pass the id instead of the where object.
    The reason why I don't like this solution is that it moves most of the
    complexity to the consumer. Even when the consumer does not modify the data,
    it still has to understand both flavours.

My preference is to implement the first solution (switch to deleteAll),
as it is the simplest version from the point of consumers.

Thoughts?

4) Id-based query modified by before load

This is a similar issue like above, but this time for PersistedModel.updateOrCreate.
And it is not possible to switch from id-based to full where-based query, because
the connector API for updateOrCreate does not support that ATM.

However, there is an option of not triggering the before load hook at all.

IMO, the ideal solution is to modify the connector API for updateOrCreate
and let it support where-based query in addition to id-based query.

However, since that is a breaking change, so I am proposing to leave it until
later. As far as this pull request is concerned, the before load event won't
be emitted at all for updateOrCreate.

At the end, we can always add code emitting the hook later in the future,
when we discover a use case that requires it.

Any objections?

Non-critical but still important:

5) prototype.updateAttributes and data modified by hooks

The method needs to detect what propeties were changed
by the hook so that it can apply these changes (but only these changes) to the
sparse data object that holds key-value pairs to update.

Now consider a hook that wants to add a new property that should be changed
by the save operation. The property is not part of the original data,
juggler has to compare the old data version with the new data version
This auto-detection adds too much overhead, especially in the optimistic
cases when nothing is changed.

I am thinking about the following solution: Add context methods
include(propertyName), exclude(propertyName) that can be used by
hook implemetors to signal how to change the sparse data object.

6) findOrCreate and before save hook

The default implementation
provided by DAO detects when the model exists and does not trigger
before save hook in such case. On the other hand,
connector-provided optimized (atomic) versions
don't know in advance whether the model exists or not.

The current implementation is thus inconsistent. Depending on the
connector in use, the hook will or will not be triggered.

Do you consider it a problem?

6) updateOrCreate and after load hook

The default implementation
provided by DAO triggers after load when an existing model will be updated.
Connector-provided optimized (atomic) versions don't know in advance
whether the model exists or not.

The current implementation is thus inconsistent. Depending on the
connector in use, the hook will or will not be triggered.

Do you consider it a problem?

@bajtos
Copy link
Member Author

bajtos commented Jan 23, 2015

@STRML @clarkorz @alFReD-NSH @violet-day @kevinsproles @offlinehacker @arlaneenalra @projectxmaker @bluestaralone
Since you were involved in the discussions related to broken before/after hooks, I am inviting you to review the solution proposed by this PR and let us know what you think.

The idea is to keep the old (broken) hooks in place for now, so that backwards compatibility is preserved. At the same time, there will be a new parallel implementation that will hopefully fix all know flaws.

There isn't much documentation of the proposed solution, except the discussion in #367. Check the unit tests to see the new hooks in action.

Quick overview of the hooks:

  • before load({ query }, next) is called whenever a database is queried to fetch models
  • after load({ model }, next) is called whenever a model was successfully fetched (loaded) from the database
  • before save({ model }, next) is called before a model is saved/updated/created to the database
  • after save({ model }, next) is called after a model was successfully saved/updated/created
  • before delete({ where }, next) is called before some models are deleted
  • after delete({ where }, next) is called after some models were successfully deleted

See my earlier long descriptive comment for the list of exceptions where the implementation differs from the list above. model is a model instance; query is a full query including where, limit, include, etc; where is the "where" part of a query.

I'd like to finish most of this work by Monday/Tuesday, so please don't wait too long with your feedback.

@bajtos
Copy link
Member Author

bajtos commented Jan 23, 2015

Note for reviewers: consider ignoring whitespace changes - https://github.com/strongloop/loopback-datasource-juggler/pull/403/files?w=1

@alFReD-NSH
Copy link
Contributor

Regarding current hooks in this pr, they do cover my use cases and I think the API is good, other stuff in the discussion wouldn't really affect me :)

And if you are looking for a better name than observe, I would suggest subscribe. Seeing notify in here, it reminded me of knockout observables(technically subscribables in this case). In knockout the two methods of subscribe and notifySubscribers are used.

@bajtos
Copy link
Member Author

bajtos commented Jan 26, 2015

@alFReD-NSH Thank you for the comment.

Ad naming conventions, my original naming is based on the Observer design patter. Your proposal subscribe and notifySubscribers is a viable option too.

If we stay with observe, then we should probably rename notify to notifyObservers.

Does anybody have a strong opinion which option is better? subscribe vs. observe?

@bajtos bajtos changed the title [WIP] Intent-based hooks for persistent models Intent-based hooks for persistent models Jan 26, 2015
@bajtos
Copy link
Member Author

bajtos commented Jan 26, 2015

@raymondfeng @ritch I have finished the patch, PTAL. See the PR description field for a short summary of the new hooks.


Notes related to my long comment above and our Friday's discussion:

Items from my comment:

  1. updateOrCreate and validation: out of scope of this PR, see Validators are not triggered on updateOrCreate #262

  2. updateAll and context data: add the second flavour of ctx (where, data)
    to both before save and after save.

  3. Id-based query modified by before delete hook: switch to full deleteAll in such case.

  4. Id-based query modified by before load: original resolution - don't fire the hook is suboptimal,
    I'd rather detect such situation and either switch to unoptimised updateOrCreate.

  5. prototype.updateAttributes and data modified by before save: let's use where + data
    as an alternative to instance and see how far we can get with that implementation.

  6. findOrCreate and before save hook: keep the "inconsistency".

    Depending on the connector in use, the hook before save may not
    be triggered when the model was found and no CREATE is performed.

  7. updateOrCreate and after load hook: we are dropping the "after load" hook (see below).

Extra items

  • Rename "before load" to "query"
  • Remove "after load"
  • All delete methods must call "query" first. Actually I have dropped before delete in favour of `query.
  • Rename ctx.model to ctx.instance (everywhere)
  • It should be possible to replace the whole where/data object and DAO must pick it up.
  • Remove this and fn.apply from the hook infrastructure. Pass the Model inside context. Avoiding fn.call is approx. 2x faster in micro-benchmarks.
  • Should we test that relations methods notify hook observers too? I have omitted this part from the initial implementation, we can always add it later.
  • Add options: { notify: false } argument to some of the DAO methods. I have intentionally not added the options argument to all DAO method since I have already ran out of time.

Rejected:

  • The "ctx" object must be the same in both before/after save hook

    Why not: a single operation triggers multiple hooks, some of them have different properties. E.g. "delete" triggers "query", "after delete".

    The solution for replication of delete-by-query is to convert such query to an id-based query in the connector (or dao.js), see below.

Reworked:

  • Delete hooks. The change replication cannot rely on before/after delete hooks to convert "delete by query" to a list of deleted ids. If we want to support this feature, then we need to implement a delete returning a list of ids at the connector level.

    For the purpose of this pull request, it is enough if the change replication can detect when deleteAll finished and run a full cleanup round. (This is exactly how the current implementation of change-detection works now).

@bajtos
Copy link
Member Author

bajtos commented Jan 27, 2015

@raymondfeng @ritch any comments? May I land this patch?

@raymondfeng
Copy link
Contributor

What intent hooks should we use for the following use cases?

  1. After loading a model instance from database, calculate non-persistent fields whose values depend on the values of persistent fields
  2. Before deleting a record, check if there are references to the target instance
  3. Before loading a record from the database, check if it's available in cache

I'm particularly concerned by the query intent, which is fired for find/update/delete. Developers probably will be confused and don't have a way to implement 2 or 3.

@bajtos
Copy link
Member Author

bajtos commented Jan 27, 2015

After loading a model instance from database, calculate non-persistent fields whose values depend on the values of persistent fields

You can either use the old afterInitialize hook (sync), or we need to implement a new hook. Preferably in a new request.

My initial version had after load hook, but it has some edge cases that are difficult to get right, so I decided to omit it from this PR.

Before deleting a record, check if there are references to the target instance

Ah, good catch. Is the same check applicable in the situations where you are modifying data (foreign keys) too, or is it specific to deletes?

If it's specific to deletes only, then we can add before delete hook which will be invoked after query. Agains, let's keep it out of this PR but perhaps within the original task #367.

Before loading a record from the database, check if it's available in cache

Caching is tricky.

  • Consider find({ where: { name: 'Miroslav' } }). Do you expect to receive a cached version too?
  • Consider unoptimised updateOrCreate that executes findById followed by prototype.save. Are you expecting to use the cached version too?

It seems to me that caching should be applied exclusively to the method findById, in which case you should use method-specific hook (TBD in #397) instead of intent hooks. Perhaps only remote invocations of this method should be cached, in which case you should use remoting hooks.

Here is the caveat that is related to caching and that could prove to be difficult to fix incrementally (in a non-breaking change): the current design does not allow observers to short-circuit the usual flow of execution and provide a custom success response. On the other hand, since intent hooks are shared by multiple methods, each of the method returning different format of the data, I consider this limitation reasonable.

Thoughts?

In general, I'd like to focus this PR on the small core and leave out everything that can be incrementally added later. The patch is already too large for review and will create merge conflicts with at least one other PR, so I don't want to make the situation even worse by waiting until more stuff is crammed in.

@raymondfeng
Copy link
Contributor

I'm fine to merge the PRs as long as other candidate of hooks can be added incrementally. BTW, I'll take the intents into consideration during the spike to clean up/consolidate connectors.

@raymondfeng
Copy link
Contributor

LGTM. Please run the code with a connector such as MySQL and MongoDB to make sure no test failures are created.

@bajtos
Copy link
Member Author

bajtos commented Jan 27, 2015

Please run the code with a connector such as MySQL and MongoDB to make sure no test failures are created.

When I run mysql tests with juggler from npmjs, I see a lot of the following error:

Error: ER_BAD_DB_ERROR: Unknown database 'strongloop'

Where are the instructions for setting up the dev env for the connector?

@raymondfeng
Copy link
Contributor

@bajtos
Copy link
Member Author

bajtos commented Jan 27, 2015

Thanks. The tests run super slow now, but at least they are passing. We should really provide a local env for running the tests, but that's out of scope of this discussion.

I'll verify that MySQL tests pass with the new juggler tomorrow and send a PR to run these tests as a part of MySQL connector test suite.

@bajtos
Copy link
Member Author

bajtos commented Jan 28, 2015

@raymondfeng done. loopbackio/loopback-connector-mysql#71 adds hooks test suite to the MySQL connector and loopbackio/loopback-connector#8 fixes a problem I have discovered along the way.

I had to rework the test suite a bit to make it work with the MySQL connector.

I have also discovered a problem in the design of before save hook for updateOrCreate, I have changed the implementation to the following:

Pass ctx.where + ctx.data instead of ctx.instance whenever the operation may potentially run updateAttributes:

  • atomic update
  • atomic create
  • non-atomic update
    Non-atomic create is the only case when before save is called with ctx.instance.

PTAL at the six new commits and let me know if there is anything else to change before landing this patch.

@aantipov
Copy link

Hi guys!
Just found this discussion.
Can I add my 5 cents to the discussion about methods naming?
In my opinion the talk is about events & subscribing to the events.
I like a lot jquery's style of subscribing to the events: Object.on('SomeEvent', eventHandlerFn) - short, concise & convenient.

@ritch
Copy link
Contributor

ritch commented Jan 28, 2015

I did not get to review the code is thoroughly I would of liked to. But I do not want to hold up landing this any longer.

@ritch
Copy link
Contributor

ritch commented Jan 28, 2015

@bajtos ^^^^

@bajtos
Copy link
Member Author

bajtos commented Jan 28, 2015

One more change: I simplified the tests to reuse the same datasource instance, I hope it will speed up the tests when running against a real database.

@ritch Ok. I'd say it's most important to review the public API, implementation details can be always fixed/improved later.

@aantipov Thank you for chiming in. Methods on and emit are already implemented by EventEmitter, which the Model class inherits from. These events are synchronous, i.e. the handler function does not have a callback argument. For the work made here, we need a new API that will not make it clear that these async hooks are not related to usual sync events.

@raymondfeng
Copy link
Contributor

LGTM

@raymondfeng raymondfeng assigned bajtos and unassigned raymondfeng Jan 28, 2015
Miroslav Bajtoš added 3 commits January 29, 2015 08:41
Implement infrastructure for intent-based hooks.
This patch introduces a new API for "intent-based" hooks. These hooks
are not tied to a particular method (e.g. "find" or "update"). Instead,
they are triggered from all methods that execute a particular "intent".

The consumer API is very simple, there is a new method
Model.observe(name, observer), where the observer is function
observer(context, callback).

Observers are inherited by child models and it is possible to register
multiple observers for the same hook.

List of hooks:

 - query
 - before save
 - after save
 - after delete
@bajtos
Copy link
Member Author

bajtos commented Jan 29, 2015

@slnode test please

bajtos added a commit that referenced this pull request Jan 29, 2015
Intent-based hooks for persistent models
@bajtos bajtos merged commit ce39f8a into master Jan 29, 2015
@bajtos bajtos removed the #review label Jan 29, 2015
@bajtos bajtos deleted the feature/intent-hooks branch January 29, 2015 07:59
@bajtos bajtos changed the title Intent-based hooks for persistent models Operation hooks for persistent models (intent-based hooks) Feb 9, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants