Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/op metadata support #586

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

maxfortun
Copy link

Reworked the PR(#513) for issue(#512).
Hoping it will make it into the main branch this time around.

@alecgibson
Copy link
Collaborator

Hi @maxfortun just a quick note to say thanks for the PR and I promise I haven't forgotten about it! It's on my todo list, just very busy right now unfortunately.

@coveralls
Copy link

coveralls commented Jan 26, 2023

Coverage Status

Coverage: 97.419% (-0.02%) from 97.437% when pulling 2cb16ee on maxfortun:feature/op_metadata_support into cc0e338 on share:master.

lib/backend.js Outdated
@@ -772,9 +772,11 @@ Backend.prototype._fetchSnapshot = function(collection, id, version, callback) {
var db = this.db;
var backend = this;

var options = {metadata: true};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should potentially consider snapshot metadata in a separate PR. It's more complicated than op metadata, because there's not currently a mechanism to propagate updated snapshot metadata to remote clients (which only receive ops to transform their local data, and don't receive the full snapshot — this is sort of the point of OT in general).

For example, I'd expect a passing test case for this case:

it('updates remote metadata', function(done) {
  backend.use('commit', (context, next) => {
    context.snapshot.m.updated = Date.now();
    next();
  });

  var connection1 = backend.connect();
  var connection2 = backend.connect();

  var doc1 = connection1.get(...);
  var doc2 = connection2.get(...);

  async.series([
    doc1.subscribe.bind(doc1),
    doc1.submitOp.bind(doc1, [{p: ['foo'], oi: 'bar'}]),
    doc2.subscribe.bind(doc2),
    function(next) {
      expect(doc1.metadata.updated).to.equal(doc2.metadata.updated);
      next();
    },
    function(next) {
      doc2.once('op', function() {
        expect(doc1.data).to.eql(doc2.data);
        expect(doc1.metadata.updated).to.equal(doc2.metadata.updated);
        next();
      });
      doc1.submitOp([{p: ['foo'], od: 'bar', oi: 'baz'}], errorHandler(next));
    },
  ], done);
});

lib/client/doc.js Show resolved Hide resolved
lib/submit-request.js Outdated Show resolved Hide resolved
SubmitRequest.prototype._granularMetadataProjection = function() {
var request = this;
var metadataProjection = {};
for (var key in request.opMetadataProjection) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should at least add a code comment here that we only support a single level of projection.

Would be extra nice if we updated the docs to include this flag and notes on its use.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment, and updated docs with an example.

lib/backend.js Outdated
@@ -789,7 +791,7 @@ Backend.prototype._fetchSnapshot = function(collection, id, version, callback) {
// - we want to avoid the 'op' middleware, because we later use the 'readSnapshots' middleware in _sanitizeSnapshots
// - we handle the projection in _sanitizeSnapshots
var from = milestoneSnapshot ? milestoneSnapshot.v : 0;
db.getOps(collection, id, from, version, null, function(error, ops) {
db.getOps(collection, id, from, version, options, function(error, ops) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we've put this in the right place? Shouldn't we be fetching op metadata in _getSanitizedOps() (and maybe its bulk counterpart)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that if a snapshot has some saved metadata and we do not pass fields and options to getSnapshot , the metadata will not be retrieved at all. Same with _getSanitizedOps, if we do not pass the options to getOps , the database layer will not retrieve the metadata and there will be nothing to sanitize. I have a very shallow understanding of the intent here, and maybe I am missing something fundamental. I could definitely use your help to understand this better. Thank you!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should basically not touch snapshots at all in this PR. As I mentioned in my other comment, snapshot metadata is a very different beast, and I think we should consider it separately to ops.

I think this belongs in _getSanitizedOps() with the call to backend.db.getOps().

lib/submit-request.js Outdated Show resolved Hide resolved
@@ -1210,5 +1212,189 @@ module.exports = function() {
});
});
});

describe('metadata projection', function() {
it('passed metadata to connect', function(done) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this test has anything to do with this PR?

});
});

it('passed metadata to submit', function(done) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here: I'm not sure this test has anything to do with this PR?

thisConnection.doc = docs[thisConnection.__test_id] = thisConnection.get('dogs', 'fido');

thisConnection.doc.on('op', function(op, source, src, context) {
if (!src || !context) { // If I am the source there is no metadata to check
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we send metadata back to the original client? The server might add context the sender wants (and which remote clients will get).

What's your use-case? Does it work without the original op submitter getting their metadata?

We could add the meta to the op acknowledgement.

});
});

it('concurrent changes', function(done) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please simplify this test case to just have two clients? I'm not sure there's a vast amount of benefit in having so many clients; it just makes the test harder to read. Would be nice to get rid of the for loops.

});

this.backend.use('apply', function(request, next) {
Object.assign(request.snapshot.m, request.op.m);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unneeded?

@@ -1210,5 +1212,189 @@ module.exports = function() {
});
});
});

describe('metadata projection', function() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've flexed metadata on ops that were broadcast over pubsub, but ops can also be fetched when submitting an op from a stale document (that isn't subscribed, and is behind the server); can we test that the ops we get in that case also have their metadata correctly set?

This would need us to tweak SubmitRequest.submit().

Also, can we add a test case for an unsubscribed client calling fetch after another remote client has changed the doc: this is where you'll need the backend._getSanitizedOps() to handle metadata projection for us.

@@ -61,6 +67,10 @@ backend.use('apply', (context, next) => {
if (userId !== ownerId) {
return next(new Error('Unauthorized'))
}

// Add op metadata to snapshot before snapshot is stored
Object.assign(context.snapshot.m, context.op.m);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay have discussed with @ericyhwang and we think:

  • we should go back to having a feature flag (sorry about the U-turn!)
  • if the feature flag is enabled, we send all metadata across pubsub
  • if the feature flag is enabled, we also apply the metadata projection in the op middleware, which should handle the cases I asked for on your tests (reiterated here for clarity)

Cases we need to handle:

  • Op submission (the case you're already testing): ops broadcast over pubsub to other subscribers
  • Op fetching: have a stale document that calls doc.fetch() — this will fetch ops through backend._getSanitizedOps()
  • Ops sent back when submitting: have a stale document that calls doc.submitOp() — this retrieves ops through another mechanism in SubmitRequest.submit()

@maxfortun
Copy link
Author

Ping?

@alecgibson
Copy link
Collaborator

@maxfortun sorry you hadn't asked for re-review, so didn't know it was ready to look at again. I'll try to look today.

Copy link
Collaborator

@alecgibson alecgibson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed inline, we shouldn't be using agent.custom. We've also not got a feature flag as we previously requested.

I think I've lost sight of what it is this PR was trying to achieve? You wish to attach metadata from a client and broadcast it to other clients? Or you just want to broadcast server-side metadata from the server to receiving clients?

@@ -866,6 +864,12 @@ Doc.prototype.submitOp = function(component, options, callback) {
callback = options;
options = null;
}

// If agent has metadata, append on client level so that both client and backend have access to it
if(this.connection.agent && this.connection.agent.custom && typeof this.connection.agent.custom.metadata === "object") {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand why we're doing this? The custom object is meant to be a generic data blob for clients to use. Setting a magical property here is surprising and may clash with clients' existing usage.

Also this.connection is only ever available in setups where the client is running on the same machine as the server (as opposed to, say, a client connected remotely over websocket).

@@ -171,7 +171,8 @@ SubmitRequest.prototype.commit = function(callback) {
var op = request.op;
op.c = request.collection;
op.d = request.id;
op.m = undefined;
op.m = request.agent.custom.metadata ? request.op.m : undefined;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed above, we shouldn't be using agent.custom.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants