Change tracking #155

Closed
brooksn opened this Issue Oct 12, 2016 · 6 comments

Comments

Projects
None yet
3 participants
@brooksn

brooksn commented Oct 12, 2016

A client may be interested in displaying old versions of an object which was modified by the Update activity. This way, Alyssa P. Hacker could save a draft of a blog post to her ActivityPub server, and later open it in another client to continue working. Should the server save all versions of an object?

@cwebber

This comment has been minimized.

Show comment
Hide comment
@cwebber

cwebber Oct 13, 2016

Collaborator

Hm, so this is something that I would like / would be really nice to have! Though I worry it's going to be hard to get it right in time. So, I'm not sure we should add it right now since it could be an extension, but... it's a fun idea to explore, so let's explore it.

There are probably two ways to go about it, if we were to do it:

  • Simply have a "history" object, that has an array of previous versions of the object (either inline, though obviously excluding any of their own history properties, or even better, as just URIs to previous objects). This would be easy, though it means archiving a version of an object "after the fact".
  • Do something much more "progressive"... I recently wrote a blogpost about ideas for an ActivityPub extension that would make things closer to having append-only datastructures and immutable objects. It's a fun idea, and would involve probably having most objects have some sort of "Pointer" or "HistoryLog" object type that is the indirect representation of the object, by pointing to the current version of the object and maybe previous versions too. But that's so hard to get right there's no way to get it done in time for Candidate Recommendation status.

We could try to add the "history" option and mark it as "at risk". The other option seems interesting to me, but seems like a research project. But maybe the best thing would be to make this an extension after ActivityPub is out. Would you be interested in exploring that direction? What do you think?

Collaborator

cwebber commented Oct 13, 2016

Hm, so this is something that I would like / would be really nice to have! Though I worry it's going to be hard to get it right in time. So, I'm not sure we should add it right now since it could be an extension, but... it's a fun idea to explore, so let's explore it.

There are probably two ways to go about it, if we were to do it:

  • Simply have a "history" object, that has an array of previous versions of the object (either inline, though obviously excluding any of their own history properties, or even better, as just URIs to previous objects). This would be easy, though it means archiving a version of an object "after the fact".
  • Do something much more "progressive"... I recently wrote a blogpost about ideas for an ActivityPub extension that would make things closer to having append-only datastructures and immutable objects. It's a fun idea, and would involve probably having most objects have some sort of "Pointer" or "HistoryLog" object type that is the indirect representation of the object, by pointing to the current version of the object and maybe previous versions too. But that's so hard to get right there's no way to get it done in time for Candidate Recommendation status.

We could try to add the "history" option and mark it as "at risk". The other option seems interesting to me, but seems like a research project. But maybe the best thing would be to make this an extension after ActivityPub is out. Would you be interested in exploring that direction? What do you think?

@brooksn

This comment has been minimized.

Show comment
Hide comment
@brooksn

brooksn Oct 13, 2016

I read your blogpost; that's an interesting concept! But I'm going to try to follow along with your first idea here. I'm also looking at Activity Streams for the first time, so I apologize if I retread ground that's already been covered or misunderstand anything (like collections).

I can think of three uses for tracking old versions.

  1. Get a page of objects, but only the latest version of each object
  2. Get a collection of all the objects that are versions of an earliest ancestor object (e.g. all the activity around a blog post)
  3. Get a directed graph of an object's versions

Feature 1 is already in the protocol, since the Update action instructs the user's server (and notifies her followers' servers) to drop the old object and never use it again.

Features 2 and 3 seem to be identical, because an object's Update history cannot be branching, since the overwriting behavior of the Update action means that the only possible parent of a new object is its most recent version.

(I'm assuming that collections are defined by the ActivityPub protocol, and that the user does not create ad-hoc collections)

To get feature 2+3, an inline history list on an object would be simple, since it's guaranteed to be 1 level deep, and it doesn't require generating new object IDs. However, that's a lot of baggage to automatically push onto every object by default. But if it's an optional behavior on Update (or another activity), and only some clients used it, an object may accumulate a patchwork version history.

A separate History collection may make sense. Old versions (or the abbreviated key/value changes posted in an Update) would be available to interested clients, but wouldn't clutter up the user's outbox/her followers' inboxes. The objects would have their own ID, but would all reference the same parent post.

(I lifted this line of thinking from tent/tent.io#170, although Tent posts aren't really compatible with Activity Streams IRIs.)

brooksn commented Oct 13, 2016

I read your blogpost; that's an interesting concept! But I'm going to try to follow along with your first idea here. I'm also looking at Activity Streams for the first time, so I apologize if I retread ground that's already been covered or misunderstand anything (like collections).

I can think of three uses for tracking old versions.

  1. Get a page of objects, but only the latest version of each object
  2. Get a collection of all the objects that are versions of an earliest ancestor object (e.g. all the activity around a blog post)
  3. Get a directed graph of an object's versions

Feature 1 is already in the protocol, since the Update action instructs the user's server (and notifies her followers' servers) to drop the old object and never use it again.

Features 2 and 3 seem to be identical, because an object's Update history cannot be branching, since the overwriting behavior of the Update action means that the only possible parent of a new object is its most recent version.

(I'm assuming that collections are defined by the ActivityPub protocol, and that the user does not create ad-hoc collections)

To get feature 2+3, an inline history list on an object would be simple, since it's guaranteed to be 1 level deep, and it doesn't require generating new object IDs. However, that's a lot of baggage to automatically push onto every object by default. But if it's an optional behavior on Update (or another activity), and only some clients used it, an object may accumulate a patchwork version history.

A separate History collection may make sense. Old versions (or the abbreviated key/value changes posted in an Update) would be available to interested clients, but wouldn't clutter up the user's outbox/her followers' inboxes. The objects would have their own ID, but would all reference the same parent post.

(I lifted this line of thinking from tent/tent.io#170, although Tent posts aren't really compatible with Activity Streams IRIs.)

@brooksn

This comment has been minimized.

Show comment
Hide comment
@brooksn

brooksn Oct 13, 2016

A collection may be more friendly to implementation as an optional extension.

brooksn commented Oct 13, 2016

A collection may be more friendly to implementation as an optional extension.

@cwebber

This comment has been minimized.

Show comment
Hide comment
@cwebber

cwebber Oct 13, 2016

Collaborator

We could indeed use a collection for the history object, though I think it would make more sense to make it a simple array. I mean, I guess collections make a lot of sense! I'm just thinking about adding activitystreams paging to every object out there and getting a huge headache...

By the way, the more I think about it, the less I think it's a good idea to inline historical objects. It could lead to an explosion if inlined objects themselves inline other objects that inline their history.. Yikes! Inlining can already get explosive; imagine how explosive if all history was contained! So if history was done it should very explicitly just be links to objects representing prior state.

Collaborator

cwebber commented Oct 13, 2016

We could indeed use a collection for the history object, though I think it would make more sense to make it a simple array. I mean, I guess collections make a lot of sense! I'm just thinking about adding activitystreams paging to every object out there and getting a huge headache...

By the way, the more I think about it, the less I think it's a good idea to inline historical objects. It could lead to an explosion if inlined objects themselves inline other objects that inline their history.. Yikes! Inlining can already get explosive; imagine how explosive if all history was contained! So if history was done it should very explicitly just be links to objects representing prior state.

@cwebber

This comment has been minimized.

Show comment
Hide comment
@cwebber

cwebber Oct 25, 2016

Collaborator

Ok, we talked about this on the call.. I'll publish across two posts.

First of all, Evan suggested another way to do this: have a history property with a collection that list all the activities that were relevant to this post, from Create to Update to Delete, etc. That way you can "walk through" all the changes that have happened in a familiar format. So that's one more option that could be explored.

Collaborator

cwebber commented Oct 25, 2016

Ok, we talked about this on the call.. I'll publish across two posts.

First of all, Evan suggested another way to do this: have a history property with a collection that list all the activities that were relevant to this post, from Create to Update to Delete, etc. That way you can "walk through" all the changes that have happened in a familiar format. So that's one more option that could be explored.

@cwebber

This comment has been minimized.

Show comment
Hide comment
@cwebber

cwebber Oct 25, 2016

Collaborator

Second point from the meeting: @brooksn, we had a good amount of interest and excitement for this idea on the call. However, we're so close to CR, and this hasn't yet as far as I know been explored as something someone's implemented, so there's concern about adding this so late.

So, I think we're not getting this in ActivityPub core, or at least not in this revision. However! We are interested in it as an extension. In order for that to happen, we would need some help having it be incubated in an implementation. Would you be interested in helping incubate it @brooksn? It could be cool to explore!

Collaborator

cwebber commented Oct 25, 2016

Second point from the meeting: @brooksn, we had a good amount of interest and excitement for this idea on the call. However, we're so close to CR, and this hasn't yet as far as I know been explored as something someone's implemented, so there's concern about adding this so late.

So, I think we're not getting this in ActivityPub core, or at least not in this revision. However! We are interested in it as an extension. In order for that to happen, we would need some help having it be incubated in an implementation. Would you be interested in helping incubate it @brooksn? It could be cool to explore!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment