New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sharedInbox / siteInbox type endpoint (publicInbox, but not just for public posts) #242

Closed
cwebber opened this Issue Jul 12, 2017 · 22 comments

Comments

Projects
None yet
6 participants
@cwebber
Collaborator

cwebber commented Jul 12, 2017

Currently, the most general way to post an activity in ActivityPub is to post it to a user's inbox endpoint. However, since well known figures with many subscribers would result in many posts to many users at once, we've made an exception for public posts, which may be posted to the publicInbox, which may be shared amongst users on a site.

On the call today, we found that this was not enough for Mastodon. On Mastodon, followers-only posts are common. Gargron gave an example that they have over 12k users, and should every followers-only post result in 12k HTTP requests given that many users are on shared servers?

Gargron suggested that Mastodon will probably reuse the publicInbox endpoint for this purpose. While I personally strongly prefer the delivery to inboxes approach, I think we need to address this. It's clear that Mastodon will do something to the effect in its implementation, so I think we need to get this right in ActivityPub itself, otherwise we could end up in the same space as what's happening in OStatus right now. One could easily see an implementation like Mastodon posting private content to the publicInbox endpoint and expecting servers to filter delivery based on content, and other servers not being aware and unintentionally delivering that information publicly to their users. That would be bad!

So, I think we should rename publicInbox to something like sharedInbox or siteInbox and change its behavior.

  • For posts that are addressed to the special Public collection, the behavior is pretty much the same as currently for publicInbox.
  • For non-public posts, we have two options...
    • Servers can look at the addressing of the post and identify what recipients it should deliver to. We could maybe restrict this to being done for followers-style collections only, or it could also look at the individuals addressed on to, cc, etc. At any rate, a receiving server will have to do its "best knowledge" of determining who a post should be delivered to. This may be less precise; it could fail for users that should have been Block'ed. Errors in the network could also mean that the followers list is somewhat out of sync between servers.
    • Alternately, a sending server could, as a separate part of the message, specify an exact list of recipients relevant to that server. For 12k or even 1M followers, this could be a large post (though it could be done in multiple posts) but it would be less large than 12k individual HTTP POSTs. But it would be much more precise, and more respectful of things like blocklists (which currently we specify not federating across servers to protect users.)
@cwebber

This comment has been minimized.

Show comment
Hide comment
@cwebber

cwebber Jul 12, 2017

Collaborator

If this is only used for delivery to followers, this endpoint can be simplified considerably. All other recipients could be handled by posting to individual inboxes.

Collaborator

cwebber commented Jul 12, 2017

If this is only used for delivery to followers, this endpoint can be simplified considerably. All other recipients could be handled by posting to individual inboxes.

@puckipedia

This comment has been minimized.

Show comment
Hide comment
@puckipedia

puckipedia Jul 12, 2017

I would vote for specifying each and every recipient, as servers may implement the addressing slightly differently, and a server can't feasibly look through a collection containing e.g. thousands of objects stored remotely. Alternately, in a perfect world, those 12 thousand followers would be separated over e.g. a thousand servers, which would limit the amount of recipients per server to way less than the total of twelve thousand.

And also, what if a user is mentioned in to and also in the follower collection?

I would vote for specifying each and every recipient, as servers may implement the addressing slightly differently, and a server can't feasibly look through a collection containing e.g. thousands of objects stored remotely. Alternately, in a perfect world, those 12 thousand followers would be separated over e.g. a thousand servers, which would limit the amount of recipients per server to way less than the total of twelve thousand.

And also, what if a user is mentioned in to and also in the follower collection?

@clacke

This comment has been minimized.

Show comment
Hide comment
@clacke

clacke Jul 14, 2017

Would it make sense to have this as two different features, publicInbox and sharedPrivateInbox? I'm thinking that maybe some implementation would only be bothered implementing publicInbox. But maybe that's adding complexity for little benefit. Personally I have to admit I was wondering why the shared endpoint was for public posts only.

clacke commented Jul 14, 2017

Would it make sense to have this as two different features, publicInbox and sharedPrivateInbox? I'm thinking that maybe some implementation would only be bothered implementing publicInbox. But maybe that's adding complexity for little benefit. Personally I have to admit I was wondering why the shared endpoint was for public posts only.

@puckipedia

This comment has been minimized.

Show comment
Hide comment
@puckipedia

puckipedia Jul 14, 2017

I think if ActivityPub had an endpoint that sends objects to many different clients remote actors, a publicInbox endpoint would be unneeded, I guess? as it'd duplicate the functionality provided by the former

puckipedia commented Jul 14, 2017

I think if ActivityPub had an endpoint that sends objects to many different clients remote actors, a publicInbox endpoint would be unneeded, I guess? as it'd duplicate the functionality provided by the former

@strugee

This comment has been minimized.

Show comment
Hide comment
@strugee

strugee Jul 15, 2017

@puckipedia I can't quite parse that sentence? What now about clients?

Alternately, a sending server could, as a separate part of the message, specify an exact list of recipients relevant to that server. For 12k or even 1M followers, this could be a large post (though it could be done in multiple posts) but it would be less large than 12k individual HTTP POSTs. But it would be much more precise, and more respectful of things like blocklists (which currently we specify not federating across servers to protect users.)

So it appears to me that, ignoring efficiency, this is the better option. It's guaranteed to be precise (even if followers lists get out of sync), and it respects Blocks. So I think the million-dollar question here is, is this good enough? This is just an optimization, after all... if it's good enough I say ship the simplest thing.

Also, since I'm having a hard time keeping this abstract scenario in my head, lemme make sure my notion of the Block problem matches everyone else's:

  1. alice@example.com and chuck@example.com follow bob@foobar.net
  2. bob@foobar.net Blocks chuck@example.com; foobar.net's notion of bob@'s followers is updated but example.com's notion of bob@'s followers isn't because the Block isn't federated
  3. bob@foobar.net posts a note with to: Followers
  4. foobar.net distributes the note to alice@example.com but not chuck@example.com

The problem being that if example.com is responsible for inferring where the note should've been delivered, it'll get it wrong because it doesn't know about the Block (and the subsequent mutation of bob@example.com's followers list). Right?

strugee commented Jul 15, 2017

@puckipedia I can't quite parse that sentence? What now about clients?

Alternately, a sending server could, as a separate part of the message, specify an exact list of recipients relevant to that server. For 12k or even 1M followers, this could be a large post (though it could be done in multiple posts) but it would be less large than 12k individual HTTP POSTs. But it would be much more precise, and more respectful of things like blocklists (which currently we specify not federating across servers to protect users.)

So it appears to me that, ignoring efficiency, this is the better option. It's guaranteed to be precise (even if followers lists get out of sync), and it respects Blocks. So I think the million-dollar question here is, is this good enough? This is just an optimization, after all... if it's good enough I say ship the simplest thing.

Also, since I'm having a hard time keeping this abstract scenario in my head, lemme make sure my notion of the Block problem matches everyone else's:

  1. alice@example.com and chuck@example.com follow bob@foobar.net
  2. bob@foobar.net Blocks chuck@example.com; foobar.net's notion of bob@'s followers is updated but example.com's notion of bob@'s followers isn't because the Block isn't federated
  3. bob@foobar.net posts a note with to: Followers
  4. foobar.net distributes the note to alice@example.com but not chuck@example.com

The problem being that if example.com is responsible for inferring where the note should've been delivered, it'll get it wrong because it doesn't know about the Block (and the subsequent mutation of bob@example.com's followers list). Right?

@puckipedia

This comment has been minimized.

Show comment
Hide comment
@puckipedia

puckipedia Jul 15, 2017

woops, with clients I meant 'remote actors' (updated above comment to clarify). And indeed, that's the same notion of the problem I had at least.

puckipedia commented Jul 15, 2017

woops, with clients I meant 'remote actors' (updated above comment to clarify). And indeed, that's the same notion of the problem I had at least.

@Gargron

This comment has been minimized.

Show comment
Hide comment
@Gargron

Gargron Jul 15, 2017

I am strongly in favour of a sharedInbox endpoint that would respect audience targeting (such as "followers collection"). Listing individual recipients does not scale well. My decision is also largely informed by business logic implemented in Mastodon. The main way of getting a status on someone's home feed is them following the author. That means the huge list of individual targets would be pretty useless. Blocks federate in Mastodon, so that's not the issue.

I am also in favour of this because I do not like the idea that publicInbox should ignore to/cc fields. The logic should be the same regardless of which endpoint is being delivered to, so renaming it to sharedInbox would be more semantic.

Speaking of which, it'd be nice to be able to target a Block activity to such a sharedInbox, without targeting the blocked user. That would feel more right than having to send a Block activity directly to the blocked user.

Gargron commented Jul 15, 2017

I am strongly in favour of a sharedInbox endpoint that would respect audience targeting (such as "followers collection"). Listing individual recipients does not scale well. My decision is also largely informed by business logic implemented in Mastodon. The main way of getting a status on someone's home feed is them following the author. That means the huge list of individual targets would be pretty useless. Blocks federate in Mastodon, so that's not the issue.

I am also in favour of this because I do not like the idea that publicInbox should ignore to/cc fields. The logic should be the same regardless of which endpoint is being delivered to, so renaming it to sharedInbox would be more semantic.

Speaking of which, it'd be nice to be able to target a Block activity to such a sharedInbox, without targeting the blocked user. That would feel more right than having to send a Block activity directly to the blocked user.

@cwebber

This comment has been minimized.

Show comment
Hide comment
@cwebber

cwebber Jul 16, 2017

Collaborator

I am strongly in favour of a sharedInbox endpoint that would respect audience targeting (such as "followers collection"). Listing individual recipients does not scale well. My decision is also largely informed by business logic implemented in Mastodon. The main way of getting a status on someone's home feed is them following the author. That means the huge list of individual targets would be pretty useless. Blocks federate in Mastodon, so that's not the issue.

This is definitely a tricky point here. We're definitely seeing decisions of the backend bleed into desires of the protocol. I guess maybe that's inevitable, but I think this is the first time we've seen it so clearly. We have effectively two approaches here:

  • Explicit delivery, where we really do see objects as being "delivered" to an inbox. You see something because at some point it was delivered to you. These are sequentially ordered containers/collections. Less space efficient but the explicitness probably results in some fewer errors, and means easier tracking of all sorts of collection types. (Pubstrate, Kroeg, Pump.io... this is effectively email-style.)
  • Inferred delivery. You determine who sees what on their stream not as much by actual delivery to an inbox, but by inference based off of who's a target on the message itself, and who's following whom. This means that each server has to keep track of whom they know is in followers/following much more carefully. It's more space efficient but may be more liable to network partition errors, including failure to synchronize collections. (Mastodon, edit: and Diaspora and I'm not sure which of the other social systems, maybe GNU Social?... but this is effectively Twitter style.)

I'm not trying to make a case for either in this post, I'm just trying to document the difference. Unfortunately, it's also pushing some pressure to make a decision in how we implement this, and whatever we do will probably affect the backends of these systems.

Collaborator

cwebber commented Jul 16, 2017

I am strongly in favour of a sharedInbox endpoint that would respect audience targeting (such as "followers collection"). Listing individual recipients does not scale well. My decision is also largely informed by business logic implemented in Mastodon. The main way of getting a status on someone's home feed is them following the author. That means the huge list of individual targets would be pretty useless. Blocks federate in Mastodon, so that's not the issue.

This is definitely a tricky point here. We're definitely seeing decisions of the backend bleed into desires of the protocol. I guess maybe that's inevitable, but I think this is the first time we've seen it so clearly. We have effectively two approaches here:

  • Explicit delivery, where we really do see objects as being "delivered" to an inbox. You see something because at some point it was delivered to you. These are sequentially ordered containers/collections. Less space efficient but the explicitness probably results in some fewer errors, and means easier tracking of all sorts of collection types. (Pubstrate, Kroeg, Pump.io... this is effectively email-style.)
  • Inferred delivery. You determine who sees what on their stream not as much by actual delivery to an inbox, but by inference based off of who's a target on the message itself, and who's following whom. This means that each server has to keep track of whom they know is in followers/following much more carefully. It's more space efficient but may be more liable to network partition errors, including failure to synchronize collections. (Mastodon, edit: and Diaspora and I'm not sure which of the other social systems, maybe GNU Social?... but this is effectively Twitter style.)

I'm not trying to make a case for either in this post, I'm just trying to document the difference. Unfortunately, it's also pushing some pressure to make a decision in how we implement this, and whatever we do will probably affect the backends of these systems.

@cwebber

This comment has been minimized.

Show comment
Hide comment
@cwebber

cwebber Jul 16, 2017

Collaborator

Going back to this suggestion I made earlier:

If this is only used for delivery to followers, this endpoint can be simplified considerably. All other recipients could be handled by posting to individual inboxes.

I wonder if this could avoid the explicit vs implicit battle? Use this endpoint for followers only, and use explicit delivery for everything else? What do people think about that?

Collaborator

cwebber commented Jul 16, 2017

Going back to this suggestion I made earlier:

If this is only used for delivery to followers, this endpoint can be simplified considerably. All other recipients could be handled by posting to individual inboxes.

I wonder if this could avoid the explicit vs implicit battle? Use this endpoint for followers only, and use explicit delivery for everything else? What do people think about that?

@Gargron

This comment has been minimized.

Show comment
Hide comment
@Gargron

Gargron Jul 16, 2017

email-style vs Twitter style

Another point I'd like to add is the distinction between inbox content which is otherwise not present in ActivityPub. In e-mail, your inbox is stuff people send you personally. In a social network, you have a home feed, which is things you subscribe to passively, and notifications, which is stuff sent to you personally. I don't think any of our current users would be happy about the prospect of anyone having the capacity to insert their post into their home feed.

If you use explicit delivery, there is no way to distinguish between a truly targeted post (notification-worthy) and a passive post to followers (home). So followers URI must be handled imo.

Gargron commented Jul 16, 2017

email-style vs Twitter style

Another point I'd like to add is the distinction between inbox content which is otherwise not present in ActivityPub. In e-mail, your inbox is stuff people send you personally. In a social network, you have a home feed, which is things you subscribe to passively, and notifications, which is stuff sent to you personally. I don't think any of our current users would be happy about the prospect of anyone having the capacity to insert their post into their home feed.

If you use explicit delivery, there is no way to distinguish between a truly targeted post (notification-worthy) and a passive post to followers (home). So followers URI must be handled imo.

@cwebber

This comment has been minimized.

Show comment
Hide comment
@cwebber

cwebber Jul 16, 2017

Collaborator

Even if we do adopt the implicit federated posts endpoint as a compromise, it does introduce the problem that it requires federating Block activities if Block activities are also meant to stop delivery to such a user as a follower, which we even have text in the spec as-is saying you shouldn't do, to protect users...

Collaborator

cwebber commented Jul 16, 2017

Even if we do adopt the implicit federated posts endpoint as a compromise, it does introduce the problem that it requires federating Block activities if Block activities are also meant to stop delivery to such a user as a follower, which we even have text in the spec as-is saying you shouldn't do, to protect users...

@jaywink

This comment has been minimized.

Show comment
Hide comment
@jaywink

jaywink Jul 16, 2017

Lots of good IRC discussion about this, this AFAICT the last mostly head nodding receiving suggestion:

23:35 | <cwebber2> | - we should switch publicInbox to sharedInbox; make it for addressing only to followers and individuals on to/cc
23:35 | <cwebber2> | *and*
23:35 | <cwebber2> | - we switch the Block section away from saying SHOULD NOT federate, and instead include an informative note explaining the tradeoffs to doing each

Relevant IRC logs at this point and before: https://chat.indieweb.org/social/2017-07-16#t1500237332450000

jaywink commented Jul 16, 2017

Lots of good IRC discussion about this, this AFAICT the last mostly head nodding receiving suggestion:

23:35 | <cwebber2> | - we should switch publicInbox to sharedInbox; make it for addressing only to followers and individuals on to/cc
23:35 | <cwebber2> | *and*
23:35 | <cwebber2> | - we switch the Block section away from saying SHOULD NOT federate, and instead include an informative note explaining the tradeoffs to doing each

Relevant IRC logs at this point and before: https://chat.indieweb.org/social/2017-07-16#t1500237332450000

@strugee

This comment has been minimized.

Show comment
Hide comment
@strugee

strugee Jul 16, 2017

Link to the Etherpad from the Mumble call: https://public.etherpad-mozilla.org/p/activitypub-implicit-explit

Someone correct me if I'm wrong but I believe this was the consensus:

  1. sharedInbox is used only for public and followers delivery, everything else goes to individual inboxes
  2. ForceUnfollow (what it sounds like; probably done with {Undo: {Accept {Follow}}} or {Reject {Follow}}) is a separate concept from Ignore (preventing side effects, probably still done with {Block: {Actor}})
  3. Whether to perform ForceUnfollow at the same time as Ignore is left up to implementors, who can then use the "strawman proposal" algorithm at the bottom of the Etherpad. E.g. Mastodon will do this to match existing UI and thus will always use the sharedInbox endpoint.
  4. Need some way to communicate to clients whether or not the implementation does this, so clients can present accurate information to user (i.e. want to avoid a situation where client presents ForceUnfollow and Ignore as separate actions, but the server implicitly ForceUnfollows when the user Ignores). Could be a binary flag on the actor

strugee commented Jul 16, 2017

Link to the Etherpad from the Mumble call: https://public.etherpad-mozilla.org/p/activitypub-implicit-explit

Someone correct me if I'm wrong but I believe this was the consensus:

  1. sharedInbox is used only for public and followers delivery, everything else goes to individual inboxes
  2. ForceUnfollow (what it sounds like; probably done with {Undo: {Accept {Follow}}} or {Reject {Follow}}) is a separate concept from Ignore (preventing side effects, probably still done with {Block: {Actor}})
  3. Whether to perform ForceUnfollow at the same time as Ignore is left up to implementors, who can then use the "strawman proposal" algorithm at the bottom of the Etherpad. E.g. Mastodon will do this to match existing UI and thus will always use the sharedInbox endpoint.
  4. Need some way to communicate to clients whether or not the implementation does this, so clients can present accurate information to user (i.e. want to avoid a situation where client presents ForceUnfollow and Ignore as separate actions, but the server implicitly ForceUnfollows when the user Ignores). Could be a binary flag on the actor
@clacke

This comment has been minimized.

Show comment
Hide comment
@clacke

clacke Jul 17, 2017

If you use explicit delivery, there is no way to distinguish between a truly targeted post (notification-worthy) and a passive post to followers (home).

Oh! I wasn't paying attention and didn't notice that major and minor inboxes went away on the way to standardization.

But coupling major/minor addressing to delivery efficiency seems backwards.

clacke commented Jul 17, 2017

If you use explicit delivery, there is no way to distinguish between a truly targeted post (notification-worthy) and a passive post to followers (home).

Oh! I wasn't paying attention and didn't notice that major and minor inboxes went away on the way to standardization.

But coupling major/minor addressing to delivery efficiency seems backwards.

@Gargron

This comment has been minimized.

Show comment
Hide comment
@Gargron

Gargron Jul 17, 2017

@clacke By major and minor addressing do you mean "to" vs "cc"? Because if so, that is still in the spec.

Gargron commented Jul 17, 2017

@clacke By major and minor addressing do you mean "to" vs "cc"? Because if so, that is still in the spec.

@cwebber

This comment has been minimized.

Show comment
Hide comment
@cwebber

cwebber Jul 17, 2017

Collaborator

@Gargron major and minor are "specialized" filtered read-only inboxes in pump.io. They're basically filters for your main timeline, filtered to have just the stuff you really want in it, vs the firehose of everything, including every follow and unfollow and like and delete that crosses your timeline.

@clacke major and minor probably won't go away, it'll just be moved to an extension; it's not necessary for the main protocol and is super underspecified as-is. Probably we'll see some different filtered inbox proposals come up in extension-land.

Collaborator

cwebber commented Jul 17, 2017

@Gargron major and minor are "specialized" filtered read-only inboxes in pump.io. They're basically filters for your main timeline, filtered to have just the stuff you really want in it, vs the firehose of everything, including every follow and unfollow and like and delete that crosses your timeline.

@clacke major and minor probably won't go away, it'll just be moved to an extension; it's not necessary for the main protocol and is super underspecified as-is. Probably we'll see some different filtered inbox proposals come up in extension-land.

@strugee

This comment has been minimized.

Show comment
Hide comment
@strugee

strugee Jul 17, 2017

https://github.com/pump-io/pump.io/blob/master/API.md#major-and-minor-feeds for those who want as close as you can get to authoritative info on these feeds.

strugee commented Jul 17, 2017

https://github.com/pump-io/pump.io/blob/master/API.md#major-and-minor-feeds for those who want as close as you can get to authoritative info on these feeds.

@clacke

This comment has been minimized.

Show comment
Hide comment
@clacke

clacke Jul 17, 2017

If to and cc is still in the spec I don't see that "there is no way to distinguish between a truly targeted post [ . . . ] and a passive post to followers". Delivery mechanism shouldn't affect that.

clacke commented Jul 17, 2017

If to and cc is still in the spec I don't see that "there is no way to distinguish between a truly targeted post [ . . . ] and a passive post to followers". Delivery mechanism shouldn't affect that.

@cwebber

This comment has been minimized.

Show comment
Hide comment
@cwebber

cwebber Jul 25, 2017

Collaborator

From the meeting:

<eprodrom> PROPOSED: for https://github.com/w3c/activitypub/issues/242, group supports
  renaming publicInbox to sharedInbox and allowing sending to followers only IFF
  implementation support 

... so this is a TODO for me.

Collaborator

cwebber commented Jul 25, 2017

From the meeting:

<eprodrom> PROPOSED: for https://github.com/w3c/activitypub/issues/242, group supports
  renaming publicInbox to sharedInbox and allowing sending to followers only IFF
  implementation support 

... so this is a TODO for me.

@nightpool nightpool referenced this issue Aug 12, 2017

Merged

ActivityPub delivery #4566

16 of 16 tasks complete

cwebber added a commit that referenced this issue Aug 27, 2017

@cwebber

This comment has been minimized.

Show comment
Hide comment
@cwebber

cwebber Aug 27, 2017

Collaborator

sharedInbox is now added to the editor's draft. Additionally, the "IFF implementation support" is already in place, since Mastodon has added sharedInbox, as described in this document, to their implementation, and is ready to roll it out in their next release.

Collaborator

cwebber commented Aug 27, 2017

sharedInbox is now added to the editor's draft. Additionally, the "IFF implementation support" is already in place, since Mastodon has added sharedInbox, as described in this document, to their implementation, and is ready to roll it out in their next release.

@cwebber

This comment has been minimized.

Show comment
Hide comment
@cwebber

cwebber Aug 27, 2017

Collaborator

I'm not closing this yet because we need help to add sharedInbox to the ActivityStreams vocabulary. I guess that means that publicInbox should be marked as deprecated as well on that document?

Collaborator

cwebber commented Aug 27, 2017

I'm not closing this yet because we need help to add sharedInbox to the ActivityStreams vocabulary. I guess that means that publicInbox should be marked as deprecated as well on that document?

@cwebber

This comment has been minimized.

Show comment
Hide comment
@cwebber

cwebber Sep 6, 2017

Collaborator

Oh yeah, sharedInbox was added to the AS2 vocab/context, so we're good!

Collaborator

cwebber commented Sep 6, 2017

Oh yeah, sharedInbox was added to the AS2 vocab/context, so we're good!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment