New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Federation relays #7399

Closed
Gargron opened this Issue May 7, 2018 · 23 comments

Comments

Projects
None yet
@Gargron
Member

Gargron commented May 7, 2018

I'm not saying that we're planning to implement this, but it's an interesting idea worth a discussion. One glaring hole of Mastodon UX is bootstrapping a new server with content so it doesn't feel empty.

See: https://github.com/jaywink/social-relay

Application to act as a relay for public posts using the Diaspora protocol. Keeps track of nodes and their subscription preferences, receives payloads and forwards the payloads to subscribers. The aim is to pass public posts around in an efficient way so any new node in the network can quickly subscribe to lots of public activity, without having to wait a long time to create social relationships.

A way to fix #1589

The interesting, and perhaps negative aspect of this is that the implementation of such a relay server would be completely orthogonal/independent of ActivityPub. As far as the relay is concerned, it's just passing strings around and managing subscriptions. It's negative in my opinion because there's no standard to fall back on.

@charlag

This comment has been minimized.

charlag commented May 7, 2018

I want to note that Pleroma pulls all favourited statuses so they're visible in the public timeline.
It merely mitigates the problem but I find it relatable.

To make it clear: do you consider this replay functionality to Mastodon itself or you think of a separate relay server?

Also, signature things still work for relayed statuses, right?

@trwnh

This comment has been minimized.

Contributor

trwnh commented May 8, 2018

Maybe this could be implemented server-to-server by allowing a server to subscribe to another server? This would hypothetically fetch public posts from that server and then display them in the federated timeline.

The only concerns would be:

  • How would this affect the subscribing server? Since it's optional, admins with limited storage can simply not subscribe to any other servers. But admins that have the space and want to expand federation instantly now have the chance to do exactly that.
  • How would this affect the subscribed server? The large volume of requests to fetch posts might be too much for a small server that is subscribed to by many other servers. This would be greatly reduced with relays, either by a (larger) server that exists only to subscribe to a few (smaller) servers (and then you can subscribe to the relay), or by making subscriptions nest (so that a subscription's subscriptions are also forwarded).
  • What if a malicious admin / community implement full text search and look for people to harass? They can be suspended just like any other misbehaving server.
@TheInventrix

This comment has been minimized.

Contributor

TheInventrix commented May 8, 2018

Maybe this could be implemented server-to-server by allowing a server to subscribe to another server?

I think this approach would potentially fit better with the way mastodon federates etc. Conceptually, one could implement it by a kind of "server follow", I think?

Elaborating on that idea, what I'm thinking is have an internal dummy account - one that doesn't have a public page or a home timeline or anything, so it would dump anything it receives that wouldn't go into the public timeline.

Then, when you add another instance to follow, that dummy account follows all the unlocked accounts on that instance, thus feeding all their public posts into your public federated timeline.

Admittedly, this would be significantly more complicated an idea to implement as-is - setting up internal support for a dummy account, creating a task to check the followed instance for new accounts, etc. - but maybe it'll give someone ideas for a simpler halfway point that can still use ActivityPub as a protocol.

@Gargron

This comment has been minimized.

Member

Gargron commented May 8, 2018

But consider, as new server owner you would have to go and deliberately find all the servers you would want to follow. Compare that to a "subscribe to relay" interface, where there's maybe a default relay address preset, with the ability to change it. You click it, and now you get content from all the servers the relay knows about. That's a lot more content and a lot less manual picking for the new admin.

@Cassolotl

This comment has been minimized.

Cassolotl commented May 8, 2018

I think if I were an admin I'd be willing to put in the work to curate a federated timeline that I want, but I can see that someone might want to subscribe to a broader thingmabob and also do less work!

@lambadalambda

This comment has been minimized.

lambadalambda commented May 8, 2018

We are planning to have something like this in Pleroma, but just as a way for one server to subscribe to all of the public posts of another instance. Those 'server-followed' servers would then show up as an additional 'meta-local' timeline to make it easier for smaller instances to find new people.

@Gargron

This comment has been minimized.

Member

Gargron commented May 8, 2018

That I would implement as an Application-type actor that is the domain itself, which would Announce all public posts. Don't think end-users should be able to view/subscribe to it from the UI though, it should be a behind the scenes thing

@TheInventrix

This comment has been minimized.

Contributor

TheInventrix commented May 8, 2018

That's a good point. But, as the admin of a very small instance, more content with less picking has cons as well as pros. In terms of convenience and rapidly growing your federation, it's solid benefits. But it's also hooking your instance straight up to the firehose, which for a small instance that just wants a kickstart to get hooked into the fediverse is way overkill and can totally flood your server.

I suppose you could deal with that by hooking up to the relay for just ten minutes or so, though...

edit: Another thought I just had: Telling a relay server about your instance is a really good way for people to find YOUR instance, as well as a good way to get information in, so that's another big check for the Pros column.

@TheInventrix

This comment has been minimized.

Contributor

TheInventrix commented May 8, 2018

@Cassolotl I think this is an entirely separate use case from having a curated federated timeline. If you really want control over it, then you'd just go find the instances and accounts you want and follow them yourself, you know? This is more for casting a net to broaden your federation than for curation.

@remram44

This comment has been minimized.

Contributor

remram44 commented May 8, 2018

I'm not sure something that difficult and integrated into the network is required. What about "subscription lists"?

A Mastodon server could subscribe to a list of instances that should show up in its "federated timeline" (and check periodically if the lists changes I guess).

Note that the same feature could be implemented for lists of instances (or accounts) to silence. This is similar to the spamlists used in the email network.

@Gargron

This comment has been minimized.

Member

Gargron commented May 8, 2018

But it's also hooking your instance straight up to the firehose, which for a small instance that just wants a kickstart to get hooked into the fediverse is way overkill and can totally flood your server.

Subscribing would be an option, not automatic

But yes that's essentially the whole point of the endeavour: a global firehose. It's something some people want, but not others.

I'm not sure something that difficult and integrated into the network is required. What about "subscription lists"?

It doesn't strike me as difficult, in fact, it seems less difficult than subscription lists logic. A relay server is essentially like a stand-alone WebSub hub: subscribe/unsubscribe logic, publish logic, and redistribution logic. Faaairly straightforward.

@cwebber

This comment has been minimized.

cwebber commented May 8, 2018

I'm not sure it's incompatible with ActivityPub in any way in terms of vanilla ActivityPub, but it complexifies things maybe quite a bit when you consider the surface that must be considered when it comes to checking authentication.

Consider it this way: if posts can be relayed already when signed, in theory you could subscribe to any entity that starts sending posts to your inbox.

But, can they send it in such a way that they can also sign the message with http signatures? The object-level signature wouldn't be so complex, that can be maintained, but the http-level signature is another thing because the entity sending you the message isn't the one that authored it. Maybe this is a problem, maybe not.

But one way to get around that would be to wrap it in a Share, I suppose? Earlier there was some distinction suggested between Shares sent by Groups and Shares sent as a retweet but I don't remember what it was?

@trwnh

This comment has been minimized.

Contributor

trwnh commented May 9, 2018

I think in the server-follows-server model, there's still room for a 'meta-server" option. I think there needs to be more consideration than just "global firehose" -- which instances are included in that relay? Is it opt-in or opt-out? Does a server admin have to further curate the relay to silence or suspend problematic instances?

In other words, how would a meta-server discover other servers? Is it a spider model? Chosen by an admin? Submitted by other admins? And so on.

@cheesegrits

This comment has been minimized.

cheesegrits commented May 9, 2018

I'd like to see a variation of this that relays hashtags. When it comes to discoverability, I think most users are more interested in subjects, rather than random people. Or rather, being able to easily see hashtag content fediverse wide would lead to the interesting (not quite so random) people. The whole point of hashtags is discoverability.

The way I would imagine it working would be that instances stream all local hashtagged toots to the relays it knows, and subscribes to those tags. Whenever anyone searches a hashtag, their instance subscribes to it with any relays it knows. Subscriptions expire after X days with no searches or toots. Maybe an admin option for "subscribe all", to open up the firehose.

@mal0ki

This comment has been minimized.

mal0ki commented May 9, 2018

I feel this is relevant to our interests:

image

Also known as the "WHARRGARBL protocol"
https://mastodon.social/@theoutrider/99995120515620656

On a more serious note: I am glad that we're taking some time to discuss it 👍

@jaywink

This comment has been minimized.

jaywink commented May 10, 2018

Nice to see interest in the "social relay" concept here on the mastodon side too. It's a common problem, finding content, which is what the relay was designed to solve. At least until something better comes up. To clarify, I'm the author of the relay idea and the current servers running.

I think there is some confusion on what the relay is though. The (very simple) spec allows for a bit more than just being a firehose of all content to everyone. Basically it allows two things:

  • Sending to the relay. This does not require subscribing - just send and the relay will or will not relay it, depending on content and subscriber settings. There are some minimal checks done like verifying the signature in the post against the author public key, decreasing the possibility of spam to the network.

  • Subscribing to content. This does not require sending. Subscribing has three levels, "none" (not subscribed), "all" subscribed to everything (= firehose) and "tags" (subscribing to only certain tags). So, a server whose admin has decided "this server is about linux" can set a list of suitable hashtags that are scanned from the post - and only matching posts are delivered. In Diaspora, admins can also choose to let users influence this list of tags, but that is a server detail outside relay spec.

Currently on my relay the stats list 153 servers subscribing, of those 68 only subscribe to a list of tags. Senders is unknown, the relay doesn't track those separately.

Discovery of subscribers happens through a separate server list, currently the-federation.info. The relay looks polls the list every hour and then polls all the servers every hour, looking for a .well-known/x-social-relay document, which looks like this for my server. Any changes to this document on subscriber side will be picked up within an hour.

The relay system is very much designed for a "server" based protocol. Even though ActivityPub doesn't have a concept of server, Mastodon does, so I don't see why a similar system would not work. If I had time I would add ActivityPub support for the relay software just because there is nothing stopping doing so.

The only thing ActivityPub doesn't have is signatures for direct delivery. Thus, only servers who agree to certain rules would be able to receive content via the relay server(s). This is one reason why I pushed for signatures in the SocialWG, to be able to do indirect or delayed delivery while being able to ensure the sender is who they are, like Diaspora allows. With vanilla AP this becomes impossible.

In addition to Diaspora, the relay system has support in Friendica, Hubzilla, Socialhome and GangGo - ie all platforms which use the Diaspora protocol. The relay idea and the relay software is very simple, the most work was coordinating on implementing it cross-platform.

Hope this clarifies some things, interested to see where the discussion goes. There are no big plans for the future of the relay system, but obviously if Mastodon decided to implement something even remotely similar, it would be nice to cooperate, maybe even move the discussion to the SocialCG, if that feels like it makes sense.

One thing that I never got to implement due to time issues was decentralizing the relay network itself. Currently servers decide which relay they send to, and since all relays use the same server list, they still get all the content independent of which relay it was delivered to. But it would be nicer if servers didn't need to decide which relay they would send to. If a relay goes down, they would just choose another one. Due to the amount of daily posts being quite low in Diaspora protocol network, approx 2K-4K per day sent to the relay (about 50% of total I estimate), one relay has been easily fine to handle that amount. In Mastodon world, a single relay would not make sense.

@annando

This comment has been minimized.

annando commented May 10, 2018

Friendica has got some "direct relay" mechanism in the current develop branch. We enhanced the social relay protocol according to this issue: https://github.com/jaywink/social-relay/issues/64

When a Friendica server had activated the "direct relay" it regularly checks all known servers for their relay subscriptions (Tags or all content). When a new public item is created, the server then sends the content additionally to these servers according to their relay settings.

This avoids the need for a dedicated relay server and should work with AP as well, since the content is sent from the original server.

Concerning the server discovery: Friendica does have an API endpoint to expose all known server. (Like Mastodon does have). So even a brand new server does only need to know one single other server to query it for a list of known servers (and so on). And since there are sites like http://the-federation.info we can even start with these sites as a source for some known servers.

@schmittlauch

This comment has been minimized.

schmittlauch commented Jun 26, 2018

One idea retaining the decentralisation, but at the cost of a higher complexity could be implementing a P2P network in the background based on a Distributed Hash Table (DHT). One could distribute the tags as keys over the DHT (= over the nodes in there) and then instances can selectively follow only certain tags they choose.

A downside of this is that it nearly adds a 2nd shadow-network in the backend not based on ActivityPub and that nodes/ instances responsible for popular tags get far more load.

@Gargron

This comment has been minimized.

Member

Gargron commented Jun 26, 2018

The thing is, the relay software could totally implement peering in an opaque way if it was needed. Being able to switch out which relay your server connects to would be enough for decentralization. Relays would be very simple software, preferably written in a high-performance language. They could have a completely independent, faster-paced development cycle. That all speaks in favour of having them as separate infrastructure instead of making Mastodon servers into relays.

@jaywink

This comment has been minimized.

jaywink commented Jun 27, 2018

@Gargron these were the exact design principles for the relay software I made for the Diaspora protocol. The idea was that it would be lightweight, not tied to any project that was invited to integrate with it (Diaspora, Friendica or Hubzilla) and could be easily decentralized by allowing platforms to choose which relay they want to interact with. I think it's a good way to go. I was also planning at some point to expand it towards ActivityPub, but it doesn't look like I'll have any time.

If you end up building something, it would be beneficial to sync on some design principles, for example regarding how subscribers are discovered, what kind of metadata is exposed, etc - basically the whole external API. This would allow different relay software stacks to operate in the same space. This was kind of the idea from the start with the Diaspora relay that my software would only be a POC - but it ended up remaining and still operates almost two million received/sent payloads later :)

@akihikodaki

This comment has been minimized.

Collaborator

akihikodaki commented Jun 28, 2018

I have two questions about its semantics rather than technical details:

Is there any clarification what "relay" means? It may be problematic if it is to send all known objects, including remote objects, to another server. For example, you may want to subscribe all objects on a server which has a particular topic you are interested in. However, if the server has a lots of remote follows, the server is able to relay all of them. But it is probably not what you want in this case.

I'm also wondering what is your objective to subscribe a relay. Is it to fill the federated timeline, or to fill timelines of remote accounts which are not followed yet? If it is to fill the federated timeline it will work well, but it will also export the problem of the federated timeline to small instances; a federated timeline on a mature instance is often messy, including various topics, languages, etc. If it is to fill account timelines, then it may be better to make HTTP request when necessary.

@jaywink

This comment has been minimized.

jaywink commented Jul 1, 2018

@akihikodaki if interested, there is a description of the diaspora relay here: https://github.com/jaywink/social-relay/blob/master/docs/relays.md

For that system, only locally created content should be sent to a relay.

@Perflyst

This comment has been minimized.

Perflyst commented Nov 10, 2018

Is there already a list for public federation relays? Also setting up a relay is not very well documented. Can anyone please link some information about?

Relays I found:
https://relay.joinmastodon.org/inbox (not approved yet)
https://relay.mastodon.host/inbox (enabled)
https://relay.mastodon.nl/inbox (not approved yet)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment