-
-
Notifications
You must be signed in to change notification settings - Fork 6.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disable anonymous access to the streaming API #23989
Conversation
Co-authored-by: Claire <claire.github-309c@sitedethib.com>
This breaks how https://relay.fedi.buzz gets posts. A current number of 1189 small instances is going to miss that content. |
Since the PR is missing any kind of context or justification it feels a bit like small instances are going to have a hard time participating in their communities when fedi.buzz gets shut down. |
Just chiming in to say that fedi.buzz has been a very useful tool in getting more content, and losing it would be pretty bad. I understand that not every instance wants to have their streaming endpoints public, but authorized fetch already solves this and makes the streaming endpoints token only. |
Do we deserve a explanation for this change? |
My server requires relay.fedi.buzz to gather posts from outside our immediate federation. Small instances just cannot obtain the reach that mastodon.social has @ClearlyClaire. It's how totally bot-driven instances like press.coop maximize their instance use. They have literally 0 correspondence with their users. Please think of how federation works, and particularly the GAPS in how federation works, and how this isn't making federation better, just cleaning out the Riff-raff. |
The explanation for why this was removed is due to security and scraping — previously having the streaming API open meant that data was going to unknown locations, not just to users. For instance, someone could have written a service to help evade bans and we'd have no way to restrict them from accessing data. All of the timeline APIs required authentication to consume, see the documentation, streaming was the only unauthenticated way to retrieve data from an instance. Potentially what we could add to streaming is support for:
Currently that's not available / not a tested path, as authentication has only been for users in streaming historically. |
Oh no this will kill federation for my small instance 😱 |
This will have an enormous impact on the large number of small or single user instances that relies on this to get around federation issues. I feel that Mastodon should be doing more to improve their experience of small servers rather than introduce changes that disproportionately impacts them. |
Chiming in here as I understand its current implementation will break relay services. That will really hamper not just single instance deployments, but also make bootstrapping new public instances phenomenally more difficult without a means of discovery. I get the goal here, but the unintended impact will create fediverse islands. Perhaps put this behind an option? Even if enabled by default, a toggle would solve this for us little guys. |
Not true? Public (local/federated) and hashtag timelines do not require authentication, unless instance admins opted out of public preview. |
Any thoughts on how this will affect instances that rely on relays to stay connected to the larger world? It would seem like this is going to hamper that and significantly push us to a much more centralized configuration of servers that are really seeing the bulk of the global timeline. |
Why is the description for this pull request blank? And why is there no official explanation for why this is being added? I feel this is disingenuous to the larger community as a whole, not knowing why going forward federation will be more difficult. |
This streaming API is not required to federate. It is only for "steaming" not "scraping". And there are native relay feature in admin panel for that. Ex) |
I would like to chime in here too... The main way I got any followers on my small instances was to look at the Federated feed, and even in my new users (of which there are admittedly few) I advise they look at the Federated feed. Is there a particular / specific example of how this API being exposed to the public with the ability to be scraped? There is absolutely no notations here on the specific security issues that are being solved? Was there a vulnerability? Are the developers looking to create a more centralized architecture? If this will not break Federating can someone explain why? So far no one has addressed the big question of WHY this is a good idea other than a vague blurb about security? I don't think it's unreasonable to talk in more detail about what this patch will fix, problems it might create, and how we might mitigate those problems? I feel like the author has probably already done that, well, we'd like to know too? I can see that @ClearlyClaire and @Gargron have worked really hard on fixing issues and making things better, for that I praise them but a little explanation goes a long way. |
I'm sorry, I have not been a developer in a long time but I was once once. Can you talk more about why this won't be a problem? I have no issue with this being a lot of hubbub about absolutely nothing, but I would like to know more about how this won't break anything? |
There is the native relay feature that powered by this relay softwares: which is "Announce" all post from/to ActivityPub servers, not using Streaming API, just using ActivityPub. Actually, I didn't realize until now that there are another relay which uses some hacky way to scrap streaming API... |
How in the world did this get merged in the first place with no description or issue associated with it? I admit I haven't participated in development on mastodon, but surely this is not how development on such a large project should work. Anyway, I'm just chiming in to say that this seems like a terrible change that will disproportionately impact smaller servers. Sure, it won't break "federation" in a technical sense, but now they'll just be cut off. relay.fedi.buzz is crucial, and this will break it. Please at least put this behind an option and make it disabled by default so that you don't break existing servers that rely on it. |
This hacky way is preferred for a lot of single or small instances because we can keep the storage use lower by targeting content and curate our own instances' federation feed by independently choosing to subscribe to particular tags or instances. Sure, we can join a huge relay, but it's not desired if we want to make the federation timeline more relevant to our specific users. Unless there's a built in feature to replace this, the PR will just remove our access or force us into a large relay, both aren't great. And like someone else said above, if an instance is public, the users' public posts are publicly accessible through the web interface so how is this improving security against a banned user? |
Which one, not idly?
Indeed, as I've brought up a couple of times.
I am, and they're breaking it, that's the whole reason I'm here. xD (inb4 'not a real relay')
I can tell you that from use of one as a small server admin, it does solve the problem. You get the narrow slices of Federation that you actually want and your hardware can handle this way.
And I'd be happy to use that 'real' relay. The problem is "when it exists." Which it does not. Yet. |
Isn’t this a permissions problem rather than streaming? Like if you’re posting to Public doesn’t that mean public? Sounds like Unlisted would be the valid permission for these kinds of users, and if the streaming API isn’t doing its job filtering those out, that’s a different issue. Especially if someone is using those #hashtags, I would like to think their goal is a very wide, public conversation, which is precisely what |
https://relay.universeodon.com
I meant a real relay, as per Mastodon terminology and feature set.
Then someone can create it, or contribute to relay.fedi.buzz and add this feature. There is no code change needed on the Mastodon side. |
I feel like I tried to set up https://relay.universeodon.com/ months ago and it never accepted the request. That's fine, I can give it another go. As to the rest, I'm not saying "Mastodon has to do Y," though I will be happy to use it should it be provided in future. I'm saying, "Mastodon shouldn't break the third-party tool that already lets us do Y," particularly if Mastodon chooses to say, "This isn't our problem, go make a third-party tool," given that people already did that, and Mastodon is now intentionally breaking it. (Still a little sore about that. See also third-party moderation tools and clients on Reddit.) |
TL;DR: One site that provides in a very jankey way will have an issue if this change persists and nothing changes. This shouldn't be a big deal, people should go contact that dev / admin team to find out if they are going to work around this issue. Yes it helps, but it isn't the only relay out there. We should all just take a moment and chill. On the other side of that, come on devs, please put some notes on why things are happening to avoid this? It should have just said on pull request that these changes were done for [xyz] and a lot of hassle would have been solved. I had not seen that relay.fedi.buzz site before, but I was able to populate my site with several (lots) of other relays, I was even unaware that if you go to the URL without the /inbox you can find information about how to join -- which is probably my bad for not reading carefully enough. I understand that relay.fedi.buzz isn't a 'real' relay because relays have specific definitions for the Fediverse and that makes perfect sense to me. To bring a bit of perspective, there are many here who seem to be under the impression that relay.fedi.buzz is not needed because Relay functionality is already built in to Mastodon. I get this, but I decided to run an experiment. I went to relay.fedi.buzz and I decided to take some hashtags that I'm really into (teaching, ttrpg, math, and similar). What I found was that this was a HUGE help to finding new content to participate in. My feed was lit up with those specific tags from lots of different people. I then removed those relays, to see what would happen. My feed's content for those immediately dropped! Am I saying it's critical to keep the streaming API the way it was? No, just because relay.fedi.buzz uses the streaming API now doesn't mean it couldn't use the web API to scrape what's there and distribute it. As I understand it, relay.fedi.buzz allows new discoverable posts to be had but once we interact with someone the ActivityPub protocol tells the instances communicate directly? My understanding is that if someone on mastodon.social (or another big one) follows me and then I post something, since my Mastodon knows that they've followed me, it'll be sent over to them on mastodon.social (or whatever instance). So, that shouldn't take a time-hit (if I'm correct) from not being a live feed. Thus it should be perfectly find for relay.fedi.buzz to scrape public data from the HTTP API at an interval and distribute it. Please, correct me if I'm wrong? Many are also forgetting that relay.fedi.buzz has it's own admin / developer team, and if Mastodon's streaming server is crashing on large instance because of the streaming code not being as efficient as it needs to be, that needs to get addressed and is far more important to the growth of the Fediverse than one kind of kludgy system. It kind of sounds like this is much ado about nothing. Yeah there's some uncertainty but rather than get upset as Mastodon for doing something that really only impacts one kinda janky "relay" one might want to contact the admin of that site and find out how they are going to handle it. I do also think that there should have been some kind of comment on the pull request or its approval which says why it's happening. A lot of this super extended conversation would have been saved if someone had just put the reasons. I also find it a little odd that one of the early comments aid this was a security issue but when pressed on it, turns out it's a stability and optimization issue? Yeah, some reasoning would have been nice and avoided a lot of this back and forth and people freaking out. I also would like to suggest that admins of small instances look into this site (if they haven't already) which has a list of traditional relays. |
As one of many users of small fediverse instances I implore you to reverse this change. |
Might be making I agree this PR at this moment. But, other ways can be suggested. |
I'm pretty sure you're going to push this through at this point no matter what some of us have to say, but I want to be very explicit on something: You're breaking up existing communities with this, and they have no idea it's coming. I've spent four hours tonight (not including dinner break) finding, following, and talking with everybody I can still find who has been involved in Monsterdon, using a second account I set up last year to mass-follow people in order to help boost our Federated feed. I trolled through thousands of posts to find them - that's after filtering down to the hashtag, of course - and added as many as I could. Despite all this, polls on Mastodon tell us that most people who are into Monsterdon don't generally post. They read and watch, but don't talk. Hell, some of them don't even watch the movie, they just like following along with the chaos of the commentary. Those people are just gone. Our site won't see them if and when they do decide to start talking, or talking again if they talked further back than we happen to have retained in our cache here. And Monsterdon is - or was, before this - a growing event, but we probably won't see new people who show up. We might see some of them, but not most. (Assuming fedi.buzz doesn't implement a new solution, of course. This entire long comment assumes that.) If someone not me hadn't figured out what this change was going to do and sent up a little red flag, I myself wouldn't've known this was coming. Most of the community, a community I personally have helped grow, that I have used to draw people into trying Mastodon ... well, from our point of view, they would've just vanished. Essentially overnight. And we'd've had to work pretty hard to figure out why and try to pick up the pieces. Fortunately, the founder and I follow each other, so I wouldn't've been completely lost, but that's not true of most. And sure, it's just a dumb little monster movie watching event, who cares amirite. A bunch of people who have come together to have fun on the internet in a time when that's not easy. Some of us have become Monsterdon friends, even. But how likely is it that we're the only little community built this way? I mean, okay, maybe we are the only one, I don't know. It strikes me as improbable, given how reliant newer small instance admins have been on fedi.buzz over the last year, so I'm pretty confident there are other communities like ours as well. I don't actually know. But I do know this: you have no idea, because you didn't even look. As far as I can tell, you didn't talk to small instances about this at all. And when someone came to you a month ago saying this might be a problem, your response was basically "eh" and doing it anyway. How many other small instances like mine still don't know what's about to happen? How many other little communities are going to find themselves suddenly scattered without explanation? You're not telling anyone this is happening, after all. We had to figure it out on our own, and I didn't do that - someone else did. I just got lucky and saw it. How many instances, how many users are going see their Home feed just wither, have no idea why, and go, "well, this sucks, I guess everyone's bailing" and walk? I don't know. So given that you're pretty bound and determined to do this, what I'm suggesting is:
This is how you depreciate an API when you know doing so will break functionality people actually use. There needs to be clear communication, they need to know meaningfully before, and they need to know how either to mitigate the issue or migrate to replacement tools. In the corporate environment, that sometimes means years of warning, and at least months, but at this point I'd take a decent number of weeks (>5). Unfortunately, so far, you've given no (0) notice. Absolutely none of this has happened, and there's not one single IT team in the world who would take that from a vendor and go, "eh, that's okay. We can be down for a month while we sort this out." But that's what you're doing to one degree or another to the >1100 small instances using this tool, and who even knows how many small communities. Maybe few. Maybe a lot. Everything else - and there's a lot else - aside, that's just terrible IT practice. |
I am not a member of the Mastodon team and never have been, but to be clear:
This change was made in March. It is not coming; everyone who has deployed the recent security patches has it. I don't believe that anyone involved was aware of the existence of the fedi.buzz relay; if they were, they were unaware of it's implementation details. It may be popular in your corners of the Fediverse, but it is not that widely used. The fact that someone was abusing that API for good is not a good reason to leave the user safety issues unaddressed. The correct way forward is for fedi.buzz to implement content ingestion, as intended, as a relay, with appropriate controls so that admins can control their participation in it (e.g. only allow propagation of posts with hashtags in), and to reach out to admins to participate. As it should have done from the beginning. As a fellow developer, it is not believable that the person who implemented relay.fedi.buzz could have implemented it in the way that they did without realising that they were being "clever" in their use of the Mastodon API, and using it in a way never intended. |
Again, not true. See https://docs.joinmastodon.org/methods/streaming/#public or https://docs.joinmastodon.org/methods/streaming/#websocket. Many SSE streaming endpoints and the websocket endpoint do not require auth ( |
I have updated the PR's description with the reason behind the change, and a note on
It was merged to |
Thanks for clarifying your motivations in the PR, but:
As for
|
Due to how that flag is stored, doing so would introduce a lot of complexity in the streaming server implementation.
Considering the limited resources of the team and how much of a headache the streaming server is (see the multiple fixes in multiple patch releases), I think code simplification is definitely worthwhile here. |
I am just another admin of a small server, and for me too, fedi.buzz is the heartbeat that makes our server interesting for users. I can see that this has been an unofficial hack that should be prevented. But it would be extremely important for mastodon (the idea, the network) to create a solution here. I feel increasingly forced to advise newcomers to use mastodon.social if they want to see what is "really" going on, isn't that sad? |
Repeatedly framing this as FediBuzz using the Streaming API in an "unintended" way really rubs me the wrong way. The API documentation is very clear on this being a public endpoint. All of these "caveats" that have been mentioned in this thread are missing from the API documentation. |
While this may have been an "unintended" use of the API, it is impossible to predict how everybody is going to use your API. People will use your API in clever, unexpected ways, because that's just how people work, and that is why you need to be very careful when making API-breaking changes. It was being used by a lot of people to fill a valid need, and now it's being removed, without any workable replacement in place and without any consideration for all of the people it will hurt. I understand there are valid reasons for making this change, but you can minimize the damage it will cause by:
Please consider doing the above to minimize how much legitimate users are impacted by this. |
Instead of brigading here, has anyone asked FediBuzz about fixing their relay? Or why they didn't notice that this change happened for many of the largest servers months ago? |
They try to work around this change by asking people to donate api keys -> https://fedi.buzz/token/donate |
They did not consider this usage to be "broken", and the uproar is because there is currently no viable alternative for what they're doing (other than scraping the web interface feed, which is much worse in every way, hence complaints that this will hurt legitimate users more than bad actors).
This change was merged into main but has not actually been in a patch release. Most large servers are not using nightly builds from main (at least, I would hope not!). |
There is an alternative, it's behaving as an actual relay, processing the ActivityPub activities that are sent to it by all the subscribed servers, which would also let them forward post deletions and edits (which, if my understanding is correct, |
I mean, it was already in a very questionable gray area. There's a defined consent-based relay system they could've used to build their functionality, and they chose not to use it so they didn't have to seek the consent of the servers they were scraping. |
Framing this as "they should've just used the consent-based relay system in the first place" is incredibly disingenuous, and I'm not sure if that's coming from a lack of understanding or somewhere else. You don't need consent to use an unauthenticated, public API; clearly, that's why it exists. (*yes, the developers have since made it clear that this was not intended, but that had not been previously communicated to anybody and entire communities have been built that rely on it) Large servers often do not participate in traditional relays because they simply do not perceive any benefit in receiving posts from smaller instances, and that's not going to change. On the other hand, getting a feed of focused content from larger servers is incredibly useful for smaller ones, and this is going to break that. Saying "well, they should just join a traditional relay with the larger servers!" is nice in theory, but in the real world that simply isn't going to happen because those large servers are never going to join a relay. What about any larger servers who want to make their posts accessible to smaller ones? They cannot conceivably join every relay out there; there's too many out there to keep up with all of them. It's much easier to just let smaller servers pull what they need, which they can do now, but will be broken in 4.2.0. That's why I'm saying that there needs to be an API specifically for this use case before this one is broken. Let servers who really want to isolate themselves from the rest of the fedeiverse disable it, but "Just become a traditional relay" is not a viable replacement for the real scenarios in which this is being used. |
This is not necessarily true, multiple large servers participate in relays, and I think many of them would have no issue joining such a relay.
Not denying that.
That is a good point.
I think it is. If what has been stated earlier is true, it already has at least 1189 participants that are currently sending them all their public posts. And I don't think large servers would mind joining it if it respected consent and relayed deletes and edits. |
@minneyar It's not disingenuous, it's the whole point. The consent-based relay system existed, and they deliberately went "I don't want to have to get permission, so I'm going to get the data another way". Now that the hole is being plugged, the apparent solution is "mob the dev team to undo a performance and security fix that's been in place for months". @renchap noted that .online participates in a relay, and .social used to and may have dropped it for performance reasons. People seem to have this idea there's a conspiracy against small servers, and use it to justify outrageous positions. A far better question than "will Mastodon revert an important security and performance fix?" would be "will Mastodon ensure it's servers are available through some relays to help smaller servers?" because my guess would be the latter is drastically more likely to be a yes. |
Small update on what I said: .social is now back on the universeodon relay, this was a technical glitch.
Many large instances run on nightlies, especially in the past few months, due to some features that were needed for spam and malicious activity fighting. This includes mastodon.online and mastodon.social, as we use those instances to test the new code in the wild while having a way to check for performance and other issues (in addition to gathering very useful user feedback on UI changes) |
There's no technical impediment to relay-to-relay connectivity; the protocol should 'just work'. So theoretically there's no reason that fedi.buzz or another implementation of the same concept couldn't use other relays as a source (though it may need to deduplicate messages) To make their posts more visible, a large server just needs to connect to a relay and publish their posts. If you want to encourage this, offer people finer-grained controls such as not receiving messages from the relay, or only exporting messages with hash tags |
This would be a nice-to-have but this just brings us back to "potential future features", destroying what is currently there, and not replacing it with something that works until the small servers relying on the current feature are long gone. relay.fedi.buzz has lots of ideas to workaround this restriction, which are unfortunately going to be considered 'gray areas' (@defnull, we've talked more on Mastodon about this) because Relay's just aren't versatile or customizable enough in a way that instance owners can see to trust in them. If you're just adding relays willy nilly as a large instance owner, you're not doing it right. |
Hmm, for a lot of people, it's not even nice in theory because many (even most?) small Mastodon instances simply cannot handle the full flood of, say, mastodon.social. There are people running instances on Raspberries, ffs, and it works. (We might could, I overbuilt this machine intentionally. Probably gonna have to find out.) Reducing that unmanageable flood down to manageable chunks is the critical asset of relay.fedi.buzz. By offering small, relevant tagged slices from a much, much larger data set, even very small instances can participate properly in Federation. It solved a hard problem, and it solved it - for us - pretty well. Which is why, @ocdtrekkie, the existing relay system did not and does not meet the needs of small instances, even if large servers joined them. A third party solution appeared that did meet our needs; we flocked it it; this will break it. We didn't come here to "mob the dev team," we came here to say, "You are breaking critical tools with this change in ways you knew would happen and didn't even tell us." That last part has belatedly been improved a bit, thank you Claire. But it took me coming here and arguing with people for a day to make it happen. It wasn't going to. It was just going to fail. In sum: Existing relay functionality did and does not meet the needs of small instances, particularly not very small ones. This third-party tool did meet our needs, and made our instances ability to participate in federation much, much better. Communities formed around the functionality this tool enabled; breaking it will break our social functionality, and scatter those communities. As soon as we found out about it, we came to the dev team, as I would think we should. We don't need to keep using this particular tool, but we need something that provides the same basic functionality in order to continue to function well. Accordingly, we need the existing tool not to break until a replacement is in place, and we are given a short but reasonable time to switch over to the new tool. In every IT environment I've ever worked or worked with, this would be considered an entirely reasonable ask - one that could even be assumed. And that's all we're asking for here. |
@solarbirdy you do know that a relay doesn't have to wholesale forward absolutely every message it gets to all of it's subscribers, right? It can be more intelligent and forward based on hashtags or other logic (or as I've recently discovered, it could even do no forwarding action at all, and instead process things) Also, for what it's worth, I saw your original comment thanks to email notifications, and I don't know if you realise this, but a significant majority of the work in attempting to improve the streaming server has been unpaid labour; my unpaid labour. I reached out to FediBuzz the moment I learned that changes in streaming would break them, just as I reached out to IceCubes regarding another issue that was causing load issues with streaming. Even with this breakage, I'd still advocate for this change as it improves reliability & security for all users, whilst being a regression for only some; we can hopefully find a way forwards that solves the use case you have for BuzzRelay if they're not interested in becoming a proper relay. |
This disables anonymous access to the streaming API to simplify the code and disable an endpoint that we were not aware of any legitimate use of when unauthenticated.
Mastodon itself has only briefly used logged-out streaming and stopped doing so with v2.8.0, released on April 2019. Public timelines were and are still accessible when the server is configured to expose them to logged-out users, they just do not update live, and they weren't doing so either before this PR got merged.
Furthermore, there existed discrepancies in access control between the streaming server and the public timelines, and fixing them would have required making the streaming server code—an already complex and difficult to maintain piece of code—significantly more complex.
We have since been made aware that this breaks
relay.fedi.buzz
, a third-party service that makes unintended use of the streaming server to ingest posts to relay from servers, including that which do not willfully participate in the relay.We believe such a service can and should work by using the existing relay mechanism for ingesting posts, but we will remain open to considering other options.