-
-
Notifications
You must be signed in to change notification settings - Fork 6.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add follower synchronization mechanism #14510
Add follower synchronization mechanism #14510
Conversation
d12297b
to
9337553
Compare
This seems to assume that a complete list of who I'm following is public information, whereas it might only be a partial subset? This could be a privacy problem? Unless it's a hash only for that domain. That could work, though it makes me unhappy because I don't like the assumption of ActivityPub being a domain-oriented system, which I think will create problems long term... |
The I understand your concern, but it is precisely in the context of domain-oriented systems where the issue arises. |
In general I think that a "do I really have the correct state" synchronization system is fine and welcome; I wonder if it can be more general though than just related to followers. The hash part at least is elegant and a good idea. For that component, maybe even better would be to use a bloom filter, so that if a mismatching result is found, one could at least query to see if certain elements are part of the set. I do wonder if there's a more generalized pattern; I've thought for a while that there's something to the Unum pattern that could be followed. |
Yes, I realize I'm pushing more abstract patterns for people who are doing more practical "AP as deployed" approaches (a legitimate criticism). The domain approach could be generalized though: instead of domain, you could have "node"; conceptually, each Actor could list on its profile the "Node" actor that it's associated with corresponding to its hub in the hub-and-spoke model. This then could be the actual node that you would query to see whether or not your followers were in that set. The main problem I guess with that approach is to make it secure you need to have each Node/Hub sign that the actor really does belong to that node/hub. |
928456d
to
0ed5658
Compare
I have thought of that. It might be a good idea. The workflow with this would be, if there's a mismatch, use the provided bloom filter to prune local followers, re-compute the local bloom filter, and if it still doesn't match, perform the HTTP query. This would spare an HTTP query in many (most?) cases, but I am unsure it is worth the computational or code complexity (I'll have a look).
I will have a look.
I'm afraid a lot of security assumptions are already made across most if not all ActivityPub software around the fact that two URLs behind a same domain are managed by the same logical entity. Introducing a concept of managing “Node” might be too late. |
e219597
to
4ea250f
Compare
Alright, I think I have roughly finished a first draft of that feature. Some notes:
|
9cc5e39
to
fc44419
Compare
Re-using existing ActivityStreams vocabulary:
|
Sounds like a good proposal to me, few small things though:
Also why not use authenticated fetching (HTTP Signatures) so that the |
It's the domain part of the handle, so it's the webfinger thing. I have to think a bit more whether that would cause issues with software that use the AP hostname instead. My hunch is that it wouldn't cause issues as long as:
I think something like Last-modified would be kinda error-prone, plus you have to store it and can never re-calculate it. About SHA256, I guess that would be solved by using https://w3c-ccg.github.io/security-vocab/#Digest as I suggested in my last post.
Depending on the user's settings, the followers endpoint can currently list:
Having a third option (only the followers from the requesting domain) tied to HTTP Signatures does not make a lot of sense (you may want to know all the followers). Plus, currently, Mastodon figures out whether a remote user wants to show their followers by checking if they can access that collection's first page. That assumption would be broken by having that third option. Furthermore, periodic profile checking in Mastodon is actually not done for remote accounts which are followed: indeed, Mastodon relies on receiving In my implementation, I found that using a separate endpoint was simply easier. |
1c4c318
to
b4d88bd
Compare
def process_collection_synchronization | ||
raw_params = request.headers['X-AS-Collection-Synchronization'] | ||
return if raw_params.blank? | ||
return if ENV['DISABLE_FOLLOWERS_SYNCHRONIZATION'] == 'true' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd unite this with the line above. Maybe extract to disable_followers_synchronization?
method for easier testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put it on one line, not sure about a different method.
|
||
ActivityPub::PrepareFollowersSynchronizationService.new.call(signed_request_account, params) | ||
rescue Parslet::ParseFailed | ||
Rails.logger.warn 'Error parsing collection synchronization header' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick but I would maybe write "X-AS-Collection-Synchronization" here to be more specific in the error log. Though I also don't remember how we treat errors from "bad input" in other places. Maybe you wouldn't want to fill the warn log level with stuff like this..?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in general, Mastodon doesn't print enough stuff in logs, but maybe that should be debug
loglevel rather than warn
. Updated the text itself.
app/services/activitypub/prepare_followers_synchronization_service.rb
Outdated
Show resolved
Hide resolved
@account = Account.find_by(id: account_id) | ||
return true if @account.nil? | ||
|
||
ActivityPub::SynchronizeFollowersService.new.call(@account, url) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be compacted to one line by doing Account.find
and rescuing the error like we do in all other workers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a pattern I do not like that much as it could silently hide unexpected ActiveRecord::RecordNotFound
exceptions.
a7f31e0
to
1a2330f
Compare
…quests If the remote lists a local follower which we only know has sent a follow request, consider the follow request as accepted instead of sending an Undo.
1a2330f
to
b3d39ef
Compare
- rename X-AS-Collection-Synchronization to Collection-Synchronization - various minor refactoring and code style changes
b3d39ef
to
b0cbb39
Compare
6d5d249
to
cde156c
Compare
f727639
to
262f7f3
Compare
Makes it much easier to be memory-efficient, and avoid sorting discrepancy issues.
262f7f3
to
813ca3f
Compare
6103ed5
to
6df1c31
Compare
* Add support for Gemini urls (mastodon#15013) This PR updates the `valid_url` regex and sanitizer allowlist to provide support for Gemini urls. Closes mastodon#14991 * Removed disabling comments for Style/MethodMissingSuper (mastodon#15014) * Removed disabling comments for Style/MethodMissingSuper * Update rubocop for codeclimate * Add follower synchronization mechanism (mastodon#14510) * Add support for followers synchronization on the receiving end Check the `collectionSynchronization` attribute on `Create` and `Announce` activities and synchronize followers from provided collection if possible. * Add tests for followers synchronization on the receiving end * Add support for follower synchronization on the sender's end * Add tests for the sending end * Switch from AS attributes to HTTP header Replace the custom `collectionSynchronization` ActivityStreams attribute by an HTTP header (`X-AS-Collection-Synchronization`) with the same syntax as the `Signature` header and the following fields: - `collectionId` to specify which collection to synchronize - `digest` for the SHA256 hex-digest of the list of followers known on the receiving instance (where “receiving instance” is determined by accounts sharing the same host name for their ActivityPub actor `id`) - `url` of a collection that should be fetched by the instance actor Internally, move away from the webfinger-based `domain` attribute and use account `uri` prefix to group accounts. * Add environment variable to disable followers synchronization Since the whole mechanism relies on some new preconditions that, in some extremely rare cases, might not be met, add an environment variable (DISABLE_FOLLOWERS_SYNCHRONIZATION) to disable the mechanism altogether and avoid followers being incorrectly removed. The current conditions are: 1. all managed accounts' actor `id` and inbox URL have the same URI scheme and netloc. 2. all accounts whose actor `id` or inbox URL share the same URI scheme and netloc as a managed account must be managed by the same Mastodon instance as well. As far as Mastodon is concerned, breaking those preconditions require extensive configuration changes in the reverse proxy and might also cause other issues. Therefore, this environment variable provides a way out for people with highly unusual configurations, and can be safely ignored for the overwhelming majority of Mastodon administrators. * Only set follower synchronization header on non-public statuses This is to avoid unnecessary computations and allow Follow-related activities to be handled by the usual codepath instead of going through the synchronization mechanism (otherwise, any Follow/Undo/Accept activity would trigger the synchronization mechanism even if processing the activity itself would be enough to re-introduce synchronization) * Change how ActivityPub::SynchronizeFollowersService handles follow requests If the remote lists a local follower which we only know has sent a follow request, consider the follow request as accepted instead of sending an Undo. * Integrate review feeback - rename X-AS-Collection-Synchronization to Collection-Synchronization - various minor refactoring and code style changes * Only select required fields when computing followers_hash * Use actor URI rather than webfinger domain in synchronization endpoint * Change hash computation to be a XOR of individual hashes Makes it much easier to be memory-efficient, and avoid sorting discrepancy issues. * Marginally improve followers_hash computation speed * Further improve hash computation performances by using pluck_each * helm: bump version to 3.2.1 (mastodon#15019) * Bump @testing-library/react from 11.0.4 to 11.1.0 (mastodon#14992) Bumps [@testing-library/react](https://github.com/testing-library/react-testing-library) from 11.0.4 to 11.1.0. - [Release notes](https://github.com/testing-library/react-testing-library/releases) - [Changelog](https://github.com/testing-library/react-testing-library/blob/master/CHANGELOG.md) - [Commits](testing-library/react-testing-library@v11.0.4...v11.1.0) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump @babel/core from 7.11.6 to 7.12.3 (mastodon#14993) Bumps [@babel/core](https://github.com/babel/babel/tree/HEAD/packages/babel-core) from 7.11.6 to 7.12.3. - [Release notes](https://github.com/babel/babel/releases) - [Changelog](https://github.com/babel/babel/blob/main/CHANGELOG.md) - [Commits](https://github.com/babel/babel/commits/v7.12.3/packages/babel-core) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump @babel/runtime from 7.11.2 to 7.12.1 (mastodon#14994) Bumps [@babel/runtime](https://github.com/babel/babel/tree/HEAD/packages/babel-runtime) from 7.11.2 to 7.12.1. - [Release notes](https://github.com/babel/babel/releases) - [Changelog](https://github.com/babel/babel/blob/main/CHANGELOG.md) - [Commits](https://github.com/babel/babel/commits/v7.12.1/packages/babel-runtime) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump webmock from 3.9.1 to 3.9.3 (mastodon#14996) Bumps [webmock](https://github.com/bblimke/webmock) from 3.9.1 to 3.9.3. - [Release notes](https://github.com/bblimke/webmock/releases) - [Changelog](https://github.com/bblimke/webmock/blob/master/CHANGELOG.md) - [Commits](bblimke/webmock@v3.9.1...v3.9.3) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump @babel/plugin-transform-react-inline-elements from 7.10.4 to 7.12.1 (mastodon#14998) Bumps [@babel/plugin-transform-react-inline-elements](https://github.com/babel/babel/tree/HEAD/packages/babel-plugin-transform-react-inline-elements) from 7.10.4 to 7.12.1. - [Release notes](https://github.com/babel/babel/releases) - [Changelog](https://github.com/babel/babel/blob/main/CHANGELOG.md) - [Commits](https://github.com/babel/babel/commits/v7.12.1/packages/babel-plugin-transform-react-inline-elements) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump active_record_query_trace from 1.7 to 1.8 (mastodon#14999) Bumps [active_record_query_trace](https://github.com/brunofacca/active-record-query-trace) from 1.7 to 1.8. - [Release notes](https://github.com/brunofacca/active-record-query-trace/releases) - [Changelog](https://github.com/brunofacca/active-record-query-trace/blob/master/HISTORY.md) - [Commits](brunofacca/active-record-query-trace@v1.7...v1.8) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump yargs from 16.0.3 to 16.1.0 (mastodon#15010) Bumps [yargs](https://github.com/yargs/yargs) from 16.0.3 to 16.1.0. - [Release notes](https://github.com/yargs/yargs/releases) - [Changelog](https://github.com/yargs/yargs/blob/master/CHANGELOG.md) - [Commits](yargs/yargs@v16.0.3...v16.1.0) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump oj from 3.10.14 to 3.10.15 (mastodon#15009) Bumps [oj](https://github.com/ohler55/oj) from 3.10.14 to 3.10.15. - [Release notes](https://github.com/ohler55/oj/releases) - [Changelog](https://github.com/ohler55/oj/blob/develop/CHANGELOG.md) - [Commits](ohler55/oj@v3.10.14...v3.10.15) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump tzinfo-data from 1.2020.2 to 1.2020.3 (mastodon#15002) Bumps [tzinfo-data](https://github.com/tzinfo/tzinfo-data) from 1.2020.2 to 1.2020.3. - [Release notes](https://github.com/tzinfo/tzinfo-data/releases) - [Commits](tzinfo/tzinfo-data@v1.2020.2...v1.2020.3) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump rubocop from 0.93.0 to 0.93.1 (mastodon#15004) Bumps [rubocop](https://github.com/rubocop-hq/rubocop) from 0.93.0 to 0.93.1. - [Release notes](https://github.com/rubocop-hq/rubocop/releases) - [Changelog](https://github.com/rubocop-hq/rubocop/blob/master/CHANGELOG.md) - [Commits](rubocop/rubocop@v0.93.0...v0.93.1) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump omniauth-saml from 1.10.2 to 1.10.3 (mastodon#15007) Bumps [omniauth-saml](https://github.com/omniauth/omniauth-saml) from 1.10.2 to 1.10.3. - [Release notes](https://github.com/omniauth/omniauth-saml/releases) - [Changelog](https://github.com/omniauth/omniauth-saml/blob/master/CHANGELOG.md) - [Commits](omniauth/omniauth-saml@v1.10.2...v1.10.3) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump @github/webauthn-json from 0.5.6 to 0.5.7 (mastodon#14997) Bumps [@github/webauthn-json](https://github.com/github/webauthn-json) from 0.5.6 to 0.5.7. - [Release notes](https://github.com/github/webauthn-json/releases) - [Commits](github/webauthn-json@v0.5.6...v0.5.7) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump @babel/plugin-proposal-decorators from 7.10.5 to 7.12.1 (mastodon#15008) Bumps [@babel/plugin-proposal-decorators](https://github.com/babel/babel/tree/HEAD/packages/babel-plugin-proposal-decorators) from 7.10.5 to 7.12.1. - [Release notes](https://github.com/babel/babel/releases) - [Changelog](https://github.com/babel/babel/blob/main/CHANGELOG.md) - [Commits](https://github.com/babel/babel/commits/v7.12.1/packages/babel-plugin-proposal-decorators) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump @babel/preset-react from 7.10.4 to 7.12.1 (mastodon#15006) Bumps [@babel/preset-react](https://github.com/babel/babel/tree/HEAD/packages/babel-preset-react) from 7.10.4 to 7.12.1. - [Release notes](https://github.com/babel/babel/releases) - [Changelog](https://github.com/babel/babel/blob/main/CHANGELOG.md) - [Commits](https://github.com/babel/babel/commits/v7.12.1/packages/babel-preset-react) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump react from 16.13.1 to 16.14.0 (mastodon#15005) Bumps [react](https://github.com/facebook/react/tree/HEAD/packages/react) from 16.13.1 to 16.14.0. - [Release notes](https://github.com/facebook/react/releases) - [Changelog](https://github.com/facebook/react/blob/master/CHANGELOG.md) - [Commits](https://github.com/facebook/react/commits/v16.14.0/packages/react) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump @babel/preset-env from 7.11.5 to 7.12.1 (mastodon#15001) Bumps [@babel/preset-env](https://github.com/babel/babel/tree/HEAD/packages/babel-preset-env) from 7.11.5 to 7.12.1. - [Release notes](https://github.com/babel/babel/releases) - [Changelog](https://github.com/babel/babel/blob/main/CHANGELOG.md) - [Commits](https://github.com/babel/babel/commits/v7.12.1/packages/babel-preset-env) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump @babel/plugin-transform-runtime from 7.11.5 to 7.12.1 (mastodon#14995) Bumps [@babel/plugin-transform-runtime](https://github.com/babel/babel/tree/HEAD/packages/babel-plugin-transform-runtime) from 7.11.5 to 7.12.1. - [Release notes](https://github.com/babel/babel/releases) - [Changelog](https://github.com/babel/babel/blob/main/CHANGELOG.md) - [Commits](https://github.com/babel/babel/commits/v7.12.1/packages/babel-plugin-transform-runtime) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Fix account processing failing because of large collections (mastodon#15027) Fixes mastodon#15025 * Fix followers synchronization mechanism not being triggered on mentions (mastodon#15026) e.g. if someone on an instance that previously had followers gets mentioned in a private toot, before this PR, they would not receive a Collection-Synchronization header and may show the toot to the former followers in addition to the mentioned person. Co-authored-by: Josh Leeb-du Toit <mail@joshleeb.com> Co-authored-by: abcang <abcang1015@gmail.com> Co-authored-by: ThibG <thib@sitedethib.com> Co-authored-by: Alex Dunn <dunn.alex@gmail.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Hello👋 how could this be triggered in a self-hosted Mastodon server for a specific account? A mastodon server database I managed got corrupted and I lost some followers on the recovery process on some accounts, now they still follow the accounts on their profiles but they dont show up on the accounts followers section of my server. Could you give me an advise how to recover them? |
Fixes #14480
Everything in here is an ActivityPub extension that is not exactly in ActivityPub's spirit (since it's domain-based, and introduces synchronization mechanisms that are not a thing in AP to my knowledge) but consistent with the assumptions implementations already make (remote actors being able to create objects with any identifier as long as it is on the same domain name as the actor itself, for instance).
While the PR should be in good shape, everything, especially the vocabulary, may change from feedback from other implementors, as the issue is not strictly a Mastodon issue, but a more general one.
Related discussion: https://socialhub.activitypub.rocks/t/proposal-mechanism-for-synchronizing-followers-lists-across-servers/852
Context
There are many ways—such as software bugs (e.g. the one fixed by #14479, Pleroma forgetting about followers a while ago, and probably other bugs we did not catch or may happen in the future), incompatibilities (while initiating and accepting a follow is pretty well-defined in AP, forcefully removing followers is not), prolonged downtimes, instances being brought back from the dead or restored from backups—in which two instances A and B, acting in good faith, may not agree with who from A follows someone on B.
This may cause confusion and frustration as one may think they are followed/follow by someone they are not. Worse: with Mastodon’s follower-only privacy scope, this can get particularly damaging, as receiving remote instances are asked to deliver the message to who they know to follow the poster: if that gets out of sync, then the messages may reach people that the poster expects to not have access to.
Therefore, a mechanism to detect synchronization issues and correct them is needed. Such a mechanism has to take into account that the full list of an account’s followers may not be public information, and should thus only consider followers residing on a specific domain.
Proposal
I propose the addition of an optional
Collection-Synchronization
HTTP header following the same syntax as the Signature header, with the following attributes:collectionId
: must be the sender'sfollowers
collectionurl
: an URL to a partial collection containing the identifiers of the sender's followers residing on the receiver's instance. MUST reside on the same domain as the actor itself, and SHOULD be only accessible with a signed query from the receiver's instancedigest
: hexadecimal representation of the XORed SHA256 digests of each of the identifiers in the partial collectionExample
When delivering a private message to
https://mastodon.social/users/foo/inbox
, the instance running onsocial.sitedethib.com
sets the following HTTP header:Collection-Synchronization: collectionId="https://social.sitedethib.com/users/Thib/followers", url="https://social.sitedethib.com/users/Thib/followers_synchronization", digest="b08ab6951c7d6cc2b91e17ebd9557da7fae02489728e9332fcb3a97748244d50"
The result of querying
https://social.sitedethib.com/users/Thib/followers_synchronization
(when signing the request on behalf of a mastodon.social account):FAQ
Why
partialCollection
, why not re-use the followers collection?Currently, Mastodon lists either all followers in the
followers collection
, or none of them (if the user has opted in to “hide their network”). The first behavior is not suitable for synchronization, because it would be potentially much larger than needed. The second would obviously not work.The behavior we want from
partialCollection
, that is, listing all known followers residing on the requesting instance, is different from those two options. While we could switch to that behavior when receiving a signed request:Wouldn't that leak followers information?
No, if the
partialCollection
requires an HTTP Signature, and only returns followers residing on that instance, it will only return information that the requester is supposed to know about (but might not because of bugs or system failures).Also, the
digest
is only delivered to theinbox
via an HTTP header, unaware applications will simply ignore it, and will not risk relaying it anywhere.Implementation notes
Hashes are cached in Rails' cache, and the cache gets invalidated when followers are created or removed.
The cache key used is
followers_hash:#{account_id}:#{domain}
wheredomain
is the remote domain we are interested in if the account is local, orlocal
if it's a remote account (and we thus are interested in the local followers).The followers synchronization endpoint provided by this PR is separate from the regular followers collection endpoint, and does not expect a parameter for the requested domain. Indeed, a parameter would be redundant as the endpoint requires a signed fetch, and the used domain is retrieved from that signature.
In Mastodon's case, having a separate endpoint actually makes the code less complex and easier to reason about.
TODO