Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds ability to block a pod in Admin-View #8228

Closed
wants to merge 12 commits into from

Conversation

tclaus
Copy link
Member

@tclaus tclaus commented Apr 4, 2021

It uses the already existing 'blocked' information of Pods.

On Adminview a 'blocked/unblock' UI element is added.
A blocked pod is then filtered out in any stream views. No Database element is deleted or changed.

This Feature was requested in Issue #6640 and also discussed/requested in https://discourse.diasporafoundation.org/t/add-better-spam-controls/296

Is this a viable approach?
On https://societas.online this is already rolled out and ready to test.

Block / Unblock
Bildschirmfoto 2021-04-04 um 13 32 20

Added blocked pod count in pod list - header
Bildschirmfoto 2021-04-04 um 13 31 56

Any remarks / ideas or help is appreciated.

@tclaus tclaus changed the title Added ability to block a pod in Admin-View Adds ability to block a pod in Admin-View Apr 9, 2021
@tclaus tclaus force-pushed the disable_pods branch 2 times, most recently from 7bd9d9a to bbc254f Compare April 11, 2021 13:23
Comment on lines +68 to +71
logger.debug("Make blocked query")
Post.left_outer_joins(author: [:pod])
.where(id: post_ids)
.where("(pods.blocked = false or pods.blocked is null)")
Copy link
Member

@jhass jhass Apr 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So here's the hard part about this :/ Our main stream query already is notoriously complex and thus slow. A lot of optimization went into it already in order to make it not too terribly slow. Extending it needs to be done very considerate and carefully, ideally doing some performance benchmarking (on big datasets) on it, but at least comparing query plans before and after.

Any of that happened here? :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, why do we register posts from blocked pod at all? We could just drop them. That would mean no way back of course but it would be much simpler.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jhass I checked the queries:
Normal public posts query looks like this:

SELECT  "posts".* FROM "posts" 
    WHERE "posts"."public" = true 
        AND (posts.created_at < '2021-04-11 17:49:52') 
        AND "posts"."type" IN ('StatusMessage', 'Reshare') AND (posts.author_id NOT IN (6,2)) 
    ORDER BY posts.created_at DESC, posts.id DESC LIMIT 15  

This results in this explanation (unsuprisingly):
Bildschirmfoto 2021-04-11 um 20 13 45

Query for Blocked posts:

SELECT  "posts".* FROM "posts" 
    LEFT OUTER JOIN "people" ON "people"."id" = "posts"."author_id" 
    LEFT OUTER JOIN "pods" ON "pods"."id" = "people"."pod_id" 
    WHERE (pods.blocked is null or pods.blocked = false) AND "posts"."public" = true 
        AND (posts.created_at < '2021-04-11 17:51:47') 
        AND "posts"."type" IN ('StatusMessage', 'Reshare') AND (posts.author_id NOT IN (6,2)) 
    ORDER BY posts.created_at DESC, posts.id DESC LIMIT 15 

Explain:
Bildschirmfoto 2021-04-11 um 20 20 22

Its, well- more, but indexed. Is this a real problem?
The real bottleneck and time-consuming is IMHO the tons of extra queries done after this, beginning with 'like_posts_for_stream', Photos, Mentions, polls and so on. (Likes could also be joined to the posts query)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Flaburgan

Actually, why do we register posts from blocked pod at all? We could just drop them. That would mean no way back of course but it would be much simpler.

That was also my first Idea, but consider:
A Podmin might be retain to require the data for legal reasons. (might be an extra function)
A Podmin might only temporary block - and unblock a pod after more investigation.
A Podmins can track what is currently blocked.

My idea was not to delete any data.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a pod gets blocked the messages should not only be dropped on federation level, it should also be rejected/retracted for relayables like when the user is blocked. Otherwise users would federate content on their posts which they might want to delete.

Also as @jhass said, the stream query is already complex enough and was a reason for slow streams multiple times, so lets not make that more complex than it needs to be. And filtering/rejecting at federation level would remove the need of adding more stuff to the stream query.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got stuck stopping relay messages complete, here my knowledge about diaspora internal comes to an end.

@SuperTux88 : Do you mean, to delete all posts( and user and comments) from foreign Pods if this gets blocked? Thats quite a hard step. If I get you right, should we consider to 'archive' these deleted posts somewhere? No to recover this, but to have a copy for later. - maybe for legal reasons?

What do you think about the Query explanation?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments/likes/poll are relayed through the authors pod, if you hide them from the author, the blocked pod could still spam on their posts, but the author can do nothing against it which would be really bad. So if a pod is blocked, these messages need to be properly rejected (and not just dropped) and especially not just silently being forwarded to other pods and not telling the post author about it.

And you don't need to delete existing entities when blocking a pod (can be an optional step, if a podmin only wants to block future messages or also delete past stuff), but if it's in the database, it should also be shown to the users to give them the options to moderate their posts. So only filtering on incoming message, no filtering at displaying (and therefore also no need to change that query).

If you want to still store stuff for legal reasons take a backup or a screenshot or whatever you need.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SuperTux88
In the meantime I have blown up my developer Database to 4Million posts and I see the problem. I agree to reject any (future) requests to this pods and not changing the posts query. (So old entries are still visible).
Has this to be done in the federation-code or still on the diaspora site? Starting points for me?

Another function (or job) may be manual triggered or automatic scheduled to clean up posts, comments etc from such blocked pods.

@@ -23,6 +23,7 @@ de:
no_info: "Zurzeit keine weiteren Informationen verfügbar"
not_available: "nicht verfügbar"
offline_since: "offline seit:"
blocked: "Blockiert"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't edit locale files other than for English, all other files are managed by webtranslateit and will be overwritten.

(also, sidenote, it's probably better if you post screenshots here with English, because not everybody understand German, so you don't need to change anything but English anyway.)

app/models/person.rb Outdated Show resolved Hide resolved
app/models/person.rb Outdated Show resolved Hide resolved
app/models/person.rb Outdated Show resolved Hide resolved
app/models/person.rb Outdated Show resolved Hide resolved
app/models/person.rb Outdated Show resolved Hide resolved
config/initializers/diaspora_federation.rb Outdated Show resolved Hide resolved
config/initializers/diaspora_federation.rb Outdated Show resolved Hide resolved
config/initializers/diaspora_federation.rb Outdated Show resolved Hide resolved
lib/diaspora/fetcher/public.rb Outdated Show resolved Hide resolved
lib/diaspora/fetcher/public.rb Outdated Show resolved Hide resolved
@tclaus
Copy link
Member Author

tclaus commented Jun 12, 2021

@SuperTux88 with ef7d20e is this what you mean "block on federation level"?
Is there another source-line to block "relayables"?

I have rolled out this code since first commit and it works fast (in stream) and reliable. There are no (old and new) posts from blocked pods.

So if a pod is blocked, these messages need to be properly rejected (and not just dropped) and especially not just silently being forwarded to other pods and not telling the post author about it

Do you have a source line that supports me?

Maybe we can think this PR as a tandem to #8249 (Block/Close user) ?

@SuperTux88
Copy link
Member

SuperTux88 commented Jun 12, 2021

@SuperTux88 with ef7d20e is this what you mean "block on federation level"?

Well, yes and no ... it kinda works ... but as far as I know returning nil then fails in the federation-code, which then means it retries 10 times to receive the message. Also, this blocks all messages from a user/pod if you just block there.

A better place for blocking is probably here:

else
persisted = Diaspora::Federation::Receive.perform(entity)
Workers::ReceiveLocal.perform_async(persisted.class.to_s, persisted.id, [recipient_id].compact) if persisted
end

This is slightly later in the receive chain (so it still parses the received message and validates the key) ... but it also gives the advantage to be more flexible what to block and what to do with certain types of messages.

  • It already still allows to receive AccountDeletion and Retraction messages (which shouldn't be blocked ... if a user/pod gets blocked without deleting all their posts, they should still be allowed to control their data and delete it if they wish)
  • It has the parsed message, so we can also add special logic for relayables to send a retraction back to block a user from commenting on local posts

(sorry, I submitted this comment too early)

lib/diaspora/federation/entities.rb Outdated Show resolved Hide resolved
lib/diaspora/federation/entities.rb Outdated Show resolved Hide resolved
lib/diaspora/federation/entities.rb Outdated Show resolved Hide resolved
lib/diaspora/federation/entities.rb Outdated Show resolved Hide resolved
@weex
Copy link

weex commented Aug 24, 2021

Bump on conflicting file

app/models/person.rb Outdated Show resolved Hide resolved
app/models/person.rb Show resolved Hide resolved
app/models/person.rb Show resolved Hide resolved
Copy link
Member

@SuperTux88 SuperTux88 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to give some feedback here, as I don't think this is going in the right direction, as it looks like you are just adding blocks everywhere you find something could be blocked, but that's not how this is working. Some places don't make sense, as they never can happen (as for example you can't block your own pod), and some places just shouldn't be blocked, as they would create a mess in the federation and inconsistent states and take away control from the users.

Blocking pods might sound easy at the beginning, but it really isn't, there is a lot going into it.

There are workarounds out there to break federation with some users from other pods ("block" them), but these are workarounds and they do something similar as you did so far, just hard break federation. But that has side-effects, isn't transparent to users and lead to further problems, so they should only be used at own risk if you understand what's happening. But as this PR is something that should be merged in the main code and be used by all podmins, it needs to work robust and also handle all the cases properly, so it can't be done with just "break federation" and ignore all the problems that generates.

So blocking pods needs to work in a as clean as possible way, so it doesn't just create a huge mess. And as you can unblock a pod again, it also needs to continue working when a pod gets unblocked again, so some stuff still needs to continue working, even if it's only in the background.

I think you can probably undo most of the changes (at least those affecting streams and federation), and then add again:

  • If it's a normal entity (like a status message, or a reshare), ignore it in receive.rb (not processing them is fine)
  • If it's a relayable (comment, like and poll participation), and the pod is blocked and your pod is the owner of the root entity, then reject it sending a retraction back (like when the author if a comment is ignored)
  • There are still a lot of messages you don't want to block, as they aren't normal content of a user, for example retractions, and account deletions, so only block it, if it's actual content. For example contact changes (adding/removing to aspects) also still needs to work to not create inconsistent state if the pod gets unblocked again, but you wouldn't want to user to receive notifications about it, as they can't interact with the user when the pod is blocked, so process the entity as normal, but then return nil here to not trigger notifications.
  • You probably also don't want to send your content to those pods anymore (as people from those pods couldn't reply anyway), this can be done by filtering the people here, but again, you can only do that for actual content (posts, comments, ...), other messages still need to be sent to those pods (retractions, account deletions, unshare, ...), otherwise you people from your pod who sent posts/comments to the now blocked pods can't delete them anymore from there, which shouldn't happen.
  • For the frontend you maybe want to block showing existing content, so for example block the profile of a person from a blocked pod (or probably show a message for users from your pod, that that pod is blocked), and block the single post view of a post from person from a blocked pod (show 404), but those are just simple checks when viewing these pages, no filtering of complicated stream queries required.
  • If you really want old posts not to be included in streams anymore, build a way how a podmin can optionally delete all posts from a pod (but that can't be undone when you want to unblock a pod again, so it should be optional if a podmin wants to do that)
  • For better UX, it should be shown when somebody search for a diaspora ID from a blocked pod, that this pod was blocked, instead of just not working, otherwise users are confused and then create bugreports or support requests.
  • Also for better UX, if users already have contacts with people from blocked pods, it should mark those contacts as blocked in the UI and also not let them select those contacts anymore when write private messages. So your users understand what's happening instead of things they expect to work just don't work.

That's basically all that's needed, and all that should be done and not more. There are things that need to work, otherwise you create a huge mess in a federated system like diaspora.

If you have any more questions (or if you think I missed something in the list above, which is totally possible) please discuss them first, as again, this is a complicated topic and not as easy as it might initially look.

Comment on lines +52 to +58
is_blocked: "Der Pod wurde blockiert"
blocked:
one: "Ein Pod wurde blockiert"
other: "<%= count %> Pods wurden blockiert"
total_pods:
one: "Es gibt genau einen Pod."
other: "Es gibt insgesamt <%= count %> Pods."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You removed the line from #8228 (comment), but there are still other lines in here and in other German files.

@@ -383,6 +397,12 @@ def clear_profile!
self
end

def self.diaspora_handle_from_blocked_pod?(diaspora_handle)
host = diaspora_handle.split("@").last
pod = Pod.find_by(host: host)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pods can also have a port, this is not handled here and the port would be part of the host. But I also think you don't even need this method, as you should always get the pod which is linked with the person.

@@ -15,7 +15,7 @@
config.define_callbacks do
on :fetch_person_for_webfinger do |diaspora_id|
person = Person.where(diaspora_handle: diaspora_id, closed_account: false).where.not(owner: nil).first
if person
unless person.nil? || person.pod&.blocked
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so the federation library can get person data when other pods to a webfinger for that person, so this is your own pods people. As you can't block your own pod (as they never have a pod entity) this will always be false and therefore is not needed.

The same goes for the change in :fetch_person_for_hcard below, this is your own pods people, so it never can be blocked.

DiasporaFederation::Discovery::HCard.new(
guid: person.guid,
nickname: person.username,
full_name: "#{person.profile.first_name} #{person.profile.last_name}".strip,
url: AppConfig.pod_uri,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was removed with the cleanup of legacy federation code as hcards don't have an url anymore, it shouldn't be added back. (probably a merge/rebase error)

Comment on lines +85 to +86
person = Person.find_or_fetch_by_identifier(diaspora_id)
person.public_key unless person.nil? || person.pod&.blocked
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be blocked here. This prevents message of blocked users being validated, so they can't be received. But they need to be received so you can reject relayables, otherwise blocked people can still comment under your post without you knowing it.

Comment on lines 81 to +82
def offline?
Pod.offline_statuses.include?(Pod.statuses[status])
Pod.offline_statuses.include?(Pod.statuses[status]) || blocked
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A pod isn't just "offline" if it's blocked. If you want to handle it the same at some places, then check for both, or rename this method to offline_or_blocked? if you really want it to be the same in ALL places where this is used, but then at least it's clear what it does. But I think it shouldn't be handled the same and I think this doesn't work as you think it would work.

Comment on lines +105 to +108
rescue URI::InvalidComponentError
logger.error "Invalid pod host: #{host}"
rescue StandardError => e
logger.error "While updating pod: #{host}, #{e.inspect}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be needed here as the ConnectionTester already should handle all errors. If you had errors which made it to here, they should be added to the ConnectionTester and handled there with an error reason that matches the error. But this had nothing to do with blocking anyway.

Comment on lines +230 to +234
author = entity.try(:diaspora_handle)
return false if author.present? && Person.diaspora_handle_from_blocked_pod?(author)

root_author = entity.try(:root_diaspora_id)
return false if root_author.present? && Person.diaspora_handle_from_blocked_pod?(root_author)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is not working how you think it does. entity isn't a diaspora entity, therefore it doesn't have a diaspora_handle property, but it's a federation entity, where it has an author_id property. Also if you already know the author entity, decide if the pod is blocked by the relations in the database, instead of using Person.diaspora_handle_from_blocked_pod? which tries to parse the pod from the diaspora ID again.

Comment on lines +236 to +238
return false if closed_account?(author)

return false if closed_account?(root_author)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would leave out closed accounts for now, to not make this PR even more complicated, this is a different topic and might need to be handled differently.

@@ -10,6 +10,8 @@ class FetchWebfinger < Base

def perform(account)
person = Person.find_or_fetch_by_identifier(account)
return if person.nil?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Person.find_or_fetch_by_identifier should never return nil, if a person can't be fetched, then it throws a DiscoveryError.

@tclaus tclaus closed this Mar 27, 2024
@tclaus tclaus deleted the disable_pods branch March 27, 2024 20:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants