Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New user account type "Channel Relay" #13806

Merged
merged 33 commits into from Jan 17, 2024
Merged

Conversation

annando
Copy link
Collaborator

@annando annando commented Jan 6, 2024

There is now a new account type, named "channel relay". This is used to distribute posts based on channel definitions.

This is most likely not the finished version, but is ready to review. Further changes will then be done on subsequent PRs.

This PR also fixes #13814

@tobiasd tobiasd added this to the 2024.03 milestone Jan 7, 2024
@hoergen

This comment was marked as resolved.

@MrPetovan MrPetovan marked this pull request as draft January 8, 2024 15:12
@MrPetovan
Copy link
Collaborator

This looks like a WIP, please update its status when you're really done. 😅

@annando annando marked this pull request as ready for review January 10, 2024 10:57
Repository owner deleted a comment from hoergen Jan 10, 2024
@AndyHee
Copy link

AndyHee commented Jan 10, 2024

@hoergen you make an important point about the danger of more spam, but I think it was been established by now that this reshare feature can has existed for a very long time.

In fairness, this enhancement (if I understand this correctly) gives better control for the amount[!] of content reshared and gives admins better oversight about which account is automating.

Also I would like to point to a potential use case for this "new" account type:
https://friendica.xyz/display/adf174d5-2365-9ad1-24d6-4d1456654662

Having said this there are two issues that I still would like @annando to address:

  1. Are you planning to retain the feature that allows admins to disable the reshare function? If not, this would be a step backwards, as we would lose the current function that allows admins to disable account automation and consequently controlling the amounts of resources consumed.
  2. Are you planning to extent this work and formalise the other hidden account type that "reshares" post as self, while retaining the admin's option to disable it? This would be an improvement for admins to better monitor levels of automation on their nodes.

@annando
Copy link
Collaborator Author

annando commented Jan 10, 2024

@AndyHee there is an option in the admin settings to enable this feature. It is enabled by default.

@AndyHee
Copy link

AndyHee commented Jan 10, 2024

That's great! So in the future we can still disable the resharing of this "new" account type?

@annando
Copy link
Collaborator Author

annando commented Jan 10, 2024

This switch will disable the possibility to create that account type. And additionally it disables the reshare functionality, which is important when you already have got such a user in your system. Also this resharing is enabled for group accounts as well, since in the discussions someone said that it would be nice to have some kind of discussion group that additionally can automatically reshare content.

@AndyHee
Copy link

AndyHee commented Jan 10, 2024

I see. Currently my on node there are trusted users who are resharing automatically. But any other users cannot do this, because I have disabled this possibility again after each trusted user had set up everything.

Will this also still be possible in future?

@annando
Copy link
Collaborator Author

annando commented Jan 10, 2024

I haven't considered that use case by now, sorry. I have to see, how to do that.

@AndyHee
Copy link

AndyHee commented Jan 10, 2024

I really hope that it would be possible to implement this.

@annando
Copy link
Collaborator Author

annando commented Jan 10, 2024

@MrPetovan This PR should now be finally ready for review.

src/Content/Conversation/Entity/Timeline.php Show resolved Hide resolved
Comment on lines 159 to 161
$this->db->insert('check-full-text-search', ['pid' => getmypid(), 'searchtext' => $searchtext], Database::INSERT_UPDATE);
$result = $this->db->select('check-full-text-search', [], ["`pid` = ? AND MATCH (`searchtext`) AGAINST (? IN BOOLEAN MODE)", getmypid(), $this->escapeKeywords($searchtext)]);
$this->db->delete('check-full-text-search', ['pid' => getmypid()]);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the insertion and deletion for?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need some data to check this type of search.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand:

  • Data is inserted
  • Data is selected
  • Data is deleted

What does this tell us?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$result is important. It will be false when the syntax of the search is invalid. This can happen for example when you use control characters like -, +, ( or ) in a wrong way.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, you're just testing the syntax? You can do it in a single query like this:

SELECT MATCH (`field`) AGAINST (? IN BOOLEAN MODE) FROM `table` LIMIT 1;

No need for insertion nor deletion.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, if the system reacts differently, when there is no data in the target table.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then pick a table where we expect data!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I already changed the code.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it work as expected even without data?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested that yesterday, then I made the changes.

@MrPetovan
Copy link
Collaborator

Are you still working on it?

@annando
Copy link
Collaborator Author

annando commented Jan 13, 2024

I just added stuff and fixed things, since the PR is still open.

@MrPetovan
Copy link
Collaborator

If you keep adding stuff while it’s in review, it complicates the review. I’d like to know this is the final version.

@annando
Copy link
Collaborator Author

annando commented Jan 14, 2024

I have got some pending changes, but my priority is that this PR can be merged as soon as possible, so I will try keeping my hands still here.

@MrPetovan
Copy link
Collaborator

Thank you, I appreciate it.

Copy link
Collaborator

@MrPetovan MrPetovan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, the messages.po needs refreshing now.

src/Content/Conversation/Entity/Timeline.php Outdated Show resolved Hide resolved
src/Model/Item.php Outdated Show resolved Hide resolved
Comment on lines 192 to 194
$this->db->insert('check-full-text-search', ['pid' => getmypid(), 'searchtext' => $searchtext], Database::INSERT_UPDATE);
$result = $this->inFulltext($search);
$this->db->delete('check-full-text-search', ['pid' => getmypid()]);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is all this necessary after you changed the validity test? Is the check-full-text-search table necessary?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Here we don't check the validity. Here we check, if we have got a match.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A match between what exactly? I'm very confused by this code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This specific function checks if there are any user defined channels that would match to the given search text. This function is called at two places. First it is called for posts that arrived via relay. We don't keep all posts there if not explicitly selected. So we check if the post meets some expectations. One is, that it contains hashtags that users subscribed to. And another check is, that there are user defined channels that would select this post.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we can perform the search directly on the channels table, what's the use of check-full-text-search besides checking for search string validity that we got rid of?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. We execute this query before we consider to even store the content.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thoroughly lost by now. where is the search string defined? What it is supposed to match against? Why do we need the check-full-text-search table?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to temporarily store the text of the message to a real table (no temporary tables are allowed) to perform the search. That check-full-text-search has got the process id as primary key, since there can be a lot of parallel access to that table.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thanks, I understand now. Isn't it easier to perform the search in a regular expression instead? Seems less complicated than inserting a row and then immediately removing it from the database.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use complex search patterns: https://mariadb.com/kb/en/full-text-index-overview/#in-boolean-mode

This means that I would need to create some parser that emulates that search, just to avoid to insert some data into some table.

src/Core/L10n.php Outdated Show resolved Hide resolved
src/Model/Item.php Outdated Show resolved Hide resolved
src/Protocol/Relay.php Outdated Show resolved Hide resolved
Comment on lines 272 to 280
private function insertCheckFullTextSearch(string $searchtext)
{
$this->db->insert('check-full-text-search', ['pid' => getmypid(), 'searchtext' => $searchtext], Database::INSERT_UPDATE);
}

private function deleteCheckFullTextSearch()
{
$this->db->delete('check-full-text-search', ['pid' => getmypid()]);
}
Copy link
Collaborator

@MrPetovan MrPetovan Jan 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use a wrapper function for this that uses a callback as an argument.

Synopsis would be something like this:

function matchThrowaway(string $searchtext, callable $callback) {
    $this->db->insert('check-full-text-search', ['pid' => getmypid(), 'searchtext' => $searchtext], Database::INSERT_UPDATE);

    $return = $callback($searchtext);

    $this->db->delete('check-full-text-search', ['pid' => getmypid()]);

    return $return;
}

Usage example from the match() method:

$result = matchThrowaway($searchtext, function () use ($search) {
    return $this->inFulltext($search);
};)

For the getMatchingChannelUsers() method:

$uids = matchThrowaway($searchtext, function (string $searchtext) use ($users, $owner_id, $reshare_id, $language, $tags, $media_type) {
	$uids = [];

	foreach ($this->select(['uid' => array_column($users, 'uid'), 'publish' => true, 'valid' => true]) as $channel) {
		if (in_array($channel->uid, $uids)) {
			continue;
		}
		if (!empty($channel->circle) && ($channel->circle > 0) && !in_array($channel->uid, $uids)) {
			if (!$this->inCircle($channel->circle, $channel->uid, $owner_id) && !$this->inCircle($channel->circle, $channel->uid, $reshare_id)) {
				continue;
			}
		}
		if (!empty($channel->languages) && !in_array($channel->uid, $uids)) {
			if (!in_array($language, $channel->languages)) {
				continue;
			}
		} elseif (!in_array($language, User::getWantedLanguages($channel->uid))) {
			continue;
		}
		if (!empty($channel->includeTags) && !in_array($channel->uid, $uids)) {
			if (!$this->inTaglist($channel->includeTags, $tags)) {
				continue;
			}
		}
		if (!empty($channel->excludeTags) && !in_array($channel->uid, $uids)) {
			if ($this->inTaglist($channel->excludeTags, $tags)) {
				continue;
			}
		}
		if (!empty($channel->mediaType) && !in_array($channel->uid, $uids)) {
			if (!($channel->mediaType & $media_type)) {
				continue;
			}
		}
		if (!empty($channel->fullTextSearch) && !in_array($channel->uid, $uids)) {
			if (!$this->inFulltext($channel->fullTextSearch)) {
				continue;
			}
		}
		$uids[] = $channel->uid;
		$this->logger->debug('Matching channel found.', ['uid' => $channel->uid, 'label' => $channel->label, 'language' => $language, 'tags' => $tags, 'media_type' => $media_type, 'searchtext' => $searchtext]);
	}

	return $uids;
});

I didn't test this code, you may have to tinker with it a little bit but at least it's clearer about the meaning of the insertion and deletion.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm ... I don't really see an advantage here. In contrary, for me this decreases the readability.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The advantage is that we group the insert and the delete in a single wrapper function instead of having them independent when there is no additional use for either. The function name makes it clear why we're doing this (a throwaway match) which should clear further confusion about this code like I initially had.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly, my first thought about that function name was "what should that name tell me?". It totally confused me. Thanks to your explanation I now understand what you want to express with that name. Also for me this "function as a variable" thing is always hard to read and harder to reproduce. I always have to have a look at other places where this had been implemented to know how to use it.

I'm currently having some other topics that I'm fighting with, also I'm not in a mood for discussions, so I guess I will postpone further work on this PR and will focus on other topics first.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're the boss, boss!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not the boss. I just currently don't have enough energy. Also I really should sleep for several hours now but my body refuses to fall asleep ...

@MrPetovan MrPetovan merged commit 28a7884 into friendica:develop Jan 17, 2024
7 checks passed
@annando annando deleted the channel-relay branch February 4, 2024 08:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Blank network page and DB errors in the log
5 participants