Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spike solution for filtering tests by job group #5291

Closed
wants to merge 1 commit into from

Conversation

kraih
Copy link
Member

@kraih kraih commented Aug 29, 2023

Instead of regexes i've used globs, which are a little more user friendly and less prone to security issues. The implementation is a bit inefficient, since it loads all job groups into memory to filter them on the Perl side. But there will probably never be an openQA instance with enough job groups to justify the complexity of filtering in SQL.

Since this is a spike solution, tests and ui enhancements have been left out. The feature can only be used with query parameters in the URL for now, like /tests/overview?group_glob=*kernel*&not_group_glob=*development*&todo=1. Multiple group_glob and not_group_glob values can be used for more complex filtering.

Progress: https://progress.opensuse.org/issues/119746

Instead of regexes i've used globs, which are a little more user
friendly and less prone to security issues. The implementation is a bit
inefficient, since it loads all job groups into memory to filter them on
the Perl side. But there will probably never be an openQA instance with
enough job groups to justify the complexity of filtering in SQL.

Since this is a spike solution, tests and ui enhancements have been left
out. The feature can only be used with query parameters in the URL for
now, like
`/tests/overview?group_glob=*kernel*&not_group_glob=*development*&todo=1`.
Multiple `group_glob` and `not_group_glob` values can be used for more
complex filtering.

Progress: https://progress.opensuse.org/issues/119746
Copy link
Contributor

@Martchus Martchus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks generally good (for a spike solution, to merge tests would be good).

# use globs to filter job groups, first include all groups that match "group_glob" values, and then exclude those
# that also match "not_group_glob" values
elsif (@$group_glob || @$not_group_glob) {
my @inclusive = map { qr/$_/i } map { glob_to_regex_string($_) } @$group_glob;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe avoid the double map calls (which in Perl likely result in an intermediate array, right?):

Suggested change
my @inclusive = map { qr/$_/i } map { glob_to_regex_string($_) } @$group_glob;
my @inclusive = map { my $regex = glob_to_regex_string($_); qr/$regex/i } @$group_glob;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too cheap to worry about resource usage, the intermediate list costs nothing really. Most readable solution wins here imo.

Comment on lines +400 to +401
$v->optional('group_glob');
$v->optional('not_group_glob');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would leave out the "_glob" suffix.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would also have been my preference originally too, but we've already set a precedent here with module_re for regex. Feels kinda wrong now not to have a special suffix for globs too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. As we already have regex for module then can we please reconsider to use group_re with regex here as well? I don't see the security risk as we merely match on job group names, isn't it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perl regexes are very powerful and compiling arbitrary strings from the internet into them has a lot of room for exploitation. For example i could probably use lookahead/behind to create a DoS attack even with small input strings. 🤔 Using globs would be a lot safer and easier to understand by users.

Copy link
Contributor

@Martchus Martchus Sep 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, unfortunately Perl has no good way of dealing with regexes from users. Unfortunately we already do this. I once wrote the regex_match helper so we at least don't get warnings in our server logs but a DoS attach is likely possible. So I guess using globbing is better here, indeed.

(We can actually always make use of native code and call a library that is more limited. However, that's more work.)

Copy link
Member Author

@kraih kraih Sep 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For more information see also https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS. The current Perl regex engine is actually better than most at dealing with such cases, but it still can happen.

Copy link
Member

@okurz okurz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder about the example that you mention about /tests/overview. https://progress.opensuse.org/issues/119746 is about /tests. Does this also work for /tests or was that aspect overlooked?

Comment on lines +400 to +401
$v->optional('group_glob');
$v->optional('not_group_glob');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. As we already have regex for module then can we please reconsider to use group_re with regex here as well? I don't see the security risk as we merely match on job group names, isn't it?

@kraih
Copy link
Member Author

kraih commented Sep 1, 2023

I wonder about the example that you mention about /tests/overview. https://progress.opensuse.org/issues/119746 is about /tests. Does this also work for /tests or was that aspect overlooked?

Yes, that aspect was indeed overlooked after the discussions about the feature. Do we really want the group filtering in the /tests context rather than /tests/overview? I didn't even think about /tests because it fits so well into /tests/overview. Unfortunately the code paths are completely separate and probably won't be able to share much.

@kraih
Copy link
Member Author

kraih commented Sep 1, 2023

Yes, that aspect was indeed overlooked after the discussions about the feature. Do we really want the group filtering in the /tests context rather than /tests/overview?

As discussed on Slack, there is interest in having both. So i will look into extending the spike solution.

@mergify
Copy link
Contributor

mergify bot commented Sep 22, 2023

This pull request is now in conflicts. Could you fix it? 🙏

@kraih
Copy link
Member Author

kraih commented Dec 7, 2023

Followup PR has been merged.

@kraih kraih closed this Dec 7, 2023
@Martchus Martchus deleted the k/spike_group_filter branch December 7, 2023 12:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants