Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize varnish ban regexp #4231

Merged
merged 1 commit into from Apr 25, 2021
Merged

Conversation

bastnic
Copy link
Contributor

@bastnic bastnic commented Apr 20, 2021

Q A
Branch? 2.6
Bug fix? depends on how you treat perf issue
New feature? no
Deprecations? no
License MIT

The current ban regexp is not optimize and killed my varnish server to 100% cpu.

Several reasons, but let's explain how that work. This is a question of

rq/s * ban/s * size of the cache tags header * complexity of the regexp

I've a high number of requests and bans per seconds. The response has a lot of tags.

But I only ban one resource at a time, so the regexp was not that complex at first look.

Varnish stores all ban in a (deduplicated) list. Each cached resource has a pointer to the last ban checked.
If you add bans, when you call a cached resources, it will check against all the new bans and store the last ones.

The number of regexp analysis was high, but I can change my traffic, so I can only adjust the header itself or the regexp.

I looked at the regexp first

((^|,)/pim/v1/variants($|,))|((^|,)/pim/v1/variants/e3615f0c-a957-4a85-88b9-8db30d5c5c02($|,))|((^|,)/pim/v1/models/bebdee7b-fb53-448b-b79e-1bcf0dbcf050($|,))

Step 1: checking the begining of the resources is not useful if tags are only handled by apip.

(/pim/v1/variants($|,))|(/pim/v1/variants/e3615f0c-a957-4a85-88b9-8db30d5c5c02($|,))|(/pim/v1/models/bebdee7b-fb53-448b-b79e-1bcf0dbcf050($|,))

Stepe 2: regrouping the end tag at the end is better.

(/pim/v1/variants|/pim/v1/variants/e3615f0c-a957-4a85-88b9-8db30d5c5c02|/pim/v1/models/bebdee7b-fb53-448b-b79e-1bcf0dbcf050)($|,)

Step 3: remove the prefix from the regexp

(/variants|/variants/e3615f0c-a957-4a85-88b9-8db30d5c5c02|/models/bebdee7b-fb53-448b-b79e-1bcf0dbcf050)($|,)

Step 4: remove the prefix from the source header

Bench:

subject mean diff
benchCurrent 1.612μs 7.91x
benchStep1 0.575μs 2.82x
benchStep2 0.504μs 2.47x
benchStep3 0.492μs 2.41x
benchStep4 0.204μs 1.00x

This PR only goes to step 2.

On my production, I published Step3 today at 12:15:

image

@bastnic bastnic changed the title Safer PropertySchemaRegexRestriction pattern anchor (#4198) Optimize varnish ban regexp Apr 20, 2021
@bastnic bastnic force-pushed the feature/perf-regexp-ban branch 3 times, most recently from 61ba6e7 to 9d29b58 Compare April 20, 2021 14:43
];

yield 'iris to generate a header with exactly the maximum length and a smaller one' => [
37,
['/foo', '/bar', '/baz'],
[
'((^|\,)/foo($|\,))|((^|\,)/bar($|\,))',
'(^|\,)/baz($|\,)',
'(/foo|/bar)($|\,)',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think something needs to be changed in the splitting logic: the 2 segments would now fit the 37 characters limit if merged.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @julienfalque, I'm fixing tests right now.

@bastnic bastnic force-pushed the feature/perf-regexp-ban branch 4 times, most recently from 3f5ab8a to 483632f Compare April 20, 2021 17:26
Copy link
Member

@dunglas dunglas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding support for the xkey module could improve the performance even more.

src/HttpCache/VarnishPurger.php Outdated Show resolved Hide resolved
@bastnic bastnic force-pushed the feature/perf-regexp-ban branch 2 times, most recently from 9a52951 to 5d29b8c Compare April 21, 2021 20:57
@bastnic
Copy link
Contributor Author

bastnic commented Apr 21, 2021

@dunglas FYI xkey is already abandoned by varnish which prefers ykey https://docs.varnish-software.com/varnish-cache-plus/vmods/ykey/.

Migrating to x/ykey is not "free" as you need the paid version of varnish (maybe I'm wrong on this one).

This PR an optimization for what we already provide in apip.

@soyuka
Copy link
Member

soyuka commented Apr 25, 2021

xkey still works with the free varnish and won't be removed soon

…o 100% cpu.

Several reasons, but let's explain how that work. This is a question of

> rq/s * ban/s * size of the cache tags header * complexity of the regexp

I've a high number of requests and bans per seconds. The response has a lot of tags.

But I only ban one ressource at a time, so the regexp was not that complex at first look.

Varnish stores all ban in a (deduplicated) list. Each cached resource has a pointer to the last ban checked.
If you add bans, when you call a cached resources, it will check against all the new bans and store the last ones.

The number of regexp analysis was high, but I can change my traffic, so I can only adjust the header itself or the regexp.

I looked at the regexp first

> ((^|,)\/pim\/v1\/variants($|,))|((^|,)\/pim\/v1\/variants\/e3615f0c-a957-4a85-88b9-8db30d5c5c02($|,))|((^|,)\/pim\/v1\/models\/bebdee7b-fb53-448b-b79e-1bcf0dbcf050($|,))

Step 1: checking the begining of the resources is not useful if tags are only handled by apip.

> (\/pim\/v1\/variants($|,))|(\/pim\/v1\/variants\/e3615f0c-a957-4a85-88b9-8db30d5c5c02($|,))|(\/pim\/v1\/models\/bebdee7b-fb53-448b-b79e-1bcf0dbcf050($|,))

Stepe 2: regrouping the end tag at the end is better.

> (\/pim\/v1/variants|\/pim\/v1/variants\/e3615f0c-a957-4a85-88b9-8db30d5c5c02|\/pim\/v1/models\/bebdee7b-fb53-448b-b79e-1bcf0dbcf050)($|,)

Step 3: remove the prefix from the regexp

> (/variants|\/variants\/e3615f0c-a957-4a85-88b9-8db30d5c5c02|\/models\/bebdee7b-fb53-448b-b79e-1bcf0dbcf050)($|,)

Step 4: remove the prefix from the source header

Bench:

+--------------+---------+-------+
| subject      | mean    | diff  |
+--------------+---------+-------+
| benchCurrent | 1.612μs | 7.91x |
| benchStep1   | 0.575μs | 2.82x |
| benchStep2   | 0.504μs | 2.47x |
| benchStep3   | 0.492μs | 2.41x |
| benchStep4   | 0.204μs | 1.00x |
+--------------+---------+-------+

This PR only goes to step 2.
@soyuka soyuka merged commit fe615fb into api-platform:2.6 Apr 25, 2021
@soyuka
Copy link
Member

soyuka commented Apr 25, 2021

thanks @bastnic

@dunglas
Copy link
Member

dunglas commented Apr 26, 2021

Yes xkey isn't maintained by Varnish.com (the company) anymore but is still maintained by the community as part of the Varnish modules project (I recently contributed to it and the patch has been merged).
What I'm suggesting is to support both the current approach and xkey (that have better performance).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants