Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fast HTTP match #732

Open
krizhanovsky opened this issue May 13, 2017 · 0 comments
Open

Fast HTTP match #732

krizhanovsky opened this issue May 13, 2017 · 0 comments

Comments

@krizhanovsky
Copy link
Contributor

krizhanovsky commented May 13, 2017

HTTP Load balancing

Separated from #76, in particular from #76 (comment) . A faster implementation of HTTP field matching is required for HTTP load balancing and filtering. There could be a hash table, such that we can make a quick jump by a rule key and the key can be calculate by the string and ID of the HTTP field. And/or BNDM with q-Grams (BG) algorithm can be used to quickly process many strings with common prefix.

Issue #76 works on massive number of backend servers:

srv_group group_0 { server 127.0.0.1:9090 conns_n=1; }
srv_group group_1 { server 127.0.0.1:9090 conns_n=1; }
srv_group group_2 { server 127.0.0.1:9090 conns_n=1; }
....
srv_group group_999 { server 127.0.0.1:9090 conns_n=1; }

sched_http_rules {
match group_0  hdr_host eq "group-0.com";
match group_1  hdr_host eq "group-1.com";
match group_2  hdr_host eq "group-2.com";
....
match group_999  hdr_host eq "group-999.com";
}

Currently all 1000 and more match rules are matched sequentially. The example is quite realistic for massive hosting installations. BG algorithm implemented in #901 must be applied to the matching. Probably matching syntax should be adjusted like (with #731 in mind):

host == {
    "group-0.com" -> group_0;
    "group-1.com" -> group_1;
    "group-2.com" -> group_2;
}

HTTPtables

Strings matching

Also the use case from #731 must be processed in more efficient way, e.g. using hash table or a tree:

http_chain {
        mark == {
                2 -> backend_0;
                3 -> backend_1;
                4 -> backend_2;
                5 -> backend_3;
                ....
        }
}

Memory spacial locality

At the moment kzalloc() is used on configuration phase a lot, so spacial locality on run time can be improved by using more local data structures.

The chains

Currently HTTPtables sequentially scans all the rules in a chain, which isn't efficient. The first option is to run only one per-header match using multi-pattern matching. Probably, there are also other optimization opportunities.

We need some use cases on large chains to understand the typical workload, i.e. whether there are cases with many patterns for the same headers or there are mostly different headers matchers.

Generic strings matching

Actually, Tempesta FW is full of multiple strings matching. E.g. caching policy for content type suffix is performed with FOR loop in tfw_capolicy_match() while a powerfull web resource can have a lot of various suffixes: aif, aiff, au, avi, bin, bmp, cab, carb, cct, cdf, class, css, doc, dcr, dtd, gcf, gff, gif, grv, hdml, hqx, ico, ini, jpeg, jpg, js, mov, mp3, nc, pct, ppc, pws, swa, swf, txt, vbs, w32, wav, wbmp, wml, wmlc, wmls, wmlsc, xsd, zip.

Testing

Functional tests

TBD

Performance

We need a solid estimation on which number of rules and/or chains the performance significantly degrades.

@krizhanovsky krizhanovsky added this to the 1.0 WebOS milestone May 13, 2017
@krizhanovsky krizhanovsky modified the milestones: backlog, 1.0 Web Operating System Jan 15, 2018
@krizhanovsky krizhanovsky modified the milestones: 0.5 alpha, 1.0 Tempesta OS, 0.10 Kernel-User Space Transport Feb 5, 2018
@krizhanovsky krizhanovsky modified the milestones: 0.10 Kernel-User Space Transport , 0.9 Web server Feb 10, 2018
@krizhanovsky krizhanovsky self-assigned this Mar 31, 2018
@krizhanovsky krizhanovsky mentioned this issue Dec 27, 2021
2 tasks
@krizhanovsky krizhanovsky modified the milestones: 1.3 TBD( Web server & advanced strings), 1.2 TBD Jan 3, 2022
@krizhanovsky krizhanovsky modified the milestones: 1.xx TBD, 1.x: TBD Apr 19, 2023
@krizhanovsky krizhanovsky removed their assignment Nov 12, 2023
@krizhanovsky krizhanovsky modified the milestones: 1.1: TBD, 1.2 - TBD Nov 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant