Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tempesta health statistics bugs #2036

Open
voodam opened this issue Jan 10, 2024 · 4 comments · May be fixed by #2109
Open

Tempesta health statistics bugs #2036

voodam opened this issue Jan 10, 2024 · 4 comments · May be fixed by #2109
Assignees
Labels
Milestone

Comments

@voodam
Copy link
Contributor

voodam commented Jan 10, 2024

Inherited from #1454

Negative values in statistics

At the current master as the date of the issue I observe negative values in the statistics, when only couple of requests were processed by Tempesta (observed for the first line only):

Minimal response time		: -1ms
Average response time		: 0ms
Median  response time		: 0ms
Maximum response time		: 0ms

One more backend server statistics issue taken from the production server:

root@tempesta-1:/proc/tempesta# cat servers/default/192.168.122.1 
Minimal response time		: 0ms
Average response time		: 93ms
Median  response time		: 1ms
Maximum response time		: 1960ms
Percentiles
50%:	1ms
75%:	1ms
90%:	256ms
95%:	2749ms
99%:	2749ms

Obviously maximum response time can not be smaller than 99 or 95 percentiies.

Hung socket buffers

The same number of socket buffers may appear in the site statistics for relatively long time, which looks fishy (our web site statistics):

# for i in `seq 1 10`; do grep 'Socket buffers in flight' perfstat ; sleep 1; done
Socket buffers in flight		: 49
Socket buffers in flight		: 49
Socket buffers in flight		: 49
Socket buffers in flight		: 49
Socket buffers in flight		: 49
Socket buffers in flight		: 49
Socket buffers in flight		: 49
Socket buffers in flight		: 49
Socket buffers in flight		: 49
Socket buffers in flight		: 49
@voodam voodam changed the title Tempesta health statistics bugs (inherited from #1454) Tempesta health statistics bugs Jan 10, 2024
@krizhanovsky krizhanovsky modified the milestones: 0.9 - LA, 1.1: TBD Jan 10, 2024
kingluo added a commit that referenced this issue Apr 30, 2024
@kingluo kingluo linked a pull request Apr 30, 2024 that will close this issue
@kingluo
Copy link
Contributor

kingluo commented Apr 30, 2024

  1. scaled value vs real value
root@tempesta-1:/proc/tempesta# cat servers/default/192.168.122.1 
Minimal response time		: 0ms
Average response time		: 93ms
Median  response time		: 1ms
Maximum response time		: 1960ms
Percentiles
50%:	1ms
75%:	1ms
90%:	256ms
95%:	2749ms
99%:	2749ms

The maximum response time is the real collected time, but those shown in the percentiles are not. Instead, they're scaled for optimization, so they may be aligned to be larger than the real one.

rtt = rng->ctl[r].begin + (b << rng->ctl[r].order);

For example, if you have two response with rtt=1960ms, then the stats will be:

$ cat /proc/tempesta/perfstat
Minimal response time           : 1960ms
Average response time           : 1960ms
Median  response time           : 2029ms
Maximum response time           : 1960ms
Percentiles
50%:    2029ms
75%:    2029ms
90%:    2029ms
95%:    2029ms
99%:    2029ms

related logs:

[19900.613276] rtt=1960, centry=2
[19900.613449] rtt=1960, centry=2
[19900.629057] p=3, v=65535,i=64,r=4,b=0
[19900.629245] p=3, v=65535,i=64,r=4,b=0
[19900.629407] p=3, v=2029,i=63,r=3,b=15
[19900.629569] p=3, v=65535,i=64,r=4,b=0
[19900.629732] p=3, v=65535,i=64,r=4,b=0
[19900.629892] p=3, v_min=2029
[19900.630070] p=4, v_min=2029
[19900.630227] p=5, v_min=2029
[19900.630385] p=6, v_min=2029
[19900.630538] p=7, v_min=2029

logs patch:

@@ -783,6 +783,7 @@ tfw_apm_prnctl_calc(TfwApmRBuf *rbuf, TfwApmRBCtl *rbctl, TfwPrcntlStats *pstats
    while (p < T_PSZ) {
        int v_min = USHRT_MAX;
        for (i = 0; i < rbuf->rbufsz; i++) {
+       printk("p=%d, v=%u,i=%u,r=%u,b=%u\n", p,st[i].v, st[i].i,st[i].r,st[i].b);
            if (st[i].v < v_min)
                v_min = st[i].v;
        }
@@ -794,8 +795,10 @@ tfw_apm_prnctl_calc(TfwApmRBuf *rbuf, TfwApmRBCtl *rbctl, TfwPrcntlStats *pstats
            cnt += pcntrng->cnt[st[i].r][st[i].b];
            tfw_apm_state_next(pcntrng, &st[i]);
        }
-       for ( ; p < T_PSZ && pval[p] <= cnt; ++p)
+       for ( ; p < T_PSZ && pval[p] <= cnt; ++p) {
+       printk("p=%d, v_min=%u\n", p,v_min);
            pstats->val[p] = v_min;
+       }
@@ -1074,6 +1077,7 @@ __tfw_apm_update(TfwApmData *data, unsigned long jtstamp, unsigned int rtt)
    int centry = (jtstamp / tfw_apm_jtmintrvl) % data->rbuf.rbufsz;
    unsigned long jtmistart = jtstamp - (jtstamp % tfw_apm_jtmintrvl);
    TfwApmUBEnt rtt_data = { .centry = centry, .rtt = rtt };
+       printk("rtt=%d, centry=%d\n", rtt, centry);

    tfw_apm_rbent_checkreset(&data->rbuf.rbent[centry], jtmistart);
    WRITE_ONCE(ubent[jtstamp % ubuf->ubufsz].data, rtt_data.data);

It uses the median value 2029 as the histogram boundary value, not the real value 1960.

  1. The maximum value is not updated due to type incompatibility.

if (r_time < rng->min_val) {

unsigned long min_val;

The 0 of unsigned long will be cast into 4294967295 of unsigned int, then 0 will be always bigger than any value of unsigned int, so the maximum value is never updated if the minimum value keeps growing down.

@kingluo
Copy link
Contributor

kingluo commented May 1, 2024

The "Socket buffers in flight" is system-wide. So it'll change whenever some programs have network activities. In fact, it's non-zero even if tempesta has not started. The counter can be viewed by any simple kernel module:

https://gist.github.com/kingluo/9c7feb606a1348cfb8c942ecb35bddf6#file-hello-c

And, it seems that the counter maintenance is wrong, e.g. it does not decrease the counter in napi_skb_free_stolen_head() (which is used to forget a skb that already merged due to GRO).

We can check the counter maintenance by a simple systemtap script. While that script is running, we trigger some random network activities, like ping a host or curl some website, then we check if the counter (delta) is 0.

https://gist.github.com/kingluo/9c7feb606a1348cfb8c942ecb35bddf6#file-skb_stats-stap

kingluo added a commit to tempesta-tech/linux-5.10.35-tfw that referenced this issue May 2, 2024
part of tempesta-tech/tempesta#2036

`napi_skb_free_stolen_head()` should decrease skb counter.
@krizhanovsky
Copy link
Contributor

Probably the percentiles bug should be fixed with a probabilistic algorithm like count-min sketch and maybe this should be done in #712

@kingluo
Copy link
Contributor

kingluo commented May 6, 2024

The current design principles are:

It has 4 ranges to collect rtt samples, each range has 16 buckets, and then the actual rtt values will be mapped to (range, bucket). Note that the last range may be expanded to include larger values. "Extending" here does not mean increasing the array or bucket, but adjusting the parameters: "begin, end, order" to ensure that larger values can be projected into this range.

order:         0      1      2       4
begin..end: |1..16|17..47|48..108|109..349|

When calculating a histogram, each percentage boundary is determined based on the cumulative count from left to right.

On the other hand, the time window and scaling value in the configuration file are used to determine whether the histogram needs to be updated to avoid frequent updates.

Now we know why the histogram may not match the maximum value, because the bounds are a logical approximation and it may be larger than the maximum value.

So the solution is to use max as the histogram value if approx is greater than max.

Example:

Minimal response time           : 60ms
Average response time           : 1400ms
Median  response time           : 1133ms
Maximum response time           : 2572ms
Percentiles
50%:    1133ms
75%:    1901ms
90%:    2572ms
95%:    2572ms
99%:    2572ms

kingluo added a commit that referenced this issue May 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants