Skip to content

VRL reference counts#9785

Closed
StephenWakely wants to merge 19 commits intomasterfrom
stephen/vrl_refcount
Closed

VRL reference counts#9785
StephenWakely wants to merge 19 commits intomasterfrom
stephen/vrl_refcount

Conversation

@StephenWakely
Copy link
Copy Markdown
Contributor

@StephenWakely StephenWakely commented Oct 25, 2021

This is a very draft proof of concept moving VRL to use Rc<RefCell<Value>> to reduce the number of clones that are needed at runtime.

In testing the results were pretty disappointing, so I would very much appreciate some help looking through this to see if I am missing anything obvious.

I am testing with this config:

[sources.randomvip]
type = "generator"
format = "shuffle"
count = 10000000
lines = [
  '{"message": "{\"noog\": \"nork\"}", "host": "schneider", "file": "zork/12-32.log"}',
  '{"message": "{\"nonk\": \"flork\"}", "host": "zookflook", "file": "zork/34-29.log"}',
]
decoding.codec = "json"
interval = 0.0


[transforms.remap]
type = "remap"
inputs = ["randomvip"]
source = '''
.agent_name = "vector"

parsed, err = parse_json(.message)
if err == null {
    .message = parsed
    .format = "json"
} else {
    .format = "ascii"
}
matches = parse_regex!(.file, r'.*/(?P<num>\d+)-(?P<name>\w+).log')
.origin, err = .host + "/" + matches.name + "/" + matches.num
'''

[sinks.blackhole]
type = "blackhole"
inputs = ["remap"]

Running master, this is giving me a time of:

./target/release/vector -c ../confs/perf.toml  270.23s user 321.31s system 219% cpu 4:29.63 total

Running this branch we get times of:

./vector.rc -c confs/perf.toml  247.08s user 291.09s system 214% cpu 4:10.90 total

Here is a flamegraph from running master:
https://github.com/vectordotdev/vector/blob/stephen/vrl_refcount/flamegraph.master.svg

Here is a flamegraph from running the RC branch:
https://github.com/vectordotdev/vector/blob/stephen/vrl_refcount/flamegraph.rc.svg

Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>
Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>
Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>
@netlify
Copy link
Copy Markdown

netlify Bot commented Oct 25, 2021

✔️ Deploy Preview for vector-project canceled.

🔨 Explore the source changes: e1c1007

🔍 Inspect the deploy log: https://app.netlify.com/sites/vector-project/deploys/61a8999e5fd53200080b9157

@StephenWakely
Copy link
Copy Markdown
Contributor Author

With a simpler config, the timings are similar:

[transforms.remap]
type = "remap"
inputs = ["randomvip"]
source = '''
.agent_name = "vector"

if .message == "thing" {
  .nonk = upcase!(del(.host))
} else if .message == "thung" {
  .noog = downcase!(del(.host))
}
matches = { "name": .message, "num": 2 }
.origin, err = .host + "/" + matches.name + "/" + matches.num
'''

On master:

../vector.master -c ../confs/perfsimple.toml  227.63s user 308.42s system 228% cpu 3:54.54 total

With ref counting:

./target/release/vector -c ../confs/perfsimple.toml  213.28s user 284.74s system 223% cpu 3:42.38 total

@StephenWakely
Copy link
Copy Markdown
Contributor Author

Memory profiles seem similar as well.

Master:

image

Reference counted:

image

@StephenWakely
Copy link
Copy Markdown
Contributor Author

It was pointed out to me that the Generator source is quite expensive. So I have updated the test config to use the HTTP source. Data generated by https://github.com/blt/lading.

Sending 100,000 records through Vector with this has reduced the runtime from 1m20 seconds to 1m02s, albeit using fairly naive measurements (timed with my watch). This is a much more encouraging benchmark!

Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>
@bits-bot
Copy link
Copy Markdown

bits-bot commented Nov 3, 2021

CLA assistant check
All committers have signed the CLA.

Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>
Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>
@github-actions github-actions Bot added domain: core Anything related to core crates i.e. vector-core, core-common, etc domain: rfc domain: transforms Anything related to Vector's transform components domain: vrl Anything related to the Vector Remap Language labels Nov 9, 2021
Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>
Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>
}
Event::Metric(event) => VrlTarget::Metric(event),
Event::Log(event) => VrlTarget::LogEvent(
SharedValue::from(vrl_core::Value::from(&event.fields)),
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VrlTarget is now a SharedValue (a wrapper over Rc<RefCell<Value>>). When creating the target we need to copy the event into this SharedValue. This is why the clone is no longer necessary since the original event is then untouched until the end of the process.

VrlTarget::LogEvent(log, _) => log
.get(path)
.map(|val| val.map(|val| val.clone().into()))
.map(|val| val.map(|val| val.clone()))
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The clone here is now a cheap increment of the reference count. We also don't need the into since the value is already stored as a vrl::Value.

object
.iter()
.map(|(k, v)| (k.clone(), v.borrow().clone().into()))
.collect(),
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a cost here since we now need to convert back from a vrl::Value into a Vector Value.

This is one downside of the approach.

mod target;

#[derive(Debug, Clone, PartialEq)]
pub struct SharedValue(pub(crate) Rc<RefCell<Value>>);
Copy link
Copy Markdown
Contributor Author

@StephenWakely StephenWakely Nov 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SharedValue is just a wrapper over Rc<RefCell<Value>> .

Since it is used everywhere now it is much easier to create a wrapper and then add functionality to the wrapper than it is to use Rc<RefCell<Value>> everywhere and then have to deal with the Rc<RefCell directly.

I will add more functionality here over time.

let value = value.try_bytes_utf8_lossy()?;

let with_value = self.with.resolve(ctx)?;
let with_value = with_value.borrow();
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a common pattern. Unfortunately it seems we can't add this function to SharedValue like:

pub fn try_bytes_utf8_lossy(&self) -> Result<Cow<'_, str>, crate::value::error::Error> {
        let value = self.borrow();
        value.try_bytes_utf8_lossy()
}

Since that results in an error:

error[E0515]: cannot return value referencing local variable `value`
  --> lib/vrl/compiler/src/shared_value.rs:72:9
   |
72 |         value.try_bytes_utf8_lossy()
   |         -----^^^^^^^^^^^^^^^^^^^^^^^
   |         |
   |         returns a value referencing data owned by the current function
   |         `value` is borrowed here

(It does work for Interer however, since that is Copy. I will add that.)

};

if segments.peek().is_none() {
return self.borrow_mut().remove_by_segment(segment);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code here is probably the hairiest aspect of this PR. There are a lot of borrow and borrow_mut calls around.

With a bit of work it's possible a lot of this could be tidied up so we only have a single borrow_mut at the top, but I think it would still be of great value to fuzz test this code very thoroughly.

Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>
Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>
Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>
Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>
@github-actions
Copy link
Copy Markdown
Contributor

Soak Test Results

Baseline: 1ec79ef
Comparison: d18d1d7
Total Vector CPUs: 4

What follows is a statistical summary of the soak captures between the SHAs given above. Units are bytes/second/CPU, except for 'skewness' and 'kurtosis'. Higher numbers in 'comparison' is generally better. Higher skewness or kurtosis numbers indicate a lack of consistency in behavior, making predictions of fitness in the field challenging.


datadog_agent_remap_blackhole

EXPERIMENT VALUE_min VALUE_p90 VALUE_p99 VALUE_max VALUE_skewness VALUE_kurtosis
baseline 10.49Mi 10.52Mi 10.52Mi 10.53Mi -0.16 -0.19
comparison 10.04Mi 10.07Mi 10.07Mi 10.07Mi -0.93 0.76

datadog_agent_remap_datadog_logs

EXPERIMENT VALUE_min VALUE_p90 VALUE_p99 VALUE_max VALUE_skewness VALUE_kurtosis
baseline 19.46Mi 19.50Mi 19.51Mi 19.52Mi 0.31 -0.12
comparison 17.27Mi 17.32Mi 17.32Mi 17.33Mi -0.11 -0.85

fluent_elasticsearch

EXPERIMENT VALUE_min VALUE_p90 VALUE_p99 VALUE_max VALUE_skewness VALUE_kurtosis
baseline 54.05Mi 54.25Mi 54.29Mi 54.31Mi -0.53 -0.45
comparison 53.97Mi 54.15Mi 54.17Mi 54.17Mi -0.49 -0.34

fluent_remap_aws_firehose

EXPERIMENT VALUE_min VALUE_p90 VALUE_p99 VALUE_max VALUE_skewness VALUE_kurtosis
baseline 38.38Mi 38.59Mi 38.64Mi 38.66Mi 0.62 -0.36
comparison 42.07Mi 42.24Mi 42.29Mi 42.29Mi 0.15 -0.93

http_pipelines_blackhole

EXPERIMENT VALUE_min VALUE_p90 VALUE_p99 VALUE_max VALUE_skewness VALUE_kurtosis
baseline 0.00 1.05Ki 1.59Ki 1.59Ki 2.08 2.62
comparison 0.00 556.00 556.00 8.31Ki 9.15 96.46

splunk_hec_route_s3

EXPERIMENT VALUE_min VALUE_p90 VALUE_p99 VALUE_max VALUE_skewness VALUE_kurtosis
baseline 5.34Mi 5.61Mi 5.66Mi 5.69Mi -0.33 -0.45
comparison 0.00 189.20Ki 189.20Ki 189.20Ki 0.42 -1.55

splunk_transforms_splunk3

EXPERIMENT VALUE_min VALUE_p90 VALUE_p99 VALUE_max VALUE_skewness VALUE_kurtosis
baseline 2.35Mi 2.52Mi 2.57Mi 2.60Mi 0.11 -0.58
comparison 0.00 124.77Ki 147.22Ki 149.45Ki 0.36 -0.36

syslog_humio_logs

EXPERIMENT VALUE_min VALUE_p90 VALUE_p99 VALUE_max VALUE_skewness VALUE_kurtosis
baseline 7.16Mi 7.20Mi 7.20Mi 7.20Mi -0.50 -0.75
comparison 6.80Mi 6.85Mi 6.85Mi 6.85Mi -0.44 -0.86

syslog_log2metric_humio_metrics

EXPERIMENT VALUE_min VALUE_p90 VALUE_p99 VALUE_max VALUE_skewness VALUE_kurtosis
baseline 4.99Mi 5.02Mi 5.02Mi 5.02Mi -0.55 -0.52
comparison 5.11Mi 5.13Mi 5.13Mi 5.14Mi -0.22 -0.78

syslog_log2metric_splunk_hec_metrics

EXPERIMENT VALUE_min VALUE_p90 VALUE_p99 VALUE_max VALUE_skewness VALUE_kurtosis
baseline 5.15Mi 5.18Mi 5.18Mi 5.18Mi -0.14 -1.45
comparison 6.05Mi 6.07Mi 6.07Mi 6.07Mi -1.07 1.75

syslog_loki

EXPERIMENT VALUE_min VALUE_p90 VALUE_p99 VALUE_max VALUE_skewness VALUE_kurtosis
baseline 3.88Mi 4.32Mi 4.34Mi 4.34Mi -0.72 -0.51
comparison 3.66Mi 3.91Mi 3.95Mi 3.96Mi -0.31 -1.35

syslog_regex_logs2metric_ddmetrics

EXPERIMENT VALUE_min VALUE_p90 VALUE_p99 VALUE_max VALUE_skewness VALUE_kurtosis
baseline 3.86Mi 3.87Mi 3.87Mi 3.87Mi -1.31 0.95
comparison 4.42Mi 4.63Mi 4.64Mi 4.64Mi -1.60 1.12

syslog_splunk_hec_logs

EXPERIMENT VALUE_min VALUE_p90 VALUE_p99 VALUE_max VALUE_skewness VALUE_kurtosis
baseline 7.18Mi 7.22Mi 7.22Mi 7.22Mi -0.22 -1.67
comparison 7.01Mi 7.05Mi 7.06Mi 7.06Mi 0.10 -1.19

Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>
@github-actions
Copy link
Copy Markdown
Contributor

Soak Test Results

Baseline: 1ec79ef
Comparison: 3472115
Total Vector CPUs: 4

What follows is a statistical summary of the soak captures between the SHAs given above. Units are bytes/second/CPU, except for 'skewness' and 'kurtosis'. Higher numbers in 'comparison' is generally better. Higher skewness or kurtosis numbers indicate a lack of consistency in behavior, making predictions of fitness in the field challenging.


datadog_agent_remap_blackhole

EXPERIMENT VALUE_min VALUE_p90 VALUE_p99 VALUE_max VALUE_skewness VALUE_kurtosis
baseline 10.06Mi 10.08Mi 10.09Mi 10.09Mi 0.44 0.92
comparison 10.07Mi 10.09Mi 10.10Mi 10.10Mi 0.11 -0.26

datadog_agent_remap_datadog_logs

EXPERIMENT VALUE_min VALUE_p90 VALUE_p99 VALUE_max VALUE_skewness VALUE_kurtosis
baseline 18.77Mi 19.18Mi 19.21Mi 19.21Mi 0.68 -1.18
comparison 17.35Mi 17.41Mi 17.42Mi 17.42Mi 0.80 -0.26

fluent_elasticsearch

EXPERIMENT VALUE_min VALUE_p90 VALUE_p99 VALUE_max VALUE_skewness VALUE_kurtosis
baseline 54.37Mi 54.62Mi 54.64Mi 54.64Mi -0.58 -0.81
comparison 54.41Mi 54.71Mi 54.79Mi 54.80Mi -0.69 -0.52

fluent_remap_aws_firehose

EXPERIMENT VALUE_min VALUE_p90 VALUE_p99 VALUE_max VALUE_skewness VALUE_kurtosis
baseline 40.01Mi 40.30Mi 40.33Mi 40.34Mi -0.61 -0.74
comparison 42.65Mi 42.87Mi 42.90Mi 42.91Mi -0.36 -0.77

http_pipelines_blackhole

EXPERIMENT VALUE_min VALUE_p90 VALUE_p99 VALUE_max VALUE_skewness VALUE_kurtosis
baseline 0.00 556.00 2.12Ki 2.12Ki 2.10 5.65
comparison 0.00 17.70Ki 17.70Ki 17.70Ki 0.21 -1.82

splunk_hec_route_s3

EXPERIMENT VALUE_min VALUE_p90 VALUE_p99 VALUE_max VALUE_skewness VALUE_kurtosis
baseline 5.40Mi 5.62Mi 5.70Mi 5.72Mi 0.06 -0.50
comparison 5.39Mi 5.58Mi 5.61Mi 5.63Mi -0.02 -0.84

splunk_transforms_splunk3

EXPERIMENT VALUE_min VALUE_p90 VALUE_p99 VALUE_max VALUE_skewness VALUE_kurtosis
baseline 2.43Mi 2.64Mi 2.69Mi 2.69Mi 0.07 -1.27
comparison 2.78Mi 2.87Mi 2.90Mi 2.90Mi -0.14 -0.19

syslog_humio_logs

EXPERIMENT VALUE_min VALUE_p90 VALUE_p99 VALUE_max VALUE_skewness VALUE_kurtosis
baseline 7.15Mi 7.17Mi 7.17Mi 7.17Mi -0.05 -0.99
comparison 6.93Mi 6.95Mi 6.95Mi 6.95Mi 0.29 -0.95

syslog_log2metric_humio_metrics

EXPERIMENT VALUE_min VALUE_p90 VALUE_p99 VALUE_max VALUE_skewness VALUE_kurtosis
baseline 4.98Mi 4.99Mi 5.00Mi 5.00Mi 0.11 -0.41
comparison 4.92Mi 4.94Mi 4.94Mi 4.94Mi -0.10 -0.22

syslog_log2metric_splunk_hec_metrics

EXPERIMENT VALUE_min VALUE_p90 VALUE_p99 VALUE_max VALUE_skewness VALUE_kurtosis
baseline 5.32Mi 5.34Mi 5.35Mi 5.35Mi 0.10 -0.84
comparison 6.02Mi 6.04Mi 6.04Mi 6.04Mi 0.25 -1.40

syslog_loki

EXPERIMENT VALUE_min VALUE_p90 VALUE_p99 VALUE_max VALUE_skewness VALUE_kurtosis
baseline 4.01Mi 4.29Mi 4.31Mi 4.32Mi -0.18 -0.64
comparison 4.03Mi 4.64Mi 4.69Mi 4.70Mi -0.27 -0.84

syslog_regex_logs2metric_ddmetrics

EXPERIMENT VALUE_min VALUE_p90 VALUE_p99 VALUE_max VALUE_skewness VALUE_kurtosis
baseline 3.75Mi 3.76Mi 3.76Mi 3.76Mi 0.83 0.91
comparison 4.54Mi 4.59Mi 4.60Mi 4.60Mi -0.38 -0.77

syslog_splunk_hec_logs

EXPERIMENT VALUE_min VALUE_p90 VALUE_p99 VALUE_max VALUE_skewness VALUE_kurtosis
baseline 7.13Mi 7.16Mi 7.16Mi 7.17Mi 0.30 -1.44
comparison 7.06Mi 7.09Mi 7.09Mi 7.09Mi -0.40 -1.17

Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>
@github-actions
Copy link
Copy Markdown
Contributor

Soak Test Results

Baseline: 9d78719
Comparison: c628fee
Total Vector CPUs: 4

What follows is a statistical summary of the soak captures between the SHAs given above. Units are bytes/second/CPU, except for 'skewness' and 'kurtosis'. Higher numbers in 'comparison' is generally better. Higher skewness or kurtosis numbers indicate a lack of consistency in behavior, making predictions of fitness in the field challenging.


datadog_agent_remap_blackhole

EXPERIMENT VALUE_min VALUE_p90 VALUE_p99 VALUE_max VALUE_skewness VALUE_kurtosis
baseline 9.85Mi 9.87Mi 9.88Mi 9.88Mi -0.24 -0.08
comparison 10.51Mi 10.59Mi 10.59Mi 10.60Mi 0.44 -1.16

datadog_agent_remap_datadog_logs

EXPERIMENT VALUE_min VALUE_p90 VALUE_p99 VALUE_max VALUE_skewness VALUE_kurtosis
baseline 18.08Mi 18.12Mi 18.13Mi 18.13Mi -0.44 0.37
comparison 17.36Mi 17.44Mi 17.46Mi 17.46Mi 0.33 -0.93

fluent_elasticsearch

EXPERIMENT VALUE_min VALUE_p90 VALUE_p99 VALUE_max VALUE_skewness VALUE_kurtosis
baseline 50.94Mi 51.34Mi 51.37Mi 51.39Mi 0.22 -1.42
comparison 51.53Mi 51.69Mi 51.79Mi 51.81Mi 0.59 1.22

fluent_remap_aws_firehose

EXPERIMENT VALUE_min VALUE_p90 VALUE_p99 VALUE_max VALUE_skewness VALUE_kurtosis
baseline 37.01Mi 37.18Mi 37.20Mi 37.20Mi 0.15 -1.10
comparison 42.83Mi 43.01Mi 43.09Mi 43.09Mi 0.72 0.16

http_pipelines_blackhole

EXPERIMENT VALUE_min VALUE_p90 VALUE_p99 VALUE_max VALUE_skewness VALUE_kurtosis
baseline 0.00 3.17Ki 3.17Ki 3.17Ki 0.23 -1.36
comparison 0.00 0.00 2.12Ki 2.12Ki 6.31 38.70

splunk_hec_route_s3

EXPERIMENT VALUE_min VALUE_p90 VALUE_p99 VALUE_max VALUE_skewness VALUE_kurtosis
baseline 5.32Mi 5.53Mi 5.59Mi 5.59Mi 0.28 -0.42
comparison 5.04Mi 5.26Mi 5.30Mi 5.30Mi -0.04 -0.87

splunk_transforms_splunk3

EXPERIMENT VALUE_min VALUE_p90 VALUE_p99 VALUE_max VALUE_skewness VALUE_kurtosis
baseline 2.37Mi 2.48Mi 2.50Mi 2.52Mi 0.09 -0.28
comparison 2.70Mi 2.92Mi 2.95Mi 2.96Mi -0.45 -1.13

syslog_humio_logs

EXPERIMENT VALUE_min VALUE_p90 VALUE_p99 VALUE_max VALUE_skewness VALUE_kurtosis
baseline 6.89Mi 6.92Mi 6.92Mi 6.92Mi 0.20 -1.18
comparison 6.84Mi 6.88Mi 6.89Mi 6.89Mi -0.26 -1.14

syslog_log2metric_humio_metrics

EXPERIMENT VALUE_min VALUE_p90 VALUE_p99 VALUE_max VALUE_skewness VALUE_kurtosis
baseline 5.02Mi 5.04Mi 5.04Mi 5.04Mi 0.63 -0.85
comparison 4.92Mi 4.95Mi 4.95Mi 4.95Mi 0.36 -0.57

syslog_log2metric_splunk_hec_metrics

EXPERIMENT VALUE_min VALUE_p90 VALUE_p99 VALUE_max VALUE_skewness VALUE_kurtosis
baseline 5.14Mi 5.16Mi 5.16Mi 5.16Mi 0.29 -1.10
comparison 6.01Mi 6.03Mi 6.03Mi 6.03Mi 0.09 -1.16

syslog_loki

EXPERIMENT VALUE_min VALUE_p90 VALUE_p99 VALUE_max VALUE_skewness VALUE_kurtosis
baseline 4.04Mi 4.32Mi 4.41Mi 4.42Mi -0.03 -0.59
comparison 3.63Mi 4.33Mi 4.39Mi 4.39Mi 0.57 -1.34

syslog_regex_logs2metric_ddmetrics

EXPERIMENT VALUE_min VALUE_p90 VALUE_p99 VALUE_max VALUE_skewness VALUE_kurtosis
baseline 3.76Mi 3.84Mi 3.84Mi 3.85Mi -2.45 7.07
comparison 1.97Mi 1.98Mi 1.98Mi 1.98Mi 0.03 -0.55

syslog_splunk_hec_logs

EXPERIMENT VALUE_min VALUE_p90 VALUE_p99 VALUE_max VALUE_skewness VALUE_kurtosis
baseline 7.18Mi 7.33Mi 7.35Mi 7.35Mi 0.96 -0.35
comparison 6.82Mi 6.87Mi 6.88Mi 6.88Mi 0.43 -0.18

Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>
Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Dec 2, 2021

Soak Test Results

Baseline: 4cdbab1
Comparison: e1c1007
Total Vector CPUs: 4

Explanation A soak test is an integrated performance test for vector in a repeatable rig, with varying configuration for vector. What follows is a statistical summary of a brief vector run for each configuration across SHAs given above. The goal of these tests are to determine, quickly, if vector performance is changed and to what degree by a pull request. Test units below are bytes/second/CPU, except for 'skewness'. The further 'skewness' is from 0.0 the more indication that vector lacks consistency in behavior, making predictions of fitness in the field challenging.

This table lists those experiments that have experienced a statistically significant change in their throughput performance between baseline and comparision SHAs, with 95.0% confidence. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a ±8.87% change in mean throughput are discarded.

experiment Δ mean Δ mean %
fluent_remap_aws_firehose 3.84MiB 9.95
syslog_log2metric_splunk_hec_metrics 647.85KiB 12.28
syslog_regex_logs2metric_ddmetrics 637.51KiB 16.45
syslog_loki 518.7KiB 12.47
splunk_transforms_splunk3 269.91KiB 10.58
Fine details of change detection per experiment.
experiment Δ mean Δ mean % baseline mean baseline stdev baseline outlier percentage comparison mean comparison stdev comparison outlier percentage t-statistic p-value
fluent_remap_aws_firehose 3.84MiB 9.95 38.56MiB 837.19KiB 0.806452 42.4MiB 1.24MiB 0 -49.6017 3.83437e-220
syslog_log2metric_splunk_hec_metrics 647.85KiB 12.28 5.15MiB 87.15KiB 0.273973 5.79MiB 210.29KiB 0 -54.4365 3.33243e-209
syslog_regex_logs2metric_ddmetrics 637.51KiB 16.45 3.79MiB 264.6KiB 0 4.41MiB 280.53KiB 0.275482 -31.0513 5.95064e-134
fluent_elasticsearch 582.72KiB 1.07 53.11MiB 2.7MiB 0 53.68MiB 2.44MiB 0 -2.9769 0.00301026
syslog_loki 518.7KiB 12.47 4.06MiB 279.87KiB 0.275482 4.57MiB 749.72KiB 4.6832 -12.3494 1.82054e-30
splunk_transforms_splunk3 269.91KiB 10.58 2.49MiB 1.14MiB 0.884956 2.76MiB 1.26MiB 1.37363 -2.91487 0.003672
datadog_agent_remap_blackhole 217.12KiB 5.99 3.54MiB 324.06KiB 0.906344 3.75MiB 337.0KiB 0 -8.75821 1.46914e-17
splunk_hec_route_s3 109.71KiB 1.96 5.46MiB 2.05MiB 0 5.57MiB 2.08MiB 0.274725 -0.700013 0.484143
syslog_log2metric_humio_metrics 7.35KiB 0.14 4.97MiB 123.35KiB 0 4.98MiB 188.66KiB 0 -0.621944 0.534205
http_pipelines_blackhole -265.16B -27.71 956.97B 8.71KiB 1.36612 691.81B 12.92KiB 0.273224 0.317815 0.750729
syslog_splunk_hec_logs -136.08KiB -1.92 6.94MiB 249.93KiB 0 6.8MiB 223.74KiB 0 7.60762 9.28562e-14
syslog_humio_logs -348.5KiB -4.77 7.13MiB 73.62KiB 0.276243 6.79MiB 139.68KiB 0 42.0856 3.05304e-174
datadog_agent_remap_datadog_logs -1.36MiB -7.05 19.3MiB 521.21KiB 0 17.94MiB 779.96KiB 0 28.1536 2.51199e-114
Fine details of each soak run.
(experiment, variant) total samples mean std min average p90 p95 p99 max skewness
('fluent_elasticsearch', 'comparison') 364 53.68MiB 2.44MiB 48.38MiB 53.6MiB 56.81MiB 57.03MiB 57.58MiB 58.26MiB -0.0707949
('fluent_elasticsearch', 'baseline') 361 53.11MiB 2.7MiB 47.88MiB 52.54MiB 56.6MiB 56.91MiB 57.49MiB 58.16MiB 0.0377152
('fluent_remap_aws_firehose', 'comparison') 367 42.4MiB 1.24MiB 39.73MiB 42.26MiB 44.21MiB 44.48MiB 44.9MiB 45.63MiB 0.168281
('fluent_remap_aws_firehose', 'baseline') 372 38.56MiB 837.19KiB 36.28MiB 38.57MiB 39.59MiB 39.84MiB 40.51MiB 41.02MiB -0.0173972
('datadog_agent_remap_datadog_logs', 'baseline') 341 19.3MiB 521.21KiB 18.34MiB 19.26MiB 20.03MiB 20.3MiB 20.49MiB 20.65MiB 0.424361
('datadog_agent_remap_datadog_logs', 'comparison') 368 17.94MiB 779.96KiB 16.51MiB 17.87MiB 18.87MiB 19.01MiB 19.24MiB 19.59MiB 0.00244309
('syslog_humio_logs', 'baseline') 362 7.13MiB 73.62KiB 6.91MiB 7.14MiB 7.22MiB 7.24MiB 7.29MiB 7.32MiB -0.190143
('syslog_splunk_hec_logs', 'baseline') 341 6.94MiB 249.93KiB 6.5MiB 6.84MiB 7.25MiB 7.26MiB 7.3MiB 7.33MiB 0.0853223
('syslog_splunk_hec_logs', 'comparison') 366 6.8MiB 223.74KiB 6.41MiB 6.78MiB 7.1MiB 7.13MiB 7.2MiB 7.24MiB 0.112283
('syslog_humio_logs', 'comparison') 364 6.79MiB 139.68KiB 6.46MiB 6.82MiB 6.96MiB 6.97MiB 7.01MiB 7.02MiB -0.295655
('syslog_log2metric_splunk_hec_metrics', 'comparison') 366 5.79MiB 210.29KiB 5.39MiB 5.82MiB 6.04MiB 6.08MiB 6.11MiB 6.18MiB -0.0151869
('splunk_hec_route_s3', 'comparison') 364 5.57MiB 2.08MiB 335.3KiB 5.57MiB 8.16MiB 8.61MiB 10.44MiB 11.62MiB -0.0763866
('splunk_hec_route_s3', 'baseline') 365 5.46MiB 2.05MiB 976.06KiB 5.39MiB 8.22MiB 8.95MiB 10.01MiB 11.11MiB 0.202169
('syslog_log2metric_splunk_hec_metrics', 'baseline') 365 5.15MiB 87.15KiB 4.9MiB 5.16MiB 5.27MiB 5.29MiB 5.32MiB 5.38MiB -0.158302
('syslog_log2metric_humio_metrics', 'comparison') 364 4.98MiB 188.66KiB 4.63MiB 4.95MiB 5.24MiB 5.26MiB 5.31MiB 5.35MiB 0.103798
('syslog_log2metric_humio_metrics', 'baseline') 364 4.97MiB 123.35KiB 4.73MiB 4.97MiB 5.14MiB 5.17MiB 5.2MiB 5.26MiB 0.0949554
('syslog_loki', 'comparison') 363 4.57MiB 749.72KiB 3.39MiB 4.57MiB 5.52MiB 6.2MiB 6.64MiB 6.79MiB 0.884897
('syslog_regex_logs2metric_ddmetrics', 'comparison') 363 4.41MiB 280.53KiB 3.84MiB 4.38MiB 4.8MiB 4.97MiB 5.08MiB 5.22MiB 0.55653
('syslog_loki', 'baseline') 363 4.06MiB 279.87KiB 3.26MiB 4.05MiB 4.41MiB 4.51MiB 4.7MiB 4.84MiB 0.119418
('syslog_regex_logs2metric_ddmetrics', 'baseline') 342 3.79MiB 264.6KiB 3.37MiB 3.69MiB 4.17MiB 4.22MiB 4.27MiB 4.29MiB 0.446473
('datadog_agent_remap_blackhole', 'comparison') 382 3.75MiB 337.0KiB 2.81MiB 3.75MiB 4.15MiB 4.25MiB 4.51MiB 4.68MiB -0.106987
('datadog_agent_remap_blackhole', 'baseline') 331 3.54MiB 324.06KiB 2.34MiB 3.53MiB 3.97MiB 4.08MiB 4.21MiB 4.42MiB -0.0262966
('splunk_transforms_splunk3', 'comparison') 364 2.76MiB 1.26MiB 333.94KiB 2.62MiB 4.35MiB 5.17MiB 6.15MiB 8.43MiB 0.773842
('splunk_transforms_splunk3', 'baseline') 339 2.49MiB 1.14MiB 91.41KiB 2.38MiB 3.93MiB 4.74MiB 5.41MiB 6.05MiB 0.604085
('http_pipelines_blackhole', 'baseline') 366 956.97B 8.71KiB 0B 0B 0B 0B 42.41KiB 124.71KiB 10.8368
('http_pipelines_blackhole', 'comparison') 366 691.81B 12.92KiB 0B 0B 0B 0B 0B 247.27KiB 19.1311

@StephenWakely
Copy link
Copy Markdown
Contributor Author

The soak test results that we are getting from this PR are showing there is a disappointing improvement in performance using Reference Counts.

Given the additional complexity of the code and the potential for runtime panics due to moving the borrow checking to runtime, it is not felt that this PR has any value and is not worth pursuing further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

domain: core Anything related to core crates i.e. vector-core, core-common, etc domain: transforms Anything related to Vector's transform components domain: vrl Anything related to the Vector Remap Language

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants