Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(vrl parse_xml): Fix panic in parse_xml vrl function #14479

Merged
merged 4 commits into from
Sep 26, 2022

Conversation

Zettroke
Copy link
Contributor

@Zettroke Zettroke commented Sep 20, 2022

In our company we found a potential panic in parse_xml which crashes vector.
XML payload causing panic

<p><?xml?>text123</p>

Payload was found by https://github.com/stasos24

I added test case for this, but I am not sure about it.

According to spec this is incorrect xml

https://www.w3.org/TR/xml/#sec-prolog-dtd
xml declaration should be first element in the file

But roxmltree parses it without an error
Should we depend on this behavior? If not then test should be removed

@bits-bot
Copy link

bits-bot commented Sep 20, 2022

CLA assistant check
All committers have signed the CLA.

@netlify
Copy link

netlify bot commented Sep 20, 2022

Deploy Preview for vector-project canceled.

Name Link
🔨 Latest commit 04346d3
🔍 Latest deploy log https://app.netlify.com/sites/vector-project/deploys/632a0bfcd286060008cabbb8

@Zettroke Zettroke changed the title fix(parse_xml): Fix a panic in parse_xml vrl function fix(parse_xml): Fix panic in parse_xml vrl function Sep 20, 2022
@github-actions github-actions bot added the domain: vrl Anything related to the Vector Remap Language label Sep 20, 2022
@Zettroke Zettroke changed the title fix(parse_xml): Fix panic in parse_xml vrl function fix(vrl parse_xml): Fix panic in parse_xml vrl function Sep 20, 2022
@github-actions
Copy link

Soak Test Results

Baseline: 14fefee
Comparison: 4d89712
Total Vector CPUs: 4

Explanation

A soak test is an integrated performance test for vector in a repeatable rig, with varying configuration for vector. What follows is a statistical summary of a brief vector run for each configuration across SHAs given above. The goal of these tests are to determine, quickly, if vector performance is changed and to what degree by a pull request. Where appropriate units are scaled per-core.

The table below, if present, lists those experiments that have experienced a statistically significant change in their throughput performance between baseline and comparision SHAs, with 90.0% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a ±8.87% change in mean throughput are discarded. An experiment is erratic if its coefficient of variation is greater than 0.3. The abbreviated table will be omitted if no interesting changes are observed.

No interesting changes in throughput with confidence ≥ 90.00% and absolute Δ mean >= ±8.87%:

Fine details of change detection per experiment.
experiment Δ mean Δ mean % confidence baseline mean baseline stdev baseline stderr baseline outlier % baseline CoV comparison mean comparison stdev comparison stderr comparison outlier % comparison CoV erratic declared erratic
syslog_splunk_hec_logs 480.67KiB 2.92 100.00% 16.09MiB 1017.99KiB 20.73KiB 0 0.0617718 16.56MiB 606.93KiB 12.38KiB 0 0.0357845 False False
splunk_hec_route_s3 358.68KiB 1.88 100.00% 18.61MiB 2.27MiB 47.32KiB 0 0.121993 18.96MiB 2.2MiB 45.95KiB 0 0.115927 False False
syslog_loki 64.68KiB 0.44 99.62% 14.21MiB 610.06KiB 12.49KiB 0 0.0419124 14.27MiB 911.97KiB 18.54KiB 0 0.0623769 False False
syslog_regex_logs2metric_ddmetrics 52.03KiB 0.41 99.65% 12.3MiB 637.96KiB 13.0KiB 0 0.0506538 12.35MiB 595.32KiB 12.14KiB 0 0.0470732 False False
http_pipelines_blackhole_acks 4.14KiB 0.34 84.77% 1.21MiB 113.23KiB 2.3KiB 0 0.0916104 1.21MiB 85.81KiB 1.75KiB 0 0.0691895 False False
splunk_hec_to_splunk_hec_logs_acks 21.23KiB 0.09 60.23% 23.74MiB 901.33KiB 18.32KiB 0 0.0370746 23.76MiB 843.26KiB 17.16KiB 0 0.0346557 False False
splunk_hec_to_splunk_hec_logs_noack 6.11KiB 0.03 44.24% 23.83MiB 385.93KiB 7.88KiB 0 0.0158119 23.84MiB 334.11KiB 6.82KiB 0 0.0136851 False False
enterprise_http_to_http -1.56KiB -0.01 16.97% 23.85MiB 251.51KiB 5.13KiB 0 0.0102978 23.84MiB 252.28KiB 5.16KiB 0 0.0103301 False False
splunk_hec_indexer_ack_blackhole -6.46KiB -0.03 20.17% 23.76MiB 861.04KiB 17.52KiB 0 0.0353784 23.76MiB 895.58KiB 18.22KiB 0 0.0368072 False False
file_to_blackhole -50.75KiB -0.05 44.76% 95.35MiB 2.96MiB 61.32KiB 0 0.0310147 95.31MiB 2.86MiB 59.46KiB 0 0.0299796 False False
syslog_humio_logs -7.97KiB -0.05 75.38% 16.34MiB 271.89KiB 5.55KiB 0 0.0162503 16.33MiB 198.29KiB 4.06KiB 0 0.0118569 False False
syslog_log2metric_splunk_hec_metrics -14.0KiB -0.08 46.04% 16.82MiB 761.05KiB 15.5KiB 0 0.0441682 16.81MiB 822.59KiB 16.75KiB 0 0.0477784 False False
datadog_agent_remap_blackhole -54.45KiB -0.09 33.79% 58.69MiB 4.5MiB 93.86KiB 0 0.0766968 58.64MiB 3.93MiB 81.94KiB 0 0.0669886 False False
syslog_log2metric_humio_metrics -22.82KiB -0.18 96.01% 12.66MiB 190.81KiB 3.89KiB 0 0.0147146 12.64MiB 510.46KiB 10.4KiB 0 0.0394342 False False
http_to_http_json -50.08KiB -0.21 99.97% 23.85MiB 334.18KiB 6.82KiB 0 0.0136818 23.8MiB 582.15KiB 11.87KiB 0 0.0238827 False False
http_to_http_noack -61.89KiB -0.25 99.31% 23.83MiB 525.21KiB 10.73KiB 0 0.0215196 23.77MiB 993.06KiB 20.23KiB 0 0.0407926 False False
fluent_elasticsearch -300.91KiB -0.37 100.00% 79.47MiB 52.4KiB 1.06KiB 0 0.000643711 79.18MiB 3.57MiB 73.3KiB 0 0.0450304 False False
http_to_http_acks -98.31KiB -0.56 32.60% 17.18MiB 7.99MiB 166.98KiB 0 0.464932 17.08MiB 7.85MiB 163.47KiB 0 0.459255 True True
http_pipelines_blackhole -16.22KiB -0.97 100.00% 1.64MiB 59.48KiB 1.22KiB 0 0.0355136 1.62MiB 126.94KiB 2.59KiB 0 0.0765306 False False
datadog_agent_remap_blackhole_acks -643.7KiB -1.02 100.00% 61.83MiB 4.22MiB 87.97KiB 0 0.0683123 61.2MiB 3.22MiB 67.24KiB 0 0.0525659 False False
http_pipelines_no_grok_blackhole -149.17KiB -1.34 100.00% 10.9MiB 265.0KiB 5.41KiB 0 0.0237371 10.75MiB 1.07MiB 22.27KiB 0 0.099451 False False
datadog_agent_remap_datadog_logs -1.31MiB -2.17 100.00% 60.23MiB 642.51KiB 13.15KiB 0 0.0104149 58.93MiB 3.81MiB 79.4KiB 0 0.064682 False False
datadog_agent_remap_datadog_logs_acks -1.46MiB -2.33 100.00% 62.41MiB 3.16MiB 65.95KiB 0 0.0505495 60.95MiB 4.33MiB 90.06KiB 0 0.0709729 False False
socket_to_socket_blackhole -1001.85KiB -4.11 100.00% 23.8MiB 363.88KiB 7.43KiB 0 0.0149246 22.83MiB 251.46KiB 5.13KiB 0 0.0107559 False False
http_text_to_http_json -2.22MiB -5.55 100.00% 39.99MiB 767.55KiB 15.67KiB 0 0.0187381 37.77MiB 880.75KiB 17.98KiB 0 0.0227646 False False

Copy link
Member

@fuchsnj fuchsnj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks reasonable to me. Thanks!

@spencergilbert spencergilbert enabled auto-merge (squash) September 26, 2022 13:40
@spencergilbert spencergilbert merged commit 8b8c645 into vectordotdev:master Sep 26, 2022
@stasos24
Copy link

Still vulnerable to:

<R ><?ŀ?<?ŀ?>

Report:

thread 'vector-worker' panicked at 'internal error: entered unreachable code: shouldn't be other XML nodes', lib/vrl/stdlib/src/parse_xml.rs:378:14
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread 'vector-worker' panicked at 'internal error: entered unreachable code: join error or bad poll', src/topology/builder.rs:685:30

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain: vrl Anything related to the Vector Remap Language
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants