Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Reporting] Fix slow CSV with large max size bytes #120365

Merged

Conversation

jloleysens
Copy link
Contributor

@jloleysens jloleysens commented Dec 3, 2021

Summary

Follow up PR to improve CSV generation. Using the same reporting config as #120309.

Currently, there is a lot of CPU time spent copying Buffers in the concat method that is called on stream _write.

Screenshot 2021-12-03 at 16 56 23

This contribution attempts to update the behaviour to only concat the necessary buffers when we are about to flush. The CPU profile for a CSV export is the following after these changes are applied

Screenshot 2021-12-03 at 16 57 42

How to test

  • Start Kibana + ES on basic and start trial license (start Kibana with node --inspect scripts/kibana --dev if you want to profile)
  • run node scripts/makelogs.js -c 1m --url http://elastic:changeme@localhost:9200 to get larger volumes of data into the cluster
  • Create a new Data View for the logstash-*
  • View the data view in Discover and generate a CSV
  • (Optional) in Chrome, open the node debugger and capture a CPU profile
  • The CSV should complete, but hits a max length limit

Additional notes

Release note

We fixed a performance issue that occurred when generating a large CSV export.

Checklist

Delete any items that are not applicable to this PR.

Risk Matrix

Delete this section if it is not applicable to this PR.

Before closing this PR, invite QA, stakeholders, and other developers to identify risks that should be tested prior to the change/feature release.

When forming the risk matrix, consider some of the following examples and how they may potentially impact the change:

Risk Probability Severity Mitigation/Notes
We could introduce new performance issues from unforeseen scenarios Med Med The code has been hardened by adding tests and receiving multiple rounds of review. Each changes was also performance tested with 900k rows.

@jloleysens jloleysens changed the title Reporting/fix slow csv large max size bytes [Reporting] Fix slow CSV with large max size bytes Dec 3, 2021
@tsullivan tsullivan self-requested a review December 3, 2021 20:21
@tsullivan
Copy link
Member

tsullivan commented Dec 3, 2021

Could be related to "[Reporting] Streaming CSV download appears to not be working" #119540 (@tsullivan WDYT?)

I think this is a standalone fix.

@tsullivan tsullivan closed this Dec 3, 2021
@tsullivan tsullivan reopened this Dec 3, 2021
@jloleysens
Copy link
Contributor Author

@elasticmachine merge upstream

@jloleysens jloleysens marked this pull request as ready for review December 6, 2021 09:55
@jloleysens jloleysens requested review from a team as code owners December 6, 2021 09:55
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-app-services (Team:AppServicesUx)

@jloleysens
Copy link
Contributor Author

thanks @dokmic for the feedback. I think I addressed your points. do you think you could take another look?

@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

✅ unchanged

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

Copy link
Member

@tsullivan tsullivan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jloleysens jloleysens merged commit c4edb8c into elastic:main Dec 7, 2021
@jloleysens jloleysens deleted the reporting/fix-slow-csv-large-max-size-bytes branch December 7, 2021 09:15
jloleysens added a commit to jloleysens/kibana that referenced this pull request Dec 7, 2021
* use Buffer.alloc + .set API instead of .concat

* refactor variable names and actually assign to this.buffer

* ok, looks like an array of buffers could work

* added large comment and refactored some variable name

* fix comment

* refactored logic to deal with an edge case where partial buffers should be added, also throw if bad config is detected

* added new test for detecting when the write stream throws for bad config

* updated logic to not ever call .slice(0), updated the guard for the config error, updated a comment

* refactor totalBytesConsumed -> bytesToFlush

* use the while loop mike wrote

* remove unused variable

* update comment

Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
jloleysens added a commit to jloleysens/kibana that referenced this pull request Dec 7, 2021
* use Buffer.alloc + .set API instead of .concat

* refactor variable names and actually assign to this.buffer

* ok, looks like an array of buffers could work

* added large comment and refactored some variable name

* fix comment

* refactored logic to deal with an edge case where partial buffers should be added, also throw if bad config is detected

* added new test for detecting when the write stream throws for bad config

* updated logic to not ever call .slice(0), updated the guard for the config error, updated a comment

* refactor totalBytesConsumed -> bytesToFlush

* use the while loop mike wrote

* remove unused variable

* update comment

Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
jloleysens added a commit to jloleysens/kibana that referenced this pull request Dec 7, 2021
* use Buffer.alloc + .set API instead of .concat

* refactor variable names and actually assign to this.buffer

* ok, looks like an array of buffers could work

* added large comment and refactored some variable name

* fix comment

* refactored logic to deal with an edge case where partial buffers should be added, also throw if bad config is detected

* added new test for detecting when the write stream throws for bad config

* updated logic to not ever call .slice(0), updated the guard for the config error, updated a comment

* refactor totalBytesConsumed -> bytesToFlush

* use the while loop mike wrote

* remove unused variable

* update comment

Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
# Conflicts:
#	x-pack/plugins/reporting/server/lib/content_stream.ts
jloleysens added a commit to jloleysens/kibana that referenced this pull request Dec 7, 2021
…-chromium-before-compiling-pdf

* 'main' of github.com:elastic/kibana: (121 commits)
  FTR should use the new kibana_system user (elastic#120436)
  [Lens] Temporarily exclude Mosaic/Waffle from the suggestions list (elastic#120488)
  [Reporting] Fix slow CSV with large max size bytes (elastic#120365)
  [CTI] Threat Intel Card on Overview page needs to accommodate Fleet TI integrations -  (elastic#120459)
  [Dashboard] Added KibanaThemeProvider.  (elastic#120122)
  add more rule_registry unit tests (elastic#120323)
  [Lens] update xpack.lens.pie.smallValuesWarningMessage text (elastic#120478)
  [Fleet] Simplified package policy create API, enriching default values (elastic#119739)
  mock `elastic-apm-node` in `@kbn/test` jest preset (elastic#120324)
  [RAC] Rename occurrences of alert_type to rule_type in Infra (elastic#120455)
  [Security Solutions] Removes tech debt of exporting all from linter rule for timeline plugin (elastic#120437)
  [Workplace Search] Add API Keys view to replace Access tokens (elastic#120147)
  [Security Solutions] Removes tech debt of exporting all from linter rule for cases plugin in the server section (elastic#120411)
  Add support for threat.feed.name (elastic#120250)
  [Rule Registry] Rewrite APM registry rules for Observability (elastic#117740)
  [docs] Fix artifact layout formatting (elastic#119386)
  [Workplace Search] Add a technical preview of connecting GitHub via GitHub apps (elastic#119764)
  add osquery notes for 7.16 (elastic#120407)
  chore(NA): splits types from code on @kbn/docs-utils (elastic#120533)
  [Reporting] Decouple screenshotting plugin from the reporting (elastic#120110)
  ...

# Conflicts:
#	x-pack/plugins/reporting/server/browsers/chromium/driver_factory/index.test.ts
#	x-pack/plugins/reporting/server/browsers/chromium/driver_factory/index.ts
#	x-pack/plugins/reporting/server/lib/screenshots/observable.test.ts
#	x-pack/plugins/reporting/server/lib/screenshots/observable.ts
#	x-pack/plugins/reporting/server/test_helpers/create_mock_browserdriverfactory.ts
jloleysens added a commit that referenced this pull request Dec 7, 2021
* use Buffer.alloc + .set API instead of .concat

* refactor variable names and actually assign to this.buffer

* ok, looks like an array of buffers could work

* added large comment and refactored some variable name

* fix comment

* refactored logic to deal with an edge case where partial buffers should be added, also throw if bad config is detected

* added new test for detecting when the write stream throws for bad config

* updated logic to not ever call .slice(0), updated the guard for the config error, updated a comment

* refactor totalBytesConsumed -> bytesToFlush

* use the while loop mike wrote

* remove unused variable

* update comment

Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>

Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
jloleysens added a commit that referenced this pull request Dec 7, 2021
* use Buffer.alloc + .set API instead of .concat

* refactor variable names and actually assign to this.buffer

* ok, looks like an array of buffers could work

* added large comment and refactored some variable name

* fix comment

* refactored logic to deal with an edge case where partial buffers should be added, also throw if bad config is detected

* added new test for detecting when the write stream throws for bad config

* updated logic to not ever call .slice(0), updated the guard for the config error, updated a comment

* refactor totalBytesConsumed -> bytesToFlush

* use the while loop mike wrote

* remove unused variable

* update comment

Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>

Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
jloleysens added a commit that referenced this pull request Dec 7, 2021
* use Buffer.alloc + .set API instead of .concat

* refactor variable names and actually assign to this.buffer

* ok, looks like an array of buffers could work

* added large comment and refactored some variable name

* fix comment

* refactored logic to deal with an edge case where partial buffers should be added, also throw if bad config is detected

* added new test for detecting when the write stream throws for bad config

* updated logic to not ever call .slice(0), updated the guard for the config error, updated a comment

* refactor totalBytesConsumed -> bytesToFlush

* use the while loop mike wrote

* remove unused variable

* update comment

Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
# Conflicts:
#	x-pack/plugins/reporting/server/lib/content_stream.ts
Heenawter pushed a commit to Heenawter/kibana that referenced this pull request Dec 8, 2021
* use Buffer.alloc + .set API instead of .concat

* refactor variable names and actually assign to this.buffer

* ok, looks like an array of buffers could work

* added large comment and refactored some variable name

* fix comment

* refactored logic to deal with an edge case where partial buffers should be added, also throw if bad config is detected

* added new test for detecting when the write stream throws for bad config

* updated logic to not ever call .slice(0), updated the guard for the config error, updated a comment

* refactor totalBytesConsumed -> bytesToFlush

* use the while loop mike wrote

* remove unused variable

* update comment

Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
TinLe pushed a commit to TinLe/kibana that referenced this pull request Dec 22, 2021
* use Buffer.alloc + .set API instead of .concat

* refactor variable names and actually assign to this.buffer

* ok, looks like an array of buffers could work

* added large comment and refactored some variable name

* fix comment

* refactored logic to deal with an edge case where partial buffers should be added, also throw if bad config is detected

* added new test for detecting when the write stream throws for bad config

* updated logic to not ever call .slice(0), updated the guard for the config error, updated a comment

* refactor totalBytesConsumed -> bytesToFlush

* use the while loop mike wrote

* remove unused variable

* update comment

Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
gbamparop pushed a commit to gbamparop/kibana that referenced this pull request Jan 12, 2022
* use Buffer.alloc + .set API instead of .concat

* refactor variable names and actually assign to this.buffer

* ok, looks like an array of buffers could work

* added large comment and refactored some variable name

* fix comment

* refactored logic to deal with an edge case where partial buffers should be added, also throw if bad config is detected

* added new test for detecting when the write stream throws for bad config

* updated logic to not ever call .slice(0), updated the guard for the config error, updated a comment

* refactor totalBytesConsumed -> bytesToFlush

* use the while loop mike wrote

* remove unused variable

* update comment

Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants