Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Fix counters and percentages for array fields on the Data visualizer page #55209

Merged
merged 4 commits into from
Jan 22, 2020

Conversation

darnautov
Copy link
Contributor

@darnautov darnautov commented Jan 17, 2020

Summary

Fixes #54734.

data_visualizer endpoints have been using values_count and stats aggregation to get the count of the field in the dataset, but with this approach for array fields, we ended up with an amount which higher than the actual amount of docs in the dataset.
I've added a filter to the aggregation to retrieve the number of docs that contain the field for both get_field_stats and get_overall_stats.

image

Checklist

@elasticmachine
Copy link
Contributor

Pinging @elastic/ml-ui (:ml)

@darnautov
Copy link
Contributor Author

@elasticmachine merge upstream

@walterra
Copy link
Contributor

I have a question about the aggregation changes: The previous code added aggregation configurations directly like aggs[...] = { value_count: { field } };. It looks to me like the new code wraps that with an outer level of e.g. filter and aggs but then assigns it to the same aggs[...]. I wonder why this doesn't require other changes in how the final request object is constructed/assigned, at least it's not obvious by looking at the PR diff. Can you explain why this still works :) ?

Copy link
Contributor

@peteharverson peteharverson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gave this a good test, and all LGTM

Copy link
Contributor

@walterra walterra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed my comment with @peteharverson — LGTM!

@kibanamachine
Copy link
Contributor

💚 Build Succeeded

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@darnautov darnautov merged commit 5e711e4 into elastic:master Jan 22, 2020
@darnautov darnautov deleted the ML-54734-fix-doc-percentage branch January 22, 2020 06:09
darnautov added a commit to darnautov/kibana that referenced this pull request Jan 22, 2020
…izer page (elastic#55209)

* [ML] update data visualizer endpoint to check doc counts

* [ML] fix mock for cardinality tests

* [ML] use actual field name for agg filtering instead of safeFieldName

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
darnautov added a commit to darnautov/kibana that referenced this pull request Jan 22, 2020
…izer page (elastic#55209)

* [ML] update data visualizer endpoint to check doc counts

* [ML] fix mock for cardinality tests

* [ML] use actual field name for agg filtering instead of safeFieldName

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
darnautov added a commit that referenced this pull request Jan 22, 2020
…izer page (#55209) (#55519)

* [ML] update data visualizer endpoint to check doc counts

* [ML] fix mock for cardinality tests

* [ML] use actual field name for agg filtering instead of safeFieldName

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
darnautov added a commit that referenced this pull request Jan 22, 2020
…izer page (#55209) (#55518)

* [ML] update data visualizer endpoint to check doc counts

* [ML] fix mock for cardinality tests

* [ML] use actual field name for agg filtering instead of safeFieldName

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
gmmorris added a commit to gmmorris/kibana that referenced this pull request Jan 22, 2020
* master: (38 commits)
  [ML] Fix counters and percentages for array fields on the Data visualizer page (elastic#55209)
  [SIEM][Detection Engine] Tags being turned into null
  rules part deux (elastic#55507)
  [DOCS] Add tip for using elasticsearch-certutil http command (elastic#55357)
  [SIEM][Detection Engine] Critical blocker, fixes schema accepting values it should not (elastic#55488)
  [SIEM] Detections create prepackage rules (elastic#55403)
  [Reporting] Convert CSV Export libs to Typescript (elastic#55117)
  [Maps] show field type icons in data driven styling field select (elastic#55166)
  Adds event log for actions and alerting (elastic#45081)
  [SIEM][Detection Engine] Fixes critical blocker where signals on signals are not operating
  [SIEM][Detection Engine] Critical blocker, adds need REST prefix for cloud
  remove incorrect config (elastic#55427)
  Retain pinned filters when loading and clearing saved queries (elastic#54307)
  Resolver zoom, pan, and center controls (elastic#55221)
  Skip failing endpoint saga tests
  [skip-ci] Update migration guide to add rendering service example (elastic#54744)
  [DOCS] Updates to heat map page (elastic#55097)
  [Endpoint] Fix saga to start only after store is created and stopped on app unmount (elastic#55245)
  [Logs UI] Use the correct icons and labels in the feature cont… (elastic#55292)
  [Uptime] Handle locations with names but no geo data (elastic#55234)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ML] Data visualizer document percentage inside array fields can show over 100%
5 participants