Skip to content

Implementing Histogram Aggregation#111

Merged
eladiw merged 27 commits intodev-vnextfrom
feat/histo
Jan 19, 2022
Merged

Implementing Histogram Aggregation#111
eladiw merged 27 commits intodev-vnextfrom
feat/histo

Conversation

@nebrass
Copy link
Copy Markdown
Contributor

@nebrass nebrass commented Jan 6, 2022

The following changes are proposed:

  • Implementing Histogram Aggregation

Screenshot:
image

@nebrass nebrass requested a review from a team as a code owner January 6, 2022 09:58
Comment thread K2Bridge/Visitors/HistogramVisitor.cs Outdated
Comment thread K2Bridge/Visitors/HistogramVisitor.cs Outdated
Comment thread K2Bridge/Visitors/HistogramVisitor.cs Outdated
Comment thread K2Bridge/Visitors/HistogramVisitor.cs Outdated
Comment thread K2Bridge/Visitors/HistogramVisitor.cs Outdated
@nebrass nebrass force-pushed the feat/histo branch 2 times, most recently from fead311 to c4ab0de Compare January 7, 2022 08:44
Comment thread K2Bridge.Tests.End2End/ParallelApiTest.cs Outdated
Comment thread K2Bridge/Factories/BucketFactory.cs Outdated
Comment thread K2Bridge/Factories/BucketFactory.cs
Comment thread K2Bridge/JsonConverters/BucketsCollectionConverter.cs Outdated
Comment thread K2Bridge/JsonConverters/HistogramBucketConverter.cs Outdated
Comment thread K2Bridge/Models/Request/Aggregations/Bucket/Histogram/HistogramAggregation.cs Outdated
Comment thread K2Bridge/Visitors/HistogramVisitor.cs Outdated
Comment thread K2Bridge.Tests.UnitTests/Visitors/HistogramAggregationVisitorTests.cs Outdated
Comment thread K2Bridge/JsonConverters/BucketsCollectionConverter.cs Outdated
Comment thread K2Bridge/Visitors/HistogramVisitor.cs Outdated
Copy link
Copy Markdown
Collaborator

@eladiw eladiw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some comments

@nebrass nebrass force-pushed the feat/histo branch 2 times, most recently from 03087f8 to 054c4ab Compare January 10, 2022 16:22
Comment thread K2Bridge/Visitors/HistogramVisitor.cs Outdated
Comment thread K2Bridge/Visitors/HistogramVisitor.cs Outdated
@AsafMah
Copy link
Copy Markdown
Contributor

AsafMah commented Jan 11, 2022

image

Your PR still doesn't support "Extended Bounds", that can be enabled by the UI (you have the property, you do nothing with it).
What it does is send empty buckets for the entire range of the extended bounds.

If we consider that a non-important feature we can skip it, just have it documented

@AsafMah
Copy link
Copy Markdown
Contributor

AsafMah commented Jan 11, 2022

You have an off by one on hard bounds -
try "dayOfWeek" with hard bounds of 3 to 4
in kibana it gives you 3 and 4
in kusto it gives you 2,3,4,5

Comment thread K2Bridge/Visitors/HistogramVisitor.cs Outdated
Copy link
Copy Markdown
Contributor

@AsafMah AsafMah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments

Comment thread K2Bridge/Visitors/HistogramVisitor.cs Outdated
Comment thread K2Bridge/Visitors/HistogramVisitor.cs Outdated
@nebrass nebrass force-pushed the feat/histo branch 3 times, most recently from f85f3b2 to 4cb81aa Compare January 13, 2022 14:40
@nebrass
Copy link
Copy Markdown
Contributor Author

nebrass commented Jan 13, 2022

Hello @AsafMah & @eladiw thank you for your comments and suggestions !
I made some changes to fix the hard_bounds issue
for the extended_bounds I will mention in the wiki that it was not implemented for lack of knowledge about the feature itself..

@nebrass nebrass requested a review from eladiw January 13, 2022 14:55
nebrass and others added 10 commits January 18, 2022 12:08
Co-authored-by: Tamir Kamara <26870601+tamirkamara@users.noreply.github.com>
Co-authored-by: Tamir Kamara <26870601+tamirkamara@users.noreply.github.com>
Co-authored-by: Tamir Kamara <26870601+tamirkamara@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@tamirkamara tamirkamara left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Query and visitor look good now!
I'll let the others review the rest of the code.

@AsafMah
Copy link
Copy Markdown
Contributor

AsafMah commented Jan 18, 2022

@tamirkamara are you sure?

There is something wrong with the logic of the bins, no matter what options I choose it puts 99.9% of the data in the "0" bin, the rest in the next bin, and all the other bins are emtpy.

Even when entering a manual interval of 500 I got the following query:

let _data = database("kibana").StormEvents ;
let _extdata = _data
| extend ['2%False'] = bin(todouble(['StormSummary'].['TotalDamages']), 7000000);
let _summarizablemetrics = _extdata
| summarize count() by ['2%False']
| order by ['2%False'] asc;
(_summarizablemetrics
| as aggs);
(_data | count | as hitsTotal);
(_data | limit 0 | as hits)

@tamirkamara
Copy link
Copy Markdown
Contributor

I didn't test it myself @AsafMah, only looked at the visitor logic. With the newly added base classes it's not easy to see the exact query that gets generated...
Are you saying it's always seeing 7000000 as the bin size?

@AsafMah
Copy link
Copy Markdown
Contributor

AsafMah commented Jan 18, 2022

I didn't test it myself @AsafMah, only looked at the visitor logic. With the newly added base classes it's not easy to see the exact query that gets generated... Are you saying it's always seeing 7000000 as the bin size?

No, it always sets the bin size so it's basically all in the first bin.
If I set the max bars to 5 I get this:
image
And if I set max bars to 3 I get this:
image

@nebrass have you tested it?

@nebrass
Copy link
Copy Markdown
Contributor Author

nebrass commented Jan 18, 2022

@AsafMah I just made a test and the result is not as expected as you mentioned below:
Capture d’écran 2022-01-18 à 16 56 46

The generated query:

let _data = database("dev").StormEvents | where (['StartTime'] >= todatetime("2007-01-18T15:52:40.4150000Z") and ['StartTime'] <= todatetime("2022-01-18T15:52:40.4150000Z"));
let _extdata = _data
| extend ['2%False'] = bin(['InjuriesDirect'], 500);
let _summarizablemetrics = _extdata
| summarize count() by ['2%False']
| order by ['2%False'] asc;
(_summarizablemetrics | as aggs);
(_data | count | as hitsTotal);
(_data | limit 0 | as hits)

and the resulted data is the same:
Capture d’écran 2022-01-18 à 16 58 24

Comment thread docs/development-container.md Outdated
Copy link
Copy Markdown
Collaborator

@eladiw eladiw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job.
lgtm

@eladiw eladiw dismissed AsafMah’s stale review January 18, 2022 16:04

All comments were resolved and discussed

@AsafMah
Copy link
Copy Markdown
Contributor

AsafMah commented Jan 18, 2022

@AsafMah I just made a test and the result is not as expected as you mentioned below: Capture d’écran 2022-01-18 à 16 56 46

The generated query:

let _data = database("dev").StormEvents | where (['StartTime'] >= todatetime("2007-01-18T15:52:40.4150000Z") and ['StartTime'] <= todatetime("2022-01-18T15:52:40.4150000Z"));
let _extdata = _data
| extend ['2%False'] = bin(['InjuriesDirect'], 500);
let _summarizablemetrics = _extdata
| summarize count() by ['2%False']
| order by ['2%False'] asc;
(_summarizablemetrics | as aggs);
(_data | count | as hitsTotal);
(_data | limit 0 | as hits)

and the resulted data is the same: Capture d’écran 2022-01-18 à 16 58 24

Try it with a field with bigger values (DamageCrops)

Copy link
Copy Markdown
Collaborator

@eladiw eladiw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changing the approval until all comments are resolved to avoid accidental merge

Copy link
Copy Markdown
Contributor

@AsafMah AsafMah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good!
Some minor comments here

Also fix the open comments & merge conflict files and we can merge

Comment thread K2Bridge/Models/Request/Aggregations/Bucket/Histogram/Bounds.cs
Comment thread K2Bridge/Visitors/Aggregations/HistogramAggregationVisitor.cs Outdated
Copy link
Copy Markdown
Contributor

@AsafMah AsafMah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Copy link
Copy Markdown
Collaborator

@eladiw eladiw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@eladiw eladiw merged commit dae0301 into dev-vnext Jan 19, 2022
@eladiw eladiw deleted the feat/histo branch January 19, 2022 11:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants