Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Replace APM error rate table with failed transactions correlations #108441

Merged
merged 26 commits into from Aug 17, 2021

Conversation

qn895
Copy link
Member

@qn895 qn895 commented Aug 12, 2021

Summary

This PR replaces the current APM error rate table with ML's new failure correlation algorithm

Screen.Recording.2021-08-17.at.10.39.22.mov

Screen Shot 2021-08-17 at 10 45 09

Changes include:

  • Renaming tab from Failing transactions -> Failed transactions (Note that the flyout to be removed in [ML] Move APM Latency Correlations from flyout to transactions page. #107266)
  • Add normalized score logic (which is used for the impact bar)
  • Remove the score & p value in the table and replace instead with Low, Medium, and High severity threshold
  • Show the p values upon row hover (which mimics behavior of the latency table)
  • Some refactor of the Correlation table which previously depends on types of the APM API's significant terms which will soon be deprecated
  • Help text

Follow ups:

  • Deletion of the flyout & moving the failed transactions table to be inline of Transactions details
  • Functional & API integration tests

Checklist

Delete any items that are not applicable to this PR.

@qn895 qn895 added :ml v8.0.0 apm:ml Integration between APM and ML v7.15.0 labels Aug 12, 2021
@qn895 qn895 self-assigned this Aug 12, 2021
@qn895 qn895 force-pushed the ml-apm-failure-correlation-in-flyout branch from bb8b23b to 43d7a85 Compare August 13, 2021 03:39
@qn895 qn895 requested a review from sorenlouv August 13, 2021 17:54
Copy link
Contributor

@walterra walterra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a first go at the code and added some comments. I see some of the naming in files/vars still uses the now outdated failure terms. The up to date name of the feature is Failed transactions correlations. Please have a look at its usage across the PR.

Comment on lines 15 to 17
import type { SelectedSignificantTerm } from '../../../../common/search_strategies/correlations/types';

export type { SelectedSignificantTerm };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure this is necessary? Can the other places that use this type also import from common?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed here f4ecc66

@@ -170,6 +170,7 @@ export function ErrorCorrelations({ onClose }: Props) {
'xpack.apm.correlations.error.percentageColumnName',
{ defaultMessage: '% of failed transactions' }
)}
// @ts-expect-error: this file to be removed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can remove this file as part of this PR already?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleted here e70878c

@@ -0,0 +1,410 @@
/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest to name the file and components without ml prefix (failure_correlations.tsx). We kept the prefix around for 7.14 because the old code was still around. With the previous components planned to go for 7.15 we no longer will need the prefixes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated here 1554436

values: FailureCorrelationValue[];
}

export function MlFailureCorrelations({ onClose }: Props) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest to name FailureCorrelations instead of MlFailureCorrelations.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated here 1554436

Comment on lines 72 to 84
{
...{
...{
environment,
kuery,
serviceName,
transactionName,
transactionType,
start,
end,
},
},
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this is unnecessarly deeply nested. I think it could just be the most inner object.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated here 1554436

import { FailureCorrelationImpactThreshold } from '../../../../../common/search_strategies/failure_correlations/types';
import { FAILURE_CORRELATION_IMPACT_THRESHOLD } from '../../../../../common/search_strategies/failure_correlations/constants';

export function getFailureCorrelationImpactLabel(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick win? Would be great to cover this with jest tests as part of this PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added unit tests for this here c6940c1

@@ -11,7 +11,7 @@ import { euiStyled } from '../../../../../../../src/plugins/kibana_react/common'
import { Maybe } from '../../../../typings/common';

interface Props {
items: Array<Maybe<React.ReactElement>>;
items: Array<Maybe<React.ReactElement | string>>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure we should change this component which is used in other places too as part of this PR. Can you try to adapt the props we pass in? E.g. change 'text' to <>{text}</>.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated here 1554436

try {
const results = await Promise.allSettled(
batches[i].map((fieldName) =>
fetchFailureCorrelationPValues(esClient, params!, fieldName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest to try to get rid of the !.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated here 1554436

score: number;
}>).buckets.map((b) => {
const score = b.score;
const normalizedScore =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be great to have a comment here that describes what this is trying to achieve based on TomV's description of the approach

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated here 1554436

)
);

// Register APM error correlations strategy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Register APM error correlations strategy
// Register APM failed transactions correlations correlations strategy

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated here 1554436

@qn895 qn895 marked this pull request as ready for review August 16, 2021 13:43
@qn895 qn895 requested a review from a team as a code owner August 16, 2021 13:43
@elasticmachine
Copy link
Contributor

Pinging @elastic/ml-ui (:ml)

@botelastic botelastic bot added the Team:APM All issues that need APM UI Team support label Aug 16, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/apm-ui (Team:apm)

services: { data },
} = useKibana<ApmPluginStartDeps>();

const [error, setError] = useState<Error>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the similar code for latency correlations we decided to reduce the number of required state handlers and maintain a single state object, you can see how it was done here: 1284687

Copy link
Member Author

@qn895 qn895 Aug 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@walterra Thanks for the heads up - I've also refactored to use the state handler and to be more inline with Latency correlation changes here

Copy link
Contributor

@smith smith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall. A couple of changes I noticed are needed.

},
},
];
const tableColumns: Array<EuiBasicTableColumn<T>> = columns ?? [];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If columns is already a required prop with this type do we need this assignment or the ?? [] at all?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated here bcec466

button={
<HelpPopoverButton
onClick={() => {
setIsPopoverOpen(!isPopoverOpen);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll want to use the function form of this to ensure you have the correct state:

Suggested change
setIsPopoverOpen(!isPopoverOpen);
setIsPopoverOpen((prevIsPopoverOpen) => !prevIsPopoverOpen);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated here bcec466

async function fetchErrorCorrelations() {
let params: SearchServiceFetchParams = {
...searchServiceParams,
index: 'apm-*',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't be hard-coding this index. It looks like we're not below, so I think we should be doing the same here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was originally just a default fallback, but you're right it's better to not hard-code at all. Updated here bcec466

@smith
Copy link
Contributor

smith commented Aug 17, 2021

I'm not seeing any data on this tab when I run this locally against edge-oblt. Is there anywhere on here I should be able to see data? I wasn't expecting it to empty everywhere but that's what it looks like.

@qn895 qn895 requested a review from smith August 17, 2021 15:40
Copy link
Contributor

@alvarezmelissa87 alvarezmelissa87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ⚡

@qn895
Copy link
Member Author

qn895 commented Aug 17, 2021

Did some re-arrangement here and moved the Beta badge from the tab title into the tab content so it's more inline with Latency correlations tab

Screen.Recording.2021-08-17.at.13.17.30.mov

Also changed to show p values up to 3 significant figures instead of labelling it < 0.001 because it can be useful information.

Copy link
Contributor

@walterra walterra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested and Latest changes LGTM

@qn895 qn895 added the auto-backport Deprecated: Automatically backport this PR after it's merged label Aug 17, 2021
@qn895 qn895 enabled auto-merge (squash) August 17, 2021 20:23
@qn895 qn895 disabled auto-merge August 17, 2021 20:35
@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id before after diff
apm 1608 1611 +3

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
apm 4.4MB 4.4MB +856.0B

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @qn895

@qn895 qn895 merged commit 09e8cfd into elastic:master Aug 17, 2021
kibanamachine added a commit to kibanamachine/kibana that referenced this pull request Aug 17, 2021
…ns (elastic#108441)

* [ML] Refactor with new table

* [ML] Fix types, rename var

* [ML] Remove duplicate action columns

* [ML] Finish renaming for consistency

* [ML] Add failure correlations help popover

* [ML] Add failure correlations help popover

* [ML] Extend correlation help

* Update message

* [ML] Delete old legacy correlations pages

* [ML] Address comments, rename

* [ML] Revert deletion of latency_correlations.tsx

* [ML] Add unit test for getFailedTransactionsCorrelationImpactLabel

* [ML] Rename & fix types

* [ML] Fix logic to note include 0.02 threshold

* [ML] Refactor to use state handler

* [ML] Fix hardcoded index, columns, popover

* [ML] Replace failed transaction tab

* [ML] Fix unused translations

* [ML] Delete empty files

* [ML] Move beta badge to be inside tab content

Co-authored-by: lcawl <lcawley@elastic.co>
Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
@kibanamachine
Copy link
Contributor

💚 Backport successful

Status Branch Result
7.x

This backport PR will be merged automatically after passing CI.

kibanamachine added a commit that referenced this pull request Aug 17, 2021
…ns (#108441) (#109000)

* [ML] Refactor with new table

* [ML] Fix types, rename var

* [ML] Remove duplicate action columns

* [ML] Finish renaming for consistency

* [ML] Add failure correlations help popover

* [ML] Add failure correlations help popover

* [ML] Extend correlation help

* Update message

* [ML] Delete old legacy correlations pages

* [ML] Address comments, rename

* [ML] Revert deletion of latency_correlations.tsx

* [ML] Add unit test for getFailedTransactionsCorrelationImpactLabel

* [ML] Rename & fix types

* [ML] Fix logic to note include 0.02 threshold

* [ML] Refactor to use state handler

* [ML] Fix hardcoded index, columns, popover

* [ML] Replace failed transaction tab

* [ML] Fix unused translations

* [ML] Delete empty files

* [ML] Move beta badge to be inside tab content

Co-authored-by: lcawl <lcawley@elastic.co>
Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>

Co-authored-by: Quynh Nguyen <43350163+qn895@users.noreply.github.com>
Co-authored-by: lcawl <lcawley@elastic.co>
@qn895 qn895 deleted the ml-apm-failure-correlation-in-flyout branch December 21, 2021 18:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
apm:correlations apm:ml Integration between APM and ML auto-backport Deprecated: Automatically backport this PR after it's merged :ml release_note:enhancement Team:APM All issues that need APM UI Team support v7.15.0 v8.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants