-
Notifications
You must be signed in to change notification settings - Fork 8.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[APM] Service maps anomaly detection status in popover #65217
[APM] Service maps anomaly detection status in popover #65217
Conversation
get the actual vs typical values necessary to generate a proper anomaly description
The main difference with the existing query was that i filtered by |
@ogupte Great to see this in 🙌 I have some immediate feedback on the link and the value display;
If it's possible while you're in there, could you please remove the framework badge? #64278 - it will make the content of the popover a little easier on the eyes. |
- changes anomaly score display to integer formatting - update link test to 'View anomalies'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ML changes LGTM
x-pack/plugins/apm/public/components/app/ServiceMap/Popover/Contents.tsx
Show resolved
Hide resolved
sort: [ | ||
{ | ||
record_score: { | ||
order: 'desc' as const | ||
} | ||
} | ||
], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sort: [ | |
{ | |
record_score: { | |
order: 'desc' as const | |
} | |
} | |
], | |
sort: [{ record_score: { order: 'desc' as const } }], |
term: { | ||
result_type: 'record' | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two things:
-
shouldn't the query be limited by a couple of things (environment*, date range etc.)? Was this left out intentionally in the previous PR (cc @smith )?
-
What is
record_score
exactly? Is the the record within a time range? Or the highest score ever recorded? We can only use it if it's the highest score within the time range we are querying for
*
it doesn't seem like ML jobs take environment into account (only service name and transaction type). This seems like a big deal if i'm not mistaken.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- we can make sure to add a date range filter, but i don't think we can filter by environment currently since it's not retained in the ml index.
- this is a case where
record_score
seems appropriate to use after reading https://www.elastic.co/blog/machine-learning-anomaly-scoring-elasticsearch-how-it-works last paragraph
@ogupte Not sure if this is feasible or not, but I opened a design issue for adding a message when anomaly detection job is not available for a given service, indicating to the user that they can create them in the Service details view. #65244 This is is to mitigate any confusion and because we cannot supply users with a flow to enable it from within the Service maps UI. |
…nomalies detected - removes unecessary service framework name in service map queries - adds date range filter for anomaly detection
// ensure all elements get latest data properties | ||
elements.forEach(elementDefinition => { | ||
const el = cy.getElementById(elementDefinition.data.id as string); | ||
el.data(elementDefinition.data); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was necessary to add since the data
of each node wasn't being updated when existing elements were re-added. Before adding this, the displayed anomaly data was stale after updating date ranges.
💚 Build SucceededHistory
To update your PR or re-run it, just comment with: |
* Adds ML status to the service maps popover. Modifies the query a bit to get the actual vs typical values necessary to generate a proper anomaly description * fixed failures and updated tests * component clean up * makes the ML link open in a new window * - Closes elastic#64278 by removing the framework badge. - changes anomaly score display to integer formatting - update link test to 'View anomalies' * - Closes elastic#65244 by displaying a message for services without anomalies detected - removes unecessary service framework name in service map queries - adds date range filter for anomaly detection
* Adds ML status to the service maps popover. Modifies the query a bit to get the actual vs typical values necessary to generate a proper anomaly description * fixed failures and updated tests * component clean up * makes the ML link open in a new window * - Closes #64278 by removing the framework badge. - changes anomaly score display to integer formatting - update link test to 'View anomalies' * - Closes #65244 by displaying a message for services without anomalies detected - removes unecessary service framework name in service map queries - adds date range filter for anomaly detection
if (serviceAnomalies) { | ||
const maxScore = serviceAnomalies.max_score; | ||
return { | ||
...service, | ||
max_score: maxScore, | ||
severity: getSeverity(maxScore), | ||
actual_value: serviceAnomalies.actual_value, | ||
typical_value: serviceAnomalies.typical_value, | ||
job_id: serviceAnomalies.job_id | ||
}; | ||
} | ||
return service; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need the conditional here since serviceAnomalies
is not optional:
const servicesDataWithAnomalies = servicesData.map(service => {
const serviceAnomalies = anomaliesMap[service[SERVICE_NAME]];
const maxScore = serviceAnomalies.max_score;
return {
...service,
max_score: maxScore,
severity: getSeverity(maxScore),
actual_value: serviceAnomalies.actual_value,
typical_value: serviceAnomalies.typical_value,
job_id: serviceAnomalies.job_id
};
});
children | ||
}) => { | ||
const jobId = getMlJobId(serviceName, transactionType); | ||
export const MLJobLink: React.FC<Props> = props => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, are there any advantages to having a union over making some of the props optional?
interface Props {
serviceName: string;
transactionType?: string;
jobId?: string;
external?: boolean;
}
export const MLJobLink: React.FC<Props> = props => {
const jobId = props.jobId
? props.jobId
: getMlJobId(props.serviceName, props.transactionType);
const maxScore = selectedNodeData.max_score; | ||
const actualValue = selectedNodeData.actual_value; | ||
const typicalValue = selectedNodeData.typical_value; | ||
const jobId = selectedNodeData.job_id; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are all any
because selectedNodeData
is vaguely typed. Currently:
selectedNodeData: cytoscape.NodeDataDefinition;
Should be changed to something like:
selectedNodeData: ServiceNode;
where ServiceNode
an intersection type:
type ServiceNode = cytoscape.NodeDataDefinition & TypeWithAnomalyDetectionFields
I expect TypeWithAnomalyDetectionFields
is defined somewhere on the backend like:
interface TypeWithAnomalyDetectionFields {
severity?: string;
maxScore?: string;
...
}
(or perhaps it can be inferred from a ReturnType)
@@ -129,38 +124,32 @@ async function getServicesData(options: IEnvOptions) { | |||
[SERVICE_NAME]: bucket.key as string, | |||
[AGENT_NAME]: | |||
(bucket.agent_name.buckets[0]?.key as string | undefined) || '', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like a hack to default agent name to an empty string (''
). What's wrong with making it optional and handle that explicitly downstream?
}; | ||
}) || [] | ||
); | ||
} | ||
|
||
function getAnomaliesData(options: IEnvOptions) { | ||
const { client } = options.setup; | ||
const { start, end, client } = options.setup; | ||
const rangeQuery = { range: rangeFilter(start, end, 'timestamp') }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that extracting rangeQuery
from the es query improves the readability. Additionally, what about just writing the query out without the helper?
{
range: { timestamp: { gte: start, lte: end, format: 'epoch_millis' } }
}
I think the helper is best if it's kept super simple, and only solves one case. if we extend it and try to make it more generic it loses its value.
Tested:
|
Partially addresses #64205 and closes #64278 & #65244.
Adds ML status to the service maps popover. Modifies the query a bit to get the actual vs typical values necessary to generate a proper anomaly description
![service-maps-ml-popover-4](https://user-images.githubusercontent.com/1967266/81093341-cfd11680-8eb6-11ea-98ec-004b60e07359.gif)