-
Notifications
You must be signed in to change notification settings - Fork 8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[APM] Migrate tx latency chart and group stats to rollups/service metrics #153162
[APM] Migrate tx latency chart and group stats to rollups/service metrics #153162
Conversation
Pinging @elastic/apm-ui (Team:APM) |
…-ref HEAD~1..HEAD --fix'
@@ -15,6 +15,7 @@ export const isAlertDetailsEnabledPerApp = ( | |||
alert: TopAlert | null, | |||
config: ConfigSchema | null | |||
): boolean => { | |||
console.log(alert, config); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the only file change that we own, so after removing it there is no need for review from AO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks! missed this
…w-rollups-latency-group-stats
…w-rollups-latency-group-stats
export type ApmTransactionDocumentType = | ||
| ApmDocumentType.TransactionMetric | ||
| ApmDocumentType.TransactionEvent; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about service transaction metric?
export type ApmTransactionDocumentType = | |
| ApmDocumentType.TransactionMetric | |
| ApmDocumentType.TransactionEvent; | |
export type ApmTransactionDocumentType = | |
| ApmDocumentType.ServiceTransactionMetric | |
| ApmDocumentType.TransactionMetric | |
| ApmDocumentType.TransactionEvent; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case we add ApmDocumentType.ServiceTransactionMetric
to this one, it becomes in ApmServiceTransactionDocumentType
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, it's about making sure that the metric contains certain data. ie, service-instance specific data (like host) will not be available on transaction metrics.
import { | ||
LatencyAggregationType, | ||
latencyAggregationTypeRt, | ||
} from '../../../common/latency_aggregation_types'; | ||
import { getApmEventClient } from '../../lib/helpers/get_apm_event_client'; | ||
import { getSearchTransactionsEvents } from '../../lib/helpers/transactions'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not related to this PR but afaict getSearchTransactionsEvents
should be renamed getSearchAggregatedTransactions
. It was a little confusing debugging this when that function is named the exact opposite of what it actually does.
import { getSearchTransactionsEvents } from '../../lib/helpers/transactions'; | |
import { getSearchAggregatedTransactions } from '../../lib/helpers/transactions'; |
Btw. I don't think we should use the term "aggregated transactions" anymore. I'd prefer to use the terms "transaction metrics" and "service transaction metrics" instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we'll eventually remove it anyway.
@@ -65,6 +69,9 @@ export function LatencyChart({ height, kuery }: Props) { | |||
useTransactionLatencyChartsFetcher({ | |||
kuery, | |||
environment, | |||
transactionName: | |||
'transactionName' in query ? query.transactionName : null, | |||
latencyAggregationType: getLatencyAggregationType(latencyAggregationType), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we use io-ts to validate that latencyAggregationType
is indeed of type LatencyAggregationType
?
Similar to what we do for sortDirection
:
sortDirection: t.union([t.literal('asc'), t.literal('desc')]), |
latencyAggregationType: getLatencyAggregationType(latencyAggregationType), | |
latencyAggregationType, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getLatencyAggregationType
also returns a default value if it's undefined.
serviceName={serviceName} | ||
start={historicalRange.start} | ||
end={historicalRange.end} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this fix related to this this PR and the move to service transaction metrics?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's a cosmetic change, it doesn't change the behaviour of the component.
|
||
const response = await apmEventClient.search( | ||
'get_service_transaction_groups', | ||
{ | ||
apm: { | ||
events: [ | ||
getProcessorEventForTransactions(searchAggregatedTransactions), | ||
sources: [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should eventually replace events
with sources
everywhere - even for the queries that don't support rollups or service metrics. It adds complexity to have to separate ways to define this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed! I'm adding some additional support in a follow-up PR.
@@ -99,9 +101,6 @@ export async function getServiceTransactionGroups({ | |||
], | |||
}, | |||
}, | |||
...getDocumentTypeFilterForTransactions( | |||
searchAggregatedTransactions | |||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is no longer needed because you apply the filter in APMEventClient
, right? We should aim to do this consistently everywhere. I'm sure someone will use events
and then forget to add the filter explicitly because they expect it to be added automatically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep!
start, | ||
end, | ||
kuery, | ||
numBuckets: 20, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Afaict numBuckets
is used to calculate bucketSizeInSeconds
- but this is not needed for the table. Can it be omitted?
numBuckets: 20, | |
numBuckets: 20, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The transaction group table has spark plots.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But this is for the main statistics (which doesn't include spark plots afaik)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah good point. I think it should still be there though, 1 bucket might lead to a bigger error boundary (e.g. selecting the 1 hour bucket for 1.5 hours will lead to skewed statistics).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm.. It still seems odd to specify number of buckets when the expected return value is a single metric number (as opposed to a timeseries) but no biggie.
x-pack/plugins/apm/public/components/shared/transactions_table/index.tsx
Outdated
Show resolved
Hide resolved
x-pack/plugins/apm/public/components/app/service_inventory/index.tsx
Outdated
Show resolved
Hide resolved
return ( | ||
<TimeRangeMetadataContextProvider | ||
uiSettings={uiSettings} | ||
useSpanName={isOperationView} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something still feels odd about this option (useSpanName
). I don't understand why we have to special-case service destination metrics like this - isn't it just another source similar to transaction metrics and transaction events?
I think the response of GET /internal/apm/time_range_metadata
shows the inconsistency:
{
"isUsingServiceDestinationMetrics": true,
"sources": [
{
"documentType": "serviceTransactionMetric",
"rollupInterval": "1m",
"hasDocs": true
},
{
"documentType": "serviceTransactionMetric",
"rollupInterval": "10m",
"hasDocs": true
},
{
"documentType": "serviceTransactionMetric",
"rollupInterval": "60m",
"hasDocs": true
},
{
"documentType": "transactionMetric",
"rollupInterval": "1m",
"hasDocs": true
},
{
"documentType": "transactionMetric",
"rollupInterval": "10m",
"hasDocs": true
},
{
"documentType": "transactionMetric",
"rollupInterval": "60m",
"hasDocs": true
},
{
"documentType": "transactionEvent",
"rollupInterval": "none",
"hasDocs": true
}
]
}
Shouldn't the service destination metrics be part of sources, like:
{
"sources": [
{
"documentType": "serviceDestinationMetrics",
"rollupInterval": "none",
"hasDocs": true
},
{
"documentType": "spanEvents",
"rollupInterval": "none",
"hasDocs": true
},
// ...
]
}
Apart from making things more consistent, we're now able to have views where one component uses serviceDestinationMetrics
and another uses spanEvents
similar to what we do with transactions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In earlier versions span.name
was not available on span metrics. That was a fairly recent change. But we have a couple of views that were added after, and require span.name
to be present. If span.name
is not available on span metrics, we fall back to span events.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then what about something like this?
{
"sources": [
{
"documentType": "spanMetricsWithName",
"rollupInterval": "none",
"hasDocs": false
},
{
"documentType": "spanMetrics",
"rollupInterval": "none",
"hasDocs": true
},
{
"documentType": "spanEvents",
"rollupInterval": "none",
"hasDocs": true
},
// ...
]
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've considered that, also in the context of transaction metrics with transaction.duration.summary
, but I don't think different document types for schema changes will scale. I would rather return some metadata with the document type, e.g. a version or a meta
property. But I will tackle that when I implement support for transaction.duration.summary
on tx metrics (which might not include changes to service destination metrics as it makes sense to tackle that separately).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather return some metadata with the document type, e.g. a version or a meta property.
That makes sense. Not a blocker but it would be cleaner with a "unified" model for this. The useSpanName
seems like an afterthought.
|
@@ -18,7 +18,7 @@ export function RouterProvider({ | |||
}: { | |||
router: Router<RouteMap>; | |||
history: History; | |||
children: React.ReactNode; | |||
children?: React.ReactNode; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this intended?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, no need for it to be required.
export type ApmTransactionDocumentType = | ||
| ApmDocumentType.TransactionMetric | ||
| ApmDocumentType.TransactionEvent; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case we add ApmDocumentType.ServiceTransactionMetric
to this one, it becomes in ApmServiceTransactionDocumentType
…/index.tsx Co-authored-by: Søren Louv-Jansen <sorenlouv@gmail.com>
…ex.tsx Co-authored-by: Søren Louv-Jansen <sorenlouv@gmail.com>
This has always been the behaviour, it's not changing. |
…w-rollups-latency-group-stats
….com:dgieselaar/kibana into service-overview-rollups-latency-group-stats
💚 Build Succeeded
Metrics [docs]Async chunks
Unknown metric groupsESLint disabled line counts
Total ESLint disabled count
History
To update your PR or re-run it, just comment with: |
Migrates the transaction latency chart and the transaction group statistics table to use service overview metrics/rollups when appropriate.
Part of #152693 and #146683 (the work is split up to prevent a huge PR).