[APM] Progressive fetching (experimental) #127598

dgieselaar · 2022-03-14T12:54:58Z

This implements progressive fetching for the API endpoints used for the service inventory and the trace inventory. Here's how it works:

It's off by default. The user can select a low/medium/high sampling rate if they want to turn it on. The lower the sampling rate, the less documents will be selected to be included in the aggregation.
If it's enabled, do two requests to the API endpoint: one with a probability < 1, depending on the sampling rate, and one with a probability of 1 (unsampled). The probability value is passed to the random_sampler aggregation which now wraps the aggregations we use to calculate the statistics for the service inventory/trace inventory.
The data that is available with the highest sampling rate will be shown on screen.
Once any data is in, remove the loading state from the tables so the user can interact with the elements.

Closes #126593.

…-fetcher

kibana-ci · 2022-03-14T13:24:10Z

💔 Build Failed

Failed CI Steps

Test Failures

[job] [logs] Default CI Group #1 / APM API tests basic apm_mappings_only_8.0.0 Services APIs when data is loaded compare error rate value between service inventory, error rate chart, service inventory and transactions apis "before all" hook for "returns same avg error rate value for Transaction-based and Metric-based data"
[job] [logs] Default CI Group #1 / APM API tests basic apm_mappings_only_8.0.0 Services APIs when data is loaded compare error rate value between service inventory, error rate chart, service inventory and transactions apis "before all" hook for "returns same avg error rate value for Transaction-based and Metric-based data"
[job] [logs] Default CI Group #10 / APM specs correlations latency correlations space with no features disabled sets the timePicker to return data
[job] [logs] Default CI Group #10 / APM specs correlations latency correlations space with no features disabled sets the timePicker to return data

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id	before	after	diff
`apm`	1150	1151	+1

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id	before	after	diff
`observability`	355	356	+1

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id	before	after	diff
`apm`	2.8MB	2.8MB	+495.0B

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id	before	after	diff
`observability`	87.4KB	87.6KB	+128.0B

Unknown metric groups

API count

id	before	after	diff
`observability`	358	359	+1

ESLint disabled in files

id	before	after	diff
`apm`	15	14	-1
`uptime`	7	6	-1
total			-2

ESLint disabled line counts

id	before	after	diff
`apm`	88	90	+2
`uptime`	48	42	-6
total			-4

Total ESLint disabled count

id	before	after	diff
`apm`	103	104	+1
`uptime`	55	48	-7
total			-6

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

benwtrent · 2022-03-14T13:25:33Z

x-pack/plugins/apm/server/routes/services/get_services/get_service_transaction_stats.ts

+            random_sampler: {
+              probability,


This should probably be seeded by session. Otherwise, on every refresh, the initial data fetched could be different. Seeding allows for consistency between page refreshes.

aye, any idea what that should/can be? can it be something like "apm-app"?

Integer number only. So, an integer hash of a string is ok.

@dgieselaar just confirming that for anything that is a visualization, the progressive fetching is seeded for a user's session :)

Do you mean that this is handled in Elasticsearch?

@dgieselaar no, it is not. Elasticsearch doesn't know about kibana user's sessions.

If the random_sampler is used for visualizations, it should be seeded. Otherwise a different subset of the data is used on every search call, which would be jarring as the visualization will subtly jump around.

Dang, too bad, because I forgot about it. I'll create a follow-up issue. I don't think this is required for an experimental feature though.

…-fetcher

sorenlouv · 2022-03-15T13:52:21Z

x-pack/plugins/apm/server/routes/apm_routes/get_global_apm_server_route_repository.ts

@@ -38,7 +38,7 @@ import { eventMetadataRouteRepository } from '../event_metadata/route';
 import { suggestionsRouteRepository } from '../suggestions/route';
 import { agentKeysRouteRepository } from '../agent_keys/route';

-const getTypedGlobalApmServerRouteRepository = () => {
+function getTypedGlobalApmServerRouteRepository() {


I don't know why you changed this but thank you!! :D

that makes two of us!

lizozom · 2022-03-15T18:59:29Z

This is a really exciting idea and I think that multiple teams will greatly benefit from this.

If this POC is successful, we could impement this as a search strategy where the random response is emmitted with isPartial: true and the full one is emmitted with isPartial: false.

The search service API already supports emmitting multiple responses and it would be up to the consumer to decide whether they want to render the partial result or wait for the final response to render.

…-fetcher

Closes elastic#127294.

…ut-as-tiebreaker

…-fetcher

…ut-as-tiebreaker

…-fetcher

…etcher

…-fetcher

gbamparop · 2022-03-29T10:39:04Z

x-pack/plugins/observability/common/ui_settings_keys.ts

@@ -10,4 +10,5 @@ export const maxSuggestions = 'observability:maxSuggestions';
 export const enableComparisonByDefault = 'observability:enableComparisonByDefault';
 export const enableInfrastructureView = 'observability:enableInfrastructureView';
 export const defaultApmServiceEnvironment = 'observability:apmDefaultServiceEnvironment';
+export const enableRandomSampling = 'observability:enableRandomSampling';


Might be good to add an apm prefix as discussed before for other settings

dgieselaar · 2022-03-29T18:24:42Z

We pulled this from 8.2 after running out of time. We ran into various things, all related to the random_sampler aggregation being a relatively new feature:

The Observability test clusters and the ES image used in APM Integration Testing is not easily updated. The former is updated every one or two weeks, but it's a manual process. APM Integration Testing uses images from the unified release build, but that was red for a week during the development of the feature.
There were three consecutive ES bugs in the random_sampler aggregation. All were quickly fixed by Ben, but due to the aforementioned issue w/ working with the latest ES builds, it slowed down implementation of the feature on our side.
The Kibana QA team recently added a new option for the functional tests, where backwards compatibility for cross-cluster search is enforced. There is currently no way to disable this flag, so we cannot write any API tests for this feature.
The implementation unconditionally wrapped everything in a random_sampler agg, assuming it's a no-op. This made all the API tests fail. We could have probably fixed this, but it's a significant code change and given the time until FF, it's better to pull this change entirely.

…-fetcher

kpatticha · 2022-04-12T18:32:37Z

...tes/services/get_services_detailed_statistics/get_service_transaction_detailed_statistics.ts

-          services: {
-            terms: {
-              field: SERVICE_NAME,
+          sampled: {


for the sake of consistency can we use either sample or sampled? 🙏

done! changed sampled to sample.

kpatticha · 2022-04-12T19:17:29Z

this is super exciting 🥳

it might not be in the scope of the current PR but I've noticed that the detailed_statistics request is fired only when both the sampled and unsampled service inventory fetches are completed.

The downside is that we won't see a performance improvement for rendering the sparklines, it might be even a bit slower because it depends on 2 requests.

@dgieselaar do you think is possible to decouple the dependency?

…-fetcher

dgieselaar · 2022-04-13T08:45:33Z

@dgieselaar do you think is possible to decouple the dependency?

There's a possibility of the service names changing after the unsampled request comes in. I'd like to avoid triggering requests instead of two for detailed_statistics.

The downside is that we won't see a performance improvement for rendering the sparklines, it might be even a bit slower because it depends on 2 requests.

There will still be a perforrmance improvement compared to to day - we already block on main_statistics before we fetch detailed_statistics, and detailed_statistics will use the same progressive loading technique, so the sampled data for detailed_statistics will show up earlier, and the unsampled data should show up in roughly the same time as today.

x-pack/plugins/apm/public/components/app/service_inventory/index.tsx

cauemarcondes · 2022-04-13T13:58:10Z

x-pack/plugins/apm/public/hooks/use_progressive_fetcher.tsx

+    progressiveLoadingQuality
+  );
+
+  const sampledFetch = useFetcher(


It would be nice to have some explanation about the difference between sampledFetch and unsampledFetch

Do you mean about the concept of sampling? IMHO the variable names are pretty descriptive as-is, but yes, they require the reader to understand what sampling is. But explaining that is going to be a long comment here 😄

Or at least something that says when we should use useProgressiveFetcher over useFetcher.

dgieselaar · 2022-04-14T07:16:53Z

@cauemarcondes when you use "request changes", can you be explicit about the changes you're requesting? IMHO it should be reserved for blockers and I don't really see any.

…-fetcher

cauemarcondes · 2022-04-14T14:10:08Z

x-pack/plugins/apm/public/hooks/use_progressive_fetcher.tsx

+
+  const unsampledFetch = useFetcher(
+    (regularCallApmApi) => {
+      return callback(clientWithProbability(regularCallApmApi, 1));


Do you think it would be clearer if you use ProgressiveLoadingQuality here instead of 1?

Suggested change

return callback(clientWithProbability(regularCallApmApi, 1));

return callback(clientWithProbability(regularCallApmApi, ProgressiveLoadingQuality.off));

cauemarcondes

LGTM very nice! 👏🏻

cauemarcondes · 2022-04-14T14:19:07Z

@cauemarcondes when you use "request changes", can you be explicit about the changes you're requesting? IMHO it should be reserved for blockers and I don't really see any.

Yeah I shouldn't have selected "request changes" my bad.

…-fetcher

…-fix'

…a into use-progressive-fetcher

…-fetcher

kibana-ci · 2022-04-21T09:09:05Z

💚 Build Succeeded

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id	before	after	diff
`apm`	1185	1186	+1
`observability`	390	391	+1
total			+2

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id	before	after	diff
`observability`	366	370	+4

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id	before	after	diff
`apm`	2.8MB	2.8MB	+640.0B

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id	before	after	diff
`observability`	91.4KB	91.9KB	+459.0B

Unknown metric groups

API count

id	before	after	diff
`observability`	369	373	+4

ESLint disabled line counts

id	before	after	diff
`apm`	89	92	+3

Total ESLint disabled count

id	before	after	diff
`apm`	104	107	+3

History

💔 Build #39359 failed 9a3c6b6
💔 Build #39356 failed 93206b2
💔 Build #38176 failed 5342221
💔 Build #38151 failed ce20dac
💔 Build #37831 failed 3c32228

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

kibanamachine · 2022-04-21T10:12:09Z

⚪ Backport skipped

The pull request was not backported as there were no branches to backport to. If this is a mistake, please apply the desired version labels or run the backport tool manually.

Manual backport

To create the backport manually run:

node scripts/backport --pr 127598

Questions ?

Please refer to the Backport tool documentation

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>

dgieselaar added 2 commits March 14, 2022 13:38

Progressive fetching

e6f8782

Merge branch 'main' of github.com:elastic/kibana into use-progressive…

26947c5

…-fetcher

benwtrent reviewed Mar 14, 2022

View reviewed changes

Merge branch 'main' of github.com:elastic/kibana into use-progressive…

5930532

…-fetcher

sorenlouv reviewed Mar 15, 2022

View reviewed changes

dgieselaar added 16 commits March 17, 2022 14:42

Use random_sampler for traces overview

a0cd18e

Merge branch 'main' of github.com:elastic/kibana into use-progressive…

ff17b0e

…-fetcher

Merge branch 'main' of github.com:elastic/kibana into use-progressive…

747fcbb

…-fetcher

Merge branch 'main' of github.com:elastic/kibana into use-progressive…

f773e15

…-fetcher

[APM] Set default preferred sorting tie-breaker to throughput

0be19a4

Closes elastic#127294.

Merge branch 'main' of github.com:elastic/kibana into prefer-throughp…

02644b3

…ut-as-tiebreaker

Merge branch 'main' of github.com:elastic/kibana into use-progressive…

878acd5

…-fetcher

failureRate > transactionErrorRate

29927a8

Merge branch 'main' of github.com:elastic/kibana into prefer-throughp…

b32a80c

…ut-as-tiebreaker

Only use data from terms enum API if tiebreaker is service name

4ed50cd

Fix tests, fix typo in comment

87df27e

Merge branch 'main' of github.com:elastic/kibana into prefer-throughp…

e0fa9aa

…ut-as-tiebreaker

Merge branch 'main' of github.com:elastic/kibana into use-progressive…

e9f2299

…-fetcher

Merge branch 'main' of github.com:elastic/kibana into use-progressive…

2036581

…-fetcher

Merge branch 'prefer-throughput-as-tiebreaker' into use-progressive-f…

6584d31

…etcher

Merge branch 'main' of github.com:elastic/kibana into use-progressive…

e6ae93a

…-fetcher

gbamparop reviewed Mar 29, 2022

View reviewed changes

dgieselaar added 3 commits April 11, 2022 12:45

Merge branch 'main' of github.com:elastic/kibana into use-progressive…

930f1b6

…-fetcher

API tests

99da7fe

Merge branch 'main' of github.com:elastic/kibana into use-progressive…

a048a19

…-fetcher

dgieselaar added release_note:enhancement Team:APM All issues that need APM UI Team support labels Apr 12, 2022

Move files around to appease TS

7ec0f9e

pgayvallet approved these changes Apr 12, 2022

View reviewed changes

kpatticha reviewed Apr 12, 2022

View reviewed changes

dgieselaar added 2 commits April 13, 2022 10:43

Fix API test, sampled > sample

c25fa0a

Merge branch 'main' of github.com:elastic/kibana into use-progressive…

3c32228

…-fetcher

cauemarcondes requested changes Apr 13, 2022

View reviewed changes

dgieselaar added 3 commits April 14, 2022 11:10

Merge branch 'main' of github.com:elastic/kibana into use-progressive…

ce20dac

…-fetcher

Merge branch 'main' of github.com:elastic/kibana into use-progressive…

5caa6ab

…-fetcher

Correct namespace for labels

5342221

cauemarcondes reviewed Apr 14, 2022

View reviewed changes

cauemarcondes approved these changes Apr 14, 2022

View reviewed changes

dgieselaar and others added 6 commits April 20, 2022 10:00

Merge branch 'main' of github.com:elastic/kibana into use-progressive…

8a9eed4

…-fetcher

[CI] Auto-commit changed files from 'node scripts/eslint --no-cache -…

93206b2

…-fix'

Fix lint errors

2e078a7

Merge branch 'use-progressive-fetcher' of github.com:dgieselaar/kiban…

9a3c6b6

…a into use-progressive-fetcher

Fix lint errors

f8d34b4

Merge branch 'main' of github.com:elastic/kibana into use-progressive…

26f9b9b

…-fetcher

dgieselaar merged commit 7af6915 into elastic:main Apr 21, 2022

dgieselaar deleted the use-progressive-fetcher branch April 21, 2022 10:11

kibanamachine added the backport:skip This commit does not require backporting label Apr 21, 2022

dgieselaar mentioned this pull request Apr 21, 2022

[APM] Seed random_sampler aggregation #130784

Closed

dmlemeshko pushed a commit to dmlemeshko/kibana that referenced this pull request May 5, 2022

[APM] Progressive fetching (experimental) (elastic#127598)

1ace3fe

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>

Mpdreamz mentioned this pull request May 9, 2022

Support probability: auto for random sampler aggregation. elastic/elasticsearch#86559

Open

kertal pushed a commit to kertal/kibana that referenced this pull request May 24, 2022

[APM] Progressive fetching (experimental) (elastic#127598)

b61f596

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[APM] Progressive fetching (experimental) #127598

[APM] Progressive fetching (experimental) #127598

dgieselaar commented Mar 14, 2022 •

edited

Loading

kibana-ci commented Mar 14, 2022 •

edited

Loading

API count

ESLint disabled in files

ESLint disabled line counts

Total ESLint disabled count

benwtrent Mar 14, 2022

dgieselaar Mar 15, 2022

benwtrent Mar 15, 2022

benwtrent Apr 21, 2022

dgieselaar Apr 21, 2022

benwtrent Apr 21, 2022

dgieselaar Apr 21, 2022

dgieselaar Apr 21, 2022

sorenlouv Mar 15, 2022

dgieselaar Mar 16, 2022

lizozom commented Mar 15, 2022

gbamparop Mar 29, 2022

dgieselaar commented Mar 29, 2022

kpatticha Apr 12, 2022

dgieselaar Apr 13, 2022

kpatticha commented Apr 12, 2022

dgieselaar commented Apr 13, 2022

cauemarcondes Apr 13, 2022

dgieselaar Apr 14, 2022

cauemarcondes Apr 14, 2022

dgieselaar commented Apr 14, 2022

cauemarcondes Apr 14, 2022

cauemarcondes left a comment

cauemarcondes commented Apr 14, 2022

kibana-ci commented Apr 21, 2022

API count

ESLint disabled line counts

Total ESLint disabled count

kibanamachine commented Apr 21, 2022

	return callback(clientWithProbability(regularCallApmApi, 1));
	return callback(clientWithProbability(regularCallApmApi, ProgressiveLoadingQuality.off));

[APM] Progressive fetching (experimental) #127598

[APM] Progressive fetching (experimental) #127598

Conversation

dgieselaar commented Mar 14, 2022 • edited Loading

kibana-ci commented Mar 14, 2022 • edited Loading

💔 Build Failed

Failed CI Steps

Test Failures

Metrics [docs]

Module Count

Public APIs missing comments

Async chunks

Page load bundle

API count

ESLint disabled in files

ESLint disabled line counts

Total ESLint disabled count

History

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lizozom commented Mar 15, 2022

Choose a reason for hiding this comment

dgieselaar commented Mar 29, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kpatticha commented Apr 12, 2022

dgieselaar commented Apr 13, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dgieselaar commented Apr 14, 2022

Choose a reason for hiding this comment

cauemarcondes left a comment

Choose a reason for hiding this comment

cauemarcondes commented Apr 14, 2022

kibana-ci commented Apr 21, 2022

💚 Build Succeeded

Metrics [docs]

Module Count

Public APIs missing comments

Async chunks

Page load bundle

API count

ESLint disabled line counts

Total ESLint disabled count

History

kibanamachine commented Apr 21, 2022

⚪ Backport skipped

Manual backport

Questions ?

dgieselaar commented Mar 14, 2022 •

edited

Loading

kibana-ci commented Mar 14, 2022 •

edited

Loading