Handling huge cardinality of page load transaction names #56

hmdhk · 2018-09-21T08:11:19Z

A little bit of background on the issue, we used included the page url as transaction name (without the query string) by default and we had the same problem even without the query string there are a lot applications that simply include ids in their url.

Currently the default transaction name is unknown and not the page url so all transactions are grouped by default under unknown but there is an API to let users set the initial page load transaction name and it can be set to the page url by the user which probably is the easiest way to set the page name. This can create a large number of transaction names.

Some solutions:

Having a url pattern config option that we use to set the transaction name to the page url
Implementing heuristics that tries to detects ids in the URL (I've made a POC) on the agent
Implementing a grouping algorithm on the Kibana side.

The text was updated successfully, but these errors were encountered:

alvarolobato · 2018-09-21T10:34:55Z

I think we could do a two step approach here:

Improve the API and add an additional method other than setInitialPageLoadName() that would directly take the URL and strip the query string. We could instead provide a helper method to do it or instruct the user how to do it, but I would prefer having an specific function for it.
Allow a way to initialize an list of pattern matches that the user could define in order to strip the parameters embedded in the URL. I would try to use simple matching patterns and avoid using regex for simplicity. This matching could be done either in the agent or the server but it seems that a single matching per pageload in the agent is not a big deal and we potentially save resources on the server.

Also the default behaviour cloud be changed to, probably, the step 1, based on the url. In my opinion is better than the current unknown
cc. @roncohen

sorenlouv · 2018-09-21T12:41:23Z

I would try to use simple matching patterns and avoid using regex for simplicity.

I like minimatch for these usecases
https://github.com/isaacs/minimatch#minimatch

alvarolobato · 2018-12-04T13:16:02Z

Related to elastic/kibana#26544

hmdhk · 2020-06-10T10:32:43Z

We had a meeting around sampling and high cardinality:

We discussed storage and network traffic reduction
For storage we should look into aggregation and trimming data (e.g. removing spans for older transactions)
For network traffic
- we still need to have sampling in some form or another
- we discussed providing config options to let the user decide which transactions are important (this can be provided through central config)
- Another idea is to crawl the website and discover the urls and let the user choose in the UI
High cardinality issue
- We will provide a config option to let the user specify the url pattern (this can be configured
  in central config or in apm-server) -> issue
- we discussed a heuristic based solution (POC)
- we also discussed using machine learning to categorise url sections (I will do a POC on this)

cc @axw , @drewpost @vigneshshanmugam

hmdhk added the [zube]: Inbox label Sep 21, 2018

hmdhk mentioned this issue Sep 21, 2018

Provide a guide to setting initial page load transaction name #58

Closed

hmdhk added [zube]: Backlog and removed [zube]: Inbox labels Nov 14, 2018

sorenlouv mentioned this issue May 25, 2020

[APM] Show warning in UI when cardinality of transaction.name exceeds threshold elastic/kibana#67273

Closed

hmdhk added this to the Next milestone Jun 10, 2020

hmdhk self-assigned this Jun 10, 2020

hmdhk mentioned this issue Jun 10, 2020

Provide an option to configure the url pattern for RUM data elastic/apm-server#3868

Open

vigneshshanmugam mentioned this issue Jun 26, 2020

feat(rum): categorize transactions based on current url #827

Merged

vigneshshanmugam removed the [zube]: Backlog label Aug 13, 2020

vigneshshanmugam modified the milestones: Next, Backlog Aug 13, 2020

hmdhk added the enhancement label Oct 7, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handling huge cardinality of page load transaction names #56

Handling huge cardinality of page load transaction names #56

hmdhk commented Sep 21, 2018 •

edited

Loading

alvarolobato commented Sep 21, 2018 •

edited

Loading

sorenlouv commented Sep 21, 2018

alvarolobato commented Dec 4, 2018

hmdhk commented Jun 10, 2020 •

edited

Loading

Handling huge cardinality of page load transaction names #56

Handling huge cardinality of page load transaction names #56

Comments

hmdhk commented Sep 21, 2018 • edited Loading

alvarolobato commented Sep 21, 2018 • edited Loading

sorenlouv commented Sep 21, 2018

alvarolobato commented Dec 4, 2018

hmdhk commented Jun 10, 2020 • edited Loading

hmdhk commented Sep 21, 2018 •

edited

Loading

alvarolobato commented Sep 21, 2018 •

edited

Loading

hmdhk commented Jun 10, 2020 •

edited

Loading