Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Application Usage] Better SO management #87840

Closed
afharo opened this issue Jan 11, 2021 · 8 comments · Fixed by #94923
Closed

[Application Usage] Better SO management #87840

afharo opened this issue Jan 11, 2021 · 8 comments · Fixed by #94923
Assignees
Labels
Feature:Telemetry Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc

Comments

@afharo
Copy link
Member

afharo commented Jan 11, 2021

Now that SO's incrementCounter allows atomic custom increases (and for multiple fields at once), we can get rid of the transactional documents.

Maybe we can keep the rollup logic throughout the rest of the 7.x series to make sure we rollup the transactional documents before completely removing them.

Alternatively, we could explore a new way of reporting this type of usage. Similar to how UI Counters work, we could only provide the deltas for the last 3 days. That would require a common API on the server-side to unify their logic (#81645)

@afharo afharo added Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc Feature:Telemetry labels Jan 11, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-core (Team:Core)

@Bamieh
Copy link
Member

Bamieh commented Jan 29, 2021

It would be nice to have a bulk delete api for savedobjects too.

@pgayvallet
Copy link
Contributor

It would be nice to have a bulk delete api for savedobjects too.

tbh, I'm surprised we don't have this yet. However given the technical implications, it could probably benefit from its own issue

@afharo
Copy link
Member Author

afharo commented Feb 1, 2021

++ @pgayvallet I think it already exists: #30503

@pgayvallet
Copy link
Contributor

@alexfrancoeur due to improvements in the saved objects API, we no longer need the transactional documents we were using for application usage, as we will be able to directly increment the daily application usage doc when receiving reports from the client-side.

Now the question is: is it acceptable to totally drop support for the transaction documents?

To understand the implications: Daily and total documents for application usage were added in 7.9. Meaning that if we were to totally drop support for transactional documents, any customer upgrading directly from 7.8 or older to 7.13 or higher will 'loose' his application usage data, as we would not have the opportunity to convert/roll the transactional documents into daily aggregations.

The other option would be to still perform the improvement by directly writing to daily instead of transactional, but keeping the transactional->daily rollup mechanism in 7.x to avoid this loss of data when migrating from/to the specified versions. Technically, this is slightly more complicated, as it would cause divergences between the master and 7.x branches, which we would like to avoid, if possible.

Note that in both cases, the transactional support will be removed in 8.0, meaning that customers migrating directly from 7.8 or older to 8.0 or higher will loose their application usage data. So the question is 'only' to know if preserving the application usage data from 7.8 or older to a 7.x branch of 7.13 or higher is mandatory or not.

cc @afharo: did I say anything wrong here 😅 ?

@alexfrancoeur
Copy link

Generally, if transactional support will be going away anyway, I'd prefer to make any drastic changes sooner than later. That being said, I'm not totally sure I understand the implications to analysis by moving off of transactional documents. Today the end result for analysis ends up being cluster wide time spent in and clicks into applications over a few given timeframes in a cluster snapshot. We are leveraging application usage telemetry to identify adoption of applications, so there may be a larger impact that we should discuss.

any customer upgrading directly from 7.8 or older to 7.13 or higher will 'loose' his application usage data, as we would not have the opportunity to convert/roll the transactional documents into daily aggregations

For analysis, we only have aggregate counts, so if individual transactions are being captured, I don't believe we ever see them. When you say lose the data, do you mean a reset? Meaning, the transactional documents go away, and we starting incrementing from 0 at the point of upgrade. I'd like to make this transition as seamless as possible. If we're talking about resetting the counters to 0 on an upgrade, I wonder if we can "fill in" some of these gaps on ingest somehow.

I think it might be worth a quick live sync and include a few folks not on this issue to truly understand impact before making this decision. Happy to set something up.

cc: @thesmallestduck

@afharo
Copy link
Member Author

afharo commented Mar 18, 2021

Hi @alexfrancoeur, I tried to draw a diagram of the differences across versions:

Untitled drawing

In 7.7 & 7.8: we only had transactional documents (containing 3 minutes worth of data from each user's browsers). Then the collector would read all of them and aggregate them into the last 7/30/90 days/total counters. With this strategy, we caused an explosion in the number of SOs. That explosion led to some highly used clusters failing to report these metrics because the transactional documents were more than 10k (apart from some other side effects, like migrations taking too long or import/export issues).

In 7.9, due to an explosion in the number of SOs, we applied a rollup logic that runs every 30 minutes to pre-aggregate the transactional documents to daily objects. Reducing the number of transactional SOs to 10 docs per user max. We were still using transactional docs to avoid concurrency issues when editing the daily counters.

Now that we have APIs that allow us to do so, we are considering atomically increasing the counters of the daily objects without relying on the rollup logic. So we won't generate any transactional documents anymore.

The question is: should we keep the rollup logic to migrate the transactional documents to daily for the clusters that will upgrade to 7.13+?
For those already in 7.9+, the number of transactional documents will be close to 0, so, when they upgrade, the impact of not using them anymore should be minimal. However, if a customer upgrades from 7.7/7.8 to 7.13, that would result in a full reset of the counters for those clusters.
As @pgayvallet mentioned, we wouldn't like to maintain this logic forever. So 8.0 would remove the rollup logic anyway: meaning that if customers upgrade from 7.7/7.8 to 8.0, the reset will happen anyway.

I hope I shed some light 😇

@pgayvallet
Copy link
Contributor

FWIW, as it didn't impact much of the code anyway, I preserved the transactional documents in #94923. The transactional->daily rolling is done only once, at startup, to preserve the BWC during migration from older versions.

kibana-core [DEPRECATED] automation moved this from 7.13 to Done (7.13) Mar 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Telemetry Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc
Projects
5 participants