-
Notifications
You must be signed in to change notification settings - Fork 387
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Azure Logs] Set enabled: false as default for all data streams #4373
[Azure Logs] Set enabled: false as default for all data streams #4373
Conversation
🚀 Benchmarks reportPackage
|
Data stream | Previous EPS | New EPS | Diff (%) | Result |
---|---|---|---|---|
springcloudlogs |
3816.79 | 2049.18 | -1767.61 (-46.31%) | 💔 |
activitylogs |
1280.41 | 623.05 | -657.36 (-51.34%) | 💔 |
application_gateway |
2074.69 | 1278.77 | -795.92 (-38.36%) | 💔 |
auditlogs |
2207.51 | 1180.64 | -1026.87 (-46.52%) | 💔 |
firewall_logs |
2004.01 | 1459.85 | -544.16 (-27.15%) | 💔 |
identity_protection |
3267.97 | 1700.68 | -1567.29 (-47.96%) | 💔 |
platformlogs |
4545.45 | 2132.2 | -2413.25 (-53.09%) | 💔 |
provisioning |
2777.78 | 2008.03 | -769.75 (-27.71%) | 💔 |
signinlogs |
1776.2 | 957.85 | -818.35 (-46.07%) | 💔 |
To see the full report comment with /test benchmark fullreport
🌐 Coverage report
|
Have we evaluated what will happen to users upgrading to this version? Will this alter significantly the behaviour of the current package? |
+1 on @endorama's comment. This is a great change and let's make sure it's not breaking existing users. |
IIRC I tested this scenario, and the I will rerun the upgrade scenario and document it in the description. |
@endorama @kaiyan-sheng, I tested upgrading Azure Logs integration from version 1.3.0 to version 1.5.1. In the following video, I:
CleanShot.2022-10-04.at.17.17.04.mp4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! Thank you for testing and the video! Could you add a changelog to this please?
4c234a7
to
a666258
Compare
I have found only this downside. When I try to install an "individual" integration (in the example below, Azure Active Directory) instead of the "collective" one (Azure Logs) the data stream is disabled. This is both not great and okay.
Video: CleanShot.2022-10-04.at.17.31.43.mp4 |
@zmoog I think it's Okay in this case because there are different data streams(audit, provisioning, sign-in...) under Azure Active Directory for users to choose from. If I understand correctly, we recommend users to use different event hubs for these log types under Azure Active Directory right? |
Yep, the idea is to invite users to enable only the data streams they use. We also have other integrations with just one data stream. It would be disabled also for one data stream integration. Do you see this as a problem?
Each Here's an example from Azure AD where:
The end goal is to avoid the scenario where users install the Azure Logs integration, they export log categories for only 2-3 of them. But, since the default setting is "enabled", they end up having one Today we already have 10 data streams in the Azure Logs integration. We already have new ones planned, so this inefficiency and potential problems will probably worsen over time. In the latest Azure Logs doc update, we expanded the event hub section advising about properly setting up the integration. High-volume deployment would probably benefit from a dedicated event hub. I plan to further expand this content. Let me know what you think! 🙇 |
My opinion is having false as the default to prevent users from using one |
Yeah, I think the trade-off is worth too. Thank you, @kaiyan-sheng, for taking some time to think about this! 🙇 |
The Azure Logs package has a growing number of data streams. When users install the collective Azure Logs integration, we end up in a situation where Filebeat spawns a lot of "azureeventhub" behind the scenes, all receiving data from the same event hub. Many (data streams) to one (event hub) are not a good default. Each input received a copy of each message published on the event hub. A better default is to have all input disabled. In this case, Filebeat spawns a new input only if the user explicitly enables the data stream.
a666258
to
d065c0d
Compare
What does this PR do?
Set
enabled: false
as the default status for all Azure Logs data streams.Why?
When users install the collective Azure Logs integration, all data streams start as enabled. With this default setting, we end up in a situation where Filebeat spawns a lot of "azureeventhub" behind the scenes: one for each existing data stream, and all receiving data from the same event hub:
Many (data streams) to one (event hub) is not a good default.
In this case, each input receives a copy of each message published on the event hub. Some data streams have pipeline that can filter incoming logs by category (sign-in, audit) but other will try to ingest any log category (generic event hub and platform logs).
A better default is to have all input disabled.
In this case, Filebeat spawns a new input only if the user explicitly enables the data stream, with no unintended input running with no reason.
In addition, since the Azure Logs version 1.4.1 update, we started to strongly recommend using one event hub per log group (a set of related log categories).
Here's the recommended setup using multiple event hubs, where each input receives only the intended log categories:
Checklist
changelog.yml
file.