-
Notifications
You must be signed in to change notification settings - Fork 429
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Azure] Add Monitor App Service Logs data stream #4818
Conversation
6e193f5
to
cb8230e
Compare
cb8230e
to
66b0442
Compare
🌐 Coverage report
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This first iteration on AppServiceAuditLogs
and AppServiceHTTPLogs
looks promising.
I added a few comments I'd love to hear you about.
storage_account_container: {{storage_account_container}} | ||
{{else}} | ||
{{#if eventhub}} | ||
storage_account_container: filebeat-firewalllogs-{{eventhub}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this firewalllogs
is a leftover, isn't it?
processors: | ||
- rename: | ||
field: azure.appservices.ResourceId | ||
target_field: azure.resource.id | ||
ignore_missing: true | ||
- rename: | ||
field: azure.appservices.Category | ||
target_field: azure.appservices.category | ||
- rename: | ||
field: azure.appservices.OperationName | ||
target_field: azure.appservices.operation_name | ||
- rename: | ||
field: azure.appservices.Properties.Protocol | ||
target_field: azure.appservices.properties.protocol | ||
- rename: | ||
field: azure.appservices.Properties.User | ||
target_field: azure.appservices.properties.user | ||
- rename: | ||
field: azure.appservices.Properties.UserAddress | ||
target_field: azure.appservices.properties.user_address | ||
- rename: | ||
field: azure.appservices.Properties.UserDisplayName | ||
target_field: azure.appservices.properties.user_display_name | ||
- convert: | ||
field: azure.appservices.properties.user_address | ||
type: ip | ||
ignore_missing: true | ||
- remove: | ||
field: | ||
- azure.appservices.Properties | ||
ignore_missing: true | ||
on_failure: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By removing the azure.appservices.Properties
field, we risk dropping sub-fields that we are unaware of or Azure may add in the future.
What if we rename Properties
into properties
first and then rename the other fields? We won't lose unexpected field this way.
- name: azure.appservices | ||
type: group | ||
fields: | ||
- name: category | ||
type: keyword | ||
description: The category of the operation. | ||
- name: operation_name | ||
type: keyword | ||
description: The operation name. | ||
- name: properties | ||
type: group | ||
fields: | ||
- name: protocol | ||
type: keyword | ||
description: Authentication protocol. | ||
- name: user | ||
type: keyword | ||
description: Username used for publishing access. | ||
- name: user_address | ||
type: ip | ||
description: Client IP address of the publishing user. | ||
- name: user_display_name | ||
type: keyword | ||
description: Email address of a user in case publishing was authorized via AAD authentication. | ||
- name: client_ip | ||
type: ip | ||
description: IP address of the client. | ||
- name: computer_name | ||
type: keyword | ||
description: The name of the server on which the log file entry was generated. | ||
- name: cookie | ||
type: keyword | ||
description: Cookie on HTTP request. | ||
- name: cs_bytes | ||
type: long | ||
description: Number of bytes received by server. | ||
- name: cs_host | ||
type: keyword | ||
description: Host name header on HTTP request. | ||
- name: cs_method | ||
type: keyword | ||
description: The request HTTP verb. | ||
- name: cs_uri_query | ||
type: keyword | ||
description: URI query on HTTP request. | ||
- name: cs_uri_stem | ||
type: keyword | ||
description: The target of the request. | ||
- name: cs_username | ||
type: keyword | ||
description: The name of the authenticated user on HTTP request. | ||
- name: referer | ||
type: keyword | ||
description: The site that the user last visited. This site provided a link to the current site. | ||
- name: result | ||
type: keyword | ||
description: Success / Failure of HTTP request. | ||
- name: s_port | ||
type: keyword | ||
description: Server port number. | ||
- name: sc_bytes | ||
type: long | ||
description: Number of bytes sent by server. | ||
- name: sc_status | ||
type: long | ||
description: HTTP status code. | ||
- name: sc_substatus | ||
type: keyword | ||
description: Substatus error code on HTTP request. | ||
- name: sc_win32status | ||
type: keyword | ||
description: Windows status code on HTTP request. | ||
- name: time_taken | ||
type: long | ||
description: Time taken by HTTP request in milliseconds. | ||
- name: user_agent | ||
type: keyword | ||
description: User agent on HTTP request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about grouping the fields in this list by log categories?
Adding comments to visually separate the shared ones from those belonging to a specific log category can be helpful.
What log categories are you working on right now? |
Let me know which log categories you are working on, and I'll try to pick at least one or two of the other ones to find the sample logs. |
Currently, I have samples for the 4 log categories marked and will update this PR to contain the yaml for the other 2. |
Hey @lucianpy, how is the log hunting going? Please let me know if you need help with the log collection, testing, or field/pipeline building! |
Hey @zmoog, no new progress as of yet. Worst case I will just push the 4 categories that I do have. Still want to try to get them this week. If not succesful, I will ask for your help on Monday. |
Hey @lucianpy, can I help with something? |
Hey @zmoog, could you take a look at It probably needs some IP rules set up and then just an access to the application by a restricted IP for example might trigger it. |
As an update here, I now have all of the 6 log categories that we need. Unfortunately, Currently looking at a fix for this issue. |
One thing that needs to be decided is what to do in case of same name fields for different log categories. A solution that I found is to modify the fields for each log category, such as Also, updated the PR with the document providing instructions on how to collect all the log categories! |
Quick update after the sync over zoom in the last few days to discuss the following two topics.
We decided to try to map the source document into a destination field as an object. For example, if the source has an IP address only, we can use a destination field like this: {
"ip": {
"address": "10.1.0.1"
}
} Instead, if it has an IP address and a port, we can use: {
"ip": {
"address": "10.1.0.1",
"port": 8080
}
} This is only an example, not the recommended names 😄 , I suggest exploring what other integration are doing and the reference names and structure in the Elastic Common Schema (ECS).
We need to reproduce this issue to share our findings with the Azure folks to understand this issue better. Suppose this turns out to be something we can't avoid. In that case, one possible approach is the update the |
@lucianpy how is it going with the newlines issue? Are you and Gabriel able to reproduce the issue again? |
@zmoog we were not able to reproduce the issue so I will add the sanitization just in case it ever happens again. |
To be noted here that there's a pretty big discrepancy between the fields I was able to fetch for Console Logs, Platform Logs and App Logs and the azure official documentation. Initially I thought there might be some optional fields that only come up in specific instances but some fields are not in the documentation at all, like Wasn't able to fetch any different variation of logs than those I have here. What do you think about this issue @zmoog? |
I found I can only sanitize the event for appservices in integrations, since this is where the issue was found. The advantage of this is not modifying the azure-eventhub code directly in beats. What do you think @zmoog? |
I had similar experiences in other log categories. The docs do not seem 100% accurate, and on average there are ~5% of the fields that are different (present, not present, or with a slightly different name). I usually link to the docs, but take the log samples as a de-factor spec. |
There is one main issue in trying to "patch" the invalid JSON string in the pipeline vs. the input. The diagnostic setting sends data to the event hub in batches, so the input receives something like this:
In this case, we have two records. When the input fails to parse the JSON, it packs the whole In a case like this, the pipeline receives one document with two records, and I am not sure if it is possible to create two documents out of it. My understaning is that the pipeline in:out ratio is 1:1. In this example, we have two records, but it is common to have ten or more records in a batch. |
I believe we need to tackle this at the input level with a PR on Beats, but it’s better to double-check; let's ask other team members and hear their advice. (We're asking offline, we'll update this conversation afterwards) |
As an update here, the consensus of the team is to takle this directly in Beats. Currently waiting on the sanitization PR there to merge before moving forward. |
Hi! We just realized that we haven't looked into this PR in a while. We're sorry! We're labeling this issue as |
Hi! We just realized that we haven't looked into this PR in a while. We're sorry! We're labeling this issue as |
Hi! This PR has been stale for a while and we're going to close it as part of our cleanup procedure. We appreciate your contribution and would like to apologize if we have not been able to review it, due to the current heavy load of the team. Feel free to re-open this PR if you think it should stay open and is worth rebasing. Thank you for your contribution! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚢
{{#if eventhub}} | ||
storage_account_container: filebeat-firewalllogs-{{eventhub}} | ||
{{/if}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you remove it? We need a default value, don't you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
During a chat with @lucian-ioan, @gpop63, and @devamanv, we agreed that the best way forward for this App Service integration is the following:
- Move the App Services integration from Azure Logs into an independent integration package.
- Release App Services as:
- maturity level: experimental
- version: v0.0.1
- no dashboards
This approach allow us to:
- Add a metrics data stream in the App Services package so that we can have everything about App Services in a single package.
- Have the option to add a dedicated data stream for each log category instead of using a single data stream as of today.
- Collect user feedback to adjust field mapping, ingest pipelines, and dashboards.
@lalit-satapathy @rameshelastic @SubhrataK let us know if you think we should take a different route. We plan to move forward with this in the current sprint.
8a82037
to
df9ea0c
Compare
267a8ee
to
b0c3324
Compare
Package azure_app_service - 0.0.1 containing this change is available at https://epr.elastic.co/search?package=azure_app_service |
* standalone app service integration * default value for storage_account_container * add code owner
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this PR do?
Adds App Service data stream with support for the following log categories:
Checklist
changelog.yml
file.Author's Checklist
Document providing instruction on how to collect each log category:
https://docs.google.com/document/d/1AJREQvAE2mc1QU19YKgZ4QJB6OHsyw8fy_WlLRYJV74
How to test this PR locally
Related issues
-Relates elastic/obs-infraobs-team#826
Screenshots