Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove index audit trail in favor of filebeat #29881

Closed
elasticmachine opened this issue Mar 19, 2018 · 12 comments
Closed

Remove index audit trail in favor of filebeat #29881

elasticmachine opened this issue Mar 19, 2018 · 12 comments
Assignees
Labels

Comments

@elasticmachine
Copy link
Collaborator

elasticmachine commented Mar 19, 2018

Original comment by @albertzaharovits:

Right now Security has the capability to index the audit events trail on the local or a remote cluster.
I want to moot the removal of this in favor of filebeat.
The standing logging audit trail will remain the only auditing option. It could possibly be improved with a syslog appender.
This way, logging audit trail will handle the durability of the audit events, and we will leave search-ability to the user to do.

Given the overlapping of functionality with filebeat, the index auditing is more of a convenience functionality. More importantly, the asynchronicity of enqueuing audit events, shipping over the network and then indexing will make the failure-state-when-auditing-unavailable feature impractical in this case. Another feature, namely integrity-verification will also be tricky to implement, although, arguably, there's no clear solution for logging too (possibly ship events to syslog or Event Viewer and let them handle the integrity feature).
Another inconvenience is that we are indexing all the event's fields with fixed mappings, and the user cannot change this. The audit trail corpus is beefy, we should allow for indexing only a subset of fields with custom mappings.
Spice all this with the problem of permission around the audit index which is hard to get right in the remote cluster case.

Index audit trail has grown into a toy functionality which a security conscious user (presumably ones that enable audit logging in the first place) will want to stay away from.

TODO: In the process we ought to make audit events JSON docs.

CC @elastic/es-security
pinging @clintongormley as this is a xstack venture

@elasticmachine
Copy link
Collaborator Author

Original comment by @jaymode:

I want to moot the removal of this in favor of filebeat.

Let's make sure we take care of our users; we will need to create a filebeat module so that there is an easier transition for users of this feature. I did not see that stated anywhere in the issue.

It could possibly be improved with a syslog appender.

Let's keep this discussion separate.

TODO: In the process we ought to make audit events JSON docs.

I do not see how/why we need to do this? The logfile needs to be human readable IMO and I do not find JSON docs to be as readable as a log message.

Index audit trail has grown into a toy functionality which a security conscious user (presumably ones that enable audit logging in the first place) will want to stay away from.

I think most users may be unaware of the limitations with the index audit implementation; this is something we see and know by looking at the implementation.

@elasticmachine
Copy link
Collaborator Author

Original comment by @tvernum:

TODO: In the process we ought to make audit events JSON docs.

I do not see how/why we need to do this? The logfile needs to be human readable IMO and I do not find JSON docs to be as readable as a log message.

We've generally taken the view (with strong support from the ingest team) that we need to move to structured logging in order to facilitate reliable log ingest.

The audit log is already somewhat structured, since it print key/values with defined separators, but it's not quite there. For example, actions with embedded [ characters (which exist) get messy within the action=[{}] output.

I would favour us having a fully structured output (even if it's just key-value pairs) and if we support that, then newline-delimetered JSON output would be quite simple to produce and would make the ingest module easier to build and maintain.

It doesn't have to be JSON, but it does need to be:

  • well defined & unambiguous
  • easy to generate
  • easy to parse
  • somewhat readable

And we're not doing XML! 😈

@elasticmachine
Copy link
Collaborator Author

Original comment by @tvernum:

Also to consider: How does emit_request_body play into this?

It is currently handled differently between index and logfile on the expectation that indices have more disk space available than logs.

But if we do this, then there is only 1 output method and we probably need to make emit_request_body be more configurable (e.g. have it take a list of event types?)

@elasticmachine
Copy link
Collaborator Author

Original comment by @jaymode:

My concerns about JSON may be unfounded but agree that structure is important and its a well known format.

It is currently handled differently between index and logfile on the expectation that indices have more disk space available than logs.

Actually, disk space was not the deciding factor for these; it was more about readability and being able to parse through the information in a log file. For the index, we have search capabilities and didn't care how many documents we output by default, since elasticsearch "you know, for search" :)

But if we do this, then there is only 1 output method and we probably need to make emit_request_body be more configurable (e.g. have it take a list of event types?)

I am on the fence about this one. Until we have the ability to associate entries with one another, it might be necessary to know the request body in order to correlate various audit entries with one another.

@elasticmachine
Copy link
Collaborator Author

Original comment by @clintongormley:

We've generally taken the view (with strong support from the ingest team) that we need to move to structured logging in order to facilitate reliable log ingest.

The audit log could be a good place to test out a new structured log format.

We should collaborate with the ES team to discuss how to go about the filebeat module. /cc @andrewkroh @tsg

@elasticmachine
Copy link
Collaborator Author

Original comment by @tvernum:

Re emit_request_body

My concern is that for some users the logfile is going to be the final output, and for some users the logfile is just a mechanism to feed events into Beats and on to their audit cluster.
Those have different needs, as evidenced by the fact that emit_request_body today does different things in "logfile" to "index".

We probably need to do something to distiguish between users who want the log file to be usable in its own right (with less noise), and those who want to capture everything in order to send it off to ES.

That might mean a compact vs full output, of more tweakable options, or 2 output formats ("human" vs "json"), but when we plan the details, we'll need to bear in mind that we will have at least 2 different sets of consumers of the audit file (humans and beats).

@albertzaharovits
Copy link
Contributor

Those have different needs, as evidenced by the fact that emit_request_body today does different things in "logfile" to "index".

@tvernum It looks to me that both audit trails use the same method to print the request content:

public static String restRequestContent(RestRequest request) {

So I don't think that there are two formats?

In addition I think we should keep request as they are received, escaped and sanitized if only because requests can be malformed. I don't think we can get this field to a state where it will be indexed and searchable.
Maybe we can add another field which prints valid, parsed queries? Maybe we can make this field searchable?

@tvernum
Copy link
Contributor

tvernum commented Jun 4, 2018

So I don't think that there are two formats?

Compare:

The file based format (intentionally) only honours emit_request_body for some event types.
The index based format respects it across all event types.

If we were to have a single, file based output format, then which of those two approaches should it follow? Presumably it would include all bodies, because a replacement for the index output ought to maintain full feature parity. But that would mean we are throwing out the reasons for not including all event bodies in the file output.

I believe we need something that controls that behaviour. It could be two different output types (that both produce files, just with different formats & levels of detail), or it could be a setting that allows the administrator to control which events have their bodies emitted, but I don't want us to go with either of the two simplistic options:

  • include all bodies, and flood the supposedly human-readble logs with a mass of content
  • include only some bodies, and reduce functionality compared to the existing index output

@jaymode
Copy link
Member

jaymode commented Jun 26, 2018

@albertzaharovits has opened a issue to discuss the structured format #31046

albertzaharovits added a commit that referenced this issue Jan 24, 2019
This PR deprecates the index audit output.
In general, the problem with the index audit output is that event indexing
can be slower than the rate with which audit events are generated,
especially during the daily rollovers or the rolling cluster upgrades.
In this situation audit events will be lost which is a terrible failure situation
for an audit system.
Besides of the settings under the `xpack.security.audit.index` namespace, the `xpack.security.audit.outputs` setting has also been deprecated and will be
removed in 7. Although explicitly configuring the logfile output does not touch
any deprecation bits, this setting is made redundant in 7 so this PR deprecates
it as well.

Relates #29881
albertzaharovits added a commit that referenced this issue Jan 24, 2019
This commit removes the Index Audit Output type, following its deprecation
in 6.7 by 8765a31. It also adds the migration notice (settings notice).

In general, the problem with the index audit output is that event indexing
can be slower than the rate with which audit events are generated,
especially during the daily rollovers or the rolling cluster upgrades.
In this situation audit events will be lost which is a terrible failure situation
for an audit system.
Besides of the settings under the `xpack.security.audit.index` namespace, the
`xpack.security.audit.outputs` setting has also been deprecated and will be
removed in 7. Although explicitly configuring the logfile output does not touch
any deprecation bits, this setting is made redundant in 7 so this PR deprecates
it as well.

Relates #29881
@albertzaharovits
Copy link
Contributor

The ES work has been accomplished across multiple PR, most recently #37671 and #37707 .
Beats work is still on progress elastic/beats#8852 as of this writing.

@willemdh
Copy link

Can we still index the .security_audit_log to a separate monitoring cluster (with license)? In what index is the audit data written? Not Filebeat I hope?

@tvernum
Copy link
Contributor

tvernum commented Apr 1, 2019

@willemdh At Elastic we prefer to keep GitHub issues focused on the issues relating to developing a feature, and leave questions about how to use it to our forums.

If you would like more information about how to ingest audit events into a remote cluster, please open a new topic on the forum. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants