Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EDR Workflows] Adjust Sentinel One Cloud Funnel mappings to support Analyzer - entity_id #8608

Merged
merged 14 commits into from
Dec 8, 2023

Conversation

tomsonpl
Copy link
Contributor

@tomsonpl tomsonpl commented Nov 29, 2023

Proposed commit message:

[sentinel_one_cloud_funnel] `process.*` changes for Analyzer (#8608)

In order to support Analyzer:
- Populate the `process.entity_id` field.
- For process creation events, populate `process.*` fields with
  target/new process information rather than source/parent information.

Earlier title: [EDR Workflows] Adjust Sentinel One Cloud Funnel mappings to support Analyzer - entity_id
Earlier description:

This PR adds mapping to process.entity_id and process.parent.entity_id in default.yml.

Moreover it overwrites process.parent with src.process data, and process with tgt.process for process-pipeline.yml

There was a suggestion that we might actually also map event.type more dynamically, because now it's just set to [info]. But this can be done in the future :)

Kibana changes:
elastic/kibana#170829
Closes:
https://github.com/elastic/security-team/issues/7999

@tomsonpl tomsonpl added the Integration:sentinel_one_cloud_funnel SentinelOne Cloud Funnel label Nov 29, 2023
@tomsonpl tomsonpl self-assigned this Nov 29, 2023
@elasticmachine
Copy link

elasticmachine commented Nov 29, 2023

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2023-12-07T18:04:22.853+0000

  • Duration: 21 min 36 sec

Test stats 🧪

Test Results
Failed 0
Passed 24
Skipped 0
Total 24

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

@elasticmachine
Copy link

elasticmachine commented Nov 29, 2023

🌐 Coverage report

Name Metrics % (covered/total) Diff
Packages 100.0% (1/1) 💚
Files 100.0% (14/14) 💚
Classes 100.0% (14/14) 💚
Methods 95.312% (61/64)
Lines 90.055% (4084/4535)
Conditionals 100.0% (0/0) 💚

@elasticmachine
Copy link

Pinging @elastic/security-external-integrations (Team:Security-External Integrations)

@tomsonpl
Copy link
Contributor Author

tomsonpl commented Dec 1, 2023

@elastic/security-external-integrations Hey, I think it's everything that I needed to change in order to get analyzer to work with s1 data. Could somebody please take a look at this? Thanks! 🙇

@chrisberkhout chrisberkhout self-requested a review December 1, 2023 17:04
@chrisberkhout
Copy link
Contributor

Hey @tomsonpl,

I'm taking a look at this. Let me write out what I've got so far.

Here's a summary the new field handling in the pipelines:

default.yml:1542:     sentinel_one_cloud_funnel.event.src.process.parent.uid --copy--> process.parent.entity_id
default.yml:1871:     sentinel_one_cloud_funnel.event.src.process.uid        --copy--> process.entity_id
default.yml:2534:     json.tgt.process.uid                                   -rename-> sentinel_one_cloud_funnel.event.tgt.process.uid

default.yml:2639:     pipeline-process.yml (runs if sentinel_one_cloud_funnel.event.category contains "process" and not "cross")

pipeline-process:284: sentinel_one_cloud_funnel.event.tgt.process.name       --copy--> process.name
pipeline-process:342: sentinel_one_cloud_funnel.event.tgt.process.uid        --copy--> process.entity_id
pipeline-process:346: sentinel_one_cloud_funnel.event.src.process.uid        --copy--> process.parent.entity_id

Background information about incoming process information and ECS field population

This data stream includes many types of events, and all seem to have information about a source process that comes in as src.process and src.process.parent, which is used to populate the ECS fields process.* and process.parent.*.

There are some event categories, including process (and cross_process) that also have information about a target process, arriving in tgt.process. For example, with "event.category": "process", "event.type": "ProcessCreation", the src.process fields describe the parent process that is creating a new process, and the tgt fields describe the child process being created.

Background information about enabling the Analyzer

Reading https://github.com/elastic/security-team/issues/7999#issuecomment-1814748178, it sounds like the Analyzer expects a document for each node in the tree, with process.entity_id, process.parent.entity_id and process.name populated.

There's some discussion about missing root or leaf nodes of the process tree because the source and target fields appear in a single document, and there's a choice to make about which to use. If there's a ProcessCreation event for every process of interest, there will be a document with target fields for that process. We may miss out on PID 1 or other processes that we don't observe the creation of, but using the target fields seems like the better choice.

These changes

With the changes in this PR, the default.yml pipeline will, for all event types, set process.entity_id and process.parent.entity_id using source process information. It also does a rename to make target process information available in sentinel_one_cloud_funnel.event.tgt.process.uid.

Then in the pipeline-process.yml pipeline, the process.name, process.entity_id and process.parent.entity_id fields have their values (with source process information) overwritten with the target process information.

My only concern here is that the process.name, process.entity_id and process.parent.entity_id values from the target process and being mixed with other process.* fields that have source process values. Perhaps the process pipeline should remove those? That way the process.* fields would always refer to one and the same process: either the process associated with an event like network activity or file access, or for process events the target process itself.

@tomsonpl
Copy link
Contributor Author

tomsonpl commented Dec 4, 2023

Hey @chrisberkhout, thanks for the in-depth review! You're totally right here, I feel terrible that I missed all these... :P

So I am thinking now, that we can take it slow and change the rest of the fields for process event.category or even the process that is also of processCreation event.type. However that sounds like that it might require overwriting all the process.* and process.parent.* fields.

What in your opinion would be the best approach?

  1. Remove setting the values from default.yml and set them separately in ALL the others yml files?
  2. Set the default as it is right now, and just overwrite the process.yml?
  3. Same as 2, however do this only for ProcessCreation ?

Thanks!

@chrisberkhout
Copy link
Contributor

What in your opinion would be the best approach?
1. Remove setting the values from default.yml and set them separately in ALL the others yml files?
2. Set the default as it is right now, and just overwrite the process.yml?
3. Same as 2, however do this only for ProcessCreation ?

Approach 1 seems like a lot of duplication.

Doing it just for the ProcessCreation event type like in approach 3 is a good idea. Maybe it'll apply to others, but this seems like the critical one for now, and I think it's okay to start with one of them.

It should be clear to the reader that
"process.* is populated using sentinel_one_cloud_funnel.event..XXX.process.*"
I think doing it with a condition in the default pipeline, or doing it with a default there and an override elsewhere are both good approaches.

The tgt.process.* data has less information than src.process.*, but I'm not worried about populating less process.* fields because all of it is still available in the src fields. Also, this integration is still considered "Beta".

@elasticmachine
Copy link

🚀 Benchmarks report

To see the full report comment with /test benchmark fullreport

@tomsonpl tomsonpl marked this pull request as draft December 4, 2023 16:18
@tomsonpl
Copy link
Contributor Author

tomsonpl commented Dec 4, 2023

@chrisberkhout I am changing this to draft while I'll be working on the adjustments so you don't get too many notifications. I am not getting the new -set value, like if the if didn't meet criteria. I'll do some more debugging here, and ask you again for the review, ok ? :)
However, if you feel like it, feel free to check already if I am going a good path ;) Thanks!

field: process.thread.id
copy_from: sentinel_one_cloud_funnel.event.tgt.process.tid
ignore_empty_value: true
if: ctx.sentinel_one_cloud_funnel?.event?.type == 'Process Creation'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and below, I think the condition would be

if: ctx.sentinel_one_cloud_funnel.event.type == 'ProcessCreation'
  1. Null safe operators (?.) probably aren't needed, given they aren't used in the surrounding code.
  2. The ProcessCreation value (no space) can be seen in data_stream/event/_dev/test/pipeline/test-process.log

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one was strange, because the data in the original event was with a space between Process Creation. And when I adjusted the code to check for space - it worked.

However, I decided to overwrite all process fields like this, not only ProcessCreation, because it couldn't create an event for the parent sometimes. I am not sure what the issue was, but I had an event for tgt.process.uid, but no event for src.process.uid (parent). Now with the changes that all process are overwritten - it works as expected.
Do you have any concerns about it?
Thanks @chrisberkhout :elasticheart:

@tomsonpl tomsonpl marked this pull request as ready for review December 5, 2023 10:05
Comment on lines -69 to -83
- convert:
field: json.src.process.tid
tag: 'convert_json_src_process_tid'
target_field: sentinel_one_cloud_funnel.event.src.process.tid
type: long
ignore_missing: true
if: ctx.json?.src?.process?.tid != ''
on_failure:
- append:
field: error.message
value: 'Processor {{{_ingest.on_failure_processor_type}}} with tag {{{_ingest.on_failure_processor_tag}}} in pipeline {{{_ingest.pipeline}}} failed with message: {{{_ingest.on_failure_message}}}'
- set:
field: process.thread.id
copy_from: sentinel_one_cloud_funnel.event.src.process.tid
ignore_empty_value: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved these things for the src process to default.yml

Comment on lines 84 to 87
- set:
field: process.thread.id
copy_from: sentinel_one_cloud_funnel.event.tgt.process.tid
ignore_empty_value: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved these things for the tgt process down to after the process -> process.parent rename.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I mixed that up and removed it.

I'll add it back in an additional commit. It doesn't change the test output, since the field isn't in the input.

Comment on lines -92 to -96
- append:
field: process.user.name
value: '{{{sentinel_one_cloud_funnel.event.tgt.process.user.name}}}'
allow_duplicates: false
if: ctx.sentinel_one_cloud_funnel?.event?.tgt?.process?.user?.name != null
Copy link
Contributor

@chrisberkhout chrisberkhout Dec 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved this down to after the process -> process.parent rename

Comment on lines 288 to 291
- set:
field: process.name
copy_from: sentinel_one_cloud_funnel.event.tgt.process.name
ignore_empty_value: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved this down to after the process -> process.parent rename

Comment on lines 292 to 302
- convert:
field: json.tgt.process.pid
tag: 'convert_json_tgt_process_pid'
target_field: sentinel_one_cloud_funnel.event.tgt.process.pid
ignore_missing: true
type: long
if: ctx.json?.tgt?.process?.pid != ''
on_failure:
- append:
field: error.message
value: 'Processor {{{_ingest.on_failure_processor_type}}} with tag {{{_ingest.on_failure_processor_tag}}} in pipeline {{{_ingest.pipeline}}} failed with message: {{{_ingest.on_failure_message}}}'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved this into the default.yml file. It's got type: string there, because the sentinel_one_cloud_funnel.event.tgt.process.pid field has type keyword.

Comment on lines 321 to 374
- append:
field: process.user.name
value: '{{{sentinel_one_cloud_funnel.event.tgt.process.user.name}}}'
allow_duplicates: false
if: ctx.sentinel_one_cloud_funnel?.event?.tgt?.process?.user?.name != null
- set:
field: process.pid
copy_from: sentinel_one_cloud_funnel.event.tgt.process.pid
field: process.name
copy_from: sentinel_one_cloud_funnel.event.tgt.process.name
ignore_empty_value: true
- convert:
field: json.tgt.process.pid
tag: 'convert_json_tgt_process_pid'
target_field: sentinel_one_cloud_funnel.event.tgt.process.pid
field: sentinel_one_cloud_funnel.event.tgt.process.pid
target_field: process.pid
type: long
ignore_missing: true
type: string
if: ctx.json?.tgt?.process?.pid != ''
on_failure:
- append:
field: error.message
value: 'Processor {{{_ingest.on_failure_processor_type}}} with tag {{{_ingest.on_failure_processor_tag}}} in pipeline {{{_ingest.pipeline}}} failed with message: {{{_ingest.on_failure_message}}}'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this part the diff is tricky to read, but it should just be doing the things described above.

@tomsonpl
Copy link
Contributor Author

tomsonpl commented Dec 6, 2023

Thanks @chrisberkhout for the changes, I tested locally and everything works as expected. Big thank you for your help and time :)

@tomsonpl tomsonpl merged commit 4df37b3 into main Dec 8, 2023
4 checks passed
@elasticmachine
Copy link

Package sentinel_one_cloud_funnel - 0.10.0 containing this change is available at https://epr.elastic.co/search?package=sentinel_one_cloud_funnel

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants