Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[santa] update ingest pipeline to match new log format #3347

Merged
merged 6 commits into from May 18, 2022

Conversation

DamiaPoquet
Copy link
Contributor

@DamiaPoquet DamiaPoquet commented May 12, 2022

What does this PR do?

The log output for Santa changed on versions 0.9.19, v0.9.20, and 0.9.21, making this integration fundamentally broken since then. This PR addresses the new log format parsing.

Checklist

  • I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.
  • I have added an entry to my package's changelog.yml file.
  • I have verified that Kibana version constraints are current according to guidelines.

Author's Checklist

  • Check log parsing
  • ECS 8.2 doesn't contains a field for process pidversion

How to test this PR locally

Related issues

I didn't find any related issue, although it seems to be broken since 2017.

@elasticmachine
Copy link

elasticmachine commented May 12, 2022

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2022-05-17T08:33:25.434+0000

  • Duration: 15 min 32 sec

Test stats 🧪

Test Results
Failed 0
Passed 13
Skipped 0
Total 13

🤖 GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

@DamiaPoquet DamiaPoquet changed the title Update Santa integration to work with current new format [santa] update ingest pipeline to match new log format May 12, 2022
@DamiaPoquet DamiaPoquet marked this pull request as ready for review May 12, 2022 15:40
@DamiaPoquet DamiaPoquet requested a review from a team as a code owner May 12, 2022 15:40
@elasticmachine
Copy link

Pinging @elastic/security-external-integrations (Team:Security-External Integrations)

Copy link
Contributor

@efd6 efd6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we leave the original tests in place while adding the new ones.

@efd6
Copy link
Contributor

efd6 commented May 16, 2022

/test

@DamiaPoquet
Copy link
Contributor Author

Can we leave the original tests in place while adding the new ones.

Is there any use case for preserving those old log samples? If I leave the original ones, the tests will fail.

@DamiaPoquet DamiaPoquet requested a review from efd6 May 16, 2022 10:08
@efd6
Copy link
Contributor

efd6 commented May 16, 2022

Do we know that these are no users still on the old version? If we are going to drop support for the older log format then this should really be a breaking change and marked as such.

The preferred option though would be to retain support for the old format while adding the new.

@DamiaPoquet
Copy link
Contributor Author

In my opinion, it doesn't make sense to support a pre-release output format for Santa. All the GA versions are outputting logs in the new format.

I'll mark this release as a breaking change.

@andrewkroh
Copy link
Member

/test

Copy link
Member

@andrewkroh andrewkroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix!

packages/santa/changelog.yml Outdated Show resolved Hide resolved
packages/santa/changelog.yml Outdated Show resolved Hide resolved
@elasticmachine
Copy link

elasticmachine commented May 16, 2022

🌐 Coverage report

Name Metrics % (covered/total) Diff
Packages 100.0% (1/1) 💚
Files 100.0% (1/1) 💚 3.589
Classes 100.0% (1/1) 💚 3.589
Methods 100.0% (12/12) 💚 11.714
Lines 91.0% (91/100) 👍 1.968
Conditionals 100.0% (0/0) 💚

Comment on lines 122 to 123
"pid": 72012,
"pidversion": 1097765,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

process.pidversion is not an allowed ECS field, so it should be placed in santa, but we could construct a process.entity_id with it and process.pid, something like "{{{process.pid}}}-{{[process.pidversion}}}". @andrewkroh WDYT? We don't have any examples of it being used anywhere else AFAICS.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds like a good idea. I am looking for some documentation on what pidversion in the Santa logs means just to confirm that it makes sense to use it as part of process.entity_id.

Copy link
Contributor

@efd6 efd6 May 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I could find it's just another pid, but from the kernel to prevent pid collision on pid rollover, e.g. https://www.exploit-db.com/exploits/46648. The Apple documentation is also not wildly helpful https://developer.apple.com/documentation/kernel/proc_persona_info/1560796-pidversion, the source helps https://github.com/apple/darwin-xnu/blob/2ff845c2e033bd0ff64b5b6aa6063a1f8f65aa32/bsd/kern/kern_exec.c#L6215-L6216.

This in the context of this gives the best answer I could find.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed the field name to santa.pidversion while the ECS is not ready yet to support it.

@efd6
Copy link
Contributor

efd6 commented May 17, 2022

/test

Copy link
Member

@andrewkroh andrewkroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. We could add a process.entity_id independently.

I was curious how Elastic Endpoint generates its process.entity_id to see if we could replicate that here. The code is

    (void)StringLib::FormatDuplicate(str, "%s-%llu-%llu.%llu", Cache::GetEndpointID().c_str(),
        (unsigned long long)pid, (unsigned long long)ts.tv_sec, (unsigned long long)ts.tv_nsec);
    (void)Titan_Encoding_toBase64(str.c_str(), str.size(), (void **)&(pBase64), &(base64Size));

But given that the start time we have is coming from the logger timestamp and lacks nanosecond precision, I don't think we can. So the only thing I would say is that if we do add an entity_id we should consider incorporating the agent ID.

@efd6
Copy link
Contributor

efd6 commented May 17, 2022

But given that the start time we have is coming from the logger timestamp and lacks nanosecond precision, I don't think we can. So the only thing I would say is that if we do add an entity_id we should consider incorporating the agent ID.

SGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Integration:Santa
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants