Add process minimum age to filter out short lived processes if needed #194

corneliu-calciu · 2025-06-19T14:09:36Z

Added minimum process age to be considered for discovery.

The new process detection logic will ignore a new process until its age reach a minimum value. As an example we may set a 3 seconds threshold as part of the monitoring configuration.
Any process will be considered as suitable for monitoring once is passing that configurable limit.
The noise of short lived processes can be filtered out.

Configuration example:
discovery:
exclude_services:

k8s_namespace: (istio-system|kube-node-lease)
services:
k8s_namespace: (acmefitness)
min_process_age: 5s

Moved the RP from grafana/beyla#2066

mariomac · 2025-06-19T15:51:19Z

Thanks for your contribution @corneliu-calciu !

I have one question: what happens if a long-lived process starts while Beyla is running? Beyla will get the notice of that new process but it will detect it as too young, so will it discard it?

corneliu-calciu · 2025-06-19T16:16:05Z

Hi Mario

This is exactly the scenario I want to address. The Beyla PODs are already running. A new process will be created on a node.
Now we can specify what is the minimum life span needed to accept and instrument the process.
If we have a large number of processes created and destroyed within a 1-2 seconds interval are just noise, especially on a Kubernetes environment where we expect long lived micro-services.
This capability can mitigate another problem. Please check bug grafana/beyla#2046.
When a new replica is created for a Kubernetes deployments sometimes the process filtering logic fails to recognize the inner processes of the container as are in the initialization phase where some new processes are created and some initialization processes will exit soon.

Best Regards
Corneliu

grcevski

This is looking good already, I left a few comments that I think we should address. Thanks for working on this!!!

pkg/components/discover/watcher_proc_test.go

pkg/services/criteria.go

pkg/components/discover/watcher_proc.go

codecov · 2025-06-20T14:23:12Z

Codecov Report

Attention: Patch coverage is 77.77778% with 6 lines in your changes missing coverage. Please review.

Project coverage is 63.37%. Comparing base (445d756) to head (8f6f86b).
Report is 12 commits behind head on main.

Files with missing lines	Patch %	Lines
pkg/components/discover/watcher_proc.go	77.77%	3 Missing and 3 partials ⚠️

❗ There is a different number of reports uploaded between BASE (445d756) and HEAD (8f6f86b). Click for more details.

HEAD has 1 upload less than BASE

Flag BASE (445d756) HEAD (8f6f86b)

integration-test 1 0

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #194      +/-   ##
==========================================
- Coverage   70.45%   63.37%   -7.08%     
==========================================
  Files         178      177       -1     
  Lines       19172    19172              
==========================================
- Hits        13507    12151    -1356     
- Misses       4909     6338    +1429     
+ Partials      756      683      -73

Flag	Coverage Δ
integration-test	`?`
integration-test-arm	`35.89% <66.66%> (-0.06%)`	⬇️
oats-test	`36.43% <55.55%> (+0.23%)`	⬆️
unittests	`46.15% <77.77%> (+0.11%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

grcevski · 2025-06-20T15:08:44Z

I looked at the test failures, I think we need to extend our default timeout interval since it takes a bit longer now for the services to be instrumented.

These functions:

func waitForTestComponentsSub(t *testing.T, url, subpath string) {
	waitForTestComponentsSubWithTime(t, url, subpath, 1)
}

func waitForTestComponentsSubStatus(t *testing.T, url, subpath string, status int) {
	waitForTestComponentsSubWithTimeAndCode(t, url, subpath, status, 1)
}

in test_utils.go (integration tests). Right now we wait a minute, I think we should make that a constant and change it to 2 minutes.

grcevski · 2025-06-20T23:29:08Z

Hm, it looks like the problem is somewhere else, I'm going to take a look myself, but I can't before Tuesday next week.

Added support to skip short lived processes

29643d4

corneliu-calciu requested a review from a team as a code owner June 19, 2025 14:09

corneliu-calciu mentioned this pull request Jun 19, 2025

Add process minimum age to filter out short lived processes if needed grafana/beyla#2066

Closed

grcevski reviewed Jun 19, 2025

View reviewed changes

pkg/components/discover/watcher_proc_test.go Show resolved Hide resolved

pkg/services/criteria.go Outdated Show resolved Hide resolved

pkg/services/criteria.go Outdated Show resolved Hide resolved

grcevski reviewed Jun 19, 2025

View reviewed changes

pkg/components/discover/watcher_proc.go Outdated Show resolved Hide resolved

grcevski reviewed Jun 19, 2025

View reviewed changes

pkg/components/discover/watcher_proc.go Show resolved Hide resolved

update the changes after code review

9384201

corneliu-calciu requested a review from grcevski June 20, 2025 13:14

increase the time limit for test

8f6f86b

Merge branch 'main' into process-minimum-age

b92c1b4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add process minimum age to filter out short lived processes if needed #194

Add process minimum age to filter out short lived processes if needed #194

Uh oh!

corneliu-calciu commented Jun 19, 2025

Uh oh!

mariomac commented Jun 19, 2025

Uh oh!

corneliu-calciu commented Jun 19, 2025

Uh oh!

grcevski left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Jun 20, 2025 •

edited

Loading

Uh oh!

grcevski commented Jun 20, 2025

Uh oh!

grcevski commented Jun 20, 2025

Uh oh!

Uh oh!

Add process minimum age to filter out short lived processes if needed #194

Are you sure you want to change the base?

Add process minimum age to filter out short lived processes if needed #194

Uh oh!

Conversation

corneliu-calciu commented Jun 19, 2025

Uh oh!

mariomac commented Jun 19, 2025

Uh oh!

corneliu-calciu commented Jun 19, 2025

Uh oh!

grcevski left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Jun 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

grcevski commented Jun 20, 2025

Uh oh!

grcevski commented Jun 20, 2025

Uh oh!

Uh oh!

codecov bot commented Jun 20, 2025 •

edited

Loading