Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove logger outside of WriteJSONKeys #35920

Merged
merged 17 commits into from
Jul 12, 2023
Merged

Conversation

alexsapran
Copy link
Contributor

@alexsapran alexsapran commented Jun 27, 2023

What does this PR do?

This change is removing the creation of the logger object inside of the WriteJSONKeys method.

That method is executed for each event causing the logger to initialize objects for each event.

Additionally adding a benchmark that show the impact of the change, for a specific scenario.

goos: darwin
goarch: amd64
pkg: github.com/elastic/beats/v7/libbeat/common/jsontransform
cpu: Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
                 │  base.out   │             logger.out             │
                 │   sec/op    │   sec/op     vs base               │
WriteJSONKeys-16   3.202µ ± 9%   2.589µ ± 3%  -19.13% (p=0.000 n=8)

                 │   base.out   │             logger.out              │
                 │     B/op     │     B/op      vs base               │
WriteJSONKeys-16   2.820Ki ± 0%   2.344Ki ± 0%  -16.90% (p=0.000 n=8)

                 │  base.out  │            logger.out             │
                 │ allocs/op  │ allocs/op   vs base               │
WriteJSONKeys-16   23.00 ± 0%   16.00 ± 0%  -30.43% (p=0.000 n=8)

Signed-off-by: Alexandros, Sapranidis alexandros@elastic.co

Why is it important?

Under certain conditions for Filebeat, but not only, but the logger was also consuming overall a lot of memory. That memory then had to be reclaimed by the GC, so reducing the logger memory, actually reduces the gc work.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have added tests that prove my fix is effective or that my feature works
    - [ ] I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

How to test this PR locally

  1. Build and trigger filebeat to load from a file and at the end capture the heap pprof.
  2. Observe the heap memory allocation for the logp.NewLogger

This commit adds a benchmarks for the method WriteJSONKeys

Signed-off-by: Alexandros, Sapranidis <alexandros@elastic.co>
This commit moves the initialization of the logger outside of the method
WriteJSONKeys and saves memory allocations.

Signed-off-by: Alexandros, Sapranidis <alexandros@elastic.co>
@alexsapran alexsapran requested a review from a team as a code owner June 27, 2023 12:28
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Jun 27, 2023
@botelastic
Copy link

botelastic bot commented Jun 27, 2023

This pull request doesn't have a Team:<team> label.

@mergify
Copy link
Contributor

mergify bot commented Jun 27, 2023

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @alexsapran? 🙏.
For such, you'll need to label your PR with:

  • The upcoming major version of the Elastic Stack
  • The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-v8./d.0 is the label to automatically backport to the 8./d branch. /d is the digit

@alexsapran alexsapran added the backport-skip Skip notification from the automated backport with mergify label Jun 27, 2023
@amitkanfer
Copy link
Collaborator

🚀

@elasticmachine
Copy link
Collaborator

elasticmachine commented Jun 27, 2023

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2023-07-12T09:52:08.596+0000

  • Duration: 71 min 44 sec

Test stats 🧪

Test Results
Failed 0
Passed 27511
Skipped 2029
Total 29540

💚 Flaky test report

Tests succeeded.

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@@ -35,11 +35,11 @@ var (
// ErrInvalidTimestamp is returned when parsing of a @timestamp field fails.
// Supported formats: ISO8601, RFC3339
ErrInvalidTimestamp = errors.New("failed to parse @timestamp, unknown format")
logger = logp.NewLogger("jsonhelper")
Copy link
Member

@andrewkroh andrewkroh Jul 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this logger still work? This becomes an order of initialization problem. The logp package is not yet initialized at this point so you get a no-op logger returned.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When tested in the unit/benchmark tests, it was properly initialized.
I can move this into a func init() on this package or see if we propagate it as part of the function parameters from the caller.

Copy link
Member

@andrewkroh andrewkroh Jul 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was properly initialized.

It will be initialized, but I would argue not properly. Moving it to an init() will not fix the issue since that still happens before the logp package receives its configuration through logp.Configure(). The godoc for NewLogger calls this out:

// NewLogger returns a new Logger labeled with the name of the selector. This
// should never be used from any global contexts, otherwise you will receive a
// no-op Logger. This is because the logp package needs to be initialized first.

If you apply this diff and run the test again, nothing gets logged.

go test -v -run TestWriteJSONKeys

diff --git a/libbeat/common/jsontransform/jsonhelper.go b/libbeat/common/jsontransform/jsonhelper.go
index 935eac90a7..c1e9a5d05b 100644
--- a/libbeat/common/jsontransform/jsonhelper.go
+++ b/libbeat/common/jsontransform/jsonhelper.go
@@ -40,6 +40,7 @@ var (
 
 // WriteJSONKeys writes the json keys to the given event based on the overwriteKeys option and the addErrKey
 func WriteJSONKeys(event *beat.Event, keys map[string]interface{}, expandKeys, overwriteKeys, addErrKey bool) {
+	logger.Error("NEVER SHOWN")
 	if expandKeys {
 		if err := expandFields(keys); err != nil {
 			logger.Errorf("JSON: failed to expand fields: %s", err)
diff --git a/libbeat/common/jsontransform/jsonhelper_test.go b/libbeat/common/jsontransform/jsonhelper_test.go
index d45ca5c3a3..9f43c1dbb3 100644
--- a/libbeat/common/jsontransform/jsonhelper_test.go
+++ b/libbeat/common/jsontransform/jsonhelper_test.go
@@ -24,10 +24,13 @@ import (
 	"github.com/stretchr/testify/require"
 
 	"github.com/elastic/beats/v7/libbeat/beat"
+	"github.com/elastic/elastic-agent-libs/logp"
 	"github.com/elastic/elastic-agent-libs/mapstr"
 )
 
 func TestWriteJSONKeys(t *testing.T) {
+	logp.TestingSetup()
+
 	now := time.Now()
 	now = now.Round(time.Second)

Solutions

  1. One solution would be to defer initialization of the logger until the time of first use. Like
var (
	pkgLogger         *logp.Logger
	pkgLoggerInitOnce sync.Once
)

func logger() *logp.Logger {
	pkgLoggerInitOnce.Do(func() {
		pkgLogger = logp.NewLogger("jsontransform")
	})
	return pkgLogger
}

func WriteJSONKeys(...) {
    logger().Errorf("JSON: failed to expand fields: %s", err)
}
  1. Remove logging from this function. I kind of find it annoying that it does any logging. There is already error handling present and that is via the addErrKey feature. As a user, if I care about errors then I should be using addErrKey==true.

    This highlights another inefficiency. When addErrKey==false the code is always executing createJSONError(err.Error()) which then goes completely unused.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would opt for the solution 2. It makes more sense to not require any logging inside that function and if we need to, we can log in the caller.

The createJSONError(err.Error()) aspect, I noticed as well, but I was not 100% sure the reasoning behind that so I was kind of reluctant to touch in this changeset.

The error is present in the output of the event. The logger here is kind
of redundant.

Signed-off-by: Alexandros, Sapranidis <alexandros@elastic.co>
@alexsapran alexsapran changed the title Move logger outside of WriteJSONKeys Remove logger outside of WriteJSONKeys Jul 10, 2023
andrewkroh
andrewkroh previously approved these changes Jul 10, 2023
Copy link
Collaborator

@amitkanfer amitkanfer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also fix the redundant call to createJSONError if addErrKey is false
we can change SetErrorWithOption to receive a string and call createJSONError only if addErrKey is true. Behavior will stay the same.

This change tryies to simplify the writing of error in an Event.

Signed-off-by: Alexandros, Sapranidis <alexandros@elastic.co>
libbeat/common/jsontransform/jsonhelper.go Outdated Show resolved Hide resolved
libbeat/common/jsontransform/jsonhelper.go Outdated Show resolved Hide resolved
Updated the function SetErrorWithOption in order to accomodate the
addition of the data and field which is something that it was used in
the processors.json_fields

Signed-off-by: Alexandros, Sapranidis <alexandros@elastic.co>
Signed-off-by: Alexandros, Sapranidis <alexandros@elastic.co>
@andrewkroh andrewkroh dismissed their stale review July 11, 2023 13:34

There are more new changes to review.

Signed-off-by: Alexandros, Sapranidis <alexandros@elastic.co>
Signed-off-by: Alexandros, Sapranidis <alexandros@elastic.co>
Signed-off-by: Alexandros, Sapranidis <alexandros@elastic.co>
Signed-off-by: Alexandros, Sapranidis <alexandros@elastic.co>
e.Fields["error"] = jsonErr
errorField := mapstr.M{"message": message, "type": "json"}
if data != "" {
_, _ = errorField.Put("data", data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
_, _ = errorField.Put("data", data)
errorField["data"] = data

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But upon closer inspection, I think we should not add these two fields into the error object because they are not part of ECS error. Perhaps find a way to incorporate them into the error.message?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just tried to keep compatibility with the current working code from

https://github.com/elastic/beats/pull/35920/files#diff-cbf4d8fa4d091014e5fe2a57fac3fa48f8361400b3cb55c2cff307b0a544f6abL125-R125

If we need to change this, we might want to do it as a followup work outside of the work on this PR, WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I missed that they were already present.

libbeat/common/jsontransform/jsonhelper_test.go Outdated Show resolved Hide resolved
libbeat/beat/event.go Outdated Show resolved Hide resolved
libbeat/beat/event.go Outdated Show resolved Hide resolved
libbeat/beat/event.go Outdated Show resolved Hide resolved
libbeat/common/jsontransform/jsonhelper_test.go Outdated Show resolved Hide resolved
jlind23 and others added 4 commits July 12, 2023 07:03
Co-authored-by: Andrew Kroh <andrew.kroh@elastic.co>
Co-authored-by: Andrew Kroh <andrew.kroh@elastic.co>
Co-authored-by: Andrew Kroh <andrew.kroh@elastic.co>
Co-authored-by: Andrew Kroh <andrew.kroh@elastic.co>
@jlind23
Copy link
Collaborator

jlind23 commented Jul 12, 2023

@alexsapran could you please move this over the line? I would happy to make it land as soon as possible.

@alexsapran alexsapran merged commit 536019b into elastic:main Jul 12, 2023
85 checks passed
Scholar-Li pushed a commit to Scholar-Li/beats that referenced this pull request Feb 5, 2024
This commit removes the logger from the WriteJSONKeys which is executed as part of the hot path of the Event and was allocating a new logger for each event.
This has an impact on the memory allocation of the function and because it's executed for each event, we see a direct benefit in the overall memory allocations.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-skip Skip notification from the automated backport with mergify enhancement libbeat needs_team Indicates that the issue/PR needs a Team:* label
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants