Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create copies of net.IP and net.HardwareAddr when decoding IE values #330

Conversation

antoninbas
Copy link
Member

When using the provided slice (provided by the collector) directly, the garbage collector will not be able to release the underlying array while references to the net objects exist. The underlying array is allocated when reading the message from the TCP connection and can be 100s of bytes in size. The collector will typically run aggregation and keep references to the data record and its IEs for some time. During that time, we cannot release the packet buffer, while new packets keep coming in. To avoid the issue, we create a new slice, with a new underlying array, and the packet buffer can be releases as soon as all IEs are instantiated.

When using the provided slice (provided by the collector) directly, the
garbage collector will not be able to release the underlying array while
references to the net objects exist. The underlying array is allocated
when reading the message from the TCP connection and can be 100s of
bytes in size. The collector will typically run aggregation and keep
references to the data record and its IEs for some time. During that
time, we cannot release the packet buffer, while new packets keep coming
in. To avoid the issue, we create a new slice, with a new underlying
array, and the packet buffer can be releases as soon as all IEs are
instantiated.

Signed-off-by: Antonin Bas <abas@vmware.com>
@antoninbas antoninbas force-pushed the copy-net-addresses-in-DecodeAndCreateInfoElementWithValue branch from c6e2cf9 to 77ae683 Compare October 4, 2023 20:41
@codecov
Copy link

codecov bot commented Oct 4, 2023

Codecov Report

Merging #330 (7d8029e) into main (a9e36d6) will decrease coverage by 0.14%.
The diff coverage is 40.00%.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #330      +/-   ##
==========================================
- Coverage   73.53%   73.39%   -0.14%     
==========================================
  Files          19       19              
  Lines        2800     2808       +8     
==========================================
+ Hits         2059     2061       +2     
- Misses        574      578       +4     
- Partials      167      169       +2     
Flag Coverage Δ
integration-tests 51.53% <ø> (ø)
unit-tests 72.47% <40.00%> (-0.14%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
pkg/entities/ie.go 51.85% <40.00%> (-0.55%) ⬇️

@antoninbas
Copy link
Member Author

Before the change:

=== RUN   TestTCPCollectingProcess_ReceiveDataRecordsMemoryUsage
I1004 13:18:40.576899   58679 tcp.go:39] Start TCP collecting process on 127.0.0.1:56807
    process_test.go:168: Data packet length: 33
I1004 13:18:40.713559   58679 process.go:121] stopping the collecting process
E1004 13:18:40.713533   58679 tcp.go:79] "Error when retrieving message length" err="read tcp 127.0.0.1:56807->127.0.0.1:56809: use of closed network connection"
    process_test.go:195: Live objects: 4432
    process_test.go:196: Bytes of allocated heap objects: 514984

After the change:

=== RUN   TestTCPCollectingProcess_ReceiveDataRecordsMemoryUsage
I1004 13:40:25.811470   62567 tcp.go:39] Start TCP collecting process on 127.0.0.1:56963
    process_test.go:169: Data packet length: 33
I1004 13:40:25.941870   62567 process.go:121] stopping the collecting process
    process_test.go:197: Live objects: 4427
    process_test.go:198: Bytes of allocated heap objects: 481208

The difference is roughly 33KB (1,000 packets x 33B per packet).
In a real deployment with 100s connections per second and data records that can be as large as 1000B (think Pods with labels), the difference can be quite significant.


const numRecords = 1000

ies := make([]entities.InfoElementWithValue, 0, numRecords)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apology if I didn't fully understand the test. Is there a need to collect and validate the sourceIPv4Address ie values?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll write a comment. If these are not collected / used after calling runtime.ReadMemStats, everything will be GC'd.

Signed-off-by: Antonin Bas <abas@vmware.com>
Copy link
Contributor

@heanlan heanlan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@antoninbas antoninbas merged commit c5e0992 into vmware:main Oct 4, 2023
7 of 9 checks passed
heanlan pushed a commit to heanlan/go-ipfix that referenced this pull request Dec 7, 2023
…mware#330)

When using the provided slice (provided by the collector) directly, the
garbage collector will not be able to release the underlying array while
references to the net objects exist. The underlying array is allocated
when reading the message from the TCP connection and can be 100s of
bytes in size. The collector will typically run aggregation and keep
references to the data record and its IEs for some time. During that
time, we cannot release the packet buffer, while new packets keep coming
in. To avoid the issue, we create a new slice, with a new underlying
array, and the packet buffer can be releases as soon as all IEs are
instantiated.

Signed-off-by: Antonin Bas <abas@vmware.com>
heanlan pushed a commit that referenced this pull request Dec 7, 2023
…330)

When using the provided slice (provided by the collector) directly, the
garbage collector will not be able to release the underlying array while
references to the net objects exist. The underlying array is allocated
when reading the message from the TCP connection and can be 100s of
bytes in size. The collector will typically run aggregation and keep
references to the data record and its IEs for some time. During that
time, we cannot release the packet buffer, while new packets keep coming
in. To avoid the issue, we create a new slice, with a new underlying
array, and the packet buffer can be releases as soon as all IEs are
instantiated.

Signed-off-by: Antonin Bas <abas@vmware.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants