Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Only keep HTTP headers we want to keep #254

Merged
merged 2 commits into from Oct 2, 2023

Conversation

MikeGoldsmith
Copy link
Contributor

Which problem is this PR solving?

When processing HTTP request/response pairs, the header structs contain a lot of information we do not care about. From profiling data, we can see a large portion of the agent's live bytes are coming from textproto.readMimeHeader(), where headers are created. These bytes are held until the request/response pair is processed.

This PR adds extracts a copy of the header entries we care about and replaces the Header struct on the request/response, meaning the previous larger Header struct can be GC'd earlier and not wait for pairing to complete.

Short description of the changes

  • Add new extractHeader function that creates a new Header and copies header key/values we want to keep
    • Initially we only keep "User-Agent" - more can be added in the future
  • Update HTTP request and response parse funcs to use new extractHeader utility
  • Add test to verify behaviour of extractHeader

How to verify that this has the expected result

The header that's part of the HTTP request / response now only contains entries we care about and allows the expensive original Header to be cleaned up earlier.

@MikeGoldsmith MikeGoldsmith added the type: bug Something isn't working label Sep 29, 2023
@MikeGoldsmith MikeGoldsmith self-assigned this Sep 29, 2023
@MikeGoldsmith MikeGoldsmith requested a review from a team as a code owner September 29, 2023 10:28
@@ -123,3 +127,23 @@ func (reader *tcpReader) processEvent(requestId int64, entry *entry) {
DstIp: reader.dstIp,
}
}

var headersToExtract = []string{
"User-Agent",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's really the only header you use right now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, for now. We were conscious to not over subscribe to too many as headers can be very sensitive.

Copy link
Contributor

@JamieDanielson JamieDanielson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have everything we need

@MikeGoldsmith MikeGoldsmith merged commit 2cdf976 into main Oct 2, 2023
3 checks passed
@MikeGoldsmith MikeGoldsmith deleted the mike/extract-http-headers branch October 2, 2023 16:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

HTTP request/response header objects are holding onto a lot of live bytes
3 participants