Skip to content

Commit

Permalink
feat: Add configurable list of HTTP headers to extract (#277)
Browse files Browse the repository at this point in the history
## Which problem is this PR solving?
The current list of HTTP headers that are extracted for request/response
pairs is limited to just `User-Agent`.

This PR adds a configuration option to provide a custom set of headers
that will be extracted from HTTP request/response pair. The agent looks
in the new `HTTP_HEADERS` env var for a comma separated list of headers
to extract in place of default set of headers.

- Closes #215 

## Short description of the changes
- Move default headers to extract from http parser to config 
- Add `HTTPHeadersToExtract` to config struct which is populated from
the default set plus any additional headers found in `HTTP_HEADERS` env
var
- Add new `LookupEnvAsStringSlice` func to utils that is used to get the
list of headers
- Update HTTP Parser to take the list of headers to extract as part of
`NewHTTPParser`
- Add unit tests to config and http parser
- Add new env var to README

## How to verify that this has the expected result
When configuring the agent, a list of additional HTTP headers can be set
and will be extracted from HTTP request/responses.

---------

Co-authored-by: Jamie Danielson <jamieedanielson@gmail.com>
Co-authored-by: Robb Kidd <robbkidd@honeycomb.io>
  • Loading branch information
3 people committed Oct 13, 2023
1 parent e25f2fe commit 5cc0ff5
Show file tree
Hide file tree
Showing 7 changed files with 91 additions and 30 deletions.
17 changes: 10 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,14 +46,17 @@ kubectl create secret generic honeycomb --from-literal=api-key=$HONEYCOMB_API_KE
The network agent can be configured using the following environment variables.

| Environment Variable | Description | Default | Required? |
| ------------------------- | ---------------------------------------------------------------------------------------- | -------------------------- | --------- |
|---------------------------|------------------------------------------------------------------------------------------|----------------------------|-----------|
| `HONEYCOMB_API_KEY` | The Honeycomb API key used when sending events | `` (empty) | **Yes** |
| `HONEYCOMB_API_ENDPOINT` | The endpoint to send events to | `https://api.honeycomb.io` | No |
| `HONEYCOMB_DATASET` | Dataset where network events are stored | `hny-network-agent` | No |
| `HONEYCOMB_STATS_DATASET` | Dataset where operational statistics for the network agent are stored | `hny-network-agent-stats` | No |
| `LOG_LEVEL` | The log level to use when printing logs to console | `INFO` | No |
| `DEBUG` | Runs the agent in debug mode including enabling a profiling endpoint using Debug Address | `false` | No |
| `DEBUG_ADDRESS` | The endpoint to listen to when running the profile endpoint | `localhost:6060` | No |
| `HTTP_HEADERS` | Case-sensitive, comma separated list of headers to be recorded from requests/responses† | `User-Agent` | No |

†: When providing an overide of a list of values, you must provide all values including any defaults.

### Run

Expand All @@ -67,12 +70,12 @@ Alternative options for configuration and running can be found in [Deploying the

## Supported Platforms

| Platform | Supported |
| ---------------------------------------------------------------------| ------------------------------------- |
| [AKS](https://azure.microsoft.com/en-gb/products/kubernetes-service) | Supported ✅ |
| [EKS](https://aws.amazon.com/eks/) | Self-managed hosts ✅ <br> Fargate ❌ |
| [GKE](https://cloud.google.com/kubernetes-engine) | Standard cluster ✅ <br> AutoPilot ❌ |
| Self-hosted | Ubuntu ✅ |
| Platform | Supported |
|----------------------------------------------------------------------|-------------------------------------|
| [AKS](https://azure.microsoft.com/en-gb/products/kubernetes-service) | Supported ✅ |
| [EKS](https://aws.amazon.com/eks/) | Self-managed hosts ✅ <br> Fargate ❌ |
| [GKE](https://cloud.google.com/kubernetes-engine) | Standard cluster ✅ <br> AutoPilot ❌ |
| Self-hosted | Ubuntu ✅ |

### Requirements

Expand Down
20 changes: 9 additions & 11 deletions assemblers/http_parser.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,14 @@ import (

// httpParser parses HTTP requests and responses
type httpParser struct {
matcher *httpMatcher
matcher *httpMatcher
headersToExtract []string
}

func newHttpParser() *httpParser {
func newHttpParser(headersToExtract []string) *httpParser {
return &httpParser{
matcher: newRequestResponseMatcher(),
matcher: newRequestResponseMatcher(),
headersToExtract: headersToExtract,
}
}

Expand All @@ -28,7 +30,7 @@ func (parser *httpParser) parse(stream *tcpStream, requestId int64, timestamp ti
return false, err
}
// We only care about a few headers, so recreate the header with just the ones we need
req.Header = extractHeaders(req.Header)
req.Header = parser.extractHeaders(req.Header)
// We don't need the body, so just close it if set
if req.Body != nil {
req.Body.Close()
Expand All @@ -54,7 +56,7 @@ func (parser *httpParser) parse(stream *tcpStream, requestId int64, timestamp ti
return false, err
}
// We only care about a few headers, so recreate the header with just the ones we need
res.Header = extractHeaders(res.Header)
res.Header = parser.extractHeaders(res.Header)
// We don't need the body, so just close it if set
if res.Body != nil {
res.Body.Close()
Expand All @@ -78,19 +80,15 @@ func (parser *httpParser) parse(stream *tcpStream, requestId int64, timestamp ti
return true, nil
}

var headersToExtract = []string{
"User-Agent",
}

// extractHeaders returns a new http.Header object with only specified headers from the original.
// The original request/response header contains a lot of stuff we don't really care about
// and stays in memory until the request/response pair is processed
func extractHeaders(header http.Header) http.Header {
func (parser *httpParser) extractHeaders(header http.Header) http.Header {
cleanHeader := http.Header{}
if header == nil {
return cleanHeader
}
for _, headerName := range headersToExtract {
for _, headerName := range parser.headersToExtract {
if headerValue := header.Get(headerName); headerValue != "" {
cleanHeader.Set(headerName, headerValue)
}
Expand Down
37 changes: 26 additions & 11 deletions assemblers/http_parser_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -9,37 +9,52 @@ import (

func TestExtractHeader(t *testing.T) {
testCases := []struct {
name string
header http.Header
expected http.Header
name string
headersToExtract []string
header http.Header
expected http.Header
}{
{
name: "nil header",
header: nil,
expected: http.Header{},
name: "nil header",
headersToExtract: nil,
header: nil,
expected: http.Header{},
},
{
name: "empty header",
header: http.Header{},
expected: http.Header{},
name: "empty header",
headersToExtract: nil,
header: http.Header{},
expected: http.Header{},
},
{
name: "only extracts headers we want to keep",
name: "only extracts headers we want to keep",
headersToExtract: []string{"User-Agent", "X-Test"},
header: http.Header{
"Accept": []string{"test"},
"Host": []string{"test"},
"Cookie": []string{"test"},
"User-Agent": []string{"test"},
"X-Test": []string{"test"},
},
expected: http.Header{
"User-Agent": []string{"test"},
"X-Test": []string{"test"},
},
},
{
name: "header names are case-sensitive",
headersToExtract: []string{"X-TEST"},
header: http.Header{
"x-test": []string{"test"},
},
expected: http.Header{},
},
}

for _, tc := range testCases {
t.Run(tc.name, func(t *testing.T) {
result := extractHeaders(tc.header)
parser := newHttpParser(tc.headersToExtract)
result := parser.extractHeaders(tc.header)
assert.Equal(t, tc.expected, result)
})
}
Expand Down
2 changes: 1 addition & 1 deletion assemblers/tcp_stream.go
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ func NewTcpStream(net gopacket.Flow, transport gopacket.Flow, config config.Conf
dstPort: transport.Dst().String(),
buffer: bufio.NewReader(bytes.NewReader(nil)),
parsers: []parser{
newHttpParser(),
newHttpParser(config.HTTPHeadersToExtract),
},
}
}
Expand Down
17 changes: 17 additions & 0 deletions config/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,9 @@ type Config struct {

// Include the request URL in the event.
IncludeRequestURL bool

// The list of HTTP headers to extract from a HTTP request/response.
HTTPHeadersToExtract []string
}

// NewConfig returns a new Config struct.
Expand Down Expand Up @@ -145,6 +148,7 @@ func NewConfig() Config {
AgentPodName: utils.LookupEnvOrString("AGENT_POD_NAME", ""),
AdditionalAttributes: utils.LookupEnvAsStringMap("ADDITIONAL_ATTRIBUTES"),
IncludeRequestURL: utils.LookupEnvOrBool("INCLUDE_REQUEST_URL", true),
HTTPHeadersToExtract: getHTTPHeadersToExtract(),
}
}

Expand Down Expand Up @@ -236,3 +240,16 @@ func (c *Config) Validate() error {
// returns nil if no errors in slice
return errors.Join(e...)
}

var defaultHeadersToExtract = []string{
"User-Agent",
}

// getHTTPHeadersToExtract returns the list of HTTP headers to extract from a HTTP request/response
// based on a user-defined list in HTTP_HEADERS, or the default headers if no list is given.
func getHTTPHeadersToExtract() []string {
if headers, found := utils.LookupEnvAsStringSlice("HTTP_HEADERS"); found {
return headers
}
return defaultHeadersToExtract
}
10 changes: 10 additions & 0 deletions config/config_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,7 @@ func TestEnvVars(t *testing.T) {
t.Setenv("AGENT_POD_NAME", "pod_name")
t.Setenv("ADDITIONAL_ATTRIBUTES", "key1=value1,key2=value2")
t.Setenv("INCLUDE_REQUEST_URL", "false")
t.Setenv("HTTP_HEADERS", "header1,header2")

config := NewConfig()
assert.Equal(t, "1234567890123456789012", config.APIKey)
Expand All @@ -82,6 +83,14 @@ func TestEnvVars(t *testing.T) {
assert.Equal(t, "pod_name", config.AgentPodName)
assert.Equal(t, map[string]string{"key1": "value1", "key2": "value2"}, config.AdditionalAttributes)
assert.Equal(t, false, config.IncludeRequestURL)
assert.Equal(t, []string{"header1", "header2"}, config.HTTPHeadersToExtract)
}

func TestEmptyHeadersEnvVar(t *testing.T) {
t.Setenv("HTTP_HEADERS", "")

config := NewConfig()
assert.Equal(t, []string{}, config.HTTPHeadersToExtract)
}

func TestEnvVarsDefault(t *testing.T) {
Expand All @@ -106,6 +115,7 @@ func TestEnvVarsDefault(t *testing.T) {
assert.Equal(t, "", config.AgentPodName)
assert.Equal(t, map[string]string{}, config.AdditionalAttributes)
assert.Equal(t, true, config.IncludeRequestURL)
assert.Equal(t, []string{"User-Agent"}, config.HTTPHeadersToExtract)
}

func Test_Config_buildBpfFilter(t *testing.T) {
Expand Down
18 changes: 18 additions & 0 deletions utils/env.go
Original file line number Diff line number Diff line change
Expand Up @@ -41,3 +41,21 @@ func LookupEnvAsStringMap(key string) map[string]string {
}
return values
}

// LookupEnvAsStringSlice returns a slice of strings from the environment variable with the given key
// and a boolean indicating if the key was present
// values are comma separated
// Example: value1,value2,value3
func LookupEnvAsStringSlice(key string) ([]string, bool) {
values := []string{}
env, found := os.LookupEnv(key)
if !found {
return values, false
}
for _, value := range strings.Split(env, ",") {
if value != "" {
values = append(values, value)
}
}
return values, true
}

0 comments on commit 5cc0ff5

Please sign in to comment.