-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add syslog batching poc implementation #491
base: main
Are you sure you want to change the base?
Changes from all commits
2d2556e
5170ba2
48a13d8
f206257
35104c5
21666c8
3c9934f
519dda6
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,88 @@ | ||
package syslog | ||
|
||
import ( | ||
"bytes" | ||
"crypto/tls" | ||
"time" | ||
|
||
"code.cloudfoundry.org/go-loggregator/v9/rpc/loggregator_v2" | ||
metrics "code.cloudfoundry.org/go-metric-registry" | ||
"code.cloudfoundry.org/loggregator-agent-release/src/pkg/egress" | ||
) | ||
|
||
const BATCHSIZE = 256 * 1024 | ||
|
||
type HTTPSBatchWriter struct { | ||
HTTPSWriter | ||
msgs chan []byte | ||
batchSize int | ||
sendInterval time.Duration | ||
egrMsgCount float64 | ||
} | ||
|
||
func NewHTTPSBatchWriter( | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you please add tests for this writer. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Some tests are already added, do we need tests for the things already done by the httpsWriter (Error handling and the likes, which is anyways the same?) |
||
binding *URLBinding, | ||
netConf NetworkTimeoutConfig, | ||
tlsConf *tls.Config, | ||
egressMetric metrics.Counter, | ||
c *Converter, | ||
) egress.WriteCloser { | ||
client := httpClient(netConf, tlsConf) | ||
binding.URL.Scheme = "https" // reset the scheme for usage to a valid http scheme | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What's the purpose of changing the scheme here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. the scheme is used to differentiate between the different endpoints (https and https-batched) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does mutating the scheme here affect the metrics emitted? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I will check for the metrics being changed here. |
||
BatchWriter := &HTTPSBatchWriter{ | ||
HTTPSWriter: HTTPSWriter{ | ||
url: binding.URL, | ||
appID: binding.AppID, | ||
hostname: binding.Hostname, | ||
client: client, | ||
egressMetric: egressMetric, | ||
syslogConverter: c, | ||
}, | ||
batchSize: BATCHSIZE, | ||
sendInterval: 1 * time.Second, | ||
egrMsgCount: 0, | ||
msgs: make(chan []byte), | ||
} | ||
go BatchWriter.startSender() | ||
return BatchWriter | ||
} | ||
|
||
// Modified Write function | ||
func (w *HTTPSBatchWriter) Write(env *loggregator_v2.Envelope) error { | ||
msgs, err := w.syslogConverter.ToRFC5424(env, w.hostname) | ||
if err != nil { | ||
return err | ||
} | ||
|
||
for _, msg := range msgs { | ||
w.msgs <- msg | ||
} | ||
return nil | ||
} | ||
|
||
func (w *HTTPSBatchWriter) startSender() { | ||
t := time.NewTimer(w.sendInterval) | ||
|
||
var msgBatch bytes.Buffer | ||
var msgCount float64 | ||
for { | ||
nicklas-dohrn marked this conversation as resolved.
Show resolved
Hide resolved
|
||
select { | ||
case msg := <-w.msgs: | ||
msgBatch.Write(msg) | ||
msgCount++ | ||
if msgBatch.Len() >= w.batchSize { | ||
w.sendHttpRequest(msgBatch.Bytes(), msgCount) | ||
msgBatch.Reset() | ||
msgCount = 0 | ||
t.Reset(w.sendInterval) | ||
} | ||
case <-t.C: | ||
if msgBatch.Len() > 0 { | ||
w.sendHttpRequest(msgBatch.Bytes(), msgCount) | ||
msgBatch.Reset() | ||
msgCount = 0 | ||
} | ||
t.Reset(w.sendInterval) | ||
} | ||
} | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,144 @@ | ||
package syslog_test | ||
|
||
import ( | ||
"bytes" | ||
"crypto/tls" | ||
"io" | ||
"net/http" | ||
"net/http/httptest" | ||
"time" | ||
|
||
"code.cloudfoundry.org/go-loggregator/v9/rfc5424" | ||
"code.cloudfoundry.org/go-loggregator/v9/rpc/loggregator_v2" | ||
metricsHelpers "code.cloudfoundry.org/go-metric-registry/testhelpers" | ||
"code.cloudfoundry.org/loggregator-agent-release/src/pkg/egress" | ||
"code.cloudfoundry.org/loggregator-agent-release/src/pkg/egress/syslog" | ||
. "github.com/onsi/ginkgo/v2" | ||
. "github.com/onsi/gomega" | ||
) | ||
|
||
var string_to_1024_chars = "saljdflajsdssdfsdfljkfkajafjajlköflkjöjaklgljksdjlakljkflkjweljklkwjejlkfekljwlkjefjklwjklsdajkljklwerlkaskldgjksakjekjwrjkljasdjkgfkljwejklrkjlklasdkjlsadjlfjlkadfljkajklsdfjklslkdfjkllkjasdjkflsdlakfjklasldfkjlasdjfkjlsadlfjklaljsafjlslkjawjklerkjljklasjkdfjklwerjljalsdjkflwerjlkwejlkarjklalkklfsdjlfhkjsdfkhsewhkjjasdjfkhwkejrkjahjefkhkasdjhfkashfkjwehfkksadfjaskfkhjdshjfhewkjhasdfjdajskfjwehkfajkankaskjdfasdjhfkkjhjjkasdfjhkjahksdf" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I find the name here a little confusing. The string is ~440 bytes in length. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes this might be confusing. |
||
|
||
var _ = Describe("HTTPS_batch_testing", func() { | ||
var ( | ||
netConf syslog.NetworkTimeoutConfig | ||
skipSSLTLSConfig = &tls.Config{ | ||
InsecureSkipVerify: true, //nolint:gosec | ||
} | ||
c = syslog.NewConverter() | ||
drain *SpyDrain | ||
b *syslog.URLBinding | ||
writer egress.WriteCloser | ||
) | ||
string_to_1024_chars += string_to_1024_chars | ||
nicklas-dohrn marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
BeforeEach(func() { | ||
drain = newBatchMockDrain(200) | ||
b = buildURLBinding( | ||
drain.URL, | ||
"test-app-id", | ||
"test-hostname", | ||
) | ||
writer = syslog.NewHTTPSBatchWriter( | ||
b, | ||
netConf, | ||
skipSSLTLSConfig, | ||
&metricsHelpers.SpyMetric{}, | ||
c, | ||
) | ||
}) | ||
|
||
It("testing simple appending of one log", func() { | ||
env1 := buildLogEnvelope("APP", "1", "message 1", loggregator_v2.Log_OUT) | ||
Expect(writer.Write(env1)).To(Succeed()) | ||
env2 := buildLogEnvelope("APP", "2", "message 2", loggregator_v2.Log_OUT) | ||
Expect(writer.Write(env2)).To(Succeed()) | ||
time.Sleep(2 * time.Second) | ||
|
||
Expect(drain.messages).To(HaveLen(2)) | ||
expected := &rfc5424.Message{ | ||
AppName: "test-app-id", | ||
Hostname: "test-hostname", | ||
Priority: rfc5424.Priority(14), | ||
ProcessID: "[APP/1]", | ||
Message: []byte("message 1\n"), | ||
} | ||
Expect(drain.messages[0].AppName).To(Equal(expected.AppName)) | ||
Expect(drain.messages[0].Hostname).To(Equal(expected.Hostname)) | ||
Expect(drain.messages[0].Priority).To(BeEquivalentTo(expected.Priority)) | ||
Expect(drain.messages[0].ProcessID).To(Equal(expected.ProcessID)) | ||
Expect(drain.messages[0].Message).To(Equal(expected.Message)) | ||
expected = &rfc5424.Message{ | ||
AppName: "test-app-id", | ||
Hostname: "test-hostname", | ||
Priority: rfc5424.Priority(14), | ||
ProcessID: "[APP/2]", | ||
Message: []byte("message 2\n"), | ||
} | ||
Expect(drain.messages[1].AppName).To(Equal(expected.AppName)) | ||
Expect(drain.messages[1].Hostname).To(Equal(expected.Hostname)) | ||
Expect(drain.messages[1].Priority).To(BeEquivalentTo(expected.Priority)) | ||
Expect(drain.messages[1].ProcessID).To(Equal(expected.ProcessID)) | ||
Expect(drain.messages[1].Message).To(Equal(expected.Message)) | ||
}) | ||
|
||
It("test early dispatch on high message load", func() { | ||
env1 := buildLogEnvelope("APP", "1", "string to get log to 1024 characters:"+string_to_1024_chars, loggregator_v2.Log_OUT) | ||
for i := 0; i < 300; i++ { | ||
writer.Write(env1) | ||
} | ||
time.Sleep(100 * time.Millisecond) | ||
Expect(drain.messages).To(HaveLen(256)) | ||
}) | ||
|
||
It("test batch dispatching with all logs in a given timeframe", func() { | ||
env1 := buildLogEnvelope("APP", "1", "string to get log to 1024 characters:"+string_to_1024_chars, loggregator_v2.Log_OUT) | ||
for i := 0; i < 10; i++ { | ||
writer.Write(env1) | ||
time.Sleep(99 * time.Millisecond) | ||
} | ||
Expect(drain.messages).To(HaveLen(0)) | ||
time.Sleep(100 * time.Millisecond) | ||
Expect(drain.messages).To(HaveLen(10)) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In general in these tests can we use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These are not final yet. |
||
}) | ||
|
||
It("probabilistic test for race condition", func() { | ||
env1 := buildLogEnvelope("APP", "1", "string to get log to 1024 characters:"+string_to_1024_chars, loggregator_v2.Log_OUT) | ||
for i := 0; i < 10; i++ { | ||
writer.Write(env1) | ||
time.Sleep(99 * time.Millisecond) | ||
} | ||
time.Sleep(100 * time.Millisecond) | ||
Expect(drain.messages).To(HaveLen(10)) | ||
}) | ||
}) | ||
|
||
func newBatchMockDrain(status int) *SpyDrain { | ||
drain := &SpyDrain{} | ||
handler := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { | ||
|
||
body, err := io.ReadAll(r.Body) | ||
Expect(err).ToNot(HaveOccurred()) | ||
defer r.Body.Close() | ||
|
||
println(body) | ||
|
||
message := &rfc5424.Message{} | ||
|
||
messages := bytes.SplitAfter(body, []byte("\n")) | ||
for _, raw := range messages { | ||
if bytes.Equal(raw, []byte("")) { | ||
continue | ||
} | ||
message = &rfc5424.Message{} | ||
err = message.UnmarshalBinary(raw) | ||
Expect(err).ToNot(HaveOccurred()) | ||
drain.messages = append(drain.messages, message) | ||
drain.headers = append(drain.headers, r.Header) | ||
} | ||
w.WriteHeader(status) | ||
}) | ||
server := httptest.NewTLSServer(handler) | ||
drain.Server = server | ||
return drain | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We understand the goal of the HTTPSBatchWriter to be:
❓ Is that right?
Currently,
HTTPSBatchWriter
andTriggerTimer
doesn't appear to do that. We think that code actually results in the following behaviour:➡️ Can you please have a look at adjusting this code.
Here are some examples of ways we've done batching in the past:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
first thing:
Yes your understanding seems right on what we try to do.
I have pushed a newer version, that should comply with the envisioned behaviour.
Tests for the new version are also added, confirming the implementation complying with the wanted behaviour.
I am currently looking through your pointers to see if I can leverage some of the implementations shown there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the code proposed:
The implementation for the signal batcher uses slices and an append structure.
It has information of how long the batch is going to be, speeding up the execution.
In contrast, the syslog batching feature will not know beforehand, how long batches are going to be,
Giving the edge of speed to Byte.buffers:
https://stackoverflow.com/questions/39319024/builtin-append-vs-bytes-buffer-write
Also, the actual buffering code is pretty short, and the expected type for sending data with the http client used is Byte[], so the choice still seems obvious to me.