-
Notifications
You must be signed in to change notification settings - Fork 277
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rework msgpack buffers #2300
rework msgpack buffers #2300
Conversation
dd-trace-core/src/main/java/datadog/trace/common/metrics/SerializingMetricWriter.java
Outdated
Show resolved
Hide resolved
@@ -25,4 +29,24 @@ int traceCount() { | |||
abstract void writeTo(WritableByteChannel channel) throws IOException; | |||
|
|||
abstract RequestBody toRequest(); | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the bleeding of msgpack into here seems a reasonable tradeoff for not needing to store the header in the buffer itself, which makes MsgPackWriter
more generic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make more sense to have these as static methods on a msgpack specific class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this effectively is a msgpack specific class and will be for the foreseeable future
return allocationFreeUTF8Encode(s); | ||
} | ||
|
||
private int allocationFreeUTF8Encode(CharSequence s) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UTF8BytesString
obviated the need for these sorts of shenanigans, which are actually a lot slower than calling getBytes()
because every charAt()
and every put(byte)
is bounds-checked. The more we use UTF8BytesString
, the less we'll allocate in serialisation.
31e5ffa
to
80c8826
Compare
@@ -1,150 +1,3 @@ | |||
package datadog.trace.core.serialization; | |||
|
|||
import datadog.trace.bootstrap.instrumentation.api.UTF8BytesString; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This existed to share flushing logic between the abandoned protobuf writer and the msgpack writer. We won't be trying protobuf again, and the flushing has been moved into the buffer implementation.
@@ -391,9 +392,8 @@ class DDAgentApiTest extends DDSpecification { | |||
} | |||
|
|||
Payload prepareTraces(String agentVersion, List<List<DDSpan>> traces) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sort of thing is in too many places and made this change too difficult. I've patched up all of these, but this needs refactoring.
@@ -200,7 +201,7 @@ class DDAgentWriterCombinedTest extends DDSpecification { | |||
when: | |||
def mapper = agentVersion.equals("v0.5/traces") ? new TraceMapperV0_5() : new TraceMapperV0_4() | |||
int traceSize = calculateSize(minimalTrace, mapper) | |||
int maxedPayloadTraceCount = ((int) ((mapper.messageBufferSize() - 5) / traceSize)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
5 was the space reserved for a header.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why no longer needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because there is no header in the buffer any more, it's added when writing the bytes out to the network
@@ -204,6 +205,13 @@ class TraceMapperV04PayloadTest extends DDSpecification { | |||
|
|||
@Override | |||
int write(ByteBuffer src) { | |||
if (captured.remaining() < src.remaining()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
letting this grow allows the test to explore more cases, rather than rejecting the output, but the increase in heap usage needs to be monitored.
@@ -32,49 +33,6 @@ import static org.msgpack.core.MessageFormat.UINT8 | |||
|
|||
class TraceMapperV05PayloadTest extends DDSpecification { | |||
|
|||
|
|||
def "dictionary overflow causes a flush"() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The dictionary doesn't flush any more. Keeping two isolated callbacks triggered when fixed capacities were reached in sync made this code extremely hard to reason about.
@@ -98,7 +56,9 @@ class TraceMapperV05PayloadTest extends DDSpecification { | |||
UUID.randomUUID().toString(), | |||
false)) | |||
int traceSize = calculateSize(repeatedTrace) | |||
int tracesRequiredToOverflowBody = (traceMapper.messageBufferSize() + traceSize - 1) / traceSize | |||
// 30KB body |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to use MBs of data to check there's a flush when the message buffer is full
@@ -53,6 +53,12 @@ public void transferTo(ByteBuffer buffer) { | |||
buffer.put(utf8Bytes); | |||
} | |||
|
|||
/** Writes the UTF8 encoding of the wrapped {@code String}. */ | |||
public byte[] getUtf8Bytes() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally a naked reference to this byte[]
should be avoided
4273061
to
19afa9c
Compare
dd-trace-core/src/main/java/datadog/trace/core/serialization/msgpack/MsgPackWriter.java
Show resolved
Hide resolved
19afa9c
to
192a872
Compare
192a872
to
1a49110
Compare
88c1de3
to
9e9cd97
Compare
… the ripple effects of not having encapsulated this properly
9e9cd97
to
40701bb
Compare
b7a3471
to
8935fe6
Compare
8935fe6
to
22101e0
Compare
dd-trace-core/src/main/java/datadog/trace/core/serialization/GrowableBuffer.java
Show resolved
Hide resolved
22101e0
to
0263eb8
Compare
@@ -25,4 +29,24 @@ int traceCount() { | |||
abstract void writeTo(WritableByteChannel channel) throws IOException; | |||
|
|||
abstract RequestBody toRequest(); | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make more sense to have these as static methods on a msgpack specific class?
|
||
public SerializingMetricWriter(WellKnownTags wellKnownTags, Sink sink) { | ||
this.wellKnownTags = wellKnownTags; | ||
this.writer = new MsgPackWriter(sink, ByteBuffer.allocate(1 << 20), EnumSet.of(SINGLE_MESSAGE)); | ||
this.buffer = new GrowableBuffer(512 << 10); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the doc for GrowableBuffer
you say only use if bounded... what is it that makes metrics limited in size?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the metrics points are stored in a bounded LRU cache
@@ -200,7 +201,7 @@ class DDAgentWriterCombinedTest extends DDSpecification { | |||
when: | |||
def mapper = agentVersion.equals("v0.5/traces") ? new TraceMapperV0_5() : new TraceMapperV0_4() | |||
int traceSize = calculateSize(minimalTrace, mapper) | |||
int maxedPayloadTraceCount = ((int) ((mapper.messageBufferSize() - 5) / traceSize)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why no longer needed?
The main change here is that buffering is abstracted away from the format writer(s?) so that all the format writer needs to be aware of is the format. There is a new abstraction called
StreamingBuffer
with two flavoursFlushingBuffer
- fixed size, flushes when full, this is how we have been sending traces for months. Traces are awkward because they are large and the format is length prefixed, which means we need to tell the agent how many traces to expect before sending the traces, which forbids streaming.GrowableBuffer
- resizes its buffer when necessary, should only be used when there is an implicit limit on its growth in practice, e.g. for the serialized string table in the v0.5 trace format.