-
Notifications
You must be signed in to change notification settings - Fork 9
Streaming interface #4
Comments
Link to the new stream interface: https://github.com/google/brotli/blob/a6881eb3090d9d0c08e3985e6a22b7a47cecfbab/dec/decode.h#L140 function signature: BrotliResult BrotliDecompressStream(size_t* available_in,
const uint8_t** next_in,
size_t* available_out,
uint8_t** next_out,
size_t* total_out,
BrotliState* s); |
Link to example usage of the new stream interface (in code for reference: BrotliResult BrotliDecompressBuffer(size_t encoded_size,
const uint8_t* encoded_buffer,
size_t* decoded_size,
uint8_t* decoded_buffer) {
BrotliState s;
BrotliResult result;
size_t total_out = 0;
size_t available_in = encoded_size;
const uint8_t* next_in = encoded_buffer;
size_t available_out = *decoded_size;
uint8_t* next_out = decoded_buffer;
BrotliStateInit(&s);
result = BrotliDecompressStream(&available_in, &next_in, &available_out,
&next_out, &total_out, &s);
*decoded_size = total_out;
BrotliStateCleanup(&s);
if (result != BROTLI_RESULT_SUCCESS) {
result = BROTLI_RESULT_ERROR;
}
return result;
} |
Ah, it seems there's still no C-only interface for the streaming encoder. Got it. |
Ah, I was excited there for a moment :) I think wrapping the Encoder should be reasonably straightforward, but my C++ is a bit rusty so I've been putting it off. |
Me too! But let's get it working anyway
Alright let's give it a go then :) |
This looks like a good starting place: http://stackoverflow.com/a/1721230/37416 |
Also, finalizers are probably necessary to avoid memory leaks: https://golang.org/pkg/runtime/#SetFinalizer |
Yup, although this is from back in '09, possibly when go didn't build the .c/.h files found inthe same directory as a cgo-using .go file ? Should be even simpler now
👍 keeping that in mind |
Hmm might have to rename 'BrotliParams' to 'CBrotliParams' or something, otherwise |
Copying this here for reference — minus the Brotli In/Out classes, it uses the encoding API I'm going for (CopyInputToRingBuffer + WriteBrotliData) int BrotliCompressWithCustomDictionary(size_t dictsize, const uint8_t* dict,
BrotliParams params,
BrotliIn* in, BrotliOut* out) {
size_t in_bytes = 0;
size_t out_bytes = 0;
uint8_t* output;
bool final_block = false;
BrotliCompressor compressor(params);
if (dictsize != 0) compressor.BrotliSetCustomDictionary(dictsize, dict);
while (!final_block) {
in_bytes = CopyOneBlockToRingBuffer(in, &compressor);
final_block = in_bytes == 0 || BrotliInIsFinished(in);
out_bytes = 0;
if (!compressor.WriteBrotliData(final_block,
/* force_flush = */ false,
&out_bytes, &output)) {
return false;
}
if (out_bytes > 0 && !out->Write(output, out_bytes)) {
return false;
}
}
return true;
} |
BrotliParams shoud be fine, I am already interfacing with that struct as C.struct_BrotliParams from Go. |
cf. itchio@652b90e to see what I mean — from
And before my changes, both contained a (conflicting) declaration for |
Ignore me, I forgot that the original Params object was a C++ struct |
@kothar I think we can probably still do that (I definitely understand the appeal of being able to just copy-paste a bit of the brotli header and hand-modify it as little as possible), I'll definitely look into it. edit: it's probably as easy as having As of itchio@888315f the enc streaming interface seems to work well, still looking into finalizers + documenting it properly, then I'll work on the decoder interface I suppose |
Realizing now my streaming test isn't so good, because it only copies to ringbuffer once (4.1MB of sample data < 4.3MB ring buffer), adjusting with a smaller lgwin. |
@kothar Is there any reason why 11 is the default quality? And that it's also used in the encoding test? The upstream tests use 1, 6, 9, 11. I was thinking of using a lower quality for the streaming encode test because ~6s on my 2014 MBP is a bit of a steep price to pay for such a test :( lowering to q=2 brings the entire test down to 0.13s |
My contribution: stream decompression: #6 |
That appeared to be default in the BrotliParams initialiser, so I left it as is. Feel free to use something else for testing. I imagine most people compressing with Brotli will want something like level 9 for most stuff, or 11 if they are pre-compressing archives. |
@kothar in my limited experience with brotli compression, 9 and 10/11 are completely different order of magnitudes, so even changing to 9 would be a nice speedup imho |
Ah, having an |
It looks like it should be easy to wrap your compressor with Writer fairly easily - I guess you'd need a small buffer which gets flushed through the compressor when full, and a final block when the stream is closed. |
Regarding SliceHeader - I was advised never to use it! https://groups.google.com/d/msg/golang-nuts/_qu3u9HDVK8/crR7aNkRCwAJ The advice seems to be to only allocate buffers on the Go side, and then pass them as an argument for writing to them. |
Ah, I didn't know that SliceHeader was to be avoided. Looking at |
Well the compressor has its own (ring) buffer, so we can just keep track of how much we've already copied into it, and as soon as reach the input block size (or the BrotliWriter is closed), then we call What would be the purpose of the final block? I see
|
Re
What do you think? |
Answering my own question, I think the |
Encode parallel uses a buffer of 2x input + 500 as a temporary output buffer for each block, and falls back to an uncompressed block if it runs out of space. That seems to be because each input byte could theoretically be encoded as 2 bytes, plus the metablock header. TBH, I can't think of a reason to even allocate 2x input in that case, since anything larger than the input + metablock header isn't worth compressing. https://github.com/kothar/brotli-go/blob/master/enc/encode_parallel.cc#L154 |
Right, but if I'm using brotli as part of, say, a network protocol, and for some one-in-a-thousand cases the compressed output is larger than the uncompressed version I'd still want the pipeline to remain working instead of it randomly throwing an error and me having to code a backup |
Note to self: drop the edit: Hmm but then it'd be |
cf. #7 btw (just so the PR and the issue are cross-referenced) |
closing as PR is available |
I'm currently looking into supporting
BrotliDecompressStream
and its compress equivalent, and I had a simple question: the README mentions the need to wrap some C++ structures, but (as of 10 days ago),BrotliDecompressStream
& friends seem to be usable purely from the C interface?It looks relatively easy to do but I'm a cgo noob so it might take me a day or two.
(unsafe.)Pointers welcome!
P.S: do tell me if this issue is unwelcome noise on the issue tracker, in which case I'll try & get it working in my corner and come back with a PR later — just thought I'd deduplicate work in case someone is already on it!
The text was updated successfully, but these errors were encountered: