-
Notifications
You must be signed in to change notification settings - Fork 12
feat(NODE-6540): Add c++ zstd compression API #30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
21 commits
Select commit
Hold shift + click to select a range
3290ded
all changes - initial POC
baileympearson 3a3e127
misc last changes
baileympearson 224bb06
lint
baileympearson 46509f4
remove extra stuff from test
baileympearson 695581c
include gaurds
baileympearson 7b1e0e6
use builtin types
baileympearson 6dc43ad
fix typo
baileympearson 95b9629
fix compilation in CI
baileympearson 0d90e9e
comments'
baileympearson bb88035
Compression namespace, no implementation in header files
baileympearson 251003c
migrate away from creating promise in C++
baileympearson 04030a0
other comments
baileympearson e795b4b
simplify install in test
baileympearson 8d6c390
fix lint
baileympearson 2b4dcd8
fix compilation
baileympearson 3c5448f
last comment
baileympearson 0e59111
last comment
baileympearson 1abda0f
fix lnt
baileympearson 5375216
changes
baileympearson f6863e6
comments
baileympearson aba799c
Anna"s comment
baileympearson File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
name: Lint | ||
|
||
on: | ||
push: | ||
branches: [ "main" ] | ||
pull_request: | ||
branches: [ "main" ] | ||
|
||
jobs: | ||
build: | ||
runs-on: ubuntu-latest | ||
|
||
name: ${{ matrix.lint-target }} | ||
strategy: | ||
matrix: | ||
lint-target: ["c++", "typescript"] | ||
|
||
steps: | ||
- uses: actions/checkout@v4 | ||
|
||
- name: Use Node.js LTS | ||
uses: actions/setup-node@v4 | ||
with: | ||
node-version: 'lts/*' | ||
cache: 'npm' | ||
|
||
- name: Install dependencies | ||
shell: bash | ||
run: npm i --ignore-scripts | ||
|
||
- if: matrix.lint-target == 'c++' | ||
shell: bash | ||
run: | | ||
npm run check:clang-format | ||
- if: matrix.lint-target == 'typescript' | ||
shell: bash | ||
run: | | ||
npm run check:eslint |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -18,3 +18,4 @@ node_modules | |
build | ||
|
||
npm-debug.log | ||
deps |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,73 @@ | ||
#include "compression.h" | ||
|
||
std::vector<uint8_t> mongodb_zstd::compress(const std::vector<uint8_t>& data, | ||
size_t compression_level) { | ||
size_t output_buffer_size = ZSTD_compressBound(data.size()); | ||
std::vector<uint8_t> output(output_buffer_size); | ||
|
||
size_t result_code = | ||
ZSTD_compress(output.data(), output.size(), data.data(), data.size(), compression_level); | ||
|
||
if (ZSTD_isError(result_code)) { | ||
throw std::runtime_error(ZSTD_getErrorName(result_code)); | ||
} | ||
|
||
output.resize(result_code); | ||
|
||
return output; | ||
} | ||
|
||
std::vector<uint8_t> mongodb_zstd::decompress(const std::vector<uint8_t>& compressed) { | ||
std::vector<uint8_t> decompressed; | ||
|
||
using DCTX_Deleter = void (*)(ZSTD_DCtx*); | ||
|
||
std::unique_ptr<ZSTD_DCtx, DCTX_Deleter> decompression_context( | ||
ZSTD_createDCtx(), [](ZSTD_DCtx* ctx) { ZSTD_freeDCtx(ctx); }); | ||
|
||
ZSTD_inBuffer input = {compressed.data(), compressed.size(), 0}; | ||
std::vector<uint8_t> output_buffer(ZSTD_DStreamOutSize()); | ||
ZSTD_outBuffer output = {output_buffer.data(), output_buffer.size(), 0}; | ||
|
||
// Source: https://facebook.github.io/zstd/zstd_manual.html#Chapter9 | ||
// | ||
// Use ZSTD_decompressStream() repetitively to consume your input. | ||
// The function will update both `pos` fields. | ||
// If `input.pos < input.size`, some input has not been consumed. | ||
// It's up to the caller to present again remaining data. | ||
// The function tries to flush all data decoded immediately, respecting output buffer size. | ||
// If `output.pos < output.size`, decoder has flushed everything it could. | ||
// But if `output.pos == output.size`, there might be some data left within internal buffers., | ||
// In which case, call ZSTD_decompressStream() again to flush whatever remains in the buffer. | ||
// Note : with no additional input provided, amount of data flushed is necessarily <= | ||
// ZSTD_BLOCKSIZE_MAX. | ||
// @return : 0 when a frame is completely decoded and fully flushed, | ||
// or an error code, which can be tested using ZSTD_isError(), | ||
// or any other value > 0, which means there is still some decoding or flushing to do to | ||
// complete current frame : | ||
// the return value is a suggested next input size (just a hint | ||
// for better latency) that will never request more than the | ||
// remaining frame size. | ||
auto inputRemains = [](const ZSTD_inBuffer& input) { return input.pos < input.size; }; | ||
auto isOutputBufferFlushed = [](const ZSTD_outBuffer& output) { | ||
return output.pos < output.size; | ||
}; | ||
|
||
while (inputRemains(input) || !isOutputBufferFlushed(output)) { | ||
size_t const ret = ZSTD_decompressStream(decompression_context.get(), &output, &input); | ||
if (ZSTD_isError(ret)) { | ||
throw std::runtime_error(ZSTD_getErrorName(ret)); | ||
} | ||
|
||
size_t decompressed_size = decompressed.size(); | ||
decompressed.resize(decompressed_size + output.pos); | ||
std::copy(output_buffer.data(), | ||
output_buffer.data() + output.pos, | ||
decompressed.data() + decompressed_size); | ||
|
||
// move the position back go 0, to indicate that we are ready for more data | ||
output.pos = 0; | ||
} | ||
|
||
return decompressed; | ||
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
#ifndef MONGODB_ZSTD_COMPRESSION | ||
#define MONGODB_ZSTD_COMPRESSION | ||
|
||
#include <exception> | ||
#include <vector> | ||
|
||
#include "compression_worker.h" | ||
#include "zstd.h" | ||
|
||
namespace mongodb_zstd { | ||
std::vector<uint8_t> compress(const std::vector<uint8_t>& data, size_t compression_level); | ||
std::vector<uint8_t> decompress(const std::vector<uint8_t>& data); | ||
} // namespace mongodb_zstd | ||
|
||
#endif |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
#ifndef MONGODB_ZSTD_COMPRESSION_WORKER_H | ||
#define MONGODB_ZSTD_COMPRESSION_WORKER_H | ||
#include <napi.h> | ||
|
||
#include <optional> | ||
#include <variant> | ||
|
||
using namespace Napi; | ||
addaleax marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
namespace mongodb_zstd { | ||
/** | ||
* @brief An asynchronous Napi::Worker that can be with any function that produces | ||
* CompressionResults. | ||
* */ | ||
class CompressionWorker final : public AsyncWorker { | ||
public: | ||
CompressionWorker(const Function& callback, std::function<std::vector<uint8_t>()> worker) | ||
: AsyncWorker{callback, "compression worker"}, m_worker(worker), m_result{} {} | ||
|
||
protected: | ||
void Execute() final { | ||
W-A-James marked this conversation as resolved.
Show resolved
Hide resolved
|
||
m_result = m_worker(); | ||
} | ||
|
||
void OnOK() final { | ||
if (!m_result.has_value()) { | ||
Callback().Call({Error::New(Env(), | ||
"zstd runtime - async worker finished without " | ||
"a compression or decompression result.") | ||
.Value()}); | ||
return; | ||
} | ||
addaleax marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
const std::vector<uint8_t>& data = m_result.value(); | ||
Buffer result = Buffer<uint8_t>::Copy(Env(), data.data(), data.size()); | ||
|
||
Callback().Call({Env().Undefined(), result}); | ||
} | ||
|
||
private: | ||
std::function<std::vector<uint8_t>()> m_worker; | ||
std::optional<std::vector<uint8_t>> m_result; | ||
}; | ||
|
||
} // namespace mongodb_zstd | ||
#endif |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,20 +1,60 @@ | ||
#include "zstd.h" | ||
|
||
#include <napi.h> | ||
|
||
#include <vector> | ||
|
||
#include "compression.h" | ||
#include "compression_worker.h" | ||
|
||
using namespace Napi; | ||
|
||
Napi::String Compress(const Napi::CallbackInfo& info) { | ||
auto string = Napi::String::New(info.Env(), "compress()"); | ||
return string; | ||
namespace mongodb_zstd { | ||
void Compress(const CallbackInfo& info) { | ||
// Argument handling happens in JS | ||
if (info.Length() != 3) { | ||
const char* error_message = "Expected three arguments."; | ||
throw TypeError::New(info.Env(), error_message); | ||
} | ||
|
||
Uint8Array to_compress = info[0].As<Uint8Array>(); | ||
std::vector<uint8_t> data(to_compress.Data(), to_compress.Data() + to_compress.ElementLength()); | ||
|
||
size_t compression_level = static_cast<size_t>(info[1].ToNumber().Int32Value()); | ||
Function callback = info[2].As<Function>(); | ||
|
||
CompressionWorker* worker = | ||
new CompressionWorker(callback, [data = std::move(data), compression_level] { | ||
return mongodb_zstd::compress(data, compression_level); | ||
}); | ||
|
||
worker->Queue(); | ||
} | ||
Napi::String Decompress(const Napi::CallbackInfo& info) { | ||
auto string = Napi::String::New(info.Env(), "decompress()"); | ||
return string; | ||
|
||
void Decompress(const CallbackInfo& info) { | ||
// Argument handling happens in JS | ||
if (info.Length() != 2) { | ||
const char* error_message = "Expected two argument."; | ||
throw TypeError::New(info.Env(), error_message); | ||
} | ||
|
||
Uint8Array compressed_data = info[0].As<Uint8Array>(); | ||
std::vector<uint8_t> data(compressed_data.Data(), | ||
compressed_data.Data() + compressed_data.ElementLength()); | ||
Function callback = info[1].As<Function>(); | ||
|
||
CompressionWorker* worker = new CompressionWorker( | ||
callback, [data = std::move(data)] { return mongodb_zstd::decompress(data); }); | ||
|
||
worker->Queue(); | ||
} | ||
|
||
Napi::Object Init(Napi::Env env, Napi::Object exports) { | ||
exports.Set(Napi::String::New(env, "compress"), Napi::Function::New(env, Compress)); | ||
exports.Set(Napi::String::New(env, "decompress"), Napi::Function::New(env, Decompress)); | ||
Object Init(Env env, Object exports) { | ||
exports.Set(String::New(env, "compress"), Function::New(env, Compress)); | ||
exports.Set(String::New(env, "decompress"), Function::New(env, Decompress)); | ||
return exports; | ||
} | ||
|
||
NODE_API_MODULE(zstd, Init) | ||
|
||
} // namespace mongodb_zstd |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
#!/bin/sh | ||
set -o xtrace | ||
|
||
clean_deps() { | ||
rm -rf deps | ||
} | ||
|
||
download_zstd() { | ||
rm -rf deps | ||
mkdir -p deps/zstd | ||
|
||
curl -L "https://github.com/facebook/zstd/releases/download/v1.5.6/zstd-1.5.6.tar.gz" \ | ||
| tar -zxf - -C deps/zstd --strip-components 1 | ||
} | ||
|
||
build_zstd() { | ||
export MACOSX_DEPLOYMENT_TARGET=11 | ||
cd deps/zstd/build/cmake | ||
|
||
cmake . | ||
make | ||
} | ||
|
||
clean_deps | ||
download_zstd | ||
build_zstd |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.