-
-
Notifications
You must be signed in to change notification settings - Fork 1
feat: implement Apache Avro encoder with schema validation #42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Co-authored-by: streamich <9773803+streamich@users.noreply.github.com>
src/avro/AvroEncoder.ts
Outdated
| this.writer.ensureCapacity(4); | ||
| this.writer.view.setFloat32(this.writer.x, float, true); // little-endian | ||
| this.writer.move(4); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
De-reference this.writer only once:
const writer = this.writer;There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do it everywhere in the code: de-reference pointers only once, where possible. Performance is the number one feature of this library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Optimized all performance-critical methods to cache this.writer to local variables to reduce property access overhead. Commit d6842c4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implemented throughout the codebase. All writer access patterns in hot paths now use local variable caching for better performance. Commit d6842c4
src/avro/AvroEncoder.ts
Outdated
| * Writes an Avro string value with UTF-8 encoding and length prefix. | ||
| */ | ||
| public writeStr(str: string): void { | ||
| const bytes = new TextEncoder().encode(str); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DO NOT use TextEncoder it is very slow. The Writer has fast string encoding methods. Or see other codec implementations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replaced TextEncoder with Writer's native utf8() method for much faster string encoding. The implementation now reserves space for the length prefix, writes the string, then fills in the actual byte count. Commit d6842c4
src/avro/AvroEncoder.ts
Outdated
| public writeObj(obj: Record<string, unknown>): void { | ||
| const entries = Object.entries(obj); | ||
| this.writeVarIntUnsigned(entries.length); | ||
| for (const [key, value] of entries) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do not use ForOfLoop use plain ForLoop, it is faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Converted all for-of loops to plain for loops throughout both AvroEncoder and AvroSchemaEncoder for better performance. Commit d6842c4
| /** | ||
| * Writes an ASCII string (same as regular string in Avro) | ||
| */ | ||
| public writeAsciiStr(str: string): void { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Writer instance should have a way to write .ascii() strings, see other code for reference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated writeAsciiStr to use Writer's ascii() method for faster ASCII string encoding instead of the generic UTF-8 path. Commit d6842c4
src/avro/AvroSchemaEncoder.ts
Outdated
| /** | ||
| * Writes a null value with schema validation. | ||
| */ | ||
| public writeNull(schema: AvroSchema): void { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In these primitive encoding methods, we should specify the specific schema type, no? In this case AvroNullSchema.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added specific AvroNullSchema type to the writeNull method signature and imported the type for better type safety. Commit d6842c4
streamich
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When done, make sure tests pass, linter and formatter pass.
Co-authored-by: streamich <9773803+streamich@users.noreply.github.com>
Co-authored-by: streamich <9773803+streamich@users.noreply.github.com>
All checks now pass: Fixed formatting issues in the Avro implementation files. Commit 6b9bd54 |
|
🎉 This PR is included in version 1.6.0 🎉 The release is available on: Your semantic-release bot 📦🚀 |
This PR implements a complete Apache Avro binary encoder for the json-pack library, following the Apache Avro 1.12.0 specification.
Implementation
The implementation includes three main components:
1. AvroSchemaValidator
A comprehensive schema validator that validates Avro schemas and ensures values conform to their schemas:
2. AvroEncoder
A basic Avro binary encoder implementing the
BinaryJsonEncoderinterface for encoding values without schema validation:3. AvroSchemaEncoder
A schema-aware encoder that validates values against schemas before encoding:
Key Features
Testing
Added 112 comprehensive test cases covering:
All existing tests continue to pass, ensuring no regressions.
Fixes #41.
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.