Skip to content

Conversation

@jerry-024
Copy link
Contributor

@jerry-024 jerry-024 commented Oct 13, 2025

Purpose

Support blob type and blob write and read

Tests

  • blob_test.BlobTest
  • blob_test.BlobEndToEndTest

API and Format

Documentation

@jerry-024 jerry-024 force-pushed the python_support_blob_type branch 2 times, most recently from ed4c2fb to a4de958 Compare October 14, 2025 02:34
@jerry-024 jerry-024 force-pushed the python_support_blob_type branch from a4de958 to d0eadd9 Compare October 14, 2025 02:40
@jerry-024 jerry-024 changed the title support blob [python] support blob type and blob write and read Oct 14, 2025
@jerry-024 jerry-024 requested a review from Copilot October 14, 2025 02:54
@jerry-024 jerry-024 marked this pull request as draft October 14, 2025 02:55
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds comprehensive support for BLOB (Binary Large Object) data type in the Python Paimon library, including data structures, I/O operations, and format handling.

  • Implements BLOB data type with BlobData and BlobRef classes for in-memory and reference-based storage
  • Adds blob-specific file format writer and reader with compression and indexing
  • Integrates BLOB support into existing serialization, type conversion, and file I/O systems

Reviewed Changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
blob.py Core BLOB data structures and interface definitions
blob_format_writer.py Writer for Paimon's blob file format with compression
format_blob_reader.py Reader for blob files with decompression and indexing
generic_row.py Serialization support for BLOB fields
data_types.py Type system integration for BLOB
file_io.py File I/O operations for blob format
delta_varint_compressor.py Compression utility for blob index data
core_options.py Configuration constant for blob format
split_read.py Integration of blob reader into split reading
blob_test.py Comprehensive test suite for all blob functionality

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@jerry-024 jerry-024 force-pushed the python_support_blob_type branch from 25f7cfe to 9a9ce9c Compare October 14, 2025 03:17
@jerry-024 jerry-024 requested a review from Copilot October 14, 2025 03:17
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated 3 comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines 62 to 63
bin_length = self.position - previous_pos + 12
self.lengths.append(bin_length)
Copy link

Copilot AI Oct 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The magic number 12 represents the combined size of length (8 bytes) and CRC (4 bytes) fields. Consider defining this as a named constant like METADATA_SIZE = 12 to improve code clarity and maintainability.

Copilot uses AI. Check for mistakes.
@jerry-024 jerry-024 requested a review from Copilot October 14, 2025 03:22
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@jerry-024 jerry-024 force-pushed the python_support_blob_type branch from ac5df56 to 23b5e5c Compare October 14, 2025 06:23
@jerry-024 jerry-024 requested a review from Copilot October 14, 2025 06:32
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 5 comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@jerry-024 jerry-024 marked this pull request as ready for review October 14, 2025 06:53
@jerry-024 jerry-024 force-pushed the python_support_blob_type branch from 87e4747 to 297d764 Compare October 14, 2025 06:55
@jerry-024 jerry-024 force-pushed the python_support_blob_type branch from b9c511e to 5f47706 Compare October 14, 2025 07:20
@jerry-024 jerry-024 force-pushed the python_support_blob_type branch from 5f47706 to d70eca7 Compare October 14, 2025 07:28
Copy link
Contributor

@leaves12138 leaves12138 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@leaves12138 leaves12138 merged commit 8b542c5 into apache:master Oct 14, 2025
3 of 4 checks passed
@jerry-024 jerry-024 deleted the python_support_blob_type branch October 14, 2025 07:48
@jerry-024 jerry-024 restored the python_support_blob_type branch October 16, 2025 02:22
@jerry-024 jerry-024 deleted the python_support_blob_type branch October 16, 2025 02:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants