Skip to content

docs: document read_blobs for blob payload reads#7530

Merged
prrao87 merged 4 commits into
mainfrom
xuanwo/docs-blob-read-blobs
Jun 30, 2026
Merged

docs: document read_blobs for blob payload reads#7530
prrao87 merged 4 commits into
mainfrom
xuanwo/docs-blob-read-blobs

Conversation

@Xuanwo

@Xuanwo Xuanwo commented Jun 30, 2026

Copy link
Copy Markdown
Collaborator

Summary

This PR updates the blob documentation to make read_blobs the recommended path for workflows that need complete blob payloads in memory.

It also clarifies when to use take_blobs for lazy file-like access and when to use scanner(..., blob_handling="all_binary") for Arrow table scans.

Context

Blob users were likely to discover take_blobs first and then manually parallelize BlobFile.readall() calls for batch loaders. Lance already has read_blobs, which plans and executes batched materialized blob reads, but the guide did not surface it as the primary API for full-payload reads.

Validation

  • git diff --check
  • uv run make lint partially completed: Ruff format and Ruff lint passed.
  • Pyright did not complete locally on macOS because the project installs tensorflow only through the Linux tests dependency marker; the remaining errors are existing unresolved tensorflow imports in python/lance/arrow.py, python/lance/dependencies.py, and python/tests/test_arrow.py.
  • make install built the local pylance extension successfully, then failed at pre-commit install because this checkout inherits an existing core.hooksPath configuration.

@github-actions github-actions Bot added A-python Python bindings A-docs Documentation documentation Improvements or additions to documentation labels Jun 30, 2026
@Xuanwo Xuanwo marked this pull request as ready for review June 30, 2026 11:58

@prrao87 prrao87 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed some styling nits to make it read better, but thanks, LGTM!

@prrao87 prrao87 merged commit 1ba84bc into main Jun 30, 2026
14 checks passed
@prrao87 prrao87 deleted the xuanwo/docs-blob-read-blobs branch June 30, 2026 14:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-docs Documentation A-python Python bindings documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants