When retrieving data from S3, account for chunking #1085

horsto · 2023-05-12T06:00:42Z

There seems to be an issue with retrieval of larger ("external") files from S3 compatible buckets. I documented the error here: #1083. I brought it up as issue in the minio-api repository as well: minio/minio-py#1280.

The problem is that files can be returned as chunks via the minio API, such that .data cannot be called directly. Instead, .stream and subsequent concatenation of bytes seems to work for all cases. This behavior is already correctly implemented in fget() (

datajoint-python/datajoint/s3.py

Line 88 in ee95f02

for d in data.stream(1 << 16):

)

horsto · 2023-05-12T19:37:20Z

This PR changes the .data method to .stream in the get() method for S3 (external) objects, which accommodates chunking of larger files (opposed to assuming that every / the whole file is loaded at once).

… into horst-changelog

dimitri-yatsenko · 2023-05-15T13:13:17Z

@horsto Would you merge this PR horsto#1 ?

pull from upstream and update changelog

horsto · 2023-05-15T13:29:37Z

Done!

When retrieving data from S3, account for chunking

13cb2be

dimitri-yatsenko approved these changes May 12, 2023

View reviewed changes

A-Baji added 2 commits May 12, 2023 21:13

Merge branch 's3_chunked' of https://github.com/horsto/datajoint-python…

8d5d6cc

… into horst-changelog

update changelog

2fce9ae

Merge pull request #1 from A-Baji/horst-changelog

d50fce2

pull from upstream and update changelog

dimitri-yatsenko approved these changes May 15, 2023

View reviewed changes

Merge branch 'master' into s3_chunked

1a4103a

dimitri-yatsenko merged commit 97e34a6 into datajoint:master May 15, 2023
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When retrieving data from S3, account for chunking #1085

When retrieving data from S3, account for chunking #1085

horsto commented May 12, 2023

horsto commented May 12, 2023

dimitri-yatsenko commented May 15, 2023

horsto commented May 15, 2023

When retrieving data from S3, account for chunking #1085

When retrieving data from S3, account for chunking #1085

Conversation

horsto commented May 12, 2023

horsto commented May 12, 2023

dimitri-yatsenko commented May 15, 2023

horsto commented May 15, 2023