Blob.Size loads blob content in memory

Currently the Blob.Size computes blob size by loading the blob content in memory. This is not the best way for the repositories with very large objects. We have a scenarios wherein some objects are in GBs and we face OutOfMemoryException while reading blob content. We'd want to leverage Blob.Size to screen out such items in our processing, but Blob.Size has same issue.

IMHO Blob.Size should not load entire content in memory. I had a discussion with @ethomson earlier on this. Below is the excerpt from our discussion. While I believe that changing Blob.Size implementation is the neat fix, otherwise 2 different APIs for computing Blob size would be confusing for consumers, I am also worried that it might affect checkout performance. Please let me know your suggestion on this.
# 

[@ethomson] If you really only want the size of a file, then we should be able to get that without loading the whole thing into RAM.  (Though upon further reflection, I don’t think that Blob.Size is the place to do this.  Getting only the size (and never doing any more with the blob) is a pretty rare use case, generally speaking you get the Blob because you want to act upon it and Blob.Size not loading the whole object would penalize people who call Blob.Size first before calling Blob.Contents.  Which is a pretty compelling use case.) 

[@ravindp] – I am not sure why and how much slower would it be to call Blob.Contents after Blob.Size. If Blob.Size is an inexpensive call then I believe effective time would remain almost same. It’s just that cost would move from Blob.Size to Blob.Contents.

[@ethomson] - Worth exploring.  I agree that it shouldn’t add a whole lot – the worst-case is to open all the .idx files, binary search through them looking for the ID being requested, opening the packfile that corresponds to the idx once that’s found.  I just want to avoid adding a lot of overhead when you do something like checkout that does this with great frequency.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Blob.Size loads blob content in memory #873

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Blob.Size loads blob content in memory #873

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions