Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support large BSONs by using Int64 in case Int32 size range is exceeded. #74

Open
racinmat opened this issue Aug 4, 2020 · 2 comments

Comments

@racinmat
Copy link
Contributor

racinmat commented Aug 4, 2020

Proposal how to fix #67 .
Based on discussion https://julialang.slack.com/archives/C67TK21LJ/p1596027666030200
Test if data fit to Int32 range and use it, othwerwise use 64bit range. #67 (comment) described where specifically the cast is happening.

Rationale beihind this: BSON.jl already violates BSON specification by allowing saving arbitrary types.
This follows same ides, as long as user makes sure input data follow BSON specification, it is compliant with the specification, but it also allows to persist data that don't follow BSON specification if it's required.

@rofinn
Copy link

rofinn commented Oct 23, 2020

An alternative approach, which might scale better, would be to partition large vectors/strings into multi-part objects and even breakup large docs into multi-part files. This would allow other loaders to still load and combine the data as necessary. I'd argue that violating the core BSON spec should be avoided and workarounds should remain backward compatible in some form.

@rofinn
Copy link

rofinn commented Oct 23, 2020

Alternatively, Preferences.jl could be maybe be used to allow projects to decide if they want strict BSON files?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants