-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request: cache chunk data #15
Comments
This is exactly what this package is for and what all implemented methods for reduction, broadcast etc are doing. Did you run into a case where you still get problems with this package? |
Will this still be efficient? Eeach In my use case, it's hard to formulate the above complex operations using |
No, this example would not be efficient. However, for example: A = DiskArray(hdf5_dataset)
broadcast(A) do a
complex_operations_for(a)
end would be efficient and be done chunk by chunk. Adding If you really insist on avoiding A = DiskArray(hdf5_dataset)
for i in eachindex(A)
complex_operations_for(A[i])
end I have actually thought about implementing this, but I think there might be many problems with the implementation. In addition I did not find a real-world use case for yet that I could not express with the broadcast-like constructs mentioned above. |
Another way to approach this would of course be to use https://github.com/JuliaCollections/LRUCache.jl at the cost of a |
It will be more efficient to cache the current chunk in use when using a
for
loop to iterate over the data because each reading has a non-neglegible overhead (e.g. HDF5.jl).The text was updated successfully, but these errors were encountered: