-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What about using a seed with "dual layout" i.e. row- and col- oriented? #19
Comments
Hey @mikejiang, @gfinak, @raphg, Yes the seed class is mandated. This is how you actually implement the backend, by implementing a seed class. When you implement an About implementing a backend with dual layouts: For this to play well with DelayedArray, at least 2 things are needed:
If you're going to implement a seed class for HDF5 dual layout, you should probably avoid starting from scratch. It's going to be easier to define the new class on top of the HDF5ArraySeed class e.g. with something like this:
It feels to me that the approach would be the same if you were going to implement a seed class for tiledb dual layout (except that AFAIK there is no TileDbSeed class yet so you would need to start by implementing that). Implementing a dual layout seed might actually be done in a more generic way e.g. with something like:
with a validity method that checks that the seeds stored in the |
@mikejiang Is it ok to close this? |
I'm opening an issue to facilitate the discussion about this. The discussion started with Mike's following email (May 12, 2018):
Herve,
I am in the process of implementing a
delayedArray
backend and have some design questions for you. In vignetteIn theory, it should be possible to implement a DelayedArray backend for any file format that has the capability to store array data with fast random access
, which I understand as the minimum requirement forextract_array
method (used for random indexing[i,j]
) anddims
,dimnames
slots. (edited)However, I am not quite sure what is the purpose of
seed class
which contains the filepath of the actual data store. I thought theDelayedArray
should be agnostic about the physical storage information of the backend.If it is mandated, then my
seed object
will contain at least twofile paths
, which represent different data layout (both row and column oriented storage) of the same matrix. is that going to play well with theDelayedArray
?Also, I am envisioning our extended
array
type will be a generic disk-based class, which allows arbitrary file format (h5, tiledb,etc..) to be used in this type of hybrid data layout framework.Hopefully you can give me some clarifications on these. Thanks a lot!
Mike
The text was updated successfully, but these errors were encountered: