Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Further granularity for JLD #65

Closed
davidavdav opened this issue May 30, 2016 · 11 comments
Closed

Further granularity for JLD #65

davidavdav opened this issue May 30, 2016 · 11 comments

Comments

@davidavdav
Copy link

Hello,

I understand JLD is distinguished from HDF5, which is great. I my own packages I rely on JLD to store specific types (e.g., a GaussianMixtures::GMM). I believe JLD already handles most of what I want, but sometimes I'd like to call a constructor-specific hook for loading or saving. How should I handle this in the FileIO infrastructure.

To be more concrete, suppose I have a type like:

type T1
  compact::Matrix
  derived::Vector{Matrix}
  function T1(c::Matrix)
    d = some_heavyish_precomputation(c)
    new(c, d)
  end
end

then I'd like to save only T1.c and upon loading call something like T1(load(stream)["t1.c"]). But it would be nice if this can be accomplished all from just loading a JLD-encoded file using FileIO magic.

How would I go about that?

---david

@timholy
Copy link
Member

timholy commented May 31, 2016

I think you're looking for a custom serializer: https://github.com/JuliaIO/JLD.jl/blob/master/doc/jld.md#custom-serialization. You can define JLD.readas and JLD.writeas methods for your types in your packages. (That makes JLD a dependency, of course.)

Since I don't think this is a FileIO issue, I'm closing this (but if you disagree, I can reopen).

@timholy timholy closed this as completed May 31, 2016
@davidavdav
Copy link
Author

Thanks,

This is more/less what I was looking for. I had missed that part in the docs... There still might be a small relation to FileIO: In the documented example, an obj2 = load(filename) will still result in an object of type MyVectors5Serializer, so the FileIO magic goes as far as running JLD.jldopen() I suppose, but not the associated JLD.readas(). At least, this is what happened for my type. Is this right?

Would there be a way of cascading the FileIO magic to readas?

@timholy
Copy link
Member

timholy commented May 31, 2016

There still might be a small relation to FileIO: In the documented example, an obj2 = load(filename) will still result in an object of type MyVectors5Serializer

It shouldn't, if the package that defines the readas method has been loaded. If you can observe this behavior in a reproducible example, then it's a bug that should be filed to JLD.jl.

@timholy
Copy link
Member

timholy commented May 31, 2016

Relevant to the previous point: see also the part in the docs about addrequire.

@davidavdav
Copy link
Author

Yes, thanks, I did have the addrequire in the jldopen(filename, "w") do ... end in my test. But perhaps the test didn't result in the right result because of module-reloading issues. I'll try to do a clean test now

@davidavdav
Copy link
Author

Ah---I found the reason why it just does not exactly what I want, and why I was looking for a FileIO solution.

In the documented example, things work if I say something like

obj4 = FileIO.load(filename, "somedata")

but what I was looking for was a way to be able to say

obj3 = FileIO.load(filename)

which would need additional magic that files that require MyModule and have as a root dataset "somedata" (a better name would be MyVector5 in such a case).

The parallel with FileIO would be that most standard formats have "unnamed" objects as well, e.g., :WAV just has the audio. I am looking for a way to save objects from my types in a JLD container, namelessly.

@timholy
Copy link
Member

timholy commented Jun 1, 2016

You'd probably need to define a new file extension, and most likely give it a special magic #.

@davidavdav
Copy link
Author

Is there a way to add additional magic after string(magic_base, f.version) in JLD, so that the file will be both :JLD and the special :MyVector5 file type? I assume I can always re-write the header by hand.

This is assuming that FileIO will go for the longer magic match.

@timholy
Copy link
Member

timholy commented Jun 1, 2016

I don't think that FileIO can (in the central repository) reasonably support "custom JLD content" as separate file types; after all, there are an infinite number of possible combinations of variable names, types, etc., and so you might define 3 different ones, I might define 5 different ones, etc. Only if there is some standard in widespread use would we consider such a thing.

However, of course you can add this locally. But perhaps an easier option would be to write a function called myload which checks to see if it's a JLD file and, if so, asks whether the content is one of your "standard" formats. If not, myload can call load.

@davidavdav
Copy link
Author

Yes, that makes sense.

It is funny, though, that I tend to save objects (GaussianMixtures.GMM, IVectors.IExtractor) in a JLD under their type name. Another interface that appears to make sense is load(file, ::Type), because in the code that calls load you usually know what the resulting type should be. This makes it a little more explicit.

@timholy
Copy link
Member

timholy commented Jun 3, 2016

Right, but other people might save 20 different objects of the same type to a single file, using a different pathname for each.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants