Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make conversion from custom serialization work even when original type cannot be understood. #468

Merged
merged 9 commits into from
Jul 10, 2023

Conversation

KnutAM
Copy link
Contributor

@KnutAM KnutAM commented Apr 24, 2023

Attempt for solving: #288 (comment)

See example usage in the comment below and temporarily included script files

It seems to work and tests pass locally, but it feels like this cannot be general.
I never even had to overload the readas function...

@codecov
Copy link

codecov bot commented Apr 24, 2023

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.49 🎉

Comparison is base (30dd578) 84.04% compared to head (9411e83) 84.53%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #468      +/-   ##
==========================================
+ Coverage   84.04%   84.53%   +0.49%     
==========================================
  Files          31       31              
  Lines        4142     4145       +3     
==========================================
+ Hits         3481     3504      +23     
+ Misses        661      641      -20     
Impacted Files Coverage Δ
src/data/custom_serialization.jl 92.00% <ø> (ø)
src/data/reconstructing_datatypes.jl 82.65% <100.00%> (+5.98%) ⬆️

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@KnutAM KnutAM changed the title Kam/readas Make conversion from custom serialization work even when original type cannot be understood. Apr 24, 2023
@KnutAM
Copy link
Contributor Author

KnutAM commented Apr 24, 2023

UPDATED: There were some mistakes in the example so it wasn't really relevant.
With the new changes, the approach seems to work as expected.
Specifically, when readas hasn't been defined by the user, there should be no change to master.

The "includeme.jl" file:

using JLD2

# Some way to define an unknown function
struct UndefinedFunction <:Function
    fun
end
(f::UndefinedFunction)(args...; kwargs...) = throw(ErrorException("The function $(f.fun) is not defined"))

# Foo has all required definitions to be reloaded back to an unknown function
struct Foo{F<:Function}
    fun::F
end
struct FooSerialization
    fun
end

JLD2.readas(::Type{<:FooSerialization}) = Foo
JLD2.writeas(::Type{<:Foo}) = FooSerialization
Base.convert(::Type{<:FooSerialization}, f::Foo) = FooSerialization(f.fun)
function Base.convert(::Type{<:Foo}, f::FooSerialization)
    isa(f.fun, Function) && return Foo(f.fun)
    return Foo(UndefinedFunction(f.fun))
end

# We do not define JLD2.readas for Bar: It should remain unconverted
struct Bar{F<:Function}
    fun::F
end
struct BarSerialization
    fun
end

JLD2.writeas(::Type{<:Bar}) = BarSerialization
Base.convert(::Type{BarSerialization}, f::Bar) = BarSerialization(f.fun)
Base.convert(::Type{Bar}, f::BarSerialization) = Bar(f.fun)
function Base.convert(::Type{<:Bar}, f::BarSerialization)
    isa(f.fun, Function) && return Bar(f.fun)
    return Bar(UndefinedFunction(f.fun))
end

Session 1:

include("includeme.jl")
jldsave("sin.jld2"; foo=Foo(sin))
jldsave("tmp.jld2"; foo=Foo(x->x^2))
jldsave("bar.jld2"; bar=Bar(x->x^2))

Session 2:

include("includeme.jl")
s = jldopen("sin.jld2", "r"); ss=s["foo"]; close(s); ss
a = jldopen("tmp.jld2", "r"); aa=a["foo"]; close(a); aa
b = jldopen("bar.jld2", "r"); bb=b["bar"]; close(b); bb
@show ss
@show aa
@show bb

In the REPL after session 2:

julia> include("session2.jl");
┌ Warning: type Main.#2#3 does not exist in workspace; reconstructing
└ @ JLD2 C:\Users\meyer\.julia\dev\JLD2\src\data\reconstructing_datatypes.jl:411
┌ Warning: custom serialization of Bar{Main.#4#5} encountered, but the type does not exist in the workspace; the data will be read unconverted
└ @ JLD2 C:\Users\meyer\.julia\dev\JLD2\src\data\reconstructing_datatypes.jl:70
┌ Warning: type Main.#4#5 does not exist in workspace; reconstructing
└ @ JLD2 C:\Users\meyer\.julia\dev\JLD2\src\data\reconstructing_datatypes.jl:411
ss = Foo{typeof(sin)}(sin)
aa = Foo{UndefinedFunction}(UndefinedFunction(JLD2.ReconstructedTypes.var"##Main.#2#3#314"()))
bb = BarSerialization(JLD2.ReconstructedTypes.var"##Main.#4#5#315"())

@JonasIsensee
Copy link
Collaborator

this looks good to me. Thanks for your work! :)

I'll say again that this probably shouldn't be a permanent solution but it also doesn't look like it would harm anything.
If you would like, you're welcome to bump the patch version, and add a changelog. If not, I'll get around to it in a few days.

@KnutAM
Copy link
Contributor Author

KnutAM commented Apr 26, 2023

Great! I'll try to add some tests and do that when I have some time.

Would you like me to also add some documentation explaining how/when to use this feature?

@JonasIsensee
Copy link
Collaborator

Great! I'll try to add some tests and do that when I have some time.

Would you like me to also add some documentation explaining how/when to use this feature?

That would be excellent!
But please do put an Experimental in front ;)

@KnutAM KnutAM marked this pull request as ready for review April 26, 2023 18:59
@KnutAM
Copy link
Contributor Author

KnutAM commented Apr 27, 2023

I don't understand why the invalidation fails, the only diff from passing to failing is using a type-assert https://github.com/JuliaIO/JLD2.jl/compare/21cf579..18bfa02#diff-4ff9a5fb7264e7d305d989c07cc06a047e83cbcf591f75d8bb675558ea216970R59 ?

Nightly seems unrelated to me and fails on master too.

@JonasIsensee
Copy link
Collaborator

I don't understand why the invalidation fails, the only diff from passing to failing is using a type-assert https://github.com/JuliaIO/JLD2.jl/compare/21cf579..18bfa02#diff-4ff9a5fb7264e7d305d989c07cc06a047e83cbcf591f75d8bb675558ea216970R59 ?

Nightly seems unrelated to me and fails on master too.

To me, it looks more like an unrelated error in the CI run rather than an actual invalidation change.
I've restarted the job.

@KnutAM
Copy link
Contributor Author

KnutAM commented Apr 28, 2023

I've restarted the job.

Great!

@KnutAM
Copy link
Contributor Author

KnutAM commented Apr 28, 2023

For future reference:

Not intending any of this to be updated in this pr, just a note of what is possible with this pr.

After tip from @fredrikekre I tried using Serialization from stdlib to write/save the function, and that seems to work!

With the readas function, one can do

using Serialization

struct Foo{F<:Function}
    fun::F
end
struct FooSerialization
    io
end

JLD2.writeas(::Type{<:Foo}) = FooSerialization
JLD2.readas(::Type{<:FooSerialization}) = Foo

function Base.convert(::Type{<:FooSerialization}, f::Foo)
    io = IOBuffer()
    serialize(io, f.fun)
    return FooSerialization(io)
end

function Base.convert(::Type{<:Foo}, f::FooSerialization)
    seek(f.io, 0)
    fun = deserialize(f.io)
    return Foo(fun)
end

saving as jldsave("foo.jld2"; foo=Foo(x->x^2)) and reading gives

getfoo(file) = jldopen(file) do io; io["foo"]; end
foo=getfoo("foo.jld2")
# Output: Foo{Serialization.__deserialized_types__.var"#11#12"}(Serialization.__deserialized_types__.var"#11#12"())
foo.fun
# Output: #11 (generic function with 1 method)
foo.fun(2)
# Output: 4

Not tested a lot, but could potentially provide workarounds for some of #208, #175, #36, and #37 (And if there is a failure, it should occur during convert in loading, so that when reopening the file this function can be redefined to handle potential issues with deserialization.

@KnutAM
Copy link
Contributor Author

KnutAM commented May 11, 2023

@JonasIsensee do you want any more changes before this can be merged?

@JonasIsensee
Copy link
Collaborator

I'm sorry for leaving this unanswered for so long. Was busy and forgot.

Thank you for your contribution!

@JonasIsensee JonasIsensee merged commit 32aae76 into JuliaIO:master Jul 10, 2023
11 of 14 checks passed
@KnutAM KnutAM deleted the kam/readas branch August 15, 2023 09:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants