Skip to content

Lazy default children #25

@DrChainsaw

Description

@DrChainsaw

About my story from the discourse thread, I was a bit bored so I cooked up a solution which seems to work for my use case.

I don't know if this is clean and robust enough to be worthwhile to add, but here it is in case anyone finds it useful.

To summarize the use-case: Sometimes files contain multiple items which are useful to view as separate Files in the filetree in a way where it is not possible to know exactly which items they contain. One example of this are text logs where each line is printed out from some part of a program and different parts print different things, something like this:

> cat some.log
thisorthat_func:234 par1=324, par2=happy,...
thisorthat_func:234 par1=12, par2=sad,...
someother_func:78 x=34.4, z=11,...
thisorthat_func:234 par1=32, par2=sad,...
...

One might want to see this as

some.log/
├─ thisorthat_func234 DataFrame(par1, par2,...)
└─ someother_func78 DataFrame(x, y,...)

It is often known roughly what could be in there, so it makes some sense to have default children. Here I use a dict which maps child names to their values for easier groking, but this should be possible to abstract to a generic function. It is also hardwired to assume laziness for the sake of brevity.

defaultchildren(names, val=s -> NoValue()) = f -> defaultchildren(f, names, val)
function defaultchildren(f::Union{File,FileTree}, names, val) 
    v = f[]
    v.cache = true
    maketree(name(f) => defaultchild.(names, Ref(v), val))
end
defaultchild(n, v, val) = (name = n, value = FileTrees.maybe_lazy(d -> get(d, Symbol(n), val(n)))(v))

Here is an example:

julia> ft = maketree("1" => [(name="2", value=lazy(() -> Dict(:a => 1, :b => 2, :c => 3))())])
1/
└─ 2 (Thunk(#21, ()))

julia> map(defaultchildren(["a", "b", "c"]), ft; dirs=false)
1/
└─ 2/
   ├─ a (Thunk(#11, (Thunk(#21, ...),)))
   ├─ b (Thunk(#11, (Thunk(#21, ...),)))
   └─ c (Thunk(#11, (Thunk(#21, ...),)))

julia> ftd = map(defaultchildren(["a", "b", "c"]), ft; dirs=false)
1/
└─ 2/
   ├─ a (Thunk(#11, (Thunk(#21, ...),)))
   ├─ b (Thunk(#11, (Thunk(#21, ...),)))
   └─ c (Thunk(#11, (Thunk(#21, ...),)))

julia> reducevalues(+, ftd[r"(a|c)"]) |> exec
4

So far so good, but the big drawback is that if there are values produced by the creation operation which we have not guessed are there, they will never be seen:

julia> ftd = map(defaultchildren(["a", "b"]), ft; dirs=false)
1/
└─ 2/
   ├─ a (Thunk(#11, (Thunk(#21, ...),)))
   └─ b (Thunk(#11, (Thunk(#21, ...),)))


julia> reducevalues(+, ftd[r"(a|c)"]) |> exec
1

Another issue is that one might not want to have the default children, only the ones which actually materialized.

Both of these are adressed by the following extension:

# Slight redefinition of defaultchildren:
function defaultchildren(f::Union{File,FileTree}, names, val) 
    v = f[]
    v.cache = true
    maketree((name=name(f), value=lazy(fixmeup)(f)) => defaultchild.(names, Ref(v), val))
end

# This should obviously have a better name
struct FixMeUp{T,F}
    v::T
    rmchild::F
end
fixmeup(f::File, rmchild=f -> f isa File && f[] isa NoValue) = FixMeUp(f[], rmchild)
function FileTrees.FileTree(parent::Union{FileTree,Nothing}, myname::String, children::Vector{T}, value::FixMeUp) where T
    d = exec(value.v)

    # Create children for keys in the dict which did not have default children
    newchildren = [File(nothing, string(dk), dv) for (dk, dv) in d if string(dk)  name.(children)]

    # Remove (typically default) children which we don't want (e.g. have NoValue) 
    cf = filter(!value.rmchild, vcat(newchildren, children))
    return FileTree(parent, myname, cf)
end

Now what happens? Check it out:

julia> ftd = map(defaultchildren(["a", "b", "y", "z"]), ft; dirs=false)
1/
└─ 2/ (Thunk(fixmeup, (File(1\2),)))
   ├─ a (Thunk(#11, (Thunk(#29, ...),)))
   ├─ b (Thunk(#11, (Thunk(#29, ...),)))
   ├─ y (Thunk(#11, (Thunk(#29, ...),)))
   └─ z (Thunk(#11, (Thunk(#29, ...),)))

# Default is to remove all Files with NoValue
julia> ftd |> exec
1/
└─ 2/
   ├─ c (Int64)
   ├─ a (Int64)
   └─ b (Int64)

# Ok, it is impossible to know whether c will match or not before exec
julia> reducevalues(+, ftd[r"(a|b|c)"]) |> exec
3

# So one has to remember to do this
julia> reducevalues(+, ftd[r"(a|b|c)"] |> exec)
6

# A bit the same with lazy mappings, but given that one knows which default values they put in this should not be surprising
# FixMeUp could be added to things to ignore in mapvalues
julia> ftm = mapvalues(x -> x isa FixMeUp ? x : 10x, ftd)
1/
└─ 2/ (Thunk(#35, (Thunk(fixmeup, ...),)))
   ├─ a (Thunk(#35, (Thunk(#11, ...),)))
   ├─ b (Thunk(#35, (Thunk(#11, ...),)))
   ├─ y (Thunk(#35, (Thunk(#11, ...),)))
   └─ z (Thunk(#35, (Thunk(#11, ...),)))

# Note: c was not multiplied 
julia> reducevalues(+, ftm[r"(a|b|c)"] |> exec)
33

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions