A more type focused interface #19

rofinn · 2017-04-10T17:40:17Z

HI,

Flux looks really cool and I like the simplicity of the API. I'm not sure if this is the right platform for this discussion, but it would be nice if some of the magic parts of Flux were a little easier to follow. I think it would help if there were fewer macros and better use of types and multiple dispatch. Some examples include:

Limiting the scope of @net to only providing syntactic sugar rather including other behaviour that isn't clearly explained in the docs (e.g., is there an easy way to reuse outputs more than once outside of the @net syntax?). More explicit use of the model type hierarchy would help as, at least in the documentation, it is unclear that @net is also subtyping Flex.Model.
Dispatching on a backend type for model compilation rather than including and evaluating module code might provide a clearer public API for implementing new backends, and allow customization of application code based on those backend types.

abstract Backend

type MXNet <: Backend end

function compile(backend::MXNet, m::Flex.Model)
    # this is where all the scary inspection of your model code is done in order to 
    # compile it into an MXModel

    # Return MXModel
end

function execute!(m::MXModel)
    # ...
end

I'm still reading through the code, so I apologize if I'm missing some of the requirements that necessitate the heavy use of macros and code evaluation.

The text was updated successfully, but these errors were encountered:

MikeInnes · 2017-04-17T14:53:52Z

Yes, great to have this discussion here, thanks for opening.

I agree completely about the black-boxy-ness of @net, and I'd like to rework the docs to present things in a simpler way. Really, the core feature is being able to write an annotated definition like @net f(xs) = xs .* xs and have it run on a backend like TensorFlow. This process is necessarily somewhat magical, but it has a single, conceptually simple purpose, while the rest of Flux (training etc) should be very boring Julia code.

Currently @net type ... adds some orthogonal sugar on top of that. Some of it is pretty superfluous, like <: Flux.Model. The right approach is probably to explain the basics without it, then show where it adds extra convenience.

RE backends. The code eval stuff is really only a lazy-loading technique; if we simply included the backend files it would behave the same but force a dependency on both backend libs, which is unreasonable for the user. But it's not a great long-term solution; better might be to have backends be separate packages that are explicitly loaded, which would obviate all the eval stuff.

That aside, compile(::MXNet, x) is the same as compilemx(x) (which is equivalent to the current situation), unless you want to implement compile(::Any, x::Foo). I can't think of a use case for this and suspect the lazy loading is really the core issue here, but I'm happy to be corrected. Generally, if folks want to implement other backends or anything related, I'm more than happy to provide examples or fix things up as needed.

is there an easy way to reuse outputs more than once outside of the @net syntax?

Can you elaborate on this question?

rofinn · 2017-04-26T23:32:48Z

The right approach is probably to explain the basics without it, then show where it adds extra convenience.

I think that would help clarify things a lot. If I have time I'll try and work through some of the examples in the docs without @net to try and help compare the two options.

But it's not a great long-term solution; better might be to have backends be separate packages that are explicitly loaded, which would obviate all the eval stuff.

Yeah, I've also had issues with wanting lazy loading in julia. I know packages like Extern.jl aim to address this in a more general way by placing your code in a macro block without changing it, but it's essentially doing the same thing.

... compile(::MXNet, x) is the same as compilemx(x) (which is equivalent to the current situation), unless you want to implement compile(::Any, x::Foo)...

I mostly just thought that compile(::Flux.Backend, ::Flux.Model) provides a cleaner API, so folks can have code that compiles models independent of what type of backend they're using. For example, if I have a project that compiles a bunch of different models it would be nice if switching backends was as simple as creating a different Flux.Backend type at the beginning of my program and all my functions that compile models could just take a Flux.Backend and call compile(backend, model) without caring what type of backend it's using (rather than needing to change all the lines with compilemx(model) to compiletf(model)).

is there an easy way to reuse outputs more than once outside of the @net syntax?

Can you elaborate on this question?

Sorry, I was just referring to the docs where it says

For simple networks Chain is completely fine, although the @net version is more powerful as we can (for example) reuse the output l1 more than once.

but it doesn't really explain how @net can reuse the output. I think this is another case where showing how you might achieve similar behaviour without using @net would help clarify what @net is doing and better demonstrate why it's so useful.

MikeInnes · 2017-04-27T15:32:24Z

I mostly just thought that compile(::Flux.Backend, ::Flux.Model) provides a cleaner API ...

This is an interesting use case I hadn't thought about as much. Still though, I think it works out the same either way; you can think of the compiletf function itself as representing the backend, but you just have to use call in place of compile:

global backend = compiletf
# later on
backend(mymodel)

One thing I could see coming up, though, is that we might want to provide a vocabulary over backends beyond compile. For example you could imagine hypothetical initialise(::MXNet) functions. I can't think of any immediate use case for this but it'd be a good reason to represent backends explicitly.

but it doesn't really explain how @net can reuse the output.

Ah ok, so this is just unclear writing on my part. Chain is function composition like |> or ∘; instead of writing h(x) = g(f(x)) you can write h = g ∘ f. This is convenient, but limited; you can't define this version of h in the same way:

function h(x)
  temp = f(x)
  return temp, g(temp)
end

This is all I meant by reusing outputs; storing and reusing the output of f more than once, rather than piping it straight into g.

It's possible to create more function combinators that can express this kind of logic, but much more intuitive to just write it out with normal Julia syntax; that's the power of @net.

MikeInnes · 2017-05-04T17:31:42Z

I've just reworked the docs along these lines, so hopefully it should be easier to follow. I also added some internal docs, and you can check out how I've made the test code generic across backends here.

I'll close this for now, hopefully that clears some things up, but please do let me know if anything else needs clarifying – or any other feedback is welcome.

MikeInnes closed this as completed May 4, 2017

mboratko mentioned this issue Sep 17, 2018

Adjoint for CuArrays crashes on GPU (passing and using non-bitstype argument) #401

Closed

wpeguero mentioned this issue Jan 1, 2023

Method Error when using Flux.withgradient #2148

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A more type focused interface #19

A more type focused interface #19

rofinn commented Apr 10, 2017 •

edited

MikeInnes commented Apr 17, 2017

rofinn commented Apr 26, 2017 •

edited

MikeInnes commented Apr 27, 2017

MikeInnes commented May 4, 2017 •

edited

A more type focused interface #19

A more type focused interface #19

Comments

rofinn commented Apr 10, 2017 • edited

MikeInnes commented Apr 17, 2017

rofinn commented Apr 26, 2017 • edited

MikeInnes commented Apr 27, 2017

MikeInnes commented May 4, 2017 • edited

rofinn commented Apr 10, 2017 •

edited

rofinn commented Apr 26, 2017 •

edited

MikeInnes commented May 4, 2017 •

edited