Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

using Zygote #669

Merged
merged 92 commits into from Sep 11, 2019

Conversation

@MikeInnes
Copy link
Member

commented Mar 8, 2019

Otherwise known as "break all the things". This will be a huge change so I'm beginning to prepare now, even though Zygote is still a couple of months off from being really ready. Do not try this at home (yet) – this branch is eventually aimed at beta testers, but isn't even ready for that yet.

The idea is to break as little code as possible, which means supporting the current Params API; but I also want to start prototyping the nicer things discussed in #628 and other issues.

Blocking issues:

Nice to have:

  • Robust nested AD (may not be a blocker if one can still use Tracker with Flux).
  • Zygote support for modules / globals as discussed in #628, along with #637.
  • Better train/test mode as in #643.

If you're the kind of person who ignores triangular road signs, you can try this with

]add Flux#zygote Zygote#master
@MikeInnes

This comment has been minimized.

Copy link
Member Author

commented Apr 4, 2019

I initially had started doing #666 and #637 on this branch, but that's turning into a big project, so I've stripped the commits for now (preserved on mji/step). As excited as I am to get rid of Params I think the right move is to shelve that for now and focus on the core AD issues.

@staticfloat

This comment has been minimized.

Copy link
Contributor

commented Apr 9, 2019

In my own experiments trying to use Zygote with Flux (I just do model = mapleaves(Flux.data, model) first, then define my own Zygote-based update step:

zyg_update!(opt, model, updates::Nothing) = nothing
function zyg_update!(opt, model::AbstractArray, updates::AbstractArray)
    # Sub off to Flux's ADAM optimizer
    Δ = Flux.Optimise.update!(opt, model, updates)
    return model .-= Δ
end

function zyg_update!(opt, model, updates)
    if nfields(model) == 0
        return model
    end

    for field_idx in 1:nfields(model)
        zyg_update!(opt, getfield(model, field_idx), getfield(updates, field_idx))
    end
end

Things actually work fairly well, except BatchNorm freaks out, complaining about mutating arrays. To work around this, I am using my own BatchNorm implementation, re-architected to work with Zygote. Not sure if that is the direction you want to go with this Mike, but it worked well for us on TPU. I will note, anecdotally, that my convolution and batchnom-heavy workload (a convolutional autoencoder for large images) uses ~20% less memory with Zygote than with Tracker.

@staticfloat

This comment has been minimized.

Copy link
Contributor

commented May 3, 2019

I wanted to use this with the new NNlib overhaul, so I rebased this branch on top of master; I'm not certain I did everything right, but sf/zygote_updated contains my rebased version. Mike, if you like it, you can just force-push it to #zygote and keep on working, or you can just do the rebase yourself.

MikeInnes and others added 17 commits Aug 19, 2019
Dhairya Gandhi
@@ -3,25 +3,25 @@
Consider a [simple linear regression](../models/basics.md). We create some dummy data, calculate a loss, and backpropagate to calculate gradients for the parameters `W` and `b`.

```julia
using Flux, Flux.Tracker
using Flux, Flux.Zygote

This comment has been minimized.

Copy link
@MikeInnes

MikeInnes Sep 11, 2019

Author Member

@dhairyagandhi96 Flux already exports gradient, so this may not be necessary

This comment has been minimized.

Copy link
@dhairyagandhi96
Dhairya Gandhi
Co-Authored-By: Mike J Innes <mike.j.innes@gmail.com>
@MikeInnes MikeInnes marked this pull request as ready for review Sep 11, 2019
@MikeInnes

This comment has been minimized.

Copy link
Member Author

commented Sep 11, 2019

bors r+

bors bot added a commit that referenced this pull request Sep 11, 2019
Merge #669
669: using Zygote r=MikeInnes a=MikeInnes

Otherwise known as "break all the things". This will be a huge change so I'm beginning to prepare now, even though Zygote is still a couple of months off from being really ready. **Do not try this at home** (yet) – this branch is eventually aimed at beta testers, but isn't even ready for that yet.

The idea is to break as little code as possible, which means supporting the current `Params` API; but I also want to start prototyping the nicer things discussed in #628 and other issues.

Blocking issues:

* [x] Get the tests passing.
* [x] Check tests on GPU.
* [x] Rewrite all the docs.
* [x] Cache invalidation (jrevels/Cassette.jl#6).
* [x] Moving over adjoints (FluxML/Zygote.jl#81).
* [x] General Zygote robustness.

Nice to have:

* [ ] Robust nested AD (may not be a blocker if one can still use Tracker with Flux).
* [x] Zygote support for modules / globals as discussed in #628, along with #637.
* [x] Better train/test mode as in #643.

If you're the kind of person who ignores triangular road signs, you can try this with

```julia
]add Flux#zygote Zygote#master
```

Co-authored-by: Mike J Innes <mike.j.innes@gmail.com>
Co-authored-by: Elliot Saba <staticfloat@gmail.com>
Co-authored-by: thebhatman <manjunathbhat9920@gmail.com>
@MikeInnes

This comment has been minimized.

Copy link
Member Author

commented Sep 11, 2019

Seem to be some issues with our GPU CI, so just merging.

@MikeInnes MikeInnes merged commit bdeb9c6 into master Sep 11, 2019
1 of 3 checks passed
1 of 3 checks passed
continuous-integration/travis-ci/push The Travis CI build could not complete due to an error
Details
bors Running
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@bors

This comment has been minimized.

Copy link
Contributor

commented Sep 11, 2019

Build failed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.