Crash on 0.7.0-rc3.0 #2

awf · 2018-08-15T20:11:12Z

Attached jl produces attached log when run with
julia .\zg-logsumexp.jl

Apologies, haven't tested with 1.0. Zygote pkg is added from master in this repo.

using Test
using Printf
using Zygote
using BenchmarkTools

# logsumexp()
function logsumexp(x::Array{Float64,1})
  A = maximum(x);
  ema = exp.(x .- A);
  sema = sum(ema);
  log(sema) + A;
end

r = rand(5);
@test logsumexp(r) ≈ log(sum(exp.(r))) atol=1.0e-8
@printf("logsumexp: %f = %f\n", logsumexp(r), log(sum(exp.(r))));

# logsumexp()
function logsumexp_both(x::Array{Float64,1})
  A = maximum(x);
  ema = exp.(x .- A);
  sema = sum(ema);
  l = log(sema) + A;
  Jacobian = ema ./ sema;
  return (l, Jacobian)
end

x = rand(100)

"Timing logsumexp_both" |> println
@btime logsumexp_both(x)

"Timing zygote" |> println
@btime Zygote.gradient(logsumexp, x)

The text was updated successfully, but these errors were encountered:

jekbradbury · 2018-08-15T21:28:42Z

Zygote requires (edit: now simply says it works faster on) Julia built from source including Mike’s patches; I have a Docker image with such a build here: docker pull gcr.io/personal-204923/julia:zygote

MikeInnes · 2018-08-16T11:30:30Z

We should ideally not be crashing Julia's codegen here, as opposed to just throwing an error, but this is probably easily fixed by adding the right gradient definition (of which we have very few right now; * and + with scalars is enough to check the chain rule is working :)). I'll take a look shortly.

awf · 2018-08-16T12:34:16Z

Just to confirm in case useful: this does happen on James's image too.

baggepinnen · 2018-08-18T16:20:32Z

I'm experiencing a segfault with the following code on julia v0.7/Zygote v0.1. I can't really tell if it's related to this issue

using ForwardDiff, Zygote
w = randn(3,3)
x = randn(3)
f(w,x) = w*x
function loss(w,x)
    ∇xf(x) = ForwardDiff.jacobian(x->f(w,x), x)
    w -> sum(abs2.(w)) + norm(∇xf(x)) # Loss function contains nested differentiation of the model with respect to the input
end

l = loss(w,x)
g = Zygote.gradient(l,w)

ngphuoc · 2018-08-21T06:45:27Z

I modified the example in the README to add a loss function and it crashed with a sigfault too:

using Zygote: @grad, Params, gradient

W, b = rand(2, 3), rand(2);

predict(x) = W*x .+ b;

loss(x,y) = sum(abs2, y .- predict(x))

g = gradient(() -> sum(loss([1,2,3], [1, 1])), Params([W, b]))

g[W], g[b]

MikeInnes · 2018-08-23T17:37:56Z

I added a simple gradient def for maximum which fixes the original example. I really need to bring over all the definitions that Flux has.

I also just realised that our broadcast grad is not type-stable, so this benchmark comes out a bit slower than it needs to be.

MikeInnes · 2018-09-11T00:06:22Z

Current release Julia (I'll remove any need for a source build this week, though there'll still be a flag for type system abuse).

(v1.0) pkg> add Zygote#master IRTools#master

julia> using Zygote, BenchmarkTools

julia> function logsumexp(x::Array{Float64,1})
         A = maximum(x)
         ema = exp.(x .- A)
         sema = sum(ema)
         log(sema) + A
       end
logsumexp (generic function with 1 method)

julia> const x = rand(100);

julia> @btime logsumexp(x)
  1.515 μs (1 allocation: 896 bytes)
5.119135839482861

julia> @btime Zygote.gradient(logsumexp, x);
  4.882 μs (35 allocations: 7.61 KiB)

Almost all of the time is spent in broadcast; I imagine there's yet more that we could do to be cleverer here, though it's hard to imagine how we'd get as good as the hand written version without special knowledge.

chriselrod · 2018-09-11T01:32:53Z

The overhead for small problems still seems high (so my tests StaticArrays aren't fast yet), but...

julia> @btime logsumexp_both(x);
  960.263 ns (3 allocations: 1.78 KiB)

julia> @btime Zygote.gradient(logsumexp, x);
  2.737 μs (35 allocations: 7.61 KiB)

julia> @btime ReverseDiff.gradient!($results, $compiled_tape, $inputs);
  15.900 μs (0 allocations: 0 bytes)

julia> @btime ForwardDiff.gradient!($res, logsumexp, x, $cfg);
  25.158 μs (10 allocations: 87.50 KiB)

This was on regular 1.0.

This is awesome to see, because it is (a) significantly faster, and (b) Zygote didn't require me to redefine logsumexp to be generic.

MikeInnes · 2018-09-11T09:05:18Z

Nice!

I did notice one of the earlier StaticArrays examples not infering properly. Feel free to open a new issue for anything like that.

awf · 2018-09-13T12:42:12Z

Very nice. Note that my initial code wasn't quite comparing apples to apples: logsumexp_both computes the function value too. For timing, we should compare Zygote.gradient to

# grad_logsumexp()
function grad_logsumexp(x::Array{Float64,1})
  A = maximum(x);
  ema = exp.(x .- A);
  sema = sum(ema);
  # l = log(sema) + A;
  return ema ./ sema;
end

Merge upstream changes.

MikeInnes closed this as completed Sep 6, 2018

MikeInnes mentioned this issue Sep 14, 2018

Some simple optimisations #16

Closed

marty0801 mentioned this issue Mar 25, 2020

Performance question #559

Open

bors bot pushed a commit that referenced this issue Apr 22, 2020

Merge pull request #2 from FluxML/master

077d357

Merge upstream changes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Crash on 0.7.0-rc3.0 #2

Crash on 0.7.0-rc3.0 #2

awf commented Aug 15, 2018 •

edited

jekbradbury commented Aug 15, 2018 •

edited

MikeInnes commented Aug 16, 2018

awf commented Aug 16, 2018

baggepinnen commented Aug 18, 2018 •

edited

ngphuoc commented Aug 21, 2018

MikeInnes commented Aug 23, 2018

MikeInnes commented Sep 11, 2018

chriselrod commented Sep 11, 2018 •

edited

MikeInnes commented Sep 11, 2018 •

edited

awf commented Sep 13, 2018 •

edited

Crash on 0.7.0-rc3.0 #2

Crash on 0.7.0-rc3.0 #2

Comments

awf commented Aug 15, 2018 • edited

jekbradbury commented Aug 15, 2018 • edited

MikeInnes commented Aug 16, 2018

awf commented Aug 16, 2018

baggepinnen commented Aug 18, 2018 • edited

ngphuoc commented Aug 21, 2018

MikeInnes commented Aug 23, 2018

MikeInnes commented Sep 11, 2018

chriselrod commented Sep 11, 2018 • edited

MikeInnes commented Sep 11, 2018 • edited

awf commented Sep 13, 2018 • edited

awf commented Aug 15, 2018 •

edited

jekbradbury commented Aug 15, 2018 •

edited

baggepinnen commented Aug 18, 2018 •

edited

chriselrod commented Sep 11, 2018 •

edited

MikeInnes commented Sep 11, 2018 •

edited

awf commented Sep 13, 2018 •

edited