# Flux.jlのBasic
Date: 17.8.2021

References
- https://fluxml.ai/Flux.jl/stable/models/basics/

In [1]:
using Pkg
using Flux
using Plots
Pkg.status()

[32m[1m      Status[22m[39m `~/Dropbox/01code/julia/intro_julia/Flux/Project.toml`
 [90m [052768ef] [39m[37mCUDA v3.3.5 ⚲[39m
 [90m [587475ba] [39m[37mFlux v0.12.6 ⚲[39m
 [90m [1902f260] [39m[37mKnet v1.4.8 ⚲[39m
 [90m [91a5bcdd] [39m[37mPlots v1.20.0 ⚲[39m
 [90m [c3e4b0f8] [39m[37mPluto v0.15.1 ⚲[39m
 [90m [92933f4c] [39m[37mProgressMeter v1.7.1[39m


In [2]:
VERSION

v"1.6.0"

## 勾配を計算
使用する関数のdocstringを見てみる：

In [3]:
?gradient

search: [0m[1mg[22m[0m[1mr[22m[0m[1ma[22m[0m[1md[22m[0m[1mi[22m[0m[1me[22m[0m[1mn[22m[0m[1mt[22m Color[0m[1mG[22m[0m[1mr[22m[0m[1ma[22m[0m[1md[22m[0m[1mi[22m[0m[1me[22m[0m[1mn[22m[0m[1mt[22m



```
gradient(f, args...)
```

Returns a tuple containing `∂f/∂x` for each argument `x`, the derivative (for scalar `x`) or the gradient.

`f(args...)` must be a real number, see [`jacobian`](@ref) for array output.

See also [`withgradient`](@ref) to keep the value `f(args...)`, and [`pullback`](@ref) for value and back-propagator.

```jldoctest; setup=:(using Zygote)
julia> gradient(*, 2.0, 3.0, 5.0)
(15.0, 10.0, 6.0)

julia> gradient(x -> sum(abs2,x), [7.0, 11.0, 13.0])
([14.0, 22.0, 26.0],)

julia> gradient([7, 11], 0, 1) do x, y, d
         p = size(x, d)
         sum(x.^p .+ y)
       end
([14.0, 22.0], 2, nothing)
```

---

```
gradient(() -> loss(), ps::Params) -> Grads
```

Gradient with implicit parameters. Takes a zero-argument function, and returns a dictionary-like container, whose keys are arrays `x in ps`.

```jldoctest; setup=:(using Zygote)
julia> x = [1 2 3; 4 5 6]; y = [7, 8]; z = [1, 10, 100];

julia> g = gradient(Params([x, y])) do
         sum(x .* y .* z')
       end
Grads(...)

julia> g[x]
2×3 Matrix{Int64}:
 7  70  700
 8  80  800

julia> haskey(g, z)  # only x and y are parameters
false
```


### Example 1:シンプルな場合
$$
f(x) = 3x^2 + 2x + 1
$$

$$
\mathrm{grad} f(x) = 6x + 2
$$

$$
\mathrm{grad} \ \mathrm{grad} f(x) = 6
$$

In [4]:
f(x) = 3x^2 + 2x + 1
df(x) = Flux.gradient(f, x)[1] # the first element of the tuple returned

df (generic function with 1 method)

In [5]:
df(2)

14.0

In [6]:
d2f(x) = Flux.gradient(df, x)[1]

d2f (generic function with 1 method)

In [7]:
d2f(2)

6.0

### Example 2: 複数のパラメタをgradientsに与える
$$
f(x,y) = \sum_i (x_i - y_i)^2
$$

In [8]:
f(x, y) = sum((x .- y ).^2)

f (generic function with 2 methods)

In [9]:
Flux.gradient(f, [2, 1], [2, 0])

([0, 2], [0, -2])

### Example 3:もっと多くのパラメタをgradientに与える場合
`Flux.params()`を使ってパラメタのコレクションを渡す．

In [10]:
x = [2, 1]
y = [2, 0]
parameters = Flux.params(x,y)

Params([[2, 1], [2, 0]])

In [11]:
gs = Flux.gradient(parameters) do
    f(x,y)
end
#What does this do???

Grads(...)

In [12]:
gs[x]

2-element Vector{Int64}:
 0
 2

In [13]:
gs[y]

2-element Vector{Int64}:
  0
 -2

## 線形回帰のシンプルなモデルを構築

In [14]:
W = rand(2, 5)
b = rand(2)
predict(x) = W*x .+ b

predict (generic function with 1 method)

In [15]:
function loss(x, y)
    ŷ = predict(x)
    sum((y .- ŷ).^2)
end

loss (generic function with 1 method)

In [16]:
x, y = rand(5), rand(2) # Dummy data
loss(x, y)

5.489018517233237

### Improve the prediction

In [17]:
parameters = Flux.params(W, b)

Params([[0.28195219904777935 0.924176700144618 … 0.16621267215193924 0.4950159182891165; 0.7723926663698144 0.6907842903983845 … 0.13320574997558343 0.5858352376259912], [0.49407665816278556, 0.7098413976911151]])

In [18]:
gs = Flux.gradient(() -> loss(x, y), parameters) 

Grads(...)