# Controlling a Trebuchet

![trebuchet](https://fluxml.ai/assets/2019-03-05-dp-vs-rl/trebuchet-basic.gif)

There is Trebuchet, which throws a mass to a target. The mass is to be
released at an angle, and at certain velocity so that it lands on the target.
The velocity of release is determined by the counterweight of the Trebuchet.
Given conditions of environment we are required to predict the angle of
release and counterweight.

* **Input:**  Wind speed,   Target distance
* **Output:** ReleaseAngle, Weight

![overview](https://fluxml.ai/assets/2019-03-05-dp-vs-rl/trebuchet-flow.png)

In [2]:
using Flux
using Random
using Plots
plotlyjs()
import Trebuchet
import Zygote

# linear interpolation helper
lerp(x, lo, hi) = x*(hi-lo)+lo

# there is currently an issue with Zygote.ignore, luckily a self-written version still works
# for an update consult https://github.com/FluxML/Zygote.jl/issues/677 
gradient_ignore(f) = f()
Zygote.@adjoint gradient_ignore(f) = gradient_ignore(f), _ -> nothing

In [3]:
function visualize_trebuchet(;target=100, wind_speed=1.0, release_angle=45, weight=98.09)  # default values from TrebuchetState
    # state is going to be mutated by simulate, hence we capsulate it into our own method
    release_angle = Trebuchet.deg2rad(release_angle)
    state = Trebuchet.TrebuchetState(wind_speed=wind_speed, release_angle=release_angle, weight=weight)
    Trebuchet.simulate(state)  # should be named `simulate!(t)`
    Trebuchet.visualise(state, target)
end 

function shoot_trebuchet(;wind_speed=1.0, release_angle=45, weight=98.09)
    release_angle = Trebuchet.deg2rad(release_angle)
    state = Trebuchet.TrebuchetState(;wind_speed=wind_speed, release_angle=release_angle, weight=weight)
    weight > 0 || return 0.0
    Trebuchet.simulate(state)
    Trebuchet.endDist(state)
end

shoot_trebuchet (generic function with 1 method)

In [19]:
visualize_trebuchet(target=50)

In [20]:
shoot_trebuchet()

73.87618208052133

## Create Model

In [33]:
Random.seed!(0)
model = Chain(Dense(2, 16, σ),
              Dense(16, 64, σ),
              Dense(64, 16, σ),
              Dense(16, 2)) |> f32
θ = params(model)

Params([Float32[0.31680438 -0.41162738; 0.47479916 -0.13068382; … ; 0.03846823 0.46813533; 0.4688731 -0.32207933], Float32[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], Float32[0.05972081 0.21486466 … -0.2639364 -0.07606439; 0.106699176 0.03548042 … -0.16236916 0.15217024; … ; 0.25762793 0.24063167 … 0.09960718 -0.13336718; 0.2398328 -0.012614466 … 0.06286959 -0.13437746], Float32[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0  …  0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], Float32[0.049149316 0.23821789 … 0.16744229 -0.07608868; 0.021947928 0.16212758 … -0.107711814 -0.097368784; … ; -0.049132533 -0.038270485 … -0.0883247 -0.25962076; -0.10204916 -0.2655446 … -0.17327274 0.1272158], Float32[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], Float32[-0.06311837 -0.38680297 … 0.44780564 0.5259007; -0.3301765 -0.43905833 … 0.10982446 0.057081543], Float32[0.0, 0.0]])

## Create Loss

As Trebuchet.shoot uses `try`/`catch` within, which is not yet supported by default backwards differentiation, we need to mark our shoot function with ``Zygote.forwarddiff`` marker.

In [34]:
function aim(wind, target)
  angle, weight = model([wind, target])
  angle = σ(angle)*90
  weight = weight + 200
  (release_angle=angle, weight=weight)
end

function visualize_model(;wind_speed=1.0, target=100)
    release_angle, weight = aim(wind_speed, target)
    visualize_trebuchet(target=target, wind_speed=wind_speed, release_angle=release_angle, weight=weight)
end

function shoot_model(;wind_speed=1.0, target=100)
    release_angle, weight = aim(wind_speed, target)
    # shoot_trebuchet uses array mutation internally, which is not yet supported by Zygote ReverseDiff
    # however forwarddiff works with everything, including array mutation and try/catch,
    # hence we mark this respectively
    Zygote.forwarddiff([wind_speed, release_angle, weight]) do (wind_speed, release_angle, weight)
        shoot_trebuchet(wind_speed=wind_speed, release_angle=release_angle, weight=weight)
    end
end

shoot_model (generic function with 1 method)

In [35]:
visualize_model(wind_speed=1.0, target=50)

In [36]:
shoot_model(wind_speed=1.0, target=50)

82.14732721935346

## Train

In [42]:
target_min, target_max = 20, 100	# Maximum target distance
wind_speed_mean = 5 # Maximum wind speed

random_target() = (
        wind_speed = randn() * wind_speed_mean,
    target = lerp(rand(), target_min, target_max)
)

random_target (generic function with 1 method)

In [43]:
losses = Float64[]
iterations = Int[]
i = 0

0

In [55]:
optimizer = ADAM()
try
    while true
        i += 1
        wind_speed, target = random_target()
        ∇θ = gradient(θ) do
            hit = shoot_model(wind_speed=wind_speed, target=target)
            loss = (hit - target)^2
            gradient_ignore() do
                if i % 100 == 0
                    push!(losses, loss)
                    push!(iterations, i)
                    plot(iterations, losses, show = :inline, yscale = :log10,
                        label = "square-loss", xlabel = "#iteration", ylabel="loss (log10 scale)")
                end
            end
            loss
        end
        Flux.update!(optimizer, θ, ∇θ)
    end
    
catch e
    if e isa InterruptException
        visualize_model(;random_target()...)
    end
end 

In [59]:
visualize_model(;random_target()...)

# Thank you

For more details and further examples see the original blogpost  https://fluxml.ai/2019/03/05/dp-vs-rl.html.

Or ask me directly at s.sahm@reply.de.