# Welcome! #
## Enzyme and Zygote Performance Shootout ##

### Project Information ###

If you're interested in seeing a project synopsis and the benchmarks for performance we'll measuring, please take a look at our project documentation. 

### What Now? ###

Let's start by setting up Enzyme and Zygote in our environment to start testing some differentiation. Our primary benchmark is the physical timing of a differentiation operation. In order to measure this we will simply be using the `@time` Julia method. These times will then be compared to measure how fast differentiation is actually done, along with comparing other important benchmarks.

#### Import the Packages ####

In [1]:
using Pkg

Pkg.add("Enzyme")
Pkg.add("Zygote")

In [2]:
using Zygote
using Enzyme
using Plots
using Statistics
using Printf

#### What Next? ####

Now that the packages have been added to the environment, we can start testing them out. First, a quick demonstration of the timing function we will be using. 

In [3]:
function fib(n)
    if n <= 1
        return 1
    else
        return fib(n - 1) + fib(n - 2)
    end
end
    
@time fib(20)

  0.000039 seconds


10946

#### Functions ####

For this shootout we will be handling differentiation for rootfinding problems. Specifically, we will be testing differentiation efficiency for aiding five different rootfinding algorithms, being Halley's, Golbabai-Javidi, Newton's, Noor's, and Zhanlav's Method.

Those methods look like such:

**Halley's Method**

$x_{n + 1} = x_{n} - \frac{2f(x_n)f^{\prime}(x_n)}{2f^{\prime}(x_n)^2 - f(x_n)f^{\prime \prime}(x_n)}$

**Golbabai-Javidi Method**

$x_{n + 1} = x_{n} - \frac{f(x_n)}{f^{\prime}(x_n)} - \frac{f(x_n)f^{\prime \prime}(x_n)}{2(f^{\prime \prime \prime}(x_n) - f(x_n)f^{\prime}(x_n)f^{\prime \prime}(x_n))}$

**Newton's Method**

$x_{n + 1} = x_{n} - \frac{f(x_n)}{f^{\prime}(x_n)}$

**Noor's Method**

$y_n = x_n - \frac{f(x_n)}{f^{\prime}(x_n)}$

$x_{n + 1} = x_{n} - \frac{f(x_n)}{f^{\prime}(x_n)} + (\frac{f(x_n)}{f^{\prime}(x_n)})\frac{f^{\prime}(y_n)}{f^{\prime}(x_n)}$

**Zhanlav's Method**

$z_n = y_n - \frac{f(y_n)}{f^{\prime}(y_n)}$

$q_n = z_n - \frac{f(z_n)}{f^{\prime}(y_n)}$

$y_{n + 1} = z_n - \frac{f(z_n) + f(q_n)}{f^{\prime}(y_n)}$

Before implementation of these various rootfinding algorithms, however, we can demonstrate some fairly basic differentiation using some basic functions below. This, similar to the time test, will present the methods we will be using and how they work.

We will be performing both of these tests using Enzyme and Zygote on three different functions, which will be defined below.

#### Test Functions ####

$f(x) = 5x^{10}$

$g(x) = 3x^3(cos(x) - 10x)$

$h(x) = e^{\frac{5x}{2}(sin(x)^{e^x})}$

#### Zygote Method ####

Zygote is the auto differentiation tool specifically made for Julia and uses the `gradient` method for computing derivatives. We hypothesize that this tool will likely be better optimized for Julia, but we will test this below by timing some basic differentiation!

#### Enzyme Method ####

Enzyme is another auto differentiation tool that is not specifically designed for Julia and is more generally designed for many different languages. This package uses the `autodiff` method for computing derivatives. Given that this tool is not specifically built for Julia, we hypothesize that these method calls will be significantly less optimized.

#### What are we testing? ####

For all three functions we will be timing the differentiation speed of both packages at an x-value of $2$.

#### Function One ####

In [4]:
f(x) = 5x^10
x = 2
@time zyg = gradient(x -> f(x), x)
@time enz = autodiff(f, Active(x))
@show zyg
@show enz

 24.521242 seconds (28.55 M allocations: 1.645 GiB, 5.67% gc time, 100.01% compilation time)
 17.295862 seconds (19.28 M allocations: 1.081 GiB, 4.87% gc time, 0.03% compilation time)
zyg = (25600.0,)
enz = (25600.0,)


(25600.0,)

#### Function Two ####

In [5]:
gf(x) = 3x^3 * (cos(x) - 10x)
x = 2
@time zyg = gradient(x -> gf(x), x)
@time enz = autodiff(gf, Active(x))
@show zyg
@show enz

  0.340222 seconds (495.14 k allocations: 28.543 MiB, 15.17% gc time, 99.93% compilation time)
  4.420047 seconds (6.26 M allocations: 362.785 MiB, 5.36% gc time, 37.76% compilation time)
zyg = (-996.8044243595134,)
enz = (-996.8044243595134,)


(-996.8044243595134,)

#### Function Three ####

In [6]:
hf(x) = exp((5x / 2) * (sin(x)^(exp(x))))
x = 2
@time zyg = gradient(x -> hf(x), x)
@time enz = autodiff(hf, Active(x))
@show zyg
@show enz

  0.327365 seconds (649.23 k allocations: 40.224 MiB, 99.93% compilation time)
  2.500848 seconds (3.32 M allocations: 188.145 MiB, 4.19% gc time, 89.03% compilation time)
zyg = (-105.63101149036947,)
enz = (-105.63101149036945,)


(-105.63101149036945,)

#### What are the results? ####

From using the differentiation method from Zygote and Enzyme in the above functions, we can see that both methods produce essentially the same results, being identical in the first two cases and different at the $10^{-14}$ digit in the last results. Reasonably, we can conclude that both of these methods are possess similar accuarcy from this small test. We can also see that both methods have relatively identical speeds, with Enzyme being generally a bit slower and Zygote a bit faster. In terms of time, no conclusions can really be drawn, so it might be worth performing these computations a few times and pulling the respective means, variances, standard deviations, and medians. Finally, we can clearly see that Enzyme also generally requires more memory allocation than Zygote and takes a smaller percentage of compilation time. This similarly is a bit hard to gage however which is better, so a similar approach to the aforementioned would be useful.

#### Data Science ####

Let's try collecting a larger sample set of differentiation tests to better understand which method seems to be more optimal. Below we have defined a for loop that calls the differentiation methods $100$ times each, and then collects the results. These results will be passed to a data science function at the end that uses a variety of Julia Statistics methods to compute the corresponding data science values. For the tests, we will once again differentiate at $x = 2$.

In [42]:
function dataSci(timeEnz, sizeEnz, timeZyg, sizeZyg)
    timeEnzMean = mean(timeEnz)
    sizeEnzMean = mean(sizeEnz)
    timeZygMean = mean(timeZyg)
    sizeZygMean = mean(sizeZyg)
    
    timeEnzVar = var(timeEnz)
    sizeEnzVar = var(sizeEnz)
    timeZygVar = var(timeZyg)
    sizeZygVar = var(sizeZyg)
    
    timeEnzStd = timeEnzVar ^ (1 / 2)
    sizeEnzStd = sizeEnzVar ^ (1 / 2)
    timeZygStd = timeZygVar ^ (1 / 2)
    sizeZygStd = sizeZygVar ^ (1 / 2)
    
    timeEnzMed = median(timeEnz)
    sizeEnzMed = median(sizeEnz)
    timeZygMed = median(timeZyg)
    sizeZygMed = median(sizeZyg)
    
    @printf "Enzyme Timing Results\n"
    
    @show timeEnzMean
    @show timeEnzVar
    @show timeEnzStd
    @show timeEnzMed
    
    @printf "Enzyme Memory Allocation Results\n"
    
    @show sizeEnzMean
    @show sizeEnzVar
    @show sizeEnzStd
    @show sizeEnzMed
    
    @printf "Zygote Timing Results\n"
    
    @show timeZygMean
    @show timeZygVar
    @show timeZygStd
    @show timeZygMed
    
    @printf "Zygote Memory Allocation Results\n"
    
    @show sizeZygMean
    @show sizeZygVar
    @show sizeZygStd
    @show sizeZygMed
end

dataSci (generic function with 1 method)

#### Function One ####

In [13]:
timesEnz1 = []
timesZyg1 = []
sizesEnz1 = []
sizesZyg1 = []

for i in 1:100
    
    f(x) = 5x^10 # I don't understand a ton about Julia, but if I define this outside of the loop,
                 # the timed method doesn't work???
    
    valZyg, tZyg, sizeZyg = @timed gradient(x -> f(x), x)

    push!(timesZyg1, tZyg)
    push!(sizesZyg1, sizeZyg)

    valEnz, tEnz, sizeEnz = @timed autodiff(f, Active(x))

    push!(timesEnz1, tEnz)
    push!(sizesEnz1, sizeEnz)

end

dataSci(timesEnz1, sizesEnz1, timesZyg1, sizesZyg1)

timeEnzMean = 0.09267851182999999
timeEnzVar = 0.002728936459141259
timeEnzStd = 0.05223922337804476
timeEnzMed = 0.083599264
sizeEnzMean = 5.5191422e6
sizeEnzVar = 4.909829171717173e7
sizeEnzStd = 7007.017319599811
sizeEnzMed = 5.518323e6
timeZygMean = 0.04129968049000002
timeZygVar = 0.00023591764811405283
timeZygStd = 0.01535961093628523
timeZygMed = 0.036597703999999995
sizeZygMean = 3.64246005e6
sizeZygVar = 1.7131734902499979e9
sizeZygStd = 41390.49999999997
sizeZygMed = 3.638321e6


3.638321e6

#### Function Two ####

In [8]:
timesEnz2 = []
timesZyg2 = []
sizesEnz2 = []
sizesZyg2 = []

for i in 1:100
    
    gf(x) = 3x^3 * (cos(x) - 10x) # I don't understand a ton about Julia, but if I define this outside of the loop,
                                             # the timed method doesn't work???
    
    valZyg, tZyg, sizeZyg = @timed gradient(x -> gf(x), x)

    push!(timesZyg2, tZyg)
    push!(sizesZyg2, sizeZyg)

    valEnz, tEnz, sizeEnz = @timed autodiff(gf, Active(x))

    push!(timesEnz2, tEnz)
    push!(sizesEnz2, sizeEnz)

end

dataSci(timesEnz2, sizesEnz2, timesZyg2, sizesZyg2)

timeEnzMean = 0.20542909254999991
timeEnzVar = 0.008447287654177004
timeEnzStd = 0.09190912715381973
timeEnzMed = 0.1756100965
sizeEnzMean = 6.91564444e6
sizeEnzVar = 540688.0064646468
sizeEnzStd = 735.3149029257103
sizeEnzMed = 6.915419e6
timeZygMean = 0.08563981077
timeZygVar = 0.00987911842820892
timeZygStd = 0.09939375447284865
timeZygMed = 0.0536120865
sizeZygMean = 6.02282301e6
sizeZygVar = 1.71847542667667e9
sizeZygStd = 41454.4982683022
sizeZygMed = 6.018635e6


6.018635e6

#### Function Three ####

In [14]:
timesEnz3 = []
timesZyg3 = []
sizesEnz3 = []
sizesZyg3 = []

for i in 1:100
    
    hf(x) = exp((5x / 2) * (sin(x)^(exp(x)))) # I don't understand a ton about Julia, but if I define this outside of the loop,
                                              # the timed method doesn't work???
    
    valZyg, tZyg, sizeZyg = @timed gradient(x -> hf(x), x)

    push!(timesZyg3, tZyg)
    push!(sizesZyg3, sizeZyg)

    valEnz, tEnz, sizeEnz = @timed autodiff(hf, Active(x))

    push!(timesEnz3, tEnz)
    push!(sizesEnz3, sizeEnz)

end

dataSci(timesEnz3, sizesEnz3, timesZyg3, sizesZyg3)

timeEnzMean = 0.22160837142
timeEnzVar = 0.007148965185897377
timeEnzStd = 0.08455155342095955
timeEnzMed = 0.1992649405
sizeEnzMean = 8.28255332e6
sizeEnzVar = 4.9230193674343474e7
sizeEnzStd = 7016.423139630582
sizeEnzMed = 8.281633e6
timeZygMean = 0.09240678975000005
timeZygVar = 0.004842086234737579
timeZygStd = 0.06958510066628903
timeZygMed = 0.0739500035
sizeZygMean = 6.94505553e6
sizeZygVar = 1.7102608380899878e9
sizeZygStd = 41355.29999999985
sizeZygMed = 6.94092e6


6.94092e6

#### Implications? ####

As you can see from running the above functions and viewing the outputs, Zygote is on average twice to four times as fast as Enzyme, and significantly more efficient memory-wise. On average, Enzyme allocated anywhere between $10$% and $20$% more memory. Granted, these tests are not the best and significantly more can definitely be done to make certain which is more optimal but in our case Zygote clearly wins. This is expected because Zygote is specifically designed for Julia. 

#### Practical Application ####

Theory is great and all, but what's a simple application for differentiation? There's a wide range of applications from gradient-descent to modeling certain processes, but for our project we decided some simple rootfinding algorithms would be useful. Thus, below are some rootfinding implementations of these algorithms using Enzyme and Zygote which produce a more noticeable time and memory difference between the two packages.

# Newton's with Enzyme #

In [9]:
# testing Newton's with Enzyme

function newtonE(f, x0; tol=1e-8, verbose=false)
    x = x0
    for k in 1:100 # max number of iterations
        fx = f(x)
        fpx = first(Enzyme.autodiff(f, Active(x)))
        
        if abs(fx) < tol
            return x, fx, k
        end
        x = x - fx / fpx
    end  
end

f(x) = cos(x) - x
newtonE(f, 1; tol=1e-15, verbose=true)

(0.7390851332151607, 0.0, 5)

# Newton's with Zygote #

In [8]:
# testing Newton's with Zygote

function newtonZ(f, x0; tol=1e-8, verbose=false)
    x = x0
    for k in 1:100 # max number of iterations
        fx = f(x)
        fpx = first(Zygote.gradient(f, x))
        
        if abs(fx) < tol
            return x, fx, k
        end
        x = x - fx / fpx
    end  
end

f(x) = cos(x) - x
newtonZ(f, 1; tol=1e-15, verbose=true)

(0.7390851332151607, 0.0, 5)

# Testing Netwon's #

In [43]:
timesEnz = []
timesZyg = []
sizesEnz = []
sizesZyg = []

for i in 1:2
    
    func1() = newtonZ(f, 1; tol=1e-15, verbose=true)
    func2() = newtonE(f, 1; tol=1e-15, verbose=true)
    
    valZyg, tZyg, sizeZyg = @timed func1()

    push!(timesZyg, tZyg)
    push!(sizesZyg, sizeZyg)

    valEnz, tEnz, sizeEnz = @timed func2()

    push!(timesEnz, tEnz)
    push!(sizesEnz, sizeEnz)

end

dataSci(timesEnz, sizesEnz, timesZyg, sizesZyg)

Enzyme Timing Results
timeEnzMean = 0.0174871175
timeEnzVar = 0.0006112476058558004
timeEnzStd = 0.024723422211655903
timeEnzMed = 0.0174871175
Enzyme Memory Allocation Results
sizeEnzMean = 3.534181e6
sizeEnzVar = 2.4976573302258e13
sizeEnzStd = 4.997656781158346e6
sizeEnzMed = 3.534181e6
Zygote Timing Results
timeZygMean = 0.028452244
timeZygVar = 0.0016182894132599221
timeZygStd = 0.040227968047863445
timeZygMed = 0.028452244
Zygote Memory Allocation Results
sizeZygMean = 6.538421e6
sizeZygVar = 8.5496040021618e13
sizeZygStd = 9.246406870867083e6
sizeZygMed = 6.538421e6


6.538421e6

# Halley's with Enzyme #

In [18]:
# halley's method with Enzyme

function halleyE(f, x0; tol=1e-8, verbose=false)
    x = x0
    for k in 1:100 # max number of iterations
        fx = f(x)
        fpx = first(Enzyme.autodiff(f, Active(x)))
        fppx = first(Enzyme.autodiff(f, Active(fpx)))
        
        if abs(fx) < tol
            return x, fx, k
        end
        x = x - (2 * fx * fpx) / (2 * fpx^2 - fx * fppx)
    end  
end

f(x) = cos(x) - x
halleyE(f, 1; tol=1e-15, verbose=true)

(0.7390851332151607, 0.0, 5)

# Halley's with Zygote #

In [19]:
# halley's method with Zygote

function halleyZ(f, x0; tol=1e-8, verbose=false)
    x = x0
    for k in 1:100 # max number of iterations
        fx = f(x)
        fpx = first(Zygote.gradient(f, x))
        fppx = first(Zygote.gradient(f, fpx))
        
        if abs(fx) < tol
            return x, fx, k
        end
        x = x - (2 * fx * fpx) / (2 * fpx^2 - fx * fppx)
    end  
end

f(x) = cos(x) - x
halleyZ(f, 1; tol=1e-15, verbose=true)

(0.7390851332151607, 0.0, 5)

# Testing Halley's #

In [20]:
timesEnz = []
timesZyg = []
sizesEnz = []
sizesZyg = []

for i in 1:2
    
    func1() = halleyZ(f, 1; tol=1e-15, verbose=true)
    func2() = halleyE(f, 1; tol=1e-15, verbose=true)
    
    valZyg, tZyg, sizeZyg = @timed func1()

    push!(timesZyg, tZyg)
    push!(sizesZyg, sizeZyg)

    valEnz, tEnz, sizeEnz = @timed func2()

    push!(timesEnz, tEnz)
    push!(sizesEnz, sizeEnz)

end

dataSci(timesEnz, sizesEnz, timesZyg, sizesZyg)

timeEnzMean = 0.1737424335
timeEnzVar = 0.060363269941324
timeEnzStd = 0.24568937694032275
timeEnzMed = 0.1737424335
sizeEnzMean = 7.32284e6
sizeEnzVar = 1.07236723743872e14
sizeEnzStd = 1.0355516585080244e7
sizeEnzMed = 7.32284e6
timeZygMean = 0.033445165000000006
timeZygVar = 0.0022317276541686084
timeZygStd = 0.04724116482654305
timeZygMed = 0.033445165000000006
sizeZygMean = 6.775958e6
sizeZygVar = 9.1821142477512e13
sizeZygStd = 9.582334917832501e6
sizeZygMed = 6.775958e6


6.775958e6

# Golbabai-Javidi's with Enzyme #

In [21]:
# Golbabai-Javidi's method with Enzyme

function GJ_E(f, x0; tol=1e-8, verbose=false)
    x = x0
    for k in 1:100 # max number of iterations
        fx = f(x)
        fpx = first(Enzyme.autodiff(f, Active(x)))
        fppx = first(Enzyme.autodiff(f, Active(fpx)))
        fpppx = first(Enzyme.autodiff(f, Active(fppx)))
        
        if abs(fx) < tol
            return x, fx, k
        end
        x = x - (fx / fpx) - ((fx * fppx) / (2 * (fpppx - (fx * fpx * fppx))))
    end  
end

f(x) = cos(x) - x
GJ_E(f, 1; tol=1e-15, verbose=true)

(0.739085133215161, -6.661338147750939e-16, 8)

# Golbabai-Javidi's with Zygote #

In [22]:
# Golbabai-Javidi's method with Zygote

function GJ_Z(f, x0; tol=1e-8, verbose=false)
    x = x0
    for k in 1:100 # max number of iterations
        fx = f(x)
        fpx = first(Zygote.gradient(f, x))
        fppx = first(Zygote.gradient(f, fpx))
        fpppx = first(Zygote.gradient(f, fppx))
        
        if abs(fx) < tol
            return x, fx, k
        end
        x = x - (fx / fpx) - ((fx * fppx) / (2 * (fpppx - (fx * fpx * fppx))))
    end  
end

f(x) = cos(x) - x
GJ_Z(f, 1; tol=1e-15, verbose=true)

(0.739085133215161, -6.661338147750939e-16, 8)

# Testing Golbabai-Javidi's #

In [23]:
timesEnz = []
timesZyg = []
sizesEnz = []
sizesZyg = []

for i in 1:2
    
    func1() = GJ_Z(f, 1; tol=1e-15, verbose=true)
    func2() = GJ_E(f, 1; tol=1e-15, verbose=true)
    
    valZyg, tZyg, sizeZyg = @timed func1()

    push!(timesZyg, tZyg)
    push!(sizesZyg, sizeZyg)

    valEnz, tEnz, sizeEnz = @timed func2()

    push!(timesEnz, tEnz)
    push!(sizesEnz, sizeEnz)

end

dataSci(timesEnz, sizesEnz, timesZyg, sizesZyg)

timeEnzMean = 0.1092619915
timeEnzVar = 0.02387052216230604
timeEnzStd = 0.15450088078165133
timeEnzMed = 0.1092619915
sizeEnzMean = 7.4284765e6
sizeEnzVar = 1.103464609065845e14
sizeEnzStd = 1.0504592372223899e7
sizeEnzMed = 7.4284765e6
timeZygMean = 0.0327535895
timeZygVar = 0.0021443713577055244
timeZygStd = 0.046307357489987745
timeZygMed = 0.0327535895
sizeZygMean = 6.805598e6
sizeZygVar = 9.2626230559752e13
sizeZygStd = 9.624252207821239e6
sizeZygMed = 6.805598e6


6.805598e6

# Noor's with Enzyme #

In [24]:
# Noor's method with Enzyme

function noorE(f, x0; tol=1e-8, verbose=false)
    x = x0
    for k in 1:100 # max number of iterations
        fx = f(x)
        fpx = first(Enzyme.autodiff(f, Active(x)))
        y = x - fx/fpx
        fpy = first(Enzyme.autodiff(f, Active(y)))
        
        if abs(fx) < tol
            return x, fx, k
        end
        x = x - (fx / fpx) + (fx / fpx) * (fpy / fpx)
    end  
end

f(x) = x * x
noorE(f, 1; tol=1e-15, verbose=true)

(2.391867219711847e-8, 5.72102879673208e-16, 62)

# Noor's with Zygote #

In [25]:
# Noor's method with Zygote

function noorZ(f, x0; tol=1e-8, verbose=false)
    x = x0
    for k in 1:100 # max number of iterations
        fx = f(x)
        fpx = first(Zygote.gradient(f, x))
        y = x - fx/fpx
        fpy = first(Zygote.gradient(f, y))
        
        if abs(fx) < tol
            return x, fx, k
        end
        x = x - (fx / fpx) + (fx / fpx) * (fpy / fpx)
    end  
end

f(x) = x * x
noorZ(f, 1; tol=1e-15, verbose=true)

(2.391867219711847e-8, 5.72102879673208e-16, 62)

# Testing Noor's #

In [26]:
timesEnz = []
timesZyg = []
sizesEnz = []
sizesZyg = []

for i in 1:2
    
    func1() = noorZ(f, 1; tol=1e-15, verbose=true)
    func2() = noorE(f, 1; tol=1e-15, verbose=true)
    
    valZyg, tZyg, sizeZyg = @timed func1()

    push!(timesZyg, tZyg)
    push!(sizesZyg, sizeZyg)

    valEnz, tEnz, sizeEnz = @timed func2()

    push!(timesEnz, tEnz)
    push!(sizesEnz, sizeEnz)

end

dataSci(timesEnz, sizesEnz, timesZyg, sizesZyg)

timeEnzMean = 0.1190844425
timeEnzVar = 0.028321606111369513
timeEnzStd = 0.16829024366067544
timeEnzMed = 0.1190844425
sizeEnzMean = 5.5767465e6
sizeEnzVar = 6.21509589759645e13
sizeEnzStd = 7.88358795067097e6
sizeEnzMed = 5.5767465e6
timeZygMean = 0.030482049
timeZygVar = 0.0018577122470677622
timeZygStd = 0.043101186144557116
timeZygMed = 0.030482049
sizeZygMean = 5.244526e6
sizeZygVar = 5.5005406934408e13
sizeZygStd = 7.416563013580347e6
sizeZygMed = 5.244526e6


5.244526e6

# Zhanlav's with Enzyme #

In [27]:
# Zhanlav's method with Enzyme

function zhanlavE(f, x0; tol=1e-8, verbose=false)
    x = x0
    for k in 1:1000 # max number of iterations
        
        fx = f(x)
        fpx = first(Enzyme.autodiff(f, Active(x)))
        
        z = x - fx/fpx
        fz = f(z)
        fpz = first(Enzyme.autodiff(f, Active(z)))

        q = z - fz/fpz
        fq = f(q)
        
        
        if abs(fx) < tol
            return x, fx, k
        end
        x = z - (fz + fq)/fpx
    end  
end

f(x) = cos(x) - x
zhanlavE(f, 1; tol=1e-15, verbose=true)

(0.7390851332151607, 0.0, 4)

# Zhanlav's with Zygote #

In [28]:
# Zhanlav's method with Zygote

function zhanlavZ(f, x0; tol=1e-8, verbose=false)
    x = x0
    for k in 1:1000 # max number of iterations
        
        fx = f(x)
        fpx = first(Zygote.gradient(f, x))
        
        z = x - fx/fpx
        fz = f(z)
        fpz = first(Zygote.gradient(f, z))
        
        q = z - fz/fpz
        fq = f(q)
        
        if abs(fx) < tol
            return x, fx, k
        end
        x = z - (fz + fq)/fpx
    end  
end

f(x) = cos(x) - x
zhanlavZ(f, 1; tol=1e-15, verbose=true)

(0.7390851332151607, 0.0, 4)

# Testing Zhanlav's #

In [29]:
timesEnz = []
timesZyg = []
sizesEnz = []
sizesZyg = []

for i in 1:2
    
    func1() = zhanlavZ(f, 1; tol=1e-15, verbose=true)
    func2() = zhanlavE(f, 1; tol=1e-15, verbose=true)
    
    valZyg, tZyg, sizeZyg = @timed func1()

    push!(timesZyg, tZyg)
    push!(sizesZyg, sizeZyg)

    valEnz, tEnz, sizeEnz = @timed func2()

    push!(timesEnz, tEnz)
    push!(sizesEnz, sizeEnz)

end

dataSci(timesEnz, sizesEnz, timesZyg, sizesZyg)

timeEnzMean = 0.1159427135
timeEnzVar = 0.026880790437600482
timeEnzStd = 0.1639536228254822
timeEnzMed = 0.1159427135
sizeEnzMean = 7.3470685e6
sizeEnzVar = 1.079484866627445e14
sizeEnzStd = 1.0389826113210196e7
sizeEnzMed = 7.3470685e6
timeZygMean = 0.037009609
timeZygVar = 0.0027380449921848315
timeZygStd = 0.05232633172872747
timeZygMed = 0.037009609
sizeZygMean = 6.786782e6
sizeZygVar = 9.2114738974728e13
sizeZygStd = 9.597642365431627e6
sizeZygMed = 6.786782e6


6.786782e6

# Rootfiinding Results #

For every method we have that Zygote is quicker and allocates less memory. In some cases Enzyme was much slower, but the memory allocation was generally the same. Thus, the root-finding results support our findings in the initial tests and our aforementioned conclusions.