# OBJECTIVE: Submit a performance analysis of a self-implemented physics textbook function or constant using Julia benchmarking tools.

# KR1: Implemented (customized) at least one math/physics textbook function, or constant (prefer those that involve a sum or a loop) in Julia. Discuss its importance in Physics. See Resources in the webpage.

In [1]:
using Pkg
Pkg.add("DifferentialEquations")
Pkg.add("PyPlot")
Pkg.add("Plots")

using Plots
using DifferentialEquations


[32m[1m    Updating[22m[39m registry at `C:\Users\Janelle\.julia\registries\General`
[32m[1m    Updating[22m[39m git-repo `https://github.com/JuliaRegistries/General.git`
[32m[1m   Resolving[22m[39m package versions...
[32m[1m  No Changes[22m[39m to `C:\Users\Janelle\Project.toml`
[32m[1m  No Changes[22m[39m to `C:\Users\Janelle\Manifest.toml`
[32m[1mPrecompiling[22m[39m project...
[32m  ✓ [39m[90mStrideArraysCore[39m
[32m  ✓ [39m[90mForwardDiff[39m
[32m  ✓ [39m[90mSciMLBase[39m
[32m  ✓ [39m[90mGraphs[39m
[32m  ✓ [39m[90mPolyester[39m
[32m  ✓ [39m[90mPreallocationTools[39m
[32m  ✓ [39m[90mNLSolversBase[39m
[32m  ✓ [39m[90mVertexSafeGraphs[39m
[32m  ✓ [39m[90mDistributions[39m
[32m  ✓ [39m[90mFastBroadcast[39m
[32m  ✓ [39m[90mSimpleNonlinearSolve[39m
[32m  ✓ [39m[90mSparseDiffTools[39m
[32m  ✓ [39m[90mLineSearches[39m
[32m  ✓ [39m[90mVectorizationBase[39m
[32m  ✓ [39m[90mNLsolve[39m
[32m  ✓ [39m[90mSL

LoadError: syntax: use "x^y" instead of "x**y" for exponentiation, and "x..." instead of "**x" for splatting.

In [85]:
Pkg.add("SpecialFunctions")


[32m[1m   Resolving[22m[39m package versions...
[32m[1m    Updating[22m[39m `C:\Users\Janelle\Project.toml`
 [90m [276daf66] [39m[92m+ SpecialFunctions v2.1.7[39m
[32m[1m  No Changes[22m[39m to `C:\Users\Janelle\Manifest.toml`


The math/physics textbook function that I chose to implement is the error function. The error function, also known as erf, is a special function that is often used in probability, statistics, and partial differential equations. The error function is given by

$erf(x) = \frac{2}{sqrt{𝜋}} \sum \limits _{n=0} ^{\infty}(−1)^{𝑛} \frac{ 𝑥^{2𝑛+1}}{𝑛!(2𝑛 + 1)}$.

The theory of the normal random variable and probability determination heavily relies on the error function. Results that hold with a high or low probability can be estimated using the error function and its approximations. Additionally, when boundary conditions are specified by the Heaviside step function, the error function also appears in solutions of the heat equation.



In [1]:
function erf1(x)
 K=2.0/sqrt(pi)
 start=0.0
 plusminus=1
 power=x
 factorial=1.0
     for n=1:100
         start=start+power*plusminus/(factorial*(2.0*(n-1)+1.0))
         power=power*x*x
         factorial=factorial*n
         plusminus*=plusminus*(-1)
     end
 return K*start
 end

using SpecialFunctions
x=0.1
y1=erf1(x)
y=erf(x)
d= ((y-y1)/y)*100
print("Given x=$x
    Using implemented error function:$y1 
    using existing special function:$y
    Deviation:$d % ")

Given x=0.1
    Using implemented error function:0.11246065924950271 
    using existing special function:0.1124629160182849
    Deviation:0.0020066781674200227 % 

# KR2: Compare the performance (accuracy) of the implemented function in comparison with the existing special functions within Julia

The implemented error function gave accurate results with a percent deviation of 0.0020066781674200227 % in comparison with the existing error function within Julia. 

# KR3: Successful loading of the BenchmarkTools module. May need to add it first via the Pkg or REPL package mode


In [2]:
using BenchmarkTools

In [5]:
function erf1(x)
 K=2.0/sqrt(pi)
 start=0.0
 plusminus=1
 power=x
 factorial=1.0
     for n=1:100
         start=start+power*plusminus/(factorial*(2.0*(n-1)+1.0))
         power=power*x*x
         factorial=factorial*n
         plusminus*=plusminus*(-1)
     end
 return K*start
 end

using SpecialFunctions
x=0.1
@btime y1=erf1(x)
@btime y=erf(x)

  530.688 ns (1 allocation: 16 bytes)
  40.369 ns (1 allocation: 16 bytes)


0.1124629160182849

In [9]:
x=0.2
@benchmark y1=erf1(x)

BenchmarkTools.Trial: 10000 samples with 287 evaluations.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m269.686 ns[22m[39m … [35m  5.623 μs[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m0.00% … 93.60%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m319.164 ns               [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m419.320 ns[22m[39m ± [32m188.852 ns[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m0.13% ±  0.94%

  [39m▁[39m█[39m▇[39m▅[39m▅[34m▄[39m[39m▂[39m▃[39m▃[39m▃[39m▄[39m▃[39m▂[39m [32m▁[39m[39m [39m▄[39m▃[39m▂[39m▁[39m [39m [39m [39m [39m [39m [39m▂[39m▁[39m▁[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m▆[39m▅[39m▄[39m▃[39m▁[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m▂
  [39m█[39m█[39

# KR4: Itemized differences between @time, @btime, @benchmark and other @time-like macros. Nice if the situations when they are best applied are mentioned

@time macro: <br>
-wraps the provided expression to calculate <br>
-prints the elapsed time while running the code <br>
-measures and prints the amount of memory allocated while running the code <br>

@btime macro: <br>
-returns the value of the expression being evaluated <br>
-executes the code multiple times and after that it chooses minimal time <br>
-measures and prints the amount of memory allocated while running the code <br>

@benchmark: <br>
-can only be used in front of function calls, rather than any expression <br>
-prints the mean time taken to run the code, but with statistically accurate upper and lower bounds <br>
-evaluates the expression multiple times to improve accuracy of measurements <br>

For quick sanity checks, @btime macro is usually used since it is essentially a convenient version of the @benchmark macro whose output is similar to Julia's built-in @time macro.


#  KR5-6: Identified demonstrated useful features within the Profiler module of Julia. Features must be explained why useful for your case. A discussion of the performance of the implemented function above.


In [10]:
using Profile

x=0.2
@profile y1=erf1(x)

0.2226303675942337

In [11]:
Profile.print()

Overhead ╎ [+additional indent] Count File:Line; Function
 ╎7 @Base\task.jl:484; (::IJulia.var"#15#18")()
 ╎ 7 @IJulia\src\eventloop.jl:8; eventloop(socket::ZMQ.Socket)
 ╎  7 @Base\essentials.jl:726; invokelatest
 ╎   7 @Base\essentials.jl:729; #invokelatest#2
 ╎    7 ...rc\execute_request.jl:67; execute_request(socket::ZMQ.Socke...
 ╎     7 ...c\SoftGlobalScope.jl:65; softscope_include_string(m::Modu...
 ╎    ╎ 7 @Base\loading.jl:1428; include_string(mapexpr::typeof...
6╎    ╎  7 @Base\boot.jl:368; eval
 ╎    ╎   1 ...mpiler\typeinfer.jl:996; typeinf_ext_toplevel(mi::Core....
 ╎    ╎    1 ...piler\typeinfer.jl:1000; typeinf_ext_toplevel(interp:...
 ╎    ╎     1 ...piler\typeinfer.jl:965; typeinf_ext(interp::Core.Com...
 ╎    ╎    ╎ 1 ...inferencestate.jl:284; Core.Compiler.InferenceState...
 ╎    ╎    ╎  1 ...iler\utilities.jl:131; retrieve_code_info
Total snapshots: 7. Utilization: 100% across all threads and tasks. Use the `groupby` kwarg to break down by thread and/or task


Each line of this display represents a particular spot (line number) in the code. Indentation is used to indicate the nested sequence of function calls, with more-indented lines being deeper in the sequence of calls. In each line, the first "field" is the number of backtraces (samples) taken at this line or in any functions executed by this line. The second field is the file name and line number and the third field is the function name.

In [13]:
using ProfileView
ProfileView.view()

Gtk.GtkWindowLeaf(name="", parent, width-request=-1, height-request=-1, visible=TRUE, sensitive=TRUE, app-paintable=FALSE, can-focus=FALSE, has-focus=FALSE, is-focus=FALSE, focus-on-click=TRUE, can-default=FALSE, has-default=FALSE, receives-default=FALSE, composite-child=FALSE, style, events=0, no-show-all=FALSE, has-tooltip=FALSE, tooltip-markup=NULL, tooltip-text=NULL, window, opacity=1.000000, double-buffered, halign=GTK_ALIGN_FILL, valign=GTK_ALIGN_FILL, margin-left, margin-right, margin-start=0, margin-end=0, margin-top=0, margin-bottom=0, margin=0, hexpand=FALSE, vexpand=FALSE, hexpand-set=FALSE, vexpand-set=FALSE, expand=FALSE, scale-factor=1, border-width=0, resize-mode, child, type=GTK_WINDOW_TOPLEVEL, title="Profile", role=NULL, resizable=TRUE, modal=FALSE, window-position=GTK_WIN_POS_NONE, default-width=800, default-height=600, destroy-with-parent=FALSE, hide-titlebar-when-maximized=FALSE, icon, icon-name=NULL, screen, type-hint=GDK_WINDOW_TYPE_HINT_NORMAL, skip-taskbar-hint

From the session slides: <br>
The horizontal axis represents the amount of time (more precisely, the number of backtraces) spent at each line. The row at which the single long bar breaks up into multiple different-colored bars corresponds to the execution of different lines from profile_test. The fact that they are all positioned on top of the lower peach-colored bar means that all of these lines are called by the same "parent" function. Within a block of code, they are sorted in order of increasing line number, to make it easier for you to compare to the source code.

It is also worth noting that red is (by default) a special color: it is reserved for function calls that have to be resolved at run-time. Because run-time dispatch (aka, dynamic dispatch, run-time method lookup, or a virtual call) often has a significant impact on performance, ProfileView highlights the problematic call in red. It's worth noting that some red is unavoidable; for example, the REPL can't predict in advance the return types from what users type at the prompt, and so the bottom eval call is red. Red bars are problematic only when they account for a sizable fraction of the top of a call stack, as only in such cases are they likely to be the source of a significant performance bottleneck.