# Session 2: Fast machine representation

In this session, we cover the use of fast `Numbers` in Julia.
Particularly:[^1]
- [ ] Analyze the floating point layout or architecture used by your Julia installation.
- [ ] Demonstrate tradeoff between runtime speed and over- or underflow checks in [number representations in Julia](https://docs.julialang.org/en/v1/manual/integers-and-floating-point-numbers/).
- [ ] Show how much `@fastmath` macro speeds up computation with a trades off in some level of accuracy. The `sum_diff()` function in the main book reference may be replicated for this purpose.

----
[^1]: Covers Chapter 5 of Segupta, _Julia High Performance, 2nd Ed._ (Packt Publishing, 2019).

In [1]:
using Pkg;
Pkg.activate(".");
Pkg.add([
     "Plots"
    ,"BenchmarkTools"
]);

using Plots, BenchmarkTools;

[32m[1m  Activating[22m[39m project at `~/Documents/GitHub/Phys215-202324-2/02-Performance`
[32m[1m   Resolving[22m[39m package versions...
[32m[1m  No Changes[22m[39m to `~/Documents/GitHub/Phys215-202324-2/02-Performance/Project.toml`
[32m[1m  No Changes[22m[39m to `~/Documents/GitHub/Phys215-202324-2/02-Performance/Manifest.toml`


In [2]:
include("Phys215Tools.jl") #insert pre-typed tool functions, fast and dirty style

printfloatbits (generic function with 2 methods)

In [3]:
? floatbits

search: [0m[1mf[22m[0m[1ml[22m[0m[1mo[22m[0m[1ma[22m[0m[1mt[22m[0m[1mb[22m[0m[1mi[22m[0m[1mt[22m[0m[1ms[22m



No documentation found.

`floatbits` is a `Function`.

```
# 2 methods for generic function "floatbits" from Main:
 [1] floatbits(x::Float32)
     @ ~/Documents/GitHub/Phys215-202324-2/02-Performance/Phys215Tools.jl:7
 [2] floatbits(x::Float64)
     @ ~/Documents/GitHub/Phys215-202324-2/02-Performance/Phys215Tools.jl:2
```


## Fast numbers in Julia

> Integers in Julia are stored as system integers.... The `Int` type alias represents the actual integer type used by the system. `Int32` for 32-bit machines; `Int64` for 64-bit machines.[^2]

----
[^2]: Segupta, Julia High Performance, 2nd Ed. (Packt Publishing, 2019).

## Machine `WORD_SIZE` and representation

- FOR BASH-like CLI: Use `uname -m` to examine the processor type of your machine.
    - The command `uname -a` provides `a`ll the relevant machine information.
- Default integer representation depends on machine word size.

In [4]:
; uname -vpm

Darwin Kernel Version 23.3.0: Wed Dec 20 21:28:58 PST 2023; root:xnu-10002.81.5~7/RELEASE_X86_64 x86_64 i386


**Note** that the semicolon indicates that the command is a bash command.
You may need to modify that for non-bash CLI.

### System `WORD_SIZE`

- System `WORD_SIZE` becomes the `Int` size (_in bits_) of the Julia installed.
- Check out `? sizeof()` for the output of the command.
- Use `Sys` to indicate namespace or module scoping.

In [5]:
@show Sys.WORD_SIZE;
println( "The current machine uses $(Sys.WORD_SIZE) bytes for Integers and single-precision variables." )

Sys.WORD_SIZE = 64
The current machine uses 64 bytes for Integers and single-precision variables.


### Use `sizeof()` for byte size

- One bit = 1 two-state unit in physical memory
- One byte = 8 bits, 2^8 states in physical memory

In [6]:
? sizeof

search: [0m[1ms[22m[0m[1mi[22m[0m[1mz[22m[0m[1me[22m[0m[1mo[22m[0m[1mf[22m



```
sizeof(T::DataType)
sizeof(obj)
```

Size, in bytes, of the canonical binary representation of the given `DataType` `T`, if any. Or the size, in bytes, of object `obj` if it is not a `DataType`.

See also [`Base.summarysize`](@ref).

# Examples

```jldoctest
julia> sizeof(Float32)
4

julia> sizeof(ComplexF64)
16

julia> sizeof(1.0)
8

julia> sizeof(collect(1.0:10.0))
80

julia> struct StructWithPadding
           x::Int64
           flag::Bool
       end

julia> sizeof(StructWithPadding) # not the sum of `sizeof` of fields due to padding
16

julia> sizeof(Int64) + sizeof(Bool) # different from above
9
```

If `DataType` `T` does not have a specific size, an error is thrown.

```jldoctest
julia> sizeof(AbstractArray)
ERROR: Abstract type AbstractArray does not have a definite size.
Stacktrace:
[...]
```

---

```
sizeof(str::AbstractString)
```

Size, in bytes, of the string `str`. Equal to the number of code units in `str` multiplied by the size, in bytes, of one code unit in `str`.

# Examples

```jldoctest
julia> sizeof("")
0

julia> sizeof("∀")
3
```


### Check`sizeof()` different `Int` type

- `Int` uses the machine default integer size
- Bigger integers may be used up to size 128 bytes (2^1024 physical states total)

In [7]:
@show sizeof(Int); # uses machine's default integer representation
@show sizeof(Int32);
@show sizeof(Int64);
@show sizeof(Int128);

sizeof(Int) = 8
sizeof(Int32) = 4
sizeof(Int64) = 8
sizeof(Int128) = 16


## Machine bit representation of `Int`s

- Similar to base-10 representation for whole numbers
- Applicable only for whole numbers
- Different scheme used for numbers with fractional part: floating-point representation

### Algorithm for finding bit representation

- [Divide by two method](https://en.wikipedia.org/wiki/Binary_number#Decimal_to_binary)
- 💡 Last remainder goes as most significant bit at a time

### Sample code

In [8]:
function sbit(n::Integer)
    sbit = ""; #initial string, none
    while n != 0
        r = rem(n,2); #returns the remainder
        n = div(n,2); #returns the exact division
        sbit = string(r)*sbit; #append to the left
        @show n, r
    end
    return sbit
end

sbit (generic function with 1 method)

### Algorithm for finding bit representation

- [Divide by two method](https://en.wikipedia.org/wiki/Binary_number#Decimal_to_binary)
- 💡 Last remainder goes as most significant bit at a time

In [9]:
@show sbit(25);

(n, r) = (12, 1)
(n, r) = (6, 0)
(n, r) = (3, 0)
(n, r) = (1, 1)
(n, r) = (0, 1)
sbit(25) = "11001"


### Checking with `parse()`r function

In [10]:
n = 25;
@show s = sbit(n);
@show parse(Int,s;base=2) == n;

(n, r) = (12, 1)
(n, r) = (6, 0)
(n, r) = (3, 0)
(n, r) = (1, 1)
(n, r) = (0, 1)
s = sbit(n) = "11001"
parse(Int, s; base = 2) == n = true


### Native function for `bitstring()`
- `bitstring()` function exists within Julia.

In [11]:
? bitstring

search: [0m[1mb[22m[0m[1mi[22m[0m[1mt[22m[0m[1ms[22m[0m[1mt[22m[0m[1mr[22m[0m[1mi[22m[0m[1mn[22m[0m[1mg[22m Su[0m[1mb[22mst[0m[1mi[22m[0m[1mt[22mution[0m[1mS[22m[0m[1mt[22m[0m[1mr[22m[0m[1mi[22m[0m[1mn[22m[0m[1mg[22m



```
bitstring(n)
```

A string giving the literal bit representation of a primitive type.

See also [`count_ones`](@ref), [`count_zeros`](@ref), [`digits`](@ref).

# Examples

```jldoctest
julia> bitstring(Int32(4))
"00000000000000000000000000000100"

julia> bitstring(2.2)
"0100000000000001100110011001100110011001100110011001100110011010"
```


### Native function for `bitstring()`
- `bitstring()` function exists within Julia.

In [12]:
n08 = Int8(25);
n16 = Int16(25);
n32 = Int32(25);
n64 = Int64(25);
@show n08;
@show typeof(n64);
@show bitstring(n08);
@show bitstring(n16);
@show bitstring(n32);
@show bitstring(n64);

n08 = 25
typeof(n64) = Int64
bitstring(n08) = "00011001"
bitstring(n16) = "0000000000011001"
bitstring(n32) = "00000000000000000000000000011001"
bitstring(n64) = "0000000000000000000000000000000000000000000000000000000000011001"


In [101]:
@show typemax(Int8);
@show typemax(Int32)
@show typemax(Int64);
@show typemax(Int128);

127

## Review: Floating-point representation (IEEE 754 standards)

- Not all numbers perfectly represented in machines
- Binary representation limitations results to under- and overflows
- Floating-point representation in base 2 used for real numbers
- Machine representation covered by [the IEEE Standard for Floating-Point Arithmetic (IEEE 754)](https://en.wikipedia.org/wiki/IEEE_754)
- illustration found in [GeeksForGeeks page (:warning: with paid ads)](https://www.geeksforgeeks.org/ieee-standard-754-floating-point-numbers/).

## Mem size and allocation scheme

Simple `Int` type and `FloatX` type.

In [13]:
println( bitstring(3) );
@show length( bitstring(3) );

0000000000000000000000000000000000000000000000000000000000000011
length(bitstring(3)) = 64


In [14]:
println( bitstring(3.0) );
@show length(bitstring(3.0));

0100000000001000000000000000000000000000000000000000000000000000
length(bitstring(3.0)) = 64


## Mem size and allocation scheme

Simple `Int` type and `FloatX` type.

In [15]:
println( bitstring(3) );
println( bitstring(3.0) );

0000000000000000000000000000000000000000000000000000000000000011
0100000000001000000000000000000000000000000000000000000000000000


- 💡 Same length; different information.
- ☝ Same value; different representation (data type).
- 📖 Standard binary representation for `Int` types but not for `Float` types.

### IEEE 754 floating-point representation standard

- 📖 Check out [the IEEE Standard for Floating-Point Arithmetic (IEEE 754)](https://en.wikipedia.org/wiki/IEEE_754) for the bit assignment for `Float64`
- `binary64 := [s:1][e:11][d:52]`
    - `s` sign bit, `e` exponent bits, `d` significand digits ($b_n$, $n=0,..,51$)
    - implicit significand digits `d`: 53 (assumes digits from 1.0)
    - representation:
      $$(-1)^s(1.b_{51}b_{51}\ldots b_{0})_2 \times 2^{e-1023}$$
      where $e$ is derived from its binary representation from `e` bits.

### IEEE 754 special cases

- **Subnormal numbers** that fill the underflow gap from zero to `eps(1.0)`
    - Activated when $e=0$.
    - representation:
          $$(-1)^s(0.b_{51}b_{51}\ldots b_{0})_2 \times 2^{1-1023}$$
    - Default values used
    - Significand allowed to be less than unity
- Other special numbers are found in [this wiki page](https://en.wikipedia.org/wiki/Double-precision_floating-point_format#Double-precision_examples).

#### Some special `Float`-type values (in Julia)

In [29]:
println( floatbits(NaN) );
println( floatbits(Inf) );
println( floatbits(-0.0) );
println( floatbits(0.0) );

0 | 11111111111 | 1000000000000000000000000000000000000000000000000000
0 | 11111111111 | 0000000000000000000000000000000000000000000000000000
1 | 00000000000 | 0000000000000000000000000000000000000000000000000000
0 | 00000000000 | 0000000000000000000000000000000000000000000000000000


## Julia `Numbers` and related functions

- Basic functions provided to allow analysis of `Numbers` representation
- [A range of primitive numeric representations available](https://docs.julialang.org/en/v1/manual/integers-and-floating-point-numbers/)
- Also [support for arithmetic that requires arbitrary precision](https://docs.julialang.org/en/v1/manual/integers-and-floating-point-numbers/#Arbitrary-Precision-Arithmetic).

### Checkout `Int` type types

- Default is machine `WORD_LENGTH`.
- Try out different `Int`s.

In [16]:
@show bitstring(Int8(126)); # [-2^7, 2^7-1] (one sign bit, leftmost)
@show bitstring(Int16(126));
@show bitstring(Int32(126));

bitstring(Int8(126)) = "01111110"
bitstring(Int16(126)) = "0000000001111110"
bitstring(Int32(126)) = "00000000000000000000000001111110"


- 💡 Different length; same information.
- ☝ Same value; different representation (data type), within _supertype_.

### Subnormal and dynamic range

- Gap between zero and next smallest number representable
- Filled with _subnormal_ by deviating from IEEE 754.

### Smallest number representable

- $1.0 \times 2^{(1-1023)} \approx 2.2250738585072014 \times 10^{−308}$

In [51]:
x = 2.5e-308;
@show x;
@show issubnormal( x );
println( floatbits(x) );

println("");
x = 0.01x;
@show x
@show issubnormal( x );
println( floatbits(x) );

x = 2.5e-308
issubnormal(x) = false
0 | 00000000001 | 0001111110100001100000101100010000001100011000001101

x = 2.5e-310
issubnormal(x) = true
0 | 00000000000 | 0000001011100000010101011100100110100011111101101100


### The `nextfloat()` to a subnormal Number

- Ask: `? nextfloat`

In [67]:
x = 2.0^(-1023);
xnext = nextfloat(x);
@show x;
@show xnext;
@show issubnormal(xnext);
@show xnext-x;
println( floatbits(x) )
println( floatbits(xnext) )
println( floatbits(xnext-x) )

x = 1.1125369292536007e-308
xnext = 1.112536929253601e-308
issubnormal(xnext) = true
xnext - x = 5.0e-324
0 | 00000000000 | 1000000000000000000000000000000000000000000000000000
0 | 00000000000 | 1000000000000000000000000000000000000000000000000001
0 | 00000000000 | 0000000000000000000000000000000000000000000000000001


### Try `nextfloat()` to a (normal) Number

- Ask: `? nextfloat`

In [88]:
x = 2.0^(4);
xnext = nextfloat(x);
@show x;
@show issubnormal(x);
@show xnext;
@show issubnormal(xnext);
@show xnext-x;
println( floatbits(x) )
println( floatbits(xnext) )
println( floatbits(xnext-x) )

x = 16.0
issubnormal(x) = false
xnext = 16.000000000000004
issubnormal(xnext) = false
xnext - x = 3.552713678800501e-15
0 | 10000000011 | 0000000000000000000000000000000000000000000000000000
0 | 10000000011 | 0000000000000000000000000000000000000000000000000001
0 | 01111001111 | 0000000000000000000000000000000000000000000000000000


### The machine `eps()`

- No continuous `Numbers`
- `eps(x) = nextfloat(x) - x`
- Smallest Number that can be added to the current (default: 1)

In [93]:
x₀ = 1.0
@show x₀;
@show eps(x₀);
ϵ = 1e-10*rand();
@show ϵ;
@show ϵ + x₀;
@show x₀ + eps(x₀)/2;

x₀ = 1.0
eps(x₀) = 2.220446049250313e-16
ϵ = 5.8018444091574694e-11
ϵ + x₀ = 1.0000000000580185
x₀ + eps(x₀) / 2 = 1.0


### Truncation error

- `@assert (x + y) - x == y` is not always `true`
- Care with equality conditions
- Even for zeros (since there are two zeros), just to be sure

In [116]:
ϵ_plus1 = ϵ + x₀;
@show ϵ_plus1 - x₀;
ϵPrime = ϵ_plus1 - x₀;
@show ϵPrime;
@show ϵPrime - ϵ;
println( floatbits(ϵ) );
println( floatbits(ϵPrime) );
println( floatbits( nextfloat(ϵPrime) ) );
@show -0.0 === 0.0
@show -0.0 == 0.0

ϵ_plus1 - x₀ = 5.801847891007128e-11
ϵPrime = 5.801847891007128e-11
ϵPrime - ϵ = 3.481849658637908e-17
0 | 01111011100 | 1111111001010101111010111110110110110010011000111101
0 | 01111011100 | 1111111001010110000000000000000000000000000000000000
0 | 01111011100 | 1111111001010110000000000000000000000000000000000001
-0.0 === 0.0 = false
-0.0 == 0.0 = true


true

## `Numbers` overflow

- !! Julia has no overflow check.
- Scientifically The physical constants[^1] are small
    - Avogadro number $N_A \sim 10^{23}$
    - Speed of light $c \sim 10^8$
    - Planck constant $\hbar \sim 10^{-34}$
    - Cosmological constant $\Lambda \sim 10^{-52}$
- Constant coefficients removable; normalized variables in governing equations
 
[^1]: [physical constants](https://en.wikipedia.org/wiki/List_of_physical_constants)

### Dynamic Range

- Know the expected maximum and minimum values required
- Choose for appropriate data type **for the dynamic range**.
- Normalize where applicable, from fundamental level (e.g. working equations)
- Minimize under- and overflows : Avoid boundaries between subnormal and normal `Numbers`

### `Float`s and overflows

In [4]:
xMax = typemax(Float64);
xPrv = prevfloat(xMax);
@show xMax;
@show xPrv;
@show Inf + 1.0;
printfloatbits(xMax);
printfloatbits(xPrv);
printfloatbits(xMax + 1.0);

xMax = Inf
xPrv = 1.7976931348623157e308
Inf + 1.0 = Inf
0 | 11111111111 | 0000000000000000000000000000000000000000000000000000
0 | 11111111110 | 1111111111111111111111111111111111111111111111111111
0 | 11111111111 | 0000000000000000000000000000000000000000000000000000


### Counting `Int`s better than `Float`s

- Check if your loop requires unit changes

In [16]:
@show n = maxintfloat(Float64,Int64);
@show prevfloat( typemax(Float64) );
@show float(typemax(Int64));
@show float(typemax(Int32));

n = maxintfloat(Float64, Int64) = 9.007199254740992e15
prevfloat(typemax(Float64)) = 1.7976931348623157e308
float(typemax(Int64)) = 9.223372036854776e18
float(typemax(Int32)) = 2.147483647e9


### `Float` types have unit change detection limit

- `Float` types have max for unit changes due to `eps()`

In [28]:
@show typeof(n)
@show (n+1) - n
printfloatbits(n)
@show parse(Int,"10000110100";base=2)-1023;
@show ( prevfloat(n)+1 ) - prevfloat(n);

typeof(n) = Float64
(n + 1) - n = 0.0
0 | 10000110100 | 0000000000000000000000000000000000000000000000000000
parse(Int, "10000110100"; base = 2) - 1023 = 53
(prevfloat(n) + 1) - prevfloat(n) = 1.0


## Solving over- and underflows

- Using `Big` `Numbers` [^1]
- Making `Numbers` go `big()`

[^1]: For floating-point representation, there's [GNU Multiple-Precision Floating-point computations with correct Rounding](https://www.mpfr.org), and for integers there's [the GMP or GNU Multiple Precision Arithmetic Library](https://gmplib.org).

#### `BigInt` is `big(n::Int)`

In [50]:
nMax = typemax(Int);
@show typeof(nMax);
@show nMax;
@show nMax + 1;
@show sizeof(nMax);

nBig = big( typemax(Int) );
@show typeof(nBig)
@show nBig;
@show nBig + 1;
@show sizeof(nBig)
println( "Sized up by $( sizeof(nBig) / sizeof(nMax) ) times!" )

typeof(nMax) = Int64
nMax = 9223372036854775807
nMax + 1 = -9223372036854775808
sizeof(nMax) = 8
typeof(nBig) = BigInt
nBig = 9223372036854775807
nBig + 1 = 9223372036854775808
sizeof(nBig) = 16
Sized up by 2.0 times!


#### `BigFloat` is `big(Float64)`

In [49]:
xMax = prevfloat( typemax(Float64) );
@show typeof(xMax);
@show xMax;
@show xMax + 1;
@show sizeof(xMax);

xBig = big( prevfloat( typemax(Float64) ) );
@show typeof(xBig)
@show xBig;
@show xBig + 1;

@show sizeof(xBig);
println( "Sized up by $( sizeof(xBig) / sizeof(xMax) ) times!" )

typeof(xMax) = Float64
xMax = 1.7976931348623157e308
xMax + 1 = 1.7976931348623157e308
sizeof(xMax) = 8
typeof(xBig) = BigFloat
xBig = 1.797693134862315708145274237317043567980705675258449965989174768031572607800285e+308
xBig + 1 = 1.797693134862315708145274237317043567980705675258449965989174768031572607800285e+308
sizeof(xBig) = 40
Sized up by 5.0 times!


## Speed always against convinience

- Compute time is needed for checking over- and underflows.
- Allocation time is needed for at least doubling the variable size.

In [12]:
m = rand(Int32);
n = rand(Int32);

markBig = @benchmark $(BigInt(m)) + $(BigInt(n));
mark32 = @benchmark $(Int32(m)) + $(Int32(n));
mark64 = @benchmark $(Int64(m)) + $(Int64(n));
mark128 = @benchmark $(Int128(m)) + $(Int128(n));

In [9]:
markBig

BenchmarkTools.Trial: 10000 samples with 979 evaluations.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m 59.586 ns[22m[39m … [35m170.306 μs[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m 0.00% … 47.72%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m 69.434 ns               [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m 0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m159.228 ns[22m[39m ± [32m  3.507 μs[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m23.96% ±  1.11%

  [39m▂[39m▃[39m▆[39m█[39m█[34m▇[39m[39m▆[39m▅[39m▃[39m▂[39m▂[39m▁[39m [39m [39m▁[39m▁[39m▁[39m▁[39m▁[39m▁[39m▁[39m▁[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [32m [39m[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m▂
  [39m█[39m█

In [14]:
mark128

BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m2.258 ns[22m[39m … [35m42.566 ns[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m0.00% … 0.00%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m2.278 ns              [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m2.350 ns[22m[39m ± [32m 1.195 ns[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m0.00% ± 0.00%

  [39m [39m [39m [39m [39m [39m [39m [39m▂[39m▅[39m█[39m▁[39m▆[34m▃[39m[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m▂[39m▁[39m [39m [39m [32m [39m[39m [39m [39m [39m [39m 
  [39m▂[39m▂[39m▂[39m▂[39m▃[39m▃[39m▅[

In [13]:
@show median(markBig.times);
@show median(mark32.times);
@show median(mark64.times);
@show median(mark128.times);

median(markBig.times) = 70.35234215885947
median(mark32.times) = 2.057
median(mark64.times) = 2.05
median(mark128.times) = 2.278


In [17]:
markBigF

BenchmarkTools.Trial: 10000 samples with 992 evaluations.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m37.685 ns[22m[39m … [35m 2.568 μs[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m0.00% … 96.72%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m40.486 ns              [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m46.041 ns[22m[39m ± [32m80.256 ns[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m7.41% ±  4.19%

  [39m▄[39m▇[39m█[39m█[34m█[39m[39m▇[39m▄[39m▂[39m▂[39m▃[39m▁[32m [39m[39m [39m▂[39m▃[39m▁[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m▂
  [39m█[39m█[39m█[39m█[34m█[39

In [18]:
markF64

BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m2.034 ns[22m[39m … [35m262.129 ns[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m0.00% … 0.00%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m2.054 ns               [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m2.151 ns[22m[39m ± [32m  2.851 ns[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m0.00% ± 0.00%

  [39m█[34m█[39m[39m▂[39m [39m▂[39m▁[39m [32m [39m[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m▂
  [39m█[34m█[39m[39m█[39m▇[39m█

In [15]:
x = rand(Float16);
y = rand(Float16);

markBigF = @benchmark $(BigFloat(x)) + $(BigFloat(y));
markF16 = @benchmark $(Float16(x)) + $(Float16(y));
markF32 = @benchmark $(Float32(x)) + $(Float32(y));
markF64 = @benchmark $(Float64(x)) + $(Float64(y));

In [16]:
@show median(markBigF.times);
@show median(markF16.times);
@show median(markF32.times);
@show median(markF64.times);

median(markBigF.times) = 40.48588709677419
median(markF16.times) = 2.769
median(markF32.times) = 2.109
median(markF64.times) = 2.054


## Trading performance for accuracy

- `@fastmath` switch to non-standard algorithms
    - `-ffast-math` option in `clang` or `gcc` compilers
    - `-fast` option in FORTRAN
    - `-Ox` additional options for some `clang` and FORTRAN
- Non-standard algorihtms can speed up
- Speed vs accuracy

### Sample intergration-like function

- Involves repeating mathematical operations
- Cumulative or collective algorithms
- Passing of data between processes (functions or methods)

### Usual implementation of a `sum_diff()`

In [43]:
function sum_diff(x)
    n = length(x)
    d = 1/(n-1)
    s = zero( eltype(x) ) #ensure type stability
    s = s + (x[2] - x[1]) / d
    for nn in 2:(n-1)
        s = s + (x[nn] - x[nn-1]) / (2*d)
    end
    s = s + (x[n] - x[n-1]) / d
    return s
end

sum_diff (generic function with 1 method)

In [49]:
x = rand(2_000);

s0 = sum_diff(x);
@show s0;

s0 = 1106.840504653594


In [50]:
mark0 = @benchmark s0 = sum_diff($x);
mark0

BenchmarkTools.Trial: 10000 samples with 9 evaluations.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m2.238 μs[22m[39m … [35m 45.336 μs[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m0.00% … 0.00%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m2.370 μs               [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m2.417 μs[22m[39m ± [32m762.132 ns[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m0.00% ± 0.00%

  [39m▆[39m▇[39m█[34m▃[39m[32m [39m[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m▂
  [39m█[39m█[39m█[34m█[39m[32m█[3

### Quick and dirty but `@fastmath`

In [53]:
function sum_diff_fast(x)
    n = length(x)
    d = 1/(n-1)
    s = zero( eltype(x) ) #ensure type stabilitya
    @fastmath s = s + (x[2] - x[1]) / d   # fast and dirty
    @fastmath for nn in 2:(n-1)           # fast and dirty
        s = s + (x[nn] - x[nn-1]) / (2*d)
    end
    @fastmath s = s + (x[n] - x[n-1]) / d # fast and dirty
    return s
end

sum_diff_fast (generic function with 1 method)

#### Compare results and `@benchmark`s

In [57]:
x = rand(2_000);
s0 = sum_diff(x);
s1 = sum_diff_fast(x);
@show s0;
@show s1;
@show abs(s1-s0);
@show eps(s0);

s0 = 408.7318703763275
s1 = 408.73187037632266
abs(s1 - s0) = 4.831690603168681e-12
eps(s0) = 5.684341886080802e-14


In [40]:
mark1 = @benchmark s1 = sum_diff_fast($x);
mark1

BenchmarkTools.Trial: 10000 samples with 714 evaluations.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m174.690 ns[22m[39m … [35m846.989 ns[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m0.00% … 0.00%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m190.377 ns               [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m196.163 ns[22m[39m ± [32m 27.601 ns[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m0.00% ± 0.00%

  [39m▄[39m [39m▃[39m [39m▇[39m▄[34m▇[39m[39m█[39m▃[32m▁[39m[39m▂[39m [39m [39m [39m [39m [39m [39m [39m [39m▁[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m▂
  [39m█[39m█[39m█

In [58]:
println("Speedup of $( median(mark0.times) / median(mark1.times) ) times.")

Speedup of 12.447786045952743 times.


# Fin

- [X] Analyzed the floating point layout or architecture used by your Julia installation.
- [X] Demonstrated tradeoff between runtime speed and over- or underflow checks in [number representations in Julia](https://docs.julialang.org/en/v1/manual/integers-and-floating-point-numbers/).
- [X] Shown how much `@fastmath` macro speeds up computation with a trades off in some level of accuracy. The `sum_diff()` function in the main book reference may be replicated for this purpose.