# Using Immutables for Efficiency

1. When CPU runs an instruction it operates on things in the registers. There are very few of these in a computer, these are the %1s and %2s when you run @code_llvm. If something is not in the CPU registers, the CPU needs to fetch the data from memory: this is SLOW!!!

2. The CPU first looks at L1 cache, then L2 cache then Main memory and then Swap space - L1, L2 caches are still small (order of megabytes), but hitting them often will give orders of magnitude performance gain as compared to hitting main memory that often

3. The computer optimistically brings things from main memory onto the caches when you access a chunk of data. Hence if you access data that is continuous in memory, they all get asynchronously brought into the cache and your program will be really fast.

## Memory layout of an array of types

In [1]:
type Typ
    x::Int16
    y::Int16
end

The size of `Typ` is 4 bytes.

In [2]:
sizeof(Typ(2,2))

4

In [3]:
@time typ_arr = [Typ(i%127,i%127) for i=1:10^6];

  0.044205 seconds (1.02 M allocations: 23.743 MB, 21.78% gc time)


Notice the allocation. And then notice that this array is 2x bigger than it should be!!

In [4]:
sizeof(typ_arr)

8000000

In [5]:
sizeof(typ_arr) / 10^6 # bytes per object

8.0

This is because mutable objects are **passed by reference**!! The objects are being "boxed".

In [6]:
Base.:+(a::Typ, b::Typ) = Typ(a.x+b.x, a.y+b.y)

This is to make the following possible:

In [7]:
function someone_else_doing_something_else(a::Typ)
    a.x = 42
end

someone_else_doing_something_else(typ_arr[3])
typ_arr[3]

Typ(42,3)

Sum could also have been much more efficient....

In [8]:
@time sum(typ_arr)

  0.086339 seconds (1.02 M allocations: 16.020 MB, 25.38% gc time)


Typ(19820,19781)

## Memory layout of an array of Immutables

In [9]:
immutable Imm
    x::Int16
    y::Int16
end

In [10]:
sizeof(Imm(2,2))

4

In [11]:
@time imm_arr = [Imm(i%127,i%127) for i=1:10^6];

  0.060103 seconds (18.44 k allocations: 4.649 MB)


In [12]:
sizeof(imm_arr)

4000000

**Seems correct!**

Since immutables can never be changed, their value _is_ their identity, the compiler can **pass them by value**

In [13]:
Base.:+(a::Imm, b::Imm) = Imm(a.x+b.x, a.y+b.y)

In [14]:
@time sum(typ_arr)

  0.079495 seconds (1.00 M allocations: 15.259 MB, 38.53% gc time)


Typ(19820,19781)

The allocation is the same as adding Float *values*

In [15]:
x = rand(10^6)

@time sum(x)

  0.033010 seconds (13.57 k allocations: 635.952 KB)


499541.1418369091

The compiler can do this optimization because it knows someone else won't be changing the insides of the `Imm` object!

In [16]:
function someone_else_doing_something_else(a::Imm)
    a.x = 42 # This is not allowed!!
end

someone_else_doing_something_else(imm_arr[3])

LoadError: type is immutable

If you don't know the type of the insides of an immutable type, you can tack on a type parameter.

For example

In [17]:
immutable ImmParam{T}
    x::T
    y::T
end

In [18]:
sizeof(ImmParam{Int128}) # sizeof also works on the 

32

In [19]:
sizeof(ImmParam{Int8})

2

In [20]:
ImmParam{Int8} == ImmParam{Int64}

false

In [21]:
ImmParam(1.0,2.0) # Julia can automatically infer this

ImmParam{Float64}(1.0,2.0)

In [22]:
ImmParam(1,2)

ImmParam{Int64}(1,2)

In [23]:
ImmParam(1.0,2)

LoadError: MethodError: no method matching ImmParam{T}(::Float64, ::Int64)[0m
Closest candidates are:
  ImmParam{T}{T}(::T, [1m[31m::T[0m) at In[17]:2
  ImmParam{T}{T}(::Any) at sysimg.jl:53[0m

### And! It is aligned tightly!

In [24]:
@time imm_par_array_int16 = [ImmParam{Int16}(2,3) for i = 1:10^6];

  0.046420 seconds (20.36 k allocations: 4.710 MB)


In [25]:
sizeof(imm_par_array_int16)

4000000

In [26]:
@time imm_par_array_int8 = [ImmParam{Int8}(2,3) for i = 1:10^6];

  0.095234 seconds (20.35 k allocations: 2.803 MB)


In [27]:
sizeof(imm_par_array_int8)

2000000

In [53]:
@time imm_par_array_cplx = [ImmParam(2+3im,3+2im) for i = 1:10^6];

  0.066036 seconds (13.98 k allocations: 31.117 MB, 2.43% gc time)


In [54]:
sizeof(imm_par_array_cplx)

32000000

In [56]:
Base.:+(a::ImmParam, b::ImmParam) = ImmParam(a.x+b.x, a.y+b.y)



In [62]:
@time sum(imm_par_array_cplx)

  0.043371 seconds (5 allocations: 208 bytes)


ImmParam{Complex{Int64}}(2000000 + 3000000im,3000000 + 2000000im)

## But be careful! Vectors of Heterogeneous types force boxing!

In [75]:
["xyzabc", 1+2im, 1, 1.0]

4-element Array{Any,1}:
   "xyzabc"
 1+2im     
  1        
  1.0      

In [68]:
[ImmParam(UInt8(1),UInt8(1)), ImmParam(1.0,1.0)] |> sizeof

16

In [72]:
@time heter_arr = [i%2 == 0 ? ImmParam(UInt8(1),UInt8(1)) : ImmParam(1.0,1.0) for i = 1:10^6]

  0.341083 seconds (3.03 M allocations: 85.310 MB, 50.88% gc time)


1000000-element Array{ImmParam,1}:
 ImmParam{Float64}(1.0,1.0)
 ImmParam{UInt8}(0x01,0x01)
 ImmParam{Float64}(1.0,1.0)
 ImmParam{UInt8}(0x01,0x01)
 ImmParam{Float64}(1.0,1.0)
 ImmParam{UInt8}(0x01,0x01)
 ImmParam{Float64}(1.0,1.0)
 ImmParam{UInt8}(0x01,0x01)
 ImmParam{Float64}(1.0,1.0)
 ImmParam{UInt8}(0x01,0x01)
 ImmParam{Float64}(1.0,1.0)
 ImmParam{UInt8}(0x01,0x01)
 ImmParam{Float64}(1.0,1.0)
 ⋮                         
 ImmParam{Float64}(1.0,1.0)
 ImmParam{UInt8}(0x01,0x01)
 ImmParam{Float64}(1.0,1.0)
 ImmParam{UInt8}(0x01,0x01)
 ImmParam{Float64}(1.0,1.0)
 ImmParam{UInt8}(0x01,0x01)
 ImmParam{Float64}(1.0,1.0)
 ImmParam{UInt8}(0x01,0x01)
 ImmParam{Float64}(1.0,1.0)
 ImmParam{UInt8}(0x01,0x01)
 ImmParam{Float64}(1.0,1.0)
 ImmParam{UInt8}(0x01,0x01)

In [74]:
@time sum(heter_arr)

  0.267709 seconds (1.00 M allocations: 30.518 MB, 85.09% gc time)


ImmParam{Float64}(1.0e6,1.0e6)

## Summary

- Use immutables wherever you consider something to be a *value*. Use type when something is a *state*.
- Never create a large array of mutable objects! Each one is heap-allocated, this kills performance and gives the GC a hard time.
- Parameterize if you need to change types