# Performany Code 

## Learning Objectives 
- Understand basic strategies for writing high-performance Julia code 
- Learn about type stability and why it's important for performance 
- Recognise the impact of global variables on performance and how to avoid them 
- Appreciate the benefit of using built-in functions and vectorised operations for optimisation 
- Choose appropriate data structures for a task to improve performance 
- Understand memory management techniques (like avoiding unnecessary copies) to write efficient code 
- Measure running time and memory allocation of code and identify bottlenecks using simple tools 

One of Julia's major appeals is performance. You can often write code in Julia that is both high-level and also runs nearly as fast as lower-level languages. However, to fully unlock this performance, it's good to be aware of a few tips and practices. Within this lesson, we are going to introduce some key concepts: type stability, avoiding globals, using efficient approaches, and basic profiling/timing.

## Overview of Performance Tuning. Strategies 
- Performance tuning in Julia often comes down to writing code that is easy for the compiler to optimise, which includes: 
- Ensuring computations are type-stable: the types of variables don't change unpredictably. 
- Avoiding global variables in tight loops or computations: use functions to encapsulate logic. 
- Utilising Julias vectorised (broadcast) operations and built-in functions, which are highly optimised, instead of manually looping in an inefficient way. 
- Choose the right data structure, e.g. using arrays, tuples, and dictionaries appropriately. 
- Reducing memory allocations when possible, for example, by modifying data in place or using views for subarrays instead of making copies. 
- Measuring and profiling to find where the time is actually being spent, so you can optimise where it matters.

## Efficient Julia Code 

### Type Stability 

A function is type-stable if the type of its output can be determined from the types of its inputs, **without having to run the function**. This means that you need to: 
- pass inputs of known types into the function, 
- the compiler can predict what **type** the result will be.

Example of a type-stable function: 

```Julia
function add(x::Float64, y::Float64)
    return x + y
end
```
For the function above:
- If you give two `Float64`s, the result is always a `Float64`.
- The compiler **knows this immediately**.

Example of a type-unstable function: 
```Julia
function maybe_add(x, y)
    if rand() > 0.5
        return x + y
    else
        return string(x, y)
    end
end
```
- Sometimes it returns a **number**, sometimes a **String**. 
- The compiler **Can't predict** the output type just from the input types. 
- This forces Julia to **insert expensive type checks** at runtime. 

This then raises the question of **Does specifying types in function signatures improve performance?** Where the answer is **no, not necessarily.**

Specifying types such as: 

```Julia 
function myfun(x::Float64)
    ...
end
```

It will not automatically make the code faster. The reason for this is: 
Julia **compiles specialised versions of functions for the types it sees anyway**, even if you didn't specify types.
What matters for performance is **what happens inside the function** - are the types predictable?

**You can write a function with no type annotations** but still make it very fast if the output type is **predictable**. 

The rule of thumb is:
- Specifying types at the function input is good for readability, documentation, dispatch (method choices)
- Type stability inside the function is crucial for performance.

You can use: 
```Julia 
@code_warntype your_function(args)
```

To check if. a function is type-stable. It will show **orange** (or **red**) types if Julia can not predict types.

#### Type-Stable Function: `add`

In [1]:
function add(x::Float64, y::Float64)
    return x + y
end

@code_warntype add(1.0, 2.0)

MethodInstance for add(::Float64, ::Float64)
  from add([90mx[39m::[1mFloat64[22m, [90my[39m::[1mFloat64[22m)[90m @[39m [90mMain[39m [90m[4mIn[1]:1[24m[39m
Arguments
  #self#[36m::Core.Const(add)[39m
  x[36m::Float64[39m
  y[36m::Float64[39m
Body[36m::Float64[39m
 %1 = (x + y)[36m::Float64[39m
[90m└──[39m      return %1



Above, you can see a type-stable function `add`. The output from `@code_warntype` shows that the output will be a `Float64`; all the variable types are clearly inferred, and there are no concerning red or yellow types. Given two `Float64` arguements, the output is predictable a `Float64`. Julia can compile optimised machine code with no dynamic checks, which is the ideal case for performance. 

#### Mildly Type-Unstable Function: `maybe_add`

In [2]:
function maybe_add(x, y)
    if rand() > 0.5
        return x + y
    else
        return string(x, y)
    end
end

@code_warntype maybe_add(1.0, 2.0)

MethodInstance for maybe_add(::Float64, ::Float64)
  from maybe_add([90mx[39m, [90my[39m)[90m @[39m [90mMain[39m [90m[4mIn[2]:1[24m[39m
Arguments
  #self#[36m::Core.Const(maybe_add)[39m
  x[36m::Float64[39m
  y[36m::Float64[39m
Body[33m[1m::Union{Float64, String}[22m[39m
[90m1 ─[39m %1 = Main.rand()[36m::Float64[39m
[90m│  [39m %2 = (%1 > 0.5)[36m::Bool[39m
[90m└──[39m      goto #3 if not %2
[90m2 ─[39m %4 = (x + y)[36m::Float64[39m
[90m└──[39m      return %4
[90m3 ─[39m %6 = Main.string(x, y)[36m::String[39m
[90m└──[39m      return %6



Above, you can see a mildly type-unstable function `maybe_add`. For this function, you can see the output is highlighted in yellow with the output `Body::Union{Float64, String};` as Julia knows, it might return one of two types. The function is **type-unstable** because it may return either a `Float64` or a `String` depending on a random condition. Julia must handle both possibilities at runtime, which introduces branching and dynamic dispatch, hurting performance compared to type-stable code. 

#### Severely Type-Unstable Function: `bad_sum`

In [3]:
function bad_sum(arr)
    result = 0
    for x in arr
        result = result + x
    end
    return result
end

@code_warntype bad_sum([1, 2.0, "3"])


MethodInstance for bad_sum(::Vector{Any})
  from bad_sum([90marr[39m)[90m @[39m [90mMain[39m [90m[4mIn[3]:1[24m[39m
Arguments
  #self#[36m::Core.Const(bad_sum)[39m
  arr[36m::Vector{Any}[39m
Locals
  @_3[91m[1m::Union{Nothing, Tuple{Any, Int64}}[22m[39m
  result[91m[1m::Any[22m[39m
  x[91m[1m::Any[22m[39m
Body[91m[1m::Any[22m[39m
[90m1 ─[39m       (result = 0)
[90m│  [39m %2  = arr[36m::Vector{Any}[39m
[90m│  [39m       (@_3 = Base.iterate(%2))
[90m│  [39m %4  = (@_3 === nothing)[36m::Bool[39m
[90m│  [39m %5  = Base.not_int(%4)[36m::Bool[39m
[90m└──[39m       goto #4 if not %5
[90m2 ┄[39m %7  = @_3[91m[1m::Tuple{Any, Int64}[22m[39m
[90m│  [39m       (x = Core.getfield(%7, 1))
[90m│  [39m %9  = Core.getfield(%7, 2)[36m::Int64[39m
[90m│  [39m       (result = result + x)
[90m│  [39m       (@_3 = Base.iterate(%2, %9))
[90m│  [39m %12 = (@_3 === nothing)[36m::Bool[39m
[90m│  [39m %13 = Base.not_int(%12)[36m::Bool[39m
[

Above, you can see a severely type-unstable function, `bad_sum`. For this function, you can see the output is highlighted in red with the output `Any`, showing a severely type-unstable function. The input array has mixed types: `Int`, `Float64`, and `String`. As a result: 
- Julia cannot infer the type of `result` or `x`.
- The loop performance **dynamic dispatch** for every iteration, meaning the exact method used is determined at runtime, not at compile time. 
- Intermediate values are boxed, meaning they are stored in the heap with additional type information rather than directly in registers. As a result, performance is dramatically worse due to slower memory access, dynamic type checks, and an increased overhead of garbage collection. 

### Avoiding Global variables 

Julia's functions are JIT-compiled and optimised when you call them. However, if you operate in a global space (like running a loop at the top level of a script or notebook referring to global variables), the compiler has a harder job optimising because global variables can change type or value at any time.

**Always try to put performance-critical code inside functions**. Then, call those functions from the global scope. This way, the code is compiled in a local scope where the types of variables are known. 

For example, instead of: 

```Julia 
# Avoid this for performance-critical code
numbers = rand(1000)
total = 0
for x in numbers
    total += x
end
total
```

You would want to write: 

```Julia 
function sum_array(arr)
    total = 0.0
    for x in arr
        total += x
    end
    return total
end

numbers = rand(1000)
total = sum_array(numbers)
```

Inside `sum_array`, `total` is local and of a stable type (`Float64`), and `x` will be the type of the array elements (`Float64`). The compiler can optimise this well. In the global loop version, `total` is a global variable - the compiler can't assume it won't be used elsewhere, so optimisation is limited. 

If you are using global constants (like a configuration. value that doesn't change), declare them as `const` in global space to help performance, e.g. `const PI = 3.14`. 

### Utilising Built-In Functions and Vectorised Operations 

Julia's standard library and well-known packages have many optimised routines, which are often implemented in C or using Julia's own optimisations (including multi-threading in some cases). Examples: `sum`, `maximum`, or linear algebra operations (`A * B`) or sorting (`sort`). 

### Using Appropriate Data Structures 
Some key considerations when determining which data structure to use include: 
- If you need random access to elements by index and the collection will grow/shrink, you require a Vector (`Array`). 
- If you need to look up values by keys, use a `Dict` instead of searching through an array each time. 
- If you have a fixed small set of values of heterogenous types, a `Tuple` can be helpful. However, they are immutable, and their types are part of their identity, making them very efficient for specific uses, like returning multiple values from a function. 
- If you need stack or queue behaviour, you can still use arrays (with `push!` or `pop!` for the stack and `push!` and `popfirst!` for the queue. 
- If you have binary data or bits, consider `BitVector` for large boolean arrays that are memory efficient. 
- For mathematical operations, using native numeric types (`Int`, `Float64`) is faster than arbitrary precisions or rational types, so only use `BigInt`, `BigFloat` and `Rational` when needed. 

### Memory Management and Avoiding Unnecessary Allocation 

Excessive memory allocation can slow down code (due to both allocation overhead and garbage collection); some strategies to reduce the impact include: 
- Reuse arrays instead of creating new ones in a loop. For instance, if you need to collect results in an array inside a loop, consider allocating it outside and filling it in each iteration. 
- Use **views** to refer to subarray without copying. `view(A, 1:10)` gives a "window" into array `A` from 1 to 10 without allocating a new array for hat sections. 
- Use in-place operation if possible. Many functions have an in-place form (often with a `!` at the end of their name by convention). For example, `sort!(array)` will sort an array in place (no new array created, just rearranging). `push!` and `append!` modify arrays rather than making new ones. 
- Avoid converting types repeatedly inside loops. If you need a value of a particular type, convert once outside or ensure your data is readily in the desired type before the loop.

## Measuring Performance 

Julia provides some simple macros to measure execution time and memory: 
- `@time expression` runs the expression once and prints the time taken and memory allocated. 
- `@time` is good for a quick check, but the first run does not include compilation time. Run it twice to see the actual execution time after compilation. 
- `@benchmark` from the `BechmarkTools.jl` package (needs `using BenchmarkTools`) provides more rigorous timing that runs multiple times and gives statistics, avoiding compilation cost influence.
- `@timed` will return timing info programmatically. 
- `@allocated` returns just the number of bytes allocated by running an expression. 

In [4]:
function custom_sum(arr)
    s = 0
    for i in 1:length(arr)
        s += arr[i]
    end
    return s
end


data = rand(100000000) 

println("Running inefficient_sum...")
@time total1 = custom_sum(data)

println("\n Running built-in sum...")
@time total3 = sum(data) 


Running inefficient_sum...
  0.484978 seconds (3.98 k allocations: 269.438 KiB, 0.62% compilation time)

 Running built-in sum...
  0.021193 seconds (33.21 k allocations: 2.215 MiB, 40.67% compilation time)


4.9997776831760295e7

This is just illustrative; actual times will depend on your machine and whether multi-threading is used. The key point is `sum(data)` is likely very optimized, beating the manual loops.

#### Profiling for Bottlenecks 

If you have a complex program and you want to see where it spends time, you can use the built-in profiler: 

```Julia 
using Profile
@profiler my_long_running_function()
```

Then use `Profile.print()` or the `ProfileView.jl` package to analyse the results. Profiling tells you which functions or lines are taking the most time. For simpler cases, you might not need this level of detail.

## Exercise 1: Analysing the Performance of Code 

Given the three functions below, use what we've discussed so far about type stability, allocations, and performance to understand **why they perform differently**.

```Julia 
function method_1(N)
    arr = Int[]  
    for i in 1:N
        push!(arr, i)      
    end
    return sum(arr)        
end

function method_2(N)
    arr = collect(1:N)    
    return sum(arr)
end


function method_3(N)
    return N*(N+1)÷2
end
```

In [6]:
using JSON

function show_quiz_from_json(path)
    quiz_data = JSON.parsefile(path)

    html = """
    <style>
    .quiz-question {
        background-color: #6c63ff;
        color: white;
        padding: 12px;
        border-radius: 10px;
        font-weight: bold;
        font-size: 1.2em;
        margin-bottom: 10px;
    }

    .quiz-form {
        margin-bottom: 20px;
    }

    .quiz-answer {
        display: block;
        background-color: #f2f2f2;
        border: none;
        border-radius: 10px;
        padding: 10px;
        margin: 5px 0;
        font-size: 1em;
        cursor: pointer;
        text-align: left;
        transition: background-color 0.3s;
        width: 100%;
    }

    .quiz-answer:hover {
        background-color: #e0e0e0;
    }

    .correct {
        background-color: #4CAF50 !important;
        color: white !important;
        border: none;
    }

    .incorrect {
        background-color: #D32F2F !important;
        color: white !important;
        border: none;
    }

    .feedback {
        margin-top: 10px;
        font-weight: bold;
        font-size: 1em;
    }
    </style>

    <script>
    function handleAnswer(qid, aid, feedback, isCorrect) {
        // Reset all buttons for the question
        let buttons = document.querySelectorAll(".answer-" + qid);
        buttons.forEach(btn => {
            btn.classList.remove('correct', 'incorrect');
        });

        // Apply correct/incorrect to selected
        let selected = document.getElementById(aid);
        selected.classList.add(isCorrect ? 'correct' : 'incorrect');

        // Show feedback below the question
        let feedbackBox = document.getElementById('feedback_' + qid);
        feedbackBox.innerHTML = feedback;
        feedbackBox.style.color = isCorrect ? 'green' : 'red';
    }
    </script>
    """

    for (i, question) in enumerate(quiz_data)
        qid = "$i"
        html *= """<div class="quiz-question">$(question["question"])</div><form class="quiz-form">"""

        for (j, answer) in enumerate(question["answers"])
            aid = "q$(i)_a$(j)"
            feedback = answer["feedback"]
            correct = startswith(lowercase(feedback), "correct")
            html *= """
            <button type="button" class="quiz-answer answer-$qid" id="$aid"
                onclick="handleAnswer('$qid', '$aid', '$feedback', $(correct))">
                $(answer["answer"])
            </button>
            """
        end

        html *= """<div class="feedback" id="feedback_$qid"></div></form><hr>"""
    end

    display("text/html", html)
end


# Use the function
show_quiz_from_json("questions/summary_performant_code.json")