# Chapter 25: Designing for Performance

Performance is a feature. Users expect applications to be responsive, fast, and resource‑efficient. In C#, writing high‑performance code often means understanding how memory is managed, minimizing allocations, and using modern language features that give you fine‑grained control. This chapter focuses on techniques and tools to design and measure performance in your C# applications.

You'll learn:

- The role of **`Span<T>`** and **`Memory<T>`** for working with contiguous memory without allocations.
- **`ref struct`** and **`ref returns`** – how to safely return references and avoid copies.
- Why **`StringBuilder`** is essential for string concatenation in loops.
- **Benchmarking** with **BenchmarkDotNet** to measure performance accurately.
- **Profiling** – identifying bottlenecks in your application.
- Other performance tips: object pooling, avoiding LINQ overhead in hot paths, and using `ArrayPool`.

By the end, you'll have a toolkit for writing high‑performance C# code and the ability to measure and validate your improvements.

---

## 25.1 Understanding Performance in .NET

Performance in .NET involves several factors:

- **CPU time** – how much work the processor does.
- **Memory allocations** – creating objects on the heap, which triggers garbage collection.
- **Latency** – time waiting for I/O or other operations.
- **Throughput** – number of operations per second.

The garbage collector (GC) is a major consideration. Frequent allocations lead to more GC collections, which can pause your application. Writing high‑performance code often means **reducing allocations** and **reusing objects** where possible.

---

## 25.2 `Span<T>` and `Memory<T>`

**`Span<T>`** and **`Memory<T>`** are revolutionary types introduced to enable high‑performance memory manipulation without allocations. They provide a type‑safe view over contiguous memory, which can be an array, a slice of an array, or even unmanaged memory.

### `Span<T>`

`Span<T>` is a **ref struct** – it can only live on the stack, not on the heap. This makes it incredibly lightweight. You can use it to work with slices of arrays, strings, or native buffers.

```csharp
int[] numbers = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
Span<int> span = numbers.AsSpan();
Span<int> slice = span.Slice(2, 3); // { 3, 4, 5 }
slice[0] = 30; // modifies the original array
```

You can also create a `Span` from unmanaged memory:

```csharp
Span<byte> buffer = stackalloc byte[256]; // stack allocation – fast!
```

`Span<T>` is perfect for performance‑critical paths because it avoids copying and heap allocations.

### `Memory<T>`

`Memory<T>` is similar to `Span<T>`, but it can live on the heap. It's useful for asynchronous methods because it doesn't have the stack‑only restriction. You can think of it as a holder for a span that can be stored in a field or used with `async` methods.

```csharp
Memory<int> memory = numbers.AsMemory();
// later, you can get a span from it in a synchronous context:
Span<int> span = memory.Span;
```

### When to Use

- Use `Span<T>` for synchronous, performance‑sensitive operations that work with contiguous memory.
- Use `Memory<T>` when you need to store the reference or use it across `await` boundaries.
- Both are allocation‑free when slicing.

---

## 25.3 `ref struct` and `ref Returns`

### `ref struct`

A `ref struct` is a structure that is guaranteed to be allocated only on the stack. This is enforced by the compiler, preventing boxing and heap allocations. `Span<T>` itself is a `ref struct`. You can define your own:

```csharp
public ref struct MyRefStruct
{
    public int Value;
    // No heap allocations allowed inside
}
```

This is useful for lightweight wrappers that should never escape the stack.

### `ref Returns`

You can return a reference to a variable from a method, allowing the caller to modify the original variable directly. This can avoid copying large structs.

```csharp
public ref int FindMax(int[] numbers)
{
    int maxIndex = 0;
    for (int i = 1; i < numbers.Length; i++)
    {
        if (numbers[i] > numbers[maxIndex])
            maxIndex = i;
    }
    return ref numbers[maxIndex]; // return reference to the element
}

// Usage
int[] arr = { 1, 5, 3, 9, 2 };
ref int max = ref FindMax(arr);
max = 100; // modifies the array element
Console.WriteLine(arr[3]); // 100
```

`ref returns` are useful in high‑performance scenarios where you want to avoid copying and directly manipulate data.

---

## 25.4 `StringBuilder` – Efficient String Concatenation

Strings are immutable in C#. When you concatenate strings with `+`, a new string is allocated each time. In a loop, this can create enormous GC pressure.

```csharp
string result = "";
for (int i = 0; i < 1000; i++)
{
    result += i.ToString(); // allocates a new string each iteration!
}
```

**`StringBuilder`** maintains a mutable buffer and appends without creating intermediate strings.

```csharp
StringBuilder sb = new StringBuilder();
for (int i = 0; i < 1000; i++)
{
    sb.Append(i);
}
string result = sb.ToString();
```

Always use `StringBuilder` when building strings in a loop or when the number of concatenations is unknown and potentially large.

---

## 25.5 Benchmarking with BenchmarkDotNet

To know whether a performance optimization actually helps, you need to **measure**. Guessing leads to wasted effort. **BenchmarkDotNet** is the industry‑standard library for benchmarking .NET code. It runs your code multiple times, warms up, performs statistical analysis, and outputs clear results.

### Setting Up BenchmarkDotNet

Add the NuGet package to a project (usually a separate console app):

```bash
dotnet add package BenchmarkDotNet
```

Create a benchmark class:

```csharp
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;

public class StringBenchmarks
{
    private int[] numbers;

    [GlobalSetup]
    public void Setup()
    {
        numbers = Enumerable.Range(0, 1000).ToArray();
    }

    [Benchmark]
    public string StringConcatenation()
    {
        string result = "";
        for (int i = 0; i < numbers.Length; i++)
        {
            result += numbers[i].ToString();
        }
        return result;
    }

    [Benchmark]
    public string StringBuilder()
    {
        var sb = new StringBuilder();
        for (int i = 0; i < numbers.Length; i++)
        {
            sb.Append(numbers[i]);
        }
        return sb.ToString();
    }
}

class Program
{
    static void Main()
    {
        var summary = BenchmarkRunner.Run<StringBenchmarks>();
    }
}
```

Run the project in **Release** mode (not Debug). BenchmarkDotNet will execute and produce a table like:

| Method | Mean | Error | StdDev | Allocated |
|--------|------|-------|--------|-----------|
| StringConcatenation | 1.234 ms | ... | ... | 500 KB |
| StringBuilder | 0.005 ms | ... | ... | 50 KB |

The results clearly show which method is faster and how much memory it allocates.

### Key Attributes

- `[Benchmark]` – marks a method to benchmark.
- `[GlobalSetup]` – runs once before all benchmarks (for setup).
- `[IterationSetup]` – runs before each iteration.
- `[Params(100, 1000)]` – run benchmark with different parameter values.

BenchmarkDotNet is essential for making data‑driven performance decisions.

---

## 25.6 Profiling to Find Bottlenecks

Benchmarking measures isolated code. **Profiling** shows you where your application spends time and memory in a real run. Visual Studio includes a profiler (Diagnostic Tools). There are also standalone profilers like JetBrains dotMemory, dotTrace, and PerfView.

Common profiling steps:

1. **CPU Profiling** – find methods that consume the most CPU time.
2. **Memory Profiling** – see allocations, garbage collection frequency, and memory leaks.
3. **Database Profiling** – if using EF Core, log SQL queries to find slow ones.

Always profile before optimizing. Fix the biggest bottlenecks first (the 80/20 rule).

---

## 25.7 Additional Performance Techniques

### Object Pooling

Creating and destroying many objects (e.g., buffers, connections) can be expensive. An object pool reuses instances. .NET provides `ArrayPool<T>` for arrays and `ObjectPool<T>` in Microsoft.Extensions.ObjectPool.

```csharp
byte[] buffer = ArrayPool<byte>.Shared.Rent(1024);
try
{
    // use buffer
}
finally
{
    ArrayPool<byte>.Shared.Return(buffer);
}
```

### Avoiding LINQ Overhead in Hot Paths

LINQ is expressive but can allocate enumerators and lambdas. In performance‑critical loops, prefer `for` or `foreach` with direct access.

```csharp
// LINQ version (allocates)
var sum = data.Where(x => x > 0).Sum();

// Manual loop (no allocation)
int sum = 0;
for (int i = 0; i < data.Length; i++)
{
    if (data[i] > 0) sum += data[i];
}
```

### Structs vs. Classes

Structs are value types allocated on the stack (or inline in arrays). They can reduce GC pressure but copying large structs is expensive. Use structs for small, immutable data (e.g., `Point`, `Color`). Measure to decide.

### Using `unsafe` Code and Pointers

For extreme performance, C# allows `unsafe` code with pointers. This bypasses type safety and bounds checks. Use only when absolutely necessary and after proving the benefit with benchmarks.

---

## 25.8 Putting It All Together: Optimizing a String Processing Method

Let's take a method that extracts numbers from a string and see how we can optimize it step by step.

Original (naive):

```csharp
public List<int> ExtractNumbers(string input)
{
    var result = new List<int>();
    foreach (char c in input)
    {
        if (char.IsDigit(c))
        {
            result.Add(int.Parse(c.ToString()));
        }
    }
    return result;
}
```

Problems:
- `c.ToString()` allocates a new string each time.
- `int.Parse` on a single char is overkill.
- Returning a `List<T>` allocates.

Optimized version using `Span<T>` and avoiding allocations:

```csharp
public int[] ExtractNumbersFast(string input)
{
    // Count digits first to allocate exact array
    Span<char> digits = stackalloc char[input.Length];
    int count = 0;
    foreach (char c in input)
    {
        if (c >= '0' && c <= '9')
        {
            digits[count++] = c;
        }
    }

    int[] result = new int[count];
    for (int i = 0; i < count; i++)
    {
        result[i] = digits[i] - '0'; // convert char to int without parsing
    }
    return result;
}
```

This version:
- Uses a stack‑allocated buffer for digits (no heap allocation if input length not huge).
- Uses character arithmetic instead of parsing.
- Returns an array (could also return `Span<int>` if the caller is synchronous).

Benchmark with BenchmarkDotNet to confirm improvement.

---

## 25.9 Best Practices

1. **Measure Before Optimizing** – use BenchmarkDotNet and profilers to find real bottlenecks.
2. **Minimize Allocations** – allocations lead to GC pressure. Reuse buffers, use `Span<T>`, and prefer structs for small data.
3. **Use `StringBuilder` for Dynamic Concatenation** – never use `+` in a loop.
4. **Prefer `for` over LINQ in Hot Paths** – LINQ adds overhead.
5. **Use `ArrayPool` for Temporary Buffers** – avoid repeated allocations.
6. **Consider Using `Span<T>` and `Memory<T>`** for modern, allocation‑free memory manipulation.
7. **Avoid Locking in Hot Paths** – use concurrent collections or lock‑free techniques.
8. **Make Decisions Based on Data** – don't guess; benchmark.
9. **Write Clear Code First, Then Optimize** – readability matters. Only optimize where necessary.
10. **Stay Updated** – each .NET version brings performance improvements.

---

## 25.10 Chapter Summary

In this chapter, you've learned how to design for performance in C#:

- **`Span<T>`** and **`Memory<T>`** enable allocation‑free memory manipulation.
- **`ref struct`** ensures stack‑only allocation, avoiding heap overhead.
- **`ref returns`** let you avoid copying large structs.
- **`StringBuilder`** is essential for efficient string concatenation.
- **BenchmarkDotNet** provides accurate performance measurements.
- **Profiling** helps identify real bottlenecks.
- Additional techniques like object pooling and avoiding LINQ overhead can further improve performance.

Performance is a continuous journey. With these tools and techniques, you can write C# code that is both correct and fast.

In the next chapter, **Key Design Patterns in C#**, we'll explore common design patterns that solve recurring problems in software architecture. You'll learn how to implement patterns like Singleton, Factory, Strategy, and Observer in C#, and understand when to use them.

**Exercises:**

1. Write a method that reverses a string without allocating any new string (use `Span<char>` and stack allocation). Benchmark it against the naive approach using `new string`.
2. Use BenchmarkDotNet to compare `StringBuilder` with regular concatenation for building a string of 10,000 numbers. Observe the allocation differences.
3. Profile a simple application that reads a large file and processes its lines. Identify the hottest method and try to optimize it.
4. Implement a simple object pool for `byte[]` buffers using `ConcurrentBag`. Measure the performance improvement over allocating new arrays each time.
5. Experiment with `ref returns`: create a method that returns a reference to the largest element in an array, and modify it through the reference.

Now, get ready to master design patterns in Chapter 26!

<div style='width:100%; display:flex; justify-content:space-between; align-items:center; margin: 1em 0;'>
  <a href='24. concurrency_and_parallelism.ipynb' style='font-weight:bold; font-size:1.05em;'>&larr; Previous</a>
  <a href='../TOC.md' style='font-weight:bold; font-size:1.05em; text-align:center;'>Table of Contents</a>
  <a href='26. key_design_patterns_in_c.ipynb' style='font-weight:bold; font-size:1.05em;'>Next &rarr;</a>
</div>
