# Chapter 24: Concurrency and Parallelism

Modern applications often need to perform multiple operations simultaneously to improve responsiveness and throughput. Whether it's processing large datasets, handling multiple client requests, or keeping a UI responsive while performing background work, understanding concurrency and parallelism is essential.

In this chapter, you'll learn:

- The difference between **concurrency** and **parallelism**.
- The **threading model** in .NET and how to work with threads.
- The **Task Parallel Library (TPL)** – `Parallel` class, PLINQ, and `Task` for data and task parallelism.
- **Concurrent collections** (`ConcurrentDictionary`, `ConcurrentQueue`, etc.) for safe access from multiple threads.
- How to avoid common pitfalls: **race conditions**, **deadlocks**, and **thread safety** issues.
- When to use `lock`, `Interlocked`, and other synchronization primitives.
- A practical example: parallel image processing.
- Best practices for concurrent programming.

By the end, you'll be able to write efficient, safe multi‑threaded code that scales with your hardware.

---

## 24.1 Concurrency vs. Parallelism

These terms are often confused:

- **Concurrency** is about dealing with many things at once. It's the composition of independently executing tasks. Concurrency can be achieved on a single‑core machine by time‑slicing (tasks take turns). The goal is often responsiveness (e.g., a UI thread remains responsive while background work is done).

- **Parallelism** is about doing many things at once. It requires multiple cores or processors. The goal is performance – completing a large task faster by splitting it into smaller chunks that run simultaneously.

In .NET, you can achieve concurrency with asynchronous programming (`async`/`await`) and parallelism with the Task Parallel Library (TPL). This chapter focuses on parallelism.

---

## 24.2 The .NET Threading Model

A **thread** is the smallest unit of execution within a process. The operating system schedules threads on available cores.

### Creating Threads Manually (Legacy)

You can create a thread directly using the `Thread` class:

```csharp
Thread thread = new Thread(() =>
{
    Console.WriteLine("Running on a separate thread");
});
thread.Start();
```

However, manual thread management is cumbersome and inefficient for many scenarios. The `ThreadPool` provides a pool of worker threads, but the Task Parallel Library is the recommended approach.

### Thread Pool

The thread pool manages a set of threads that can be reused for multiple tasks. You can queue work items:

```csharp
ThreadPool.QueueUserWorkItem(state =>
{
    Console.WriteLine("Work from thread pool");
});
```

But the TPL (`Task`) provides a richer, more convenient abstraction over the thread pool.

---

## 24.3 The Task Parallel Library (TPL)

The TPL is a set of public types and APIs in the `System.Threading.Tasks` namespace. Its purpose is to make developers more productive by simplifying the process of adding parallelism and concurrency to applications.

### `Parallel` Class

The `Parallel` class provides static methods for parallel loops and regions.

#### `Parallel.For`

```csharp
int[] data = Enumerable.Range(0, 100).ToArray();
Parallel.For(0, data.Length, i =>
{
    data[i] = data[i] * data[i]; // some CPU‑intensive work
});
```

This loop partitions the range and executes iterations in parallel. The degree of parallelism is automatically managed.

#### `Parallel.ForEach`

```csharp
var numbers = Enumerable.Range(0, 100).ToList();
Parallel.ForEach(numbers, n =>
{
    Console.WriteLine($"Processing {n} on thread {Thread.CurrentThread.ManagedThreadId}");
});
```

#### `Parallel.Invoke`

Executes multiple actions in parallel:

```csharp
Parallel.Invoke(
    () => DoWork1(),
    () => DoWork2(),
    () => DoWork3()
);
```

**Important:** The work inside parallel loops should be **CPU‑bound** (not I/O bound) and independent (no shared state modifications without synchronization). If you need to combine results, use thread‑safe collections or locking.

### PLINQ (Parallel LINQ)

PLINQ is a parallel implementation of LINQ. You can turn any LINQ query into a parallel query by calling `AsParallel()`.

```csharp
var numbers = Enumerable.Range(1, 10000000);
var evenSquares = numbers.AsParallel()
                         .Where(n => n % 2 == 0)
                         .Select(n => n * n)
                         .ToArray();
```

PLINQ automatically partitions the data and executes the query in parallel. You can control the degree of parallelism with `WithDegreeOfParallelism()`.

PLINQ is great for data‑parallel operations on collections. However, be cautious with order‑preserving operations (`AsOrdered()`), as they can reduce parallelism.

---

## 24.4 Concurrent Collections

When multiple threads access a shared collection, you must ensure thread safety. The standard collections (`List<T>`, `Dictionary<TKey, TValue>`) are **not** thread‑safe. If one thread modifies a collection while another enumerates it, exceptions or data corruption can occur.

The `System.Collections.Concurrent` namespace provides thread‑safe collections designed for concurrent access:

| Collection | Description |
|------------|-------------|
| `ConcurrentDictionary<TKey, TValue>` | Thread‑safe dictionary, optimized for concurrent reads and writes. |
| `ConcurrentQueue<T>` | Thread‑safe FIFO queue. |
| `ConcurrentStack<T>` | Thread‑safe LIFO stack. |
| `ConcurrentBag<T>` | Unordered collection for scenarios where order doesn't matter. |
| `BlockingCollection<T>` | Bounded or unbounded collection that blocks producers/consumers. |

### Example: Using `ConcurrentDictionary`

```csharp
var dict = new ConcurrentDictionary<string, int>();

// Add or update safely
dict.AddOrUpdate("key", 1, (k, old) => old + 1);

// Try get
if (dict.TryGetValue("key", out int value))
{
    Console.WriteLine(value);
}
```

### Example: Producer‑Consumer with `ConcurrentQueue` and `BlockingCollection`

```csharp
var queue = new BlockingCollection<int>(boundedCapacity: 5);

// Producer task
Task.Run(() =>
{
    for (int i = 0; i < 10; i++)
    {
        queue.Add(i);
        Console.WriteLine($"Produced {i}");
    }
    queue.CompleteAdding();
});

// Consumer task
Task.Run(() =>
{
    foreach (var item in queue.GetConsumingEnumerable())
    {
        Console.WriteLine($"Consumed {item}");
        Thread.Sleep(100); // simulate work
    }
});
```

`GetConsumingEnumerable` blocks when the queue is empty and completes when `CompleteAdding` is called.

---

## 24.5 Synchronization Primitives

Even with concurrent collections, you may need to coordinate access to shared resources. The `lock` statement is the simplest synchronization tool.

### `lock`

```csharp
private readonly object _lock = new object();
private int _counter;

public void Increment()
{
    lock (_lock)
    {
        _counter++;
    }
}
```

`lock` ensures that only one thread can execute the enclosed code at a time. Always lock on a private, readonly object; never lock on `this` or a public field.

### `Interlocked`

For simple atomic operations on integers, `Interlocked` provides faster, lock‑free methods:

```csharp
Interlocked.Increment(ref _counter);
Interlocked.Add(ref _counter, 5);
Interlocked.Exchange(ref _counter, 10);
Interlocked.CompareExchange(ref _counter, 100, 50); // if _counter == 50, set to 100
```

### `Monitor`, `Mutex`, `Semaphore`, `ReaderWriterLockSlim`

These offer more advanced synchronization scenarios. For example:

- `ReaderWriterLockSlim` allows multiple concurrent readers but exclusive writers.
- `Semaphore` limits the number of threads accessing a resource.
- `Mutex` works across processes.

Use them when `lock` or `Interlocked` is insufficient.

---

## 24.6 Common Pitfalls: Race Conditions and Deadlocks

### Race Condition

A **race condition** occurs when the outcome depends on the non‑deterministic timing of threads. Example:

```csharp
int counter = 0;
Task.Run(() => counter++);
Task.Run(() => counter++);
```

The final value could be 1 or 2 because `counter++` is not atomic. Always synchronize access or use `Interlocked`.

### Deadlock

A **deadlock** occurs when two or more threads are waiting for each other to release resources, and none can proceed. Classic example:

```csharp
object lock1 = new object();
object lock2 = new object();

Task.Run(() =>
{
    lock (lock1)
    {
        Thread.Sleep(100); // simulate work
        lock (lock2) { } // waits for lock2
    }
});

Task.Run(() =>
{
    lock (lock2)
    {
        Thread.Sleep(100);
        lock (lock1) { } // waits for lock1
    }
});
```

Both tasks will wait forever. To avoid deadlocks:

- Acquire locks in a consistent order.
- Use `Monitor.TryEnter` with a timeout.
- Avoid holding locks while calling external code.

---

## 24.7 Putting It All Together: Parallel Image Processing

Let's simulate processing a batch of images. We'll use `Parallel.ForEach` to process multiple images concurrently, with a thread‑safe collection to store results.

```csharp
using System.Collections.Concurrent;

public class ImageProcessor
{
    public void ProcessImages(string[] imagePaths)
    {
        var results = new ConcurrentBag<string>();

        Parallel.ForEach(imagePaths, path =>
        {
            // Simulate CPU‑intensive image processing
            string result = ProcessImage(path);
            results.Add(result);
            Console.WriteLine($"Processed {path} on thread {Thread.CurrentThread.ManagedThreadId}");
        });

        Console.WriteLine($"Total processed: {results.Count}");
    }

    private string ProcessImage(string path)
    {
        // Simulate work
        Thread.Sleep(100);
        return $"Processed {path}";
    }
}
```

If we needed to collect metrics, we could use `Interlocked`:

```csharp
int _processedCount = 0;

Parallel.ForEach(imagePaths, path =>
{
    ProcessImage(path);
    Interlocked.Increment(ref _processedCount);
});
```

For more advanced scenarios, consider using `Parallel.For` with thread‑local data to reduce contention.

---

## 24.8 Best Practices for Concurrent Programming

### 1. Prefer Higher‑Level Abstractions

Use `Parallel` loops, PLINQ, and `Task` instead of manual thread management. They are optimized and handle partitioning, load balancing, and cancellation.

### 2. Keep Work Independent

Ensure tasks in parallel loops do not interfere. If they must share data, use concurrent collections or synchronization.

### 3. Avoid Shared Mutable State

Shared mutable state is the root of many threading bugs. Try to design your algorithms to avoid it – e.g., by using immutable data or functional transformations.

### 4. Use `lock` Sparingly and Correctly

Locking reduces parallelism. Keep critical sections short. Always lock on a private object, never on a public or a string.

### 5. Be Aware of Thread Safety

Know which types are thread‑safe. For instance, `Random` is not thread‑safe; use `ThreadLocal<Random>`.

### 6. Handle Cancellation

Support cancellation in parallel operations by passing a `CancellationToken`.

```csharp
var cts = new CancellationTokenSource();
ParallelOptions options = new ParallelOptions { CancellationToken = cts.Token };
try
{
    Parallel.For(0, 100, options, i => { ... });
}
catch (OperationCanceledException) { ... }
```

### 7. Don't Assume Parallelism Always Improves Performance

Parallelism adds overhead. For small datasets or trivial work, a sequential loop may be faster. Measure with `Stopwatch`.

### 8. Watch Out for Thread Pool Starvation

If you use `Task.Run` for long‑running operations, you can exhaust the thread pool. For I/O‑bound work, prefer `async`/`await`. For CPU‑bound, `Parallel` or `Task.Run` is appropriate.

### 9. Test Thoroughly Under Load

Concurrency bugs are often hard to reproduce. Use tools like `Task` and `Parallel` to simulate high concurrency and test with `Thread.Sleep` to expose race conditions.

### 10. Consider Using Immutable Data Structures

Immutable objects are inherently thread‑safe and can be shared without locks. C# records are a great example.

---

## 24.9 Chapter Summary

In this chapter, you've learned how to write parallel and concurrent code in C#:

- **Concurrency** is about managing multiple tasks, while **parallelism** is about executing tasks simultaneously on multiple cores.
- The **Task Parallel Library (TPL)** provides high‑level abstractions like `Parallel.For` and `Parallel.ForEach`.
- **PLINQ** offers a declarative way to parallelize LINQ queries.
- **Concurrent collections** (`ConcurrentDictionary`, `ConcurrentQueue`, etc.) allow safe access from multiple threads.
- **Synchronization primitives** (`lock`, `Interlocked`) protect shared data.
- You must avoid **race conditions** and **deadlocks** by careful design.
- A practical example demonstrated parallel image processing.
- Best practices guide you to write efficient, safe concurrent code.

With these tools, you can leverage multi‑core processors to boost performance and build responsive applications.

In the next chapter, **Designing for Performance**, we'll explore techniques for writing high‑performance C# code, including `Span<T>`, `ref struct`, benchmarking, and profiling. You'll learn how to identify bottlenecks and optimize memory usage.

**Exercises:**

1. Write a program that uses `Parallel.For` to compute the sum of squares of a large array. Compare the time with a sequential loop.
2. Implement a producer‑consumer pattern using `BlockingCollection<T>`. Have one producer generate numbers and two consumers process them (e.g., compute prime factors).
3. Create a simple web crawler that downloads multiple pages in parallel using `Parallel.ForEach` and `HttpClient`. Be mindful of thread safety.
4. Simulate a bank transfer system where multiple threads transfer money between accounts. Use `lock` to prevent race conditions and ensure consistency.
5. Use PLINQ to find all prime numbers up to 10 million. Compare performance with and without `AsParallel()`.

Now, prepare to dive into performance optimization in Chapter 25!