| Concept                | What to Use                                    |
| ---------------------- | ---------------------------------------------- |
| Mid formula            | `mid = start + (end - start) // 2`             |
| Base condition         | `if start > end: return`                       |
| When to stop splitting | When segment is invalid (start > end)          |
| Minimum segment        | When `start == end` → one element, still valid |
|Left half               |`your_function(arr, start, mid)`|
|Right half	             |`your_function(arr, mid + 1, end)`|

In [6]:
max_num = float('-inf')
min_num = float('inf')

In [1]:
# Example combine functions
def combine_min(a, b):
    return min(a, b)

def combine_max(a, b):
    return max(a, b)

def divide_and_conquer(arr, start, end):
    # Base condition (very important)
    if start > end:
        return  # or return something meaningful like float('inf') or -1

    # If single element, handle it (can be optional based on your use-case)
    if start == end:
        return arr[start]

    # Calculate mid safely
    mid = start + (end - start) // 2

    # Recursively process left and right halves
    left_result = divide_and_conquer(arr, start, mid)
    right_result = divide_and_conquer(arr, mid + 1, end)

    # Combine results — depends on your use case
    result = combine(left_result, right_result)

    return result


In [2]:
def find_min(arr, start, end):
    if start > end:
        return float('inf')

    if start == end:
        return arr[start]

    mid = start + (end - start) // 2

    left_min = find_min(arr, start, mid)
    right_min = find_min(arr, mid + 1, end)

    return min(left_min, right_min)


In [3]:
def find_max(arr, start, end):
    if start > end:
        return float('-inf')

    if start == end:
        return arr[start]

    mid = start + (end - start) // 2

    left_max = find_max(arr, start, mid)
    right_max = find_max(arr, mid + 1, end)

    return max(left_max, right_max)


In [4]:
def binary_search(arr, start, end, target):
    if start > end:
        return -1

    mid = start + (end - start) // 2

    if arr[mid] == target:
        return mid
    elif arr[mid] < target:
        return binary_search(arr, mid + 1, end, target)
    else:
        return binary_search(arr, start, mid - 1, target)


🔢 Total Comparisons: Naïve vs. Divide & Conquer
🔴 Naïve Linear Approach:
Traverse array once:

Compare each element to current min → 1 comparison

Compare each element to current max → 1 comparison

For 
𝑛 elements → total 2(𝑛−1) comparisons



✅ Tournament Method (Divide & Conquer):
If array size 
𝑛
=
2^𝑘
the number of comparisons becomes:

Comparisons=3n/2−2

🧠 Why?
Every pair needs 3 comparisons (2 to find min & max among them, 1 extra to merge).

When merging two parts, 1 comparison for min and 1 for max = 2 comparisons per merge.

Recursion reduces redundant comparisons by splitting.





| n (Power of 2) | Naïve (2n−2) | Tournament (3n/2 − 2) |
| -------------- | ------------ | --------------------- |
| 2              | 2            | 1                     |
| 4              | 6            | 4                     |
| 8              | 14           | 10                    |
| 16             | 30           | 22                    |
| 32             | 62           | 46                    |
| 64             | 126          | 94                    |
| 128            | 254          | 190                   |



🧠 Summary:
Reducing comparisons helps when:

🔁 Comparisons are expensive or involve complex logic.

⚡ You're optimizing for time, energy, or efficiency.

📈 You're handling massive datasets or care about performance benchmarks.

In [None]:
def get_min_max_recursive(array, start, end):
    """
    Recursively finds the minimum and maximum values in the array between indices start and end.

    Args:
    - array (List[int]): The input array of integers.
    - start (int): Starting index of the segment.
    - end (int): Ending index of the segment.

    Returns:
    - Tuple[int, int]: A tuple containing (minimum value, maximum value).
    """
    
    # Base Case 1: Empty range
    if start > end:
        return float('inf'), float('-inf')
    
    # Base Case 2: Only one element in the range
    if start == end:
        return array[start], array[end]
    
    # Base Case 3: Only two elements — directly compare
    if end - start == 1:
        if array[start] > array[end]:
            return array[end], array[start]
        else:
            return array[start], array[end]
    
    # Recursive Case: Divide the array into two halves
    mid = start + (end - start) // 2
    
    # Recurse on the left half
    left_min, left_max = get_min_max_recursive(array, start, mid)
    
    # Recurse on the right half
    right_min, right_max = get_min_max_recursive(array, mid + 1, end)
    
    # Combine the results from both halves
    overall_min = min(left_min, right_min)
    overall_max = max(left_max, right_max)
    
    return overall_min, overall_max

-inf

🔧 Core Hardware Components Involved

| Component                       | Purpose                                     |
| ------------------------------- | ------------------------------------------- |
| **Registers**                   | Hold operands and results                   |
| **ALU** (Arithmetic Logic Unit) | Performs arithmetic and logical operations  |
| **Control Unit**                | Decodes instructions and controls data flow |
| **Buses**                       | Move data between registers and ALU         |
| **Clock**                       | Times everything                            |



Here’s what happens when you do c = a + b:

1. **Fetch**: CPU fetches the instruction from memory (using Program Counter)

2. **Decode**: Control Unit decodes instruction, identifies it's ADD

3. **Register Read**: a and b are loaded from registers

4. **ALU Activation**: ALU performs binary addition using full-adder circuits

5. **Writeback**: Result is stored in register c

6. **PC incremented**: Move to next instruction

All this might happen in 1–2 cycles (with pipelining).



⚡ Performance Notes


| Operation              | Typical Latency |
| ---------------------- | --------------- |
| Logical (AND, OR, NOT) | 1 cycle         |
| Add/Subtract           | 1 cycle         |
| Multiply               | 3–5 cycles      |
| Divide                 | 10–40+ cycles   |



💡 Summary


| Operation   | Hardware Used                        | Speed          | Notes                 |
| ----------- | ------------------------------------ | -------------- | --------------------- |
| Add         | Full adders                          | Fast (1 cycle) | Simple logic gates    |
| Subtract    | Full adders + NOT                    | Fast           | Two's complement      |
| Multiply    | Shifters + adders or multiplier unit | Slower         | Optimized in hardware |
| Divide      | Dedicated divider logic              | Slow           | Many cycles           |
| Logical Ops | Logic gates (AND, OR, XOR)           | Fastest        | Performed bitwise     |



🔧 CPU Core Components (Simplified)
Each CPU core generally consists of:

- ALU (Arithmetic Logic Unit): Performs integer arithmetic and bitwise logical operations.

- FPU (Floating Point Unit): Performs floating point (decimal) operations.

- CU (Control Unit): Decodes instructions and orchestrates the data/control signals.

- Registers: Small storage areas inside the CPU.

- L1 Cache: Fastest memory per core.


So if you have a 4-core CPU, you typically have:

1.  4 ALUs (one per core, sometimes more depending on superscalar design)

2.  4 FPUs (optional depending on CPU type)

3.  4 CUs, each per core, not a single CU shared across cores


---

🧮 What Happens During Arithmetic/Logical Operations?
Let’s go instruction-by-instruction at low level (micro-architecture):

1. **Addition/Subtraction (ADD, SUB)
Executed by: ALU**

Mechanism:
- Performed using full adders inside the ALU.
- Operates on binary representation.
- Uses two's complement for subtraction.
- Instruction: ADD R1, R2 → R1 = R1 + R2

✅ Very fast — usually single cycle.


2. **Multiplication (MUL)**

Executed by: ALU or dedicated multiplier unit inside the ALU.

Mechanism:

- Hardware-level implementation of shift-and-add algorithms.

- Modern CPUs use Booth's algorithm or Wallace Tree Multipliers.

- Larger operand multiplications may be pipelined.

⚠️ Slower than addition; typically takes multiple cycles, but pipelined in modern CPUs.


3. **Division (DIV)**

Executed by: ALU or a dedicated divider circuit

Mechanism:

- Uses restoring/non-restoring division algorithms.
- Modern CPUs often avoid division (it's slow).
- For floating-point: uses FPU.

⚠️ Slowest of all — can take 10s of cycles. That's why division is avoided in critical loops.

4. **Logical Operations (AND, OR, NOT, XOR)**

Executed by: ALU

Mechanism:

- Simple bitwise gate-level operations.
- Fastest — just electrical switching (transistors).
- AND R1, R2 means bitwise AND of the contents of R1 and R2.

✅ Extremely fast, often 1 cycle.


---

🧪 TL;DR

| Operation  | Unit       | Latency        | Notes                      |
| ---------- | ---------- | -------------- | -------------------------- |
| ADD/SUB    | ALU        | 1 cycle        | Fast, simple logic         |
| MUL        | ALU (mult) | \~3–10 cycles  | Complex logic              |
| DIV        | Divider    | \~10–30 cycles | Very slow, often avoided   |
| AND/OR/XOR | ALU        | 1 cycle        | Bitwise, fastest           |
| CMP        | ALU        | 1 cycle        | Just a subtract + flag set |






**🧠 What exactly does the FPU do?**
An FPU is a separate execution unit that handles:

- Floating-point addition, subtraction, multiplication, division

- Square root, trigonometric operations, exponentiation, etc.

- Conversions between integer and float

- IEEE 754 compliance (handling NaNs, rounding modes, exceptions, etc.)




**⚙️ How is FPU different from ALU?**


| Feature          | ALU (Integer Unit)               | FPU (Floating-Point Unit)                     |
| ---------------- | -------------------------------- | --------------------------------------------- |
| Operates on      | Integers                         | Floating-point numbers                        |
| Format           | Binary                           | IEEE 754 (sign, exponent, mantissa)           |
| Complexity       | Lower                            | Higher (needs normalization, rounding, etc.)  |
| Size/Area in CPU | Smaller                          | Larger circuitry                              |
| Instructions     | `add`, `sub`, `and`, `or`, `xor` | `fadd`, `fsub`, `fmul`, `fdiv`, `fsqrt`, etc. |
| Hardware example | Ripple Carry Adder, Logic Gates  | Floating-point Adders, Shifters, Comparators  |



**🧮 Example: Adding Two Floats**
Say: 1.5 + 2.75
Internally:

1. Convert both to IEEE 754.

2. Align exponents (bit shifting).

3. Add mantissas.

4. Normalize result.

5. Round to nearest representable value.

6. Store back as IEEE 754.

An ALU isn’t built to do all of this, but an FPU is.




**🏎 Real-World Importance**
- Scientific calculations, graphics, simulations, machine learning → heavily use FPU

- CPUs have multiple execution pipelines: integer (ALU), floating-point (FPU), vector (SIMD).

- FPU enables parallel execution of float operations without blocking integer ops.


**Bonus: What if no FPU?**
- Early CPUs (e.g., Intel 80386) had no built-in FPU; used software routines (very slow).

- Intel 80487 or math co-processors were used.

- Today, FPUs are standard in desktop/server/mobile CPUs.

- Tiny embedded CPUs (like old 8-bit MCUs) might still lack FPUs to save power/space.




