<a href="https://colab.research.google.com/github/walkerjian/DailyCode/blob/main/MaxValuesTimings.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## MaxValuesTimings

## Timing and Efficiency of a Simple Array Task.

### Task:

Given an array of integers and a number k, where 1 <= k <= length of the array, compute the maximum values of each subarray of length k.

For example, given array = [10, 5, 2, 7, 8, 7] and k = 3, we should get: [10, 7, 8, 8], since:

10 = max(10, 5, 2)
7 = max(5, 2, 7)
8 = max(2, 7, 8)
8 = max(7, 8, 7)
Do this in O(n) time and O(k) space. You can modify the input array in-place and you do not need to store the results. You can simply print them out as you compute them.

requirements:
````
1) use the MVC paradigm.
2) extensively document your code with a docstring for the initial problem as specified.
3) all code to be uninterrupted and not truncated.
4) extensively test the code; write a test function to test the code with at least 10 test examples. Make sure the test harness does not interrupt the output of the test cases, which should include the sample cases given to you. All output needs to include the original example or test case, and complete output of the solution.
5) do not make up solutions, make sure your solution is correct & adheres to all requirements above.
6) use PEP8 & nice formatting rules for word wrap
````

### Solution

To solve this problem efficiently in $O(n)$ time, we can use a double-ended queue (`deque` from `collections` module). The idea is to store indices of useful elements in every window and to maintain the usefulness of the elements in the deque, we can do the following:

1. Remove the elements which are out of the current window.
2. Remove the elements which are smaller than the current element from the back while they appear before the current element.
3. The front element of the deque will always be the largest element of the previous window.

Using the MVC paradigm:

1. Model: This will handle the logic for finding the maximum values of each subarray of length \( k \).
2. View: This will handle the display logic, i.e., printing the results.
3. Controller: This will orchestrate the interaction between the Model and View.

In [1]:
from collections import deque
from typing import List

class MaxValuesModel:
    """
    Model for computing the maximum values of each subarray of length k from an input array.
    """
    @staticmethod
    def compute_max_values(arr: List[int], k: int) -> List[int]:
        """
        Compute the maximum values of each subarray of length k.

        Args:
        - arr (List[int]): Input array of integers.
        - k (int): Length of subarray.

        Returns:
        - List[int]: List of maximum values of each subarray of length k.
        """

        # Check if the array is empty or k is 0
        if not arr or k == 0:
            return []

        # Check if k is 1, in this case just return the array itself
        if k == 1:
            return arr

        # Initialize deque and result list
        d = deque()
        result = []

        for i in range(len(arr)):
            # Remove elements that are out of the current window from the front
            while d and d[0] < i - k + 1:
                d.popleft()

            # Remove elements that are less than the current element from the back
            while d and arr[d[-1]] < arr[i]:
                d.pop()

            d.append(i)

            # The element at the front is the maximum element of previous window
            # Add it to the result list
            if i >= k - 1:
                result.append(arr[d[0]])

        return result
class MaxValuesView:
    """
    View for displaying the maximum values of each subarray of length k.
    """

    @staticmethod
    def display(arr: List[int], k: int, result: List[int]) -> None:
        """
        Display the maximum values of each subarray of length k.

        Args:
        - arr (List[int]): Input array of integers.
        - k (int): Length of subarray.
        - result (List[int]): List of maximum values of each subarray of length k.
        """

        print(f"Array: {arr}, k: {k} => Maximum values: {result}")
class MaxValuesController:
    """
    Controller for orchestrating the interaction between MaxValuesModel and MaxValuesView.
    """

    def __init__(self, model: MaxValuesModel, view: MaxValuesView):
        """
        Initialize the controller with a model and a view.

        Args:
        - model (MaxValuesModel): Model for computing max values.
        - view (MaxValuesView): View for displaying results.
        """
        self.model = model
        self.view = view

    def get_max_values(self, arr: List[int], k: int) -> None:
        """
        Get the maximum values of each subarray of length k and display the results.

        Args:
        - arr (List[int]): Input array of integers.
        - k (int): Length of subarray.
        """
        result = self.model.compute_max_values(arr, k)
        self.view.display(arr, k, result)
def test_max_values():
    """
    Test function for the MaxValuesModel, MaxValuesView, and MaxValuesController.
    """
    controller = MaxValuesController(MaxValuesModel(), MaxValuesView())

    # Test cases
    test_cases = [
        ([10, 5, 2, 7, 8, 7], 3),
        ([1, 2, 3, 4, 5], 2),
        ([1, 2, 3, 4, 5], 5),
        ([5, 4, 3, 2, 1], 3),
        ([1], 1),
        ([], 0),
        ([1, 3, 5, 7, 9, 8, 6, 4, 2, 0], 4),
        ([7, 2, 4], 2),
        ([2, 7, 3, 1, 5, 2, 6, 2], 4),
        ([3, 3, 3, 3, 3, 3, 3], 3)
    ]

    for arr, k in test_cases:
        controller.get_max_values(arr, k)

test_max_values()


Array: [10, 5, 2, 7, 8, 7], k: 3 => Maximum values: [10, 7, 8, 8]
Array: [1, 2, 3, 4, 5], k: 2 => Maximum values: [2, 3, 4, 5]
Array: [1, 2, 3, 4, 5], k: 5 => Maximum values: [5]
Array: [5, 4, 3, 2, 1], k: 3 => Maximum values: [5, 4, 3]
Array: [1], k: 1 => Maximum values: [1]
Array: [], k: 0 => Maximum values: []
Array: [1, 3, 5, 7, 9, 8, 6, 4, 2, 0], k: 4 => Maximum values: [7, 9, 9, 9, 9, 8, 6]
Array: [7, 2, 4], k: 2 => Maximum values: [7, 4]
Array: [2, 7, 3, 1, 5, 2, 6, 2], k: 4 => Maximum values: [7, 7, 5, 6, 6]
Array: [3, 3, 3, 3, 3, 3, 3], k: 3 => Maximum values: [3, 3, 3, 3, 3]


### Solution, Simplified

Alright, let's simplify the solution.

We can start at the first element and find the maximum of the next $k$ elements. We'll then move to the next element and repeat the process until we can't form a subarray of size $k$. This approach will have a worst-case time complexity of $O(nk)$ which is less efficient than the previous solution for large $k$, but it's simpler to understand and implement.
This approach directly starts at each element, finds the maximum of the next $k$ elements, and prints it. As mentioned earlier, this method is more straightforward but has a worst-case time complexity of $O(nk)$ for large $k$.


In [3]:
def compute_max_values_simple(arr: List[int], k: int) -> None:
    """
    Compute and print the maximum values of each subarray of length k.

    Args:
    - arr (List[int]): Input array of integers.
    - k (int): Length of subarray.
    """

    n = len(arr)

    # Check if the array is empty or k is 0
    if not arr or k == 0:
        print("Maximum values: []")
        return

    # Loop through the array and find the max of the next k elements
    for i in range(n - k + 1):
        print(max(arr[i:i+k]), end=" ")
    print()

# Testing the simplified function with the previous test cases
test_cases = [
    ([10, 5, 2, 7, 8, 7], 3),
    ([1, 2, 3, 4, 5], 2),
    ([1, 2, 3, 4, 5], 5),
    ([5, 4, 3, 2, 1], 3),
    ([1], 1),
    ([], 0),
    ([1, 3, 5, 7, 9, 8, 6, 4, 2, 0], 4),
    ([7, 2, 4], 2),
    ([2, 7, 3, 1, 5, 2, 6, 2], 4),
    ([3, 3, 3, 3, 3, 3, 3], 3)
]

for arr, k in test_cases:
    print(f"Array: {arr}, k: {k} => Maximum values:", end=" ")
    compute_max_values_simple(arr, k)


Array: [10, 5, 2, 7, 8, 7], k: 3 => Maximum values: 10 7 8 8 
Array: [1, 2, 3, 4, 5], k: 2 => Maximum values: 2 3 4 5 
Array: [1, 2, 3, 4, 5], k: 5 => Maximum values: 5 
Array: [5, 4, 3, 2, 1], k: 3 => Maximum values: 5 4 3 
Array: [1], k: 1 => Maximum values: 1 
Array: [], k: 0 => Maximum values: Maximum values: []
Array: [1, 3, 5, 7, 9, 8, 6, 4, 2, 0], k: 4 => Maximum values: 7 9 9 9 9 8 6 
Array: [7, 2, 4], k: 2 => Maximum values: 7 4 
Array: [2, 7, 3, 1, 5, 2, 6, 2], k: 4 => Maximum values: 7 7 5 6 6 
Array: [3, 3, 3, 3, 3, 3, 3], k: 3 => Maximum values: 3 3 3 3 3 


### Solution, Maximally Efficient, Python

To achieve a solution with a time complexity of $O(n)$ and space complexity of $O(k)$, we'll use the concept of a double-ended queue $deque$. The main idea is to maintain the deque in such a way that it always contains the maximum element of the current window of size $k$ at its front.

Here's the step-by-step approach:

1. Process the first $k$ elements and create the initial deque.
2. For the rest of the elements in the array, do the following:
   - Print the front of the deque (it will be the maximum of the previous window).
   - Remove elements from the front of the deque that are out of the current window.
   - Remove elements from the back of the deque that are smaller than the current element.
   - Add the current element to the back of the deque.
3. Finally, print the maximum of the last window.

I'll now implement this efficient solution:

The efficient solution provided leverages a double-ended queue $deque$ to ensure the time complexity remains $O(n)$ and space complexity is $O(k)$. This solution effectively keeps track of the maximum element for each window of size $k$ as we traverse the array.

In [4]:
def compute_max_values_efficient(arr: List[int], k: int) -> None:
    """
    Compute and print the maximum values of each subarray of length k efficiently.

    Args:
    - arr (List[int]): Input array of integers.
    - k (int): Length of subarray.
    """

    if not arr or k == 0:
        print("Maximum values: []")
        return

    # Initialize deque
    d = deque()

    # Process first k elements
    for i in range(k):
        # Remove elements that are smaller than the current element from the back
        while d and arr[i] >= arr[d[-1]]:
            d.pop()
        d.append(i)

    # For the rest of the elements
    for i in range(k, len(arr)):
        # Print the front of the deque (maximum of the previous window)
        print(arr[d[0]], end=" ")

        # Remove elements from the front that are out of this window
        while d and d[0] <= i - k:
            d.popleft()

        # Remove elements from the back that are smaller than the current element
        while d and arr[i] >= arr[d[-1]]:
            d.pop()

        d.append(i)

    # Print the maximum of the last window
    print(arr[d[0]])

# Testing the efficient solution with the previous test cases
for arr, k in test_cases:
    print(f"Array: {arr}, k: {k} => Maximum values:", end=" ")
    compute_max_values_efficient(arr, k)


Array: [10, 5, 2, 7, 8, 7], k: 3 => Maximum values: 10 7 8 8
Array: [1, 2, 3, 4, 5], k: 2 => Maximum values: 2 3 4 5
Array: [1, 2, 3, 4, 5], k: 5 => Maximum values: 5
Array: [5, 4, 3, 2, 1], k: 3 => Maximum values: 5 4 3
Array: [1], k: 1 => Maximum values: 1
Array: [], k: 0 => Maximum values: Maximum values: []
Array: [1, 3, 5, 7, 9, 8, 6, 4, 2, 0], k: 4 => Maximum values: 7 9 9 9 9 8 6
Array: [7, 2, 4], k: 2 => Maximum values: 7 4
Array: [2, 7, 3, 1, 5, 2, 6, 2], k: 4 => Maximum values: 7 7 5 6 6
Array: [3, 3, 3, 3, 3, 3, 3], k: 3 => Maximum values: 3 3 3 3 3


### Solutions, Timed, Python

In [5]:
import random
import time
import sys

# Generating a large array
large_array = [random.randint(1, 10000) for _ in range(100000)]
k = 500  # A reasonable window size for our large array

# Timing the simple solution
print("🚀 Launching the straightforward rocket...\n")
start_time = time.time()
compute_max_values_simple(large_array, k)
simple_time = time.time() - start_time
print(f"\n🐢 The straightforward solution took its sweet time: {simple_time:.5f} seconds")

# Measuring space for the simple solution (it doesn't use additional significant space)
simple_space = sys.getsizeof([])
print(f"🎈 The straightforward solution's baggage: {simple_space} bytes (practically air!)")

# Timing the efficient solution
print("\n🏎️ Firing up the efficiency engine...\n")
start_time = time.time()
compute_max_values_efficient(large_array, k)
efficient_time = time.time() - start_time
print(f"\n🔥 The efficient solution blazed through in: {efficient_time:.5f} seconds")

# Measuring space for the efficient solution (mainly the deque)
deque_sample = deque(range(k))  # Assuming the deque size is roughly k
efficient_space = sys.getsizeof(deque_sample)
print(f"💼 The efficient solution's luggage: {efficient_space} bytes (traveling light but not the lightest)")

# Pithy comparison
if efficient_time < simple_time:
    print("\n🎉 Efficiency reigns supreme! But let's not forget the charm of a straightforward stroll. 🚶")
else:
    print("\n😲 The straightforward approach outpaced efficiency! Maybe it's riding a secret rocket? 🚀")


🚀 Launching the straightforward rocket...

9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9996 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 9993 99

The results are in, and it's a thrilling ride!

📊 **Performance Recap**:
- The straightforward solution, reminiscent of a leisurely Sunday drive, took its time but got us to the destination.
- The efficient solution, much like a sports car, zoomed through the task, proving that sometimes speed and style can coexist!

📦 **Space Bags**:
- Our straightforward traveler doesn't bother with luggage. Travels so light, it's practically air!
- The efficient traveler packs a light bag. It's not the lightest, but it's optimized for the journey.

In the grand race of efficiency vs simplicity, the efficient solution pulled ahead. However, there's always beauty in simplicity, and sometimes the journey is just as important as the destination.

Remember, it's not always about the speed but the experience! 🌟🚗💨

## Solution, Maximally Efficient, C++

C++ solution using the STL's deque for the sliding window maximum problem:

In [16]:
%%writefile sliding_window.cpp
#include <iostream>
#include <deque>
#include <vector>
#include <chrono>


std::vector<int> sliding_window_max(const std::vector<int>& arr, int k) {
    std::deque<int> dq;
    std::vector<int> result;

    for (int i = 0; i < arr.size(); i++) {
        // Remove indices that are out of the current window from the front
        while (!dq.empty() && dq.front() <= i - k) {
            dq.pop_front();
        }

        // Remove elements that are smaller than the current element from the back
        while (!dq.empty() && arr[dq.back()] < arr[i]) {
            dq.pop_back();
        }

        dq.push_back(i);

        // The front of the deque is the maximum element of the previous window
        if (i >= k - 1) {
            result.push_back(arr[dq.front()]);
        }
    }

    return result;
}

int main() {
    std::vector<int> arr(100000);
    for (int i = 0; i < 100000; ++i) {
        arr[i] = rand() % 10000 + 1;
    }
    int k = 500;

    auto start = std::chrono::high_resolution_clock::now();
    std::vector<int> result = sliding_window_max(arr, k);
    auto stop = std::chrono::high_resolution_clock::now();
    auto duration = std::chrono::duration_cast<std::chrono::microseconds>(stop - start);

    std::cout << "Time taken by function: " << duration.count() << " microseconds" << std::endl;

    return 0;
}



Overwriting sliding_window.cpp


In [17]:
!g++ -std=c++11 -O3 sliding_window.cpp -o sliding_window

In [18]:
!./sliding_window

Time taken by function: 2852 microseconds


### Solution, Maximally Efficient, Fortran

A simple Fortran solution for the sliding window maximum problem:
The Fortran compiler is aggressively optimises the code. Modern compilers are clever and can recognize when computations don't have side effects or when results aren't used, potentially optimizing away operations that appear to be redundant.

To counteract this, we can make the following adjustments:

1. **Introduce a Side Effect**: By printing or accumulating a value that depends on the computation, we can prevent the compiler from optimizing away the core logic. We should be careful, however, not to introduce significant additional overhead.
2. **Compiler Flags**: We can instruct the compiler not to optimize the code using compiler flags. However, this might not be the best representation of the code's performance, as in practice, we'd want the code to be optimized.

So modify the Fortran code to introduce a side effect by accumulating a value that depends on our computations. We'll sum up all the maximum values across all repetitions and then print the sum at the end. This should prevent the compiler from optimizing away the main logic of the sliding window operation.

With the addition of the `total_max_val` accumulation, we're creating a dependency on the results of our computations, which should help deter the compiler from optimizing away the main logic.

Let's see if the compiler still thinks it can outsmart us! 😉🔍🚀

In [37]:
%%writefile sliding_window.f90
PROGRAM SlidingWindow
  USE ISO_C_BINDING
  IMPLICIT NONE
  INTEGER, PARAMETER :: N = 100000, K = 500, REPEATS = 1000
  INTEGER :: i, j, max_val, r, total_max_val
  INTEGER, DIMENSION(N) :: arr
  REAL, DIMENSION(N) :: temp_arr
  INTEGER, DIMENSION(N-K+1) :: result
  INTEGER(C_INT64_T) :: start, finish, count_rate, duration

  ! Generate random array
  CALL RANDOM_NUMBER(temp_arr)
  arr = NINT(temp_arr * 10000)

  total_max_val = 0
  CALL SYSTEM_CLOCK(COUNT_RATE=count_rate)
  CALL SYSTEM_CLOCK(start)

  DO r = 1, REPEATS
     DO i = 1, N-K+1
        max_val = arr(i)
        DO j = i, i+K-1
           IF (arr(j) > max_val) THEN
              max_val = arr(j)
           END IF
        END DO
        result(i) = max_val
        total_max_val = total_max_val + max_val
     END DO
  END DO

  CALL SYSTEM_CLOCK(finish)
  duration = (finish - start) * 1000000 / (count_rate * REPEATS)  ! Convert to microseconds and average

  PRINT *, "Average time taken by function (over ", REPEATS, " repeats): ", duration, " microseconds"
  PRINT *, "Total sum of max values (to prevent optimization): ", total_max_val

END PROGRAM SlidingWindow



Overwriting sliding_window.f90


In [38]:
!gfortran -O3 sliding_window.f90 -o sliding_window_fortran

In [39]:
!./sliding_window_fortran

 Average time taken by function (over         1000  repeats):                 15585  microseconds
 Total sum of max values (to prevent optimization):    868462624


Timing table:

| Language           | Average Execution Time (ms) |
|---------------------|--------------------------------|
| Python              | 13228.44                     |
| C++                 | 2.852                        |
| Fortran (1st run)   | < 0.001 (Optimized away)     |
| Fortran (2nd run)   | 15.585                       |

The "< 0.001 (Optimized away)" entry for the first Fortran run indicates that the execution time was so minimal that it effectively rounded to zero due to compiler optimizations.

It's impressive to see the extent of optimizations that can be achieved, showcasing the power of modern compilers. 🌠🚀🎉

In [40]:
!pip install numpy pandas numba dask



In [41]:
import numpy as np
import pandas as pd
from numba import jit
import dask.array as da
import time

# The test data
np.random.seed(0)
arr = np.random.randint(0, 10000, 100000).tolist()

# The testing function
def test_method(func, arr, k):
    start_time = time.time()
    result = func(arr, k)
    end_time = time.time()
    duration = (end_time - start_time) * 1000  # Convert to milliseconds
    return duration
def max_values_with_numpy(arr, k):
    arr = np.array(arr)
    shape = arr.shape[:-1] + (arr.shape[-1] - k + 1, k)
    strides = arr.strides + (arr.strides[-1],)
    rolling_view = np.lib.stride_tricks.as_strided(arr, shape=shape, strides=strides)
    return np.max(rolling_view, axis=-1).tolist()

numpy_time = test_method(max_values_with_numpy, arr, 500)
print(f"NumPy Time: {numpy_time:.4f} ms")
def max_values_with_pandas(arr, k):
    series = pd.Series(arr)
    return series.rolling(window=k).max().dropna().astype(int).tolist()

pandas_time = test_method(max_values_with_pandas, arr, 500)
print(f"Pandas Time: {pandas_time:.4f} ms")


NumPy Time: 27.1273 ms
Pandas Time: 42.8045 ms


In [46]:
@jit(nopython=True)
def max_values_with_numba(arr, k):
    n = len(arr)
    result = []
    for i in range(n - k + 1):
        result.append(max(arr[i:i+k]))
    return result

numba_time = test_method(max_values_with_numba, arr, 500)
print(f"Numba Time: {numba_time:.4f} ms")


Encountered the use of a type that is scheduled for deprecation: type 'reflected list' found for argument 'arr' of function 'max_values_with_numba'.

For more information visit https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-reflection-for-list-and-set-types

File "<ipython-input-46-e9f2ff7215ae>", line 2:
@jit(nopython=True)
def max_values_with_numba(arr, k):
^



Numba Time: 898.8340 ms


In [43]:
def max_values_with_dask(arr, k):
    darr = da.from_array(arr, chunks=len(arr)//4)  # Assuming 4 cores
    result = da.overlap.map_overlap(lambda x: max(x), darr, depth=(k-1, k-1), boundary='none')
    return result.compute()

dask_time = test_method(max_values_with_dask, arr, 500)
print(f"Dask Time: {dask_time:.4f} ms")


Dask Time: 324.6129 ms




### Updated performance comparison table:

$Speedup Multiplier = Original Python Time/New Time$


| Language           | Average Execution Time (ms) | Speedup Multiplier |
|---------------------|-----------------------------|--------------------|
| Python (Original)   | 13228.44                    | 1x                 |
| C++                 | 2.852                       | 4637x              |
| Fortran (1st run)   | < 0.001 (Optimized away)    | ∞                  |
| Fortran (2nd run)   | 15.585                      | 849x               |
| NumPy               | 27.1273                     | 488x               |
| Pandas              | 42.8045                     | 309x               |
| Numba               | 898.8340                    | 15x                |
| Dask                | 324.6129                    | 41x                |

For the Fortran 1st run, we use "∞" as the speedup multiplier, indicating an infinite speedup due to the time being optimized away to effectively zero.
Observations:
- **NumPy** provides a significant speed-up compared to the original Python implementation, showcasing the power of its C-based operations.
- **Pandas**, while slower than NumPy, is still considerably faster than the original Python approach. This is expected since Pandas is built on top of NumPy.
- **Numba** surprisingly performed slower in this test. The warning you received indicates that Numba's JIT compilation for lists (reflected lists) is deprecated, which might have affected performance. A more optimized Numba implementation might require changes to avoid deprecated features.
- **Dask** provided an improvement over the original Python solution, but not as much as NumPy or Pandas. Given the nature of the problem (a sliding window operation), there's limited parallelism that can be extracted, which might explain the result.

It's clear that while Python libraries can provide significant performance boosts, optimized compiled languages like C++ and Fortran still hold an edge for this particular problem. However, the ease of use and flexibility offered by Python and its libraries often make it a popular choice, especially when the performance difference is within an acceptable range for the application at hand.

### Python is slow, but nice

The performance disparity between Python and the other languages like C++ and Fortran can be attributed to several factors:

1. **Interpreted vs. Compiled**:
   - Python is primarily an interpreted language. This means that Python code is executed line-by-line by the Python interpreter at runtime.
   - C++ and Fortran are compiled languages. The source code is transformed into machine code by a compiler before execution. This machine code runs directly on the hardware, which generally leads to faster execution.

2. **Dynamic Typing**:
   - Python is dynamically typed, meaning variable types are determined at runtime. This introduces overhead because the interpreter needs to check types and perform dynamic dispatch during execution.
   - C++ and Fortran are statically typed. The type of each variable is known at compile time, allowing the compiler to optimize the generated code more effectively.

3. **Memory Management**:
   - Python uses automatic memory management and garbage collection. While this is convenient for the programmer, it can introduce overhead during runtime.
   - In languages like C++ and Fortran, memory is managed more explicitly, allowing for potential performance benefits (though at the cost of added complexity for the programmer).

4. **Optimizations**:
   - Modern C++ and Fortran compilers (like GCC, Clang, and others) have decades of development behind them and are adept at optimizing code for performance.
   - While Python can be optimized (using tools like Cython or JIT compilers like PyPy), the standard CPython interpreter prioritizes ease of use and readability over raw performance.

5. **Overheads of Abstractions**:
   - Python, being a high-level language, provides many abstractions for developer convenience (like lists, dictionaries, and other dynamic data structures). These abstractions, while powerful and flexible, introduce overhead.
   - C++ and Fortran allow for more direct manipulation of memory and data structures, leading to potential performance benefits.

6. **Built-in Functions**:
   - While individual Python built-in functions (like those in the `math` module) are implemented in C and are fast, the overhead often comes from the Python layer when calling these functions or iterating over data.

That said, the strength of Python isn't necessarily in raw computational performance but in its ease of use, flexibility, vast ecosystem, and rapid development capabilities. In many real-world scenarios, developers often strike a balance by writing the majority of an application in Python for rapid development and then optimizing performance-critical sections using languages like C++ or Fortran (or using tools like Cython).

It's also worth noting that there are tools and libraries (like NumPy, TensorFlow, etc.) that allow Python developers to harness the power of optimized C/C++ or Fortran code while working within the Python environment. This "best of both worlds" approach is a significant reason for Python's popularity in fields like data science and machine learning.