# Is Python fast or slow? Report

                                  Anton Wallentin, Reinaldo Martinez, Javier Garcia
                                                 ICOM5015-001D
                                                   28/02/2024

# Purpose

The objective of this experiment was to demonstrate the significance of utilizing Python libraries, specifically focusing on the application of Python's NumPy library. The experiment involved the computation of products involving 1-D and 2-D matrices using two distinct methods: through iterative processes and utilizing functions provided by the NumPy library. The primary aim was to assess and compare the processing time between these two methods.


# Hypothesis

Our hypothesis posits that employing the NumPy library in Python will considerably enhance the efficiency of matrix operation computations. Consequently, we anticipate a noticeable discrepancy in the execution time of matrix multiplication tasks between the two approaches. Furthermore, irrespective of dimensionalities, we expect NumPy to consistently outperform the iterative method for these operations. This anticipation is rooted in the understanding that the NumPy library is implemented in machine language, specifically C language, implying a potential speed advantage[1].

# Considerations

Several considerations guided the design of this experiment. Firstly, the choice to compute vector-vector and matrix-vector dot products was made due to their simplicity in coding using the iterative method, and the ease of result verification compared to more complex operations like the cross product. Additionally, these products offer a more concise display compared to matrix-matrix dot products. Secondly, ensuring compatible dimensionalities of vectors and matrices was essential for proper execution. For instance, vector-vector multiplication necessitates identical numbers of elements in both vectors, while matrix-vector multiplication requires the vector to match the number of columns in the matrix. Finally, the use of square matrices was adopted to streamline code simplicity during matrix creation.


# Experimental Setup

In measuring processing time, the "timeit" library [2] was employed for both the iterative and NumPy methods. For the iterative approach, functions calculating vector-vector and matrix-vector products were first defined. Subsequently, two vectors and one matrix were created, and the operation unfolded with the recording of start and end times for vector-vector and matrix-vector multiplications. The results, including the products and total processing time, were then displayed.

For the NumPy method, the NumPy library was imported, and a comprehensive function was defined to handle the entire process. This function generated matrices and vectors using NumPy functions, executed the multiplications, measured processing time, and presented the results. The sole input for the function was the dimensionality of the vectors and matrix.


![table.png](attachment:table.png)

![table2.png](attachment:table2.png)

# Analysis

The experimental results, as depicted in Table 1 and illustrated in Figure 1, reveal noteworthy insights. For dimensionalities below 10, both processing times exhibit similar orders of magnitude, with a marginal advantage in favor of the NumPy method. However, as dimensionalities surpass several tens, a substantial divergence emerges in processing times. The iterative method experiences a considerable increase in time, whereas the NumPy method, while increasing, maintains a consistent order of magnitude.

Upon delving into existing literature, we identified three key factors contributing to the superior performance of NumPy over native Python iteration. First and foremost, NumPy arrays (vectors and matrices) consist of homogenous data types stored contiguously in memory, in contrast to Python lists that contain heterogeneous data types stored non-contiguously [3]. This distinction results in a more efficient data retrieval process for NumPy arrays. Secondly, NumPy's inherent capability for parallel processing enables the simultaneous execution of tasks, leveraging computational power more efficiently and reducing processing time [4]. Lastly, the fact that NumPy functions are implemented in languages like C, C++, and/or Fortran—compiled languages with shorter processing times—compared to Python's interpreted nature, contributes significantly to the speed advantage [5]. The amalgamation of these factors culminates in NumPy's expeditious processing, outpacing Python lists.


# Concluding Remarks

Based on the foregoing analysis, we draw the conclusion that Python, in its native form, lacks optimization for handling substantial data operations, especially the multiplication of large-dimensionality arrays. Fortuitously, specialized libraries like NumPy are tailored for such tasks, demonstrating the significance of leveraging external libraries for enhanced operational efficiency and reduced processing times. This experiment underscores the pivotal role played by libraries in achieving computational efficiency, thus highlighting the importance of incorporating them into data-intensive operations.

# Task distribution

Javier: Report

Reinaldo: Presentation and Report

Anton: Code

# References

[1] “Difference between C and Python,” InterviewBit, Jan. 04, 2024. https://www.interviewbit.com/blog/difference-between-c-and-python/#:~:text=C%20is%20a%20faster%20language,programs%20as%20they%20are%20interpreted.&text=In%20C%2C%20the%20type%20of,must%20be%20assigned%20to%20them

[2] “timeit — Measure execution time of small code snippets,” Python Documentation. https://docs.python.org/3/library/timeit.html

[3] S. Verma, “How Fast Numpy Really is and Why? - Towards Data Science,” Medium, Dec. 13, 2021. [Online]. Available: https://towardsdatascience.com/how-fast-numpy-really-is-e9111df44347

[4] GfG, “Why is Numpy faster in Python?,” GeeksforGeeks, Aug. 13, 2021. https://www.geeksforgeeks.org/why-numpy-is-faster-in-python/

[5] “What is the difference between a compiled and an interpreted program?,” Copyright 2024, the Trustees of Indiana University. https://kb.iu.edu/d/agsz


# Using Python's Native Iteration:

# Small Arrays (Dimensionalities < 10)

In [94]:
import timeit

def dot_product(v1, v2):
    # Dot product
    return sum(v1[i] * v2[i] for i in range(len(v1)))

def matrix_vector_product(matrix, vector):
    result = []
    for row in matrix:
        # Dot product
        result.append(dot_product(row, vector))
    return result

# Small arrays for demonstration
v1_small = [i for i in range(1, 10)]
v2_small = [i for i in range(1, 10)]
matrix_small = [[i for i in range(1, 10)] for _ in range(1, 10)]

# Timing the execution
start_time = timeit.default_timer()
product_v_small = dot_product(v1_small, v2_small)
product_mv_small = matrix_vector_product(matrix_small, v1_small)
end_time = timeit.default_timer()

# Printing the results
print(f"Dot product: {product_v_small}")
print(f"Matrix-Vector dot product: {product_mv_small}")
print(f"Execution time for small arrays: {end_time - start_time} seconds")

Dot product: 285
Matrix-Vector dot product: [285, 285, 285, 285, 285, 285, 285, 285, 285]
Execution time for small arrays: 4.740001168102026e-05 seconds


# Medium Arrays (Dimensionalities in the range of several tens)

In [95]:
import timeit

def dot_product(v1, v2):
    # Dot product
    return sum(v1[i] * v2[i] for i in range(len(v1)))

def matrix_vector_product(matrix, vector):
    result = []
    for row in matrix:
        # Dot product
        result.append(dot_product(row, vector))
    return result

# Medium arrays
v1_medium = [i for i in range(1, 50)]
v2_medium = [i for i in range(1, 50)]
matrix_medium = [[i for i in range(1, 50)] for _ in range(1, 50)]

# Timing the execution
start_time = timeit.default_timer()
product_v_medium = dot_product(v1_medium, v2_medium)
product_mv_medium = matrix_vector_product(matrix_medium, v1_medium)
end_time = timeit.default_timer()

print(f"Dot product: {product_v_medium}")
# For large arrays, printing the entire product may not be practical, so we'll limit the output
print_limit = min(len(product_mv_medium), 5)
print(f"Matrix-Vector dot product: {product_mv_medium[:print_limit]}")
print(f"Execution time for medium arrays: {end_time-start_time} seconds")

Dot product: 40425
Matrix-Vector dot product: [40425, 40425, 40425, 40425, 40425]
Execution time for medium arrays: 0.00022569998691324145 seconds


# Large Arrays (Dimensionalities in the range of 100)

In [96]:
import timeit

def dot_product(v1, v2):
    # Dot product
    return sum(v1[i] * v2[i] for i in range(len(v1)))

def matrix_vector_product(matrix, vector):
    result = []
    for row in matrix:
        # Dot product
        result.append(dot_product(row, vector))
    return result

# Large arrays
v1_large = [i for i in range(1, 100)]
v2_large = [i for i in range(1, 100)]
matrix_large = [[i for i in range(1, 100)] for _ in range(1, 100)]

# Timing the execution
start_time = timeit.default_timer()
product_v_large = dot_product(v1_large, v2_large)
product_mv_large = matrix_vector_product(matrix_large, v1_large)
end_time = timeit.default_timer()

print(f"Dot product: {product_v_large}")
# For large arrays, printing the entire product may not be practical, so we'll limit the output
print_limit = min(len(product_mv_large), 5)
print(f"Matrix-Vector dot product: {product_mv_large[:print_limit]}")
print(f"Execution time for large arrays: {end_time-start_time} seconds")

Dot product: 328350
Matrix-Vector dot product: [328350, 328350, 328350, 328350, 328350]
Execution time for large arrays: 0.0007112000021152198 seconds


# Large Arrays (Dimensionalities in the range of 150)

In [97]:
import timeit

def dot_product(v1, v2):
    # Dot product
    return sum(v1[i] * v2[i] for i in range(len(v1)))

def matrix_vector_product(matrix, vector):
    result = []
    for row in matrix:
        # Dot product
        result.append(dot_product(row, vector))
    return result

# Large arrays
v1_large = [i for i in range(1, 150)]
v2_large = [i for i in range(1, 150)]
matrix_large = [[i for i in range(1, 150)] for _ in range(1, 150)]

# Timing the execution
start_time = timeit.default_timer()
product_v_large = dot_product(v1_large, v2_large)
product_mv_large = matrix_vector_product(matrix_large, v1_large)
end_time = timeit.default_timer()

print(f"Dot product: {product_v_large}")
# For large arrays, printing the entire product may not be practical, so we'll limit the output
print_limit = min(len(product_mv_large), 5)
print(f"Matrix-Vector dot product: {product_mv_large[:print_limit]}")
print(f"Execution time for large arrays: {end_time-start_time} seconds")

Dot product: 1113775
Matrix-Vector dot product: [1113775, 1113775, 1113775, 1113775, 1113775]
Execution time for large arrays: 0.0017070000030798838 seconds


# Large Arrays (Dimensionalities in the range of 200)

In [98]:
import timeit

def dot_product(v1, v2):
    # Dot product
    return sum(v1[i] * v2[i] for i in range(len(v1)))

def matrix_vector_product(matrix, vector):
    result = []
    for row in matrix:
        # Dot product
        result.append(dot_product(row, vector))
    return result

# Large arrays
v1_large = [i for i in range(1, 200)]
v2_large = [i for i in range(1, 200)]
matrix_large = [[i for i in range(1, 200)] for _ in range(1, 200)]

# Timing the execution
start_time = timeit.default_timer()
product_v_large = dot_product(v1_large, v2_large)
product_mv_large = matrix_vector_product(matrix_large, v1_large)
end_time = timeit.default_timer()

print(f"Dot product: {product_v_large}")
# For large arrays, printing the entire product may not be practical, so we'll limit the output
print_limit = min(len(product_mv_large), 5)
print(f"Matrix-Vector dot product: {product_mv_large[:print_limit]}")
print(f"Execution time for large arrays: {end_time-start_time} seconds")

Dot product: 2646700
Matrix-Vector dot product: [2646700, 2646700, 2646700, 2646700, 2646700]
Execution time for large arrays: 0.002569700009189546 seconds


# Large Arrays (Dimensionalities in the range of 250)

In [99]:
import timeit

def dot_product(v1, v2):
    # Dot product
    return sum(v1[i] * v2[i] for i in range(len(v1)))

def matrix_vector_product(matrix, vector):
    result = []
    for row in matrix:
        # Dot product
        result.append(dot_product(row, vector))
    return result

# Large arrays
v1_large = [i for i in range(1, 250)]
v2_large = [i for i in range(1, 250)]
matrix_large = [[i for i in range(1, 250)] for _ in range(1, 250)]

# Timing the execution
start_time = timeit.default_timer()
product_v_large = dot_product(v1_large, v2_large)
product_mv_large = matrix_vector_product(matrix_large, v1_large)
end_time = timeit.default_timer()

print(f"Dot product: {product_v_large}")
# For large arrays, printing the entire product may not be practical, so we'll limit the output
print_limit = min(len(product_mv_large), 5)
print(f"Matrix-Vector dot product: {product_mv_large[:print_limit]}")
print(f"Execution time for large arrays: {end_time-start_time} seconds")

Dot product: 5177125
Matrix-Vector dot product: [5177125, 5177125, 5177125, 5177125, 5177125]
Execution time for large arrays: 0.004010799995739944 seconds


# Large Arrays (Dimensionalities in the range of 300)

In [100]:
import timeit

def dot_product(v1, v2):
    # Dot product
    return sum(v1[i] * v2[i] for i in range(len(v1)))

def matrix_vector_product(matrix, vector):
    result = []
    for row in matrix:
        # Dot product
        result.append(dot_product(row, vector))
    return result

# Large arrays
v1_large = [i for i in range(1, 300)]
v2_large = [i for i in range(1, 300)]
matrix_large = [[i for i in range(1, 300)] for _ in range(1, 300)]

# Timing the execution
start_time = timeit.default_timer()
product_v_large = dot_product(v1_large, v2_large)
product_mv_large = matrix_vector_product(matrix_large, v1_large)
end_time = timeit.default_timer()

print(f"Dot product: {product_v_large}")
# For large arrays, printing the entire product may not be practical, so we'll limit the output
print_limit = min(len(product_mv_large), 5)
print(f"Matrix-Vector dot product: {product_mv_large[:print_limit]}")
print(f"Execution time for large arrays: {end_time-start_time} seconds")

Dot product: 8955050
Matrix-Vector dot product: [8955050, 8955050, 8955050, 8955050, 8955050]
Execution time for large arrays: 0.005882600002223626 seconds


# 

# Using NumPy:

# Small Arrays (Dimensionalities < 10)

In [118]:
import numpy as np
import timeit

def numpy_operations(size):
    v1_np = np.arange(1, size)
    v2_np = np.arange(1, size)
    matrix_np = np.array([np.arange(1, size) for _ in range(size - 1)])
    
    start_time = timeit.default_timer()
    product_v_np = np.dot(v1_np, v2_np)
    product_mv_np = np.dot(matrix_np, v1_np)
    end_time = timeit.default_timer()

    print(f"Dot product: {product_v_np}")
    # For large arrays, printing the entire product may not be practical, so we'll limit the output
    print_limit = min(len(product_mv_np), 5)
    print(f"Matrix-Vector dot product: {product_mv_np[:print_limit]}")
    print(f"Execution time for small arrays with NumPy: {end_time-start_time} seconds")

# Small arrays (Dimensionality < 10)
numpy_operations(10)

Dot product: 285
Matrix-Vector dot product: [285 285 285 285 285]
Execution time for small arrays with NumPy: 1.1199997970834374e-05 seconds


# Medium Arrays (Dimensionalities in the range of several tens)

In [117]:
import numpy as np
import timeit

def numpy_operations(size):
    v1_np = np.arange(1, size)
    v2_np = np.arange(1, size)
    matrix_np = np.array([np.arange(1, size) for _ in range(size - 1)])
    
    start_time = timeit.default_timer()
    product_v_np = np.dot(v1_np, v2_np)
    product_mv_np = np.dot(matrix_np, v1_np)
    end_time = timeit.default_timer()

    print(f"Dot product: {product_v_np}")
    # For large arrays, printing the entire product may not be practical, so we'll limit the output
    print_limit = min(len(product_mv_np), 5)
    print(f"Matrix-Vector dot product: {product_mv_np[:print_limit]}")
    print(f"Execution time for medium arrays with NumPy: {end_time-start_time} seconds")

# Medium arrays (Dimensionality in several tens)
numpy_operations(50)

Dot product: 40425
Matrix-Vector dot product: [40425 40425 40425 40425 40425]
Execution time for medium arrays with NumPy: 1.6099991626106203e-05 seconds


# Large Arrays (Dimensionalities in the range of 100)

In [110]:
import numpy as np
import timeit

def numpy_operations(size):
    v1_np = np.arange(1, size)
    v2_np = np.arange(1, size)
    matrix_np = np.array([np.arange(1, size) for _ in range(size - 1)])
    
    start_time = timeit.default_timer()
    product_v_np = np.dot(v1_np, v2_np)
    product_mv_np = np.dot(matrix_np, v1_np)
    end_time = timeit.default_timer()

    print(f"Dot product: {product_v_np}")
    # For large arrays, printing the entire product may not be practical, so we'll limit the output
    print_limit = min(len(product_mv_np), 5)
    print(f"Matrix-Vector dot product: {product_mv_np[:print_limit]}")
    print(f"Execution time for large arrays with NumPy: {end_time-start_time} seconds")

# Large arrays (Dimensionality in several tens)
numpy_operations(100)

Dot product: 328350
Matrix-Vector dot product: [328350 328350 328350 328350 328350]
Execution time for large arrays with NumPy: 1.920000067912042e-05 seconds


# Large Arrays (Dimensionalities in the range of 150)

In [111]:
import numpy as np
import timeit

def numpy_operations(size):
    v1_np = np.arange(1, size)
    v2_np = np.arange(1, size)
    matrix_np = np.array([np.arange(1, size) for _ in range(size - 1)])
    
    start_time = timeit.default_timer()
    product_v_np = np.dot(v1_np, v2_np)
    product_mv_np = np.dot(matrix_np, v1_np)
    end_time = timeit.default_timer()

    print(f"Dot product: {product_v_np}")
    # For large arrays, printing the entire product may not be practical, so we'll limit the output
    print_limit = min(len(product_mv_np), 5)
    print(f"Matrix-Vector dot product: {product_mv_np[:print_limit]}")
    print(f"Execution time for large arrays with NumPy: {end_time-start_time} seconds")

# Large arrays (Dimensionality in several tens)
numpy_operations(150)

Dot product: 1113775
Matrix-Vector dot product: [1113775 1113775 1113775 1113775 1113775]
Execution time for large arrays with NumPy: 2.059999678749591e-05 seconds


# Large Arrays (Dimensionalities in the range of 200)

In [112]:
import numpy as np
import timeit

def numpy_operations(size):
    v1_np = np.arange(1, size)
    v2_np = np.arange(1, size)
    matrix_np = np.array([np.arange(1, size) for _ in range(size - 1)])
    
    start_time = timeit.default_timer()
    product_v_np = np.dot(v1_np, v2_np)
    product_mv_np = np.dot(matrix_np, v1_np)
    end_time = timeit.default_timer()

    print(f"Dot product: {product_v_np}")
    # For large arrays, printing the entire product may not be practical, so we'll limit the output
    print_limit = min(len(product_mv_np), 5)
    print(f"Matrix-Vector dot product: {product_mv_np[:print_limit]}")
    print(f"Execution time for large arrays with NumPy: {end_time-start_time} seconds")

# Large arrays (Dimensionality in several tens)
numpy_operations(200)

Dot product: 2646700
Matrix-Vector dot product: [2646700 2646700 2646700 2646700 2646700]
Execution time for large arrays with NumPy: 2.6599998818710446e-05 seconds


# Large Arrays (Dimensionalities in the range of 250)

In [113]:
import numpy as np
import timeit

def numpy_operations(size):
    v1_np = np.arange(1, size)
    v2_np = np.arange(1, size)
    matrix_np = np.array([np.arange(1, size) for _ in range(size - 1)])
    
    start_time = timeit.default_timer()
    product_v_np = np.dot(v1_np, v2_np)
    product_mv_np = np.dot(matrix_np, v1_np)
    end_time = timeit.default_timer()

    print(f"Dot product: {product_v_np}")
    # For large arrays, printing the entire product may not be practical, so we'll limit the output
    print_limit = min(len(product_mv_np), 5)
    print(f"Matrix-Vector dot product: {product_mv_np[:print_limit]}")
    print(f"Execution time for large arrays with NumPy: {end_time-start_time} seconds")

# Large arrays (Dimensionality in several tens)
numpy_operations(250)

Dot product: 5177125
Matrix-Vector dot product: [5177125 5177125 5177125 5177125 5177125]
Execution time for large arrays with NumPy: 3.5299992305226624e-05 seconds


# Large Arrays (Dimensionalities in the range 300)

In [114]:
import numpy as np
import timeit

def numpy_operations(size):
    v1_np = np.arange(1, size)
    v2_np = np.arange(1, size)
    matrix_np = np.array([np.arange(1, size) for _ in range(size - 1)])

    start_time = timeit.default_timer()
    product_v_np = np.dot(v1_np, v2_np)
    product_mv_np = np.dot(matrix_np, v1_np)
    end_time = timeit.default_timer()

    print(f"Dot product: {product_v_np}")
    # For large arrays, printing the entire product may not be practical, so we'll limit the output
    print_limit = min(len(product_mv_np), 5)
    print(f"Matrix-Vector dot product: {product_mv_np[:print_limit]}")
    print(f"Execution time for large arrays with NumPy: {end_time-start_time} seconds")
    
# Large arrays (Dimensionality in several hundreds)
numpy_operations(300)

Dot product: 8955050
Matrix-Vector dot product: [8955050 8955050 8955050 8955050 8955050]
Execution time for large arrays with NumPy: 4.889999399892986e-05 seconds
