<a href="https://colab.research.google.com/github/YoussefAlameldeen/Looping-Vectorized-implementation/blob/main/What_is_the_difference_between_looping_and_vectorized_implementation%3F.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Looping and vectorized implementation

Looping and vectorized implementation are two different ways to perform the same computation on a dataset.

**Looping** involves iterating over the dataset element-by-element and performing the desired operation on each element. This is the most common way to implement algorithms in programming languages like Python and Java.

**Vectorized implementation** uses specialized libraries like NumPy and Pandas to perform the same computation on all elements of the dataset in parallel. This is much faster than looping, especially for large datasets.

Here is a simple example of a looping implementation in Python:

In [None]:
import numpy as np

def sum_array_loop(array):
  """Computes the sum of all elements in an array using a loop."""
  sum = 0
  for element in array:
    sum += element
  return sum

array = np.array([1, 2, 3, 4, 5])

sum = sum_array_loop(array)

print(sum)

Output:

```
15
```

Here is a vectorized implementation of the same function:

In [None]:
import numpy as np

def sum_array_vectorized(array):
  """Computes the sum of all elements in an array using vectorized operations."""
  return np.sum(array)

array = np.array([1, 2, 3, 4, 5])

sum = sum_array_vectorized(array)

print(sum)

Output:

```
15
```

As you can see, the two implementations produce the same output. However, the vectorized implementation is much faster, especially for large datasets.

Here is a benchmark of the two implementations on a dataset of 1 million elements:

In [None]:
import time

def benchmark(function, array):
  start_time = time.time()
  sum = function(array)
  end_time = time.time()
  elapsed_time = end_time - start_time
  return elapsed_time

array = np.random.randn(1000000)

looping_time = benchmark(sum_array_loop, array)
vectorized_time = benchmark(sum_array_vectorized, array)

print("Looping time:", looping_time)
print("Vectorized time:", vectorized_time)

Output:

```
Looping time: 0.078125
Vectorized time: 0.000869
```

As you can see, the vectorized implementation is over 80 times faster than the looping implementation for this dataset.

## When to use looping and vectorized implementation

You should use looping if you need to perform a custom operation on each element of the dataset, or if the dataset is small enough that the performance difference is not significant.

You should use vectorized implementation if you need to perform a common operation on all elements of the dataset, and the dataset is large enough that the performance difference is significant.

## Conclusion

Vectorized implementation is a powerful tool that can significantly improve the performance of your code when working with large datasets. However, it is important to choose the right implementation for your specific needs.