<a href="https://vigneashpandiyan.github.io/publications/Codes/" target="_blank" rel="noopener noreferrer">
  <img src="https://vigneashpandiyan.github.io/images/Link.png"
       style="max-width: 800px; width: 100%; height: auto;">
</a>

# Vector Operations: Scalar Multiplication, Sum and Dot Product of Vectors

In this module you will use Python and `NumPy` functions to perform main vector operations: scalar multiplication, sum of vectors and their dot product. You will also investigate the speed of calculations using loop and vectorized forms of these main linear algebra operations

## Packages

Load the `NumPy` package to access its functions.

In [None]:
import numpy as np

<a name='1'></a>
## 1 - Scalar Multiplication and Sum of Vectors

<a name='1.1'></a>
### 1.1 - Visualization of a Vector $v\in\mathbb{R}^2$

You already have seen in the videos and labs, that vectors can be visualized as arrows, and it is easy to do it for a $v\in\mathbb{R}^2$, e.g.
$v=\begin{bmatrix}
          1 & 3
\end{bmatrix}^T$

The following code will show the visualization.

In [None]:
import matplotlib.pyplot as plt
import numpy as np

def plot_vectors(list_v, list_label, list_color):
    _, ax = plt.subplots(figsize=(10, 10))
    ax.tick_params(axis='x', labelsize=14)
    ax.tick_params(axis='y', labelsize=14)
    ax.set_xticks(np.arange(-10, 10))
    ax.set_yticks(np.arange(-10, 10))


    plt.axis([-10, 10, -10, 10])
    for i, v in enumerate(list_v):
        sgn = 0.4 * np.array([[1] if i==0 else [i] for i in np.sign(v)])
        plt.quiver(v[0], v[1], color=list_color[i], angles='xy', scale_units='xy', scale=1)
        ax.text(v[0]-0.2+sgn[0], v[1]-0.2+sgn[1], list_label[i], fontsize=14, color=list_color[i])

    plt.grid()
    plt.gca().set_aspect("equal")
    plt.show()

v = np.array([[1],[3]])
# Arguments: list of vectors as NumPy arrays, labels, colors.
plot_vectors([v], [f"$v$"], ["black"])

The vector is defined by its **norm (length, magnitude)** and **direction**, not its actual position. But for clarity and convenience vectors are often plotted starting in the origin (in $\mathbb{R}^2$ it is a point $(0,0)$) .

<a name='1.2'></a>
### 1.2 - Scalar Multiplication

**Scalar multiplication** of a vector $v=\begin{bmatrix}
          v_1 & v_2 & \ldots & v_n
\end{bmatrix}^T\in\mathbb{R}^n$ by a scalar $k$ is a vector $kv=\begin{bmatrix}
          kv_1 & kv_2 & \ldots & kv_n
\end{bmatrix}^T$ (element by element multiplication). If $k>0$, then $kv$ is a vector pointing in the same direction as $v$ and it is $k$ times as long as $v$. If $k=0$, then $kv$ is a zero vector. If $k<0$, vector $kv$ will be pointing in the opposite direction. In Python you can perform this operation with a `*` operator. Check out the example below:

In [None]:
plot_vectors([v, 2*v, -2*v], [f"$v$", f"$2v$", f"$-2v$"], ["black", "green", "blue"])

<a name='1.3'></a>
### 1.3 - Sum of Vectors

**Sum of vectors (vector addition)** can be performed by adding the corresponding components of the vectors: if $v=\begin{bmatrix}
          v_1 & v_2 & \ldots & v_n
\end{bmatrix}^T\in\mathbb{R}^n$ and  
$w=\begin{bmatrix}
          w_1 & w_2 & \ldots & w_n
\end{bmatrix}^T\in\mathbb{R}^n$, then $v + w=\begin{bmatrix}
          v_1 + w_1 & v_2 + w_2 & \ldots & v_n + w_n
\end{bmatrix}^T\in\mathbb{R}^n$. The so-called **parallelogram law** gives the rule for vector addition. For two vectors $u$ and $v$ represented by the adjacent sides (both in magnitude and direction) of a parallelogram drawn from a point, the vector sum $u+v$ is is represented by the diagonal of the parallelogram drawn from the same point.

In Python you can either use `+` operator or `NumPy` function `np.add()`. In the following code you can uncomment the line to check that the result will be the same:

In [None]:
v = np.array([[1],[3]])
w = np.array([[4],[-1]])

plot_vectors([v, w, v + w], [f"$v$", f"$w$", f"$v + w$"], ["black", "black", "red"])
# plot_vectors([v, w, np.add(v, w)], [f"$v$", f"$w$", f"$v + w$"], ["black", "black", "red"])

<a name='1.4'></a>
### 1.4 - Norm of a Vector

The norm of a vector $v$ is denoted as $\lvert v\rvert$. It is a nonnegative number that describes the extent of the vector in space (its length). The norm of a vector can be found using `NumPy` function `np.linalg.norm()`:

In [None]:
print("Norm of a vector v is", np.linalg.norm(v))

<a name='2'></a>
## 2 - Dot Product

<a name='2.1'></a>
### 2.1 - Algebraic Definition of the Dot Product

The **dot product** (or **scalar product**) is an algebraic operation that takes two vectors $x=\begin{bmatrix}
          x_1 & x_2 & \ldots & x_n
\end{bmatrix}^T\in\mathbb{R}^n$ and  
$y=\begin{bmatrix}
          y_1 & y_2 & \ldots & y_n
\end{bmatrix}^T\in\mathbb{R}^n$ and returns a single scalar. The dot product can be represented with a dot operator $x\cdot y$ and defined as:

$$x\cdot y = \sum_{i=1}^{n} x_iy_i = x_1y_1+x_2y_2+\ldots+x_ny_n \tag{1}$$

<a name='2.2'></a>
### 2.2 - Dot Product using Python

The simplest way to calculate dot product in Python is to take the sum of element by element multiplications. You can define the vectors $x$ and $y$ by listing their coordinates:

In [None]:
x = [1, -2, -5]
y = [4, 3, -1]

Next, let’s define a function `dot(x,y)` for the dot product calculation:

In [None]:
def dot(x, y):
    s=0
    for xi, yi in zip(x, y):
        s += xi * yi
    return s

For the sake of simplicity, let’s assume that the vectors passed to the above function are always of the same size, so that you don’t need to perform additional checks.

Now everything is ready to perform the dot product calculation calling the function `dot(x,y)`:

In [None]:
print("The dot product of x and y is", dot(x, y))

Dot product is very a commonly used operator, so `NumPy` linear algebra package provides quick way to calculate it using function `np.dot()`:

In [None]:
print("np.dot(x,y) function returns dot product of x and y:", np.dot(x, y))

Note that you did not have to define vectors $x$ and $y$ as `NumPy` arrays, the function worked even with the lists. But there are alternative functions in Python, such as explicit operator `@` for the dot product, which can be applied only to the `NumPy` arrays. You can run the following cell to check that.

In [None]:
print("This line output is a dot product of x and y: ", np.array(x) @ np.array(y))

print("\nThis line output is an error:")
try:
    print(x @ y)
except TypeError as err:
    print(err)

As both `np.dot()` and `@` operators are commonly used, it is recommended to define vectors as `NumPy` arrays to avoid errors. Let's redefine vectors $x$ and $y$ as `NumPy` arrays to be safe:

In [None]:
x = np.array(x)
y = np.array(y)

<a name='2.3'></a>
### 2.3 - Speed of Calculations in Vectorized Form

Dot product operations in Machine Learning applications are applied to the large vectors with hundreds or thousands of coordinates (called **high dimensional vectors**). Training models based on large datasets often takes hours and days even on powerful machines. Speed of calculations is crucial for the training and deployment of your models.

It is important to understand the difference in the speed of calculations using vectorized and the loop forms of the vectors and functions. In the loop form operations are performed one by one, while in the vectorized form they can be performed in parallel. In the section above you defined loop version of the dot product calculation (function `dot()`), while `np.dot()` and `@` are the functions representing vectorized form.

Let's perform a simple experiment to compare their speed. Define new vectors $a$ and $b$ of the same size $1,000,000$:

In [None]:
a = np.random.rand(1000000)
b = np.random.rand(1000000)

Use `time.time()` function to evaluate amount of time (in seconds) required to calculate dot product using the function `dot(x,y)` which you defined above:

In [None]:
import time

tic = time.time()
c = dot(a,b)
toc = time.time()
print("Dot product: ", c)
print ("Time for the loop version:" + str(1000*(toc-tic)) + " ms")

Now compare it with the speed of the vectorized versions:

In [None]:
tic = time.time()
c = np.dot(a,b)
toc = time.time()
print("Dot product: ", c)
print ("Time for the vectorized version, np.dot() function: " + str(1000*(toc-tic)) + " ms")

In [None]:
tic = time.time()
c = a @ b
toc = time.time()
print("Dot product: ", c)
print ("Time for the vectorized version, @ function: " + str(1000*(toc-tic)) + " ms")

You can see that vectorization is extremely beneficial in terms of the speed of calculations!

<a name='2.4'></a>
### 2.4 - Geometric Definition of the Dot Product

In [Euclidean space](https://en.wikipedia.org/wiki/Euclidean_space), a Euclidean vector has both magnitude and direction. The dot product of two vectors $x$ and $y$ is defined by:

$$x\cdot y = \lvert x\rvert \lvert y\rvert \cos(\theta),\tag{2}$$

where $\theta$ is the angle between the two vectors.

This provides an easy way to test the orthogonality between vectors. If $x$ and $y$ are orthogonal (the angle between vectors is $90^{\circ}$), then since $\cos(90^{\circ})=0$, it implies that **the dot product of any two orthogonal vectors must be $0$**. Let's test it, taking two vectors $i$ and $j$ we know are orthogonal:

In [None]:
i = np.array([1, 0, 0])
j = np.array([0, 1, 0])
print("The dot product of i and j is", dot(i, j))

<a name='2.5'></a>
### 2.5 - Application of the Dot Product: Vector Similarity

Geometric definition of a dot product is used in one of the applications - to evaluate **vector similarity**. In Natural Language Processing (NLP) words or phrases from vocabulary are mapped to a corresponding vector of real numbers. Similarity between two vectors can be defined as a cosine of the angle between them. When they point in the same direction, their similarity is 1 and it decreases with the increase of the angle.

Then equation $(2)$ can be rearranged to evaluate cosine of the angle between vectors:

$\cos(\theta)=\frac{x \cdot y}{\lvert x\rvert \lvert y\rvert}.\tag{3}$

Zero value corresponds to the zero similarity between vectors (and words corresponding to those vectors). Largest value is when vectors point in the same direction, lowest value is when vectors point in the opposite directions.

This example of vector similarity is given to link the material with the Machine Learning applications. There will be no actual implementation of it in this Course. Some examples of implementation can be found in the Natual Language Processing Specialization.

Well done, you have finished this lab!

In [None]:
import numpy as np
import matplotlib.pyplot as plt

def plot_vectors(list_v, list_label, list_color):
    _, ax = plt.subplots(figsize=(8, 8))
    ax.tick_params(axis='x', labelsize=14)
    ax.tick_params(axis='y', labelsize=14)

    # Setting the grid limits
    limit = 10
    ax.set_xticks(np.arange(-limit, limit+1))
    ax.set_yticks(np.arange(-limit, limit+1))
    plt.axis([-limit, limit, -limit, limit])

    for i, v in enumerate(list_v):
        # Using quiver to plot vectors from the origin (0,0)
        plt.quiver(0, 0, v[0], v[1], color=list_color[i], angles='xy', scale_units='xy', scale=1)

        # Label placement logic
        ax.text(v[0], v[1], f" {list_label[i]}", fontsize=14, color=list_color[i], fontweight='bold')

    plt.grid(True, linestyle='--', alpha=0.6)
    plt.axhline(0, color='black', linewidth=1)
    plt.axvline(0, color='black', linewidth=1)
    plt.gca().set_aspect("equal")
    plt.title("Vector Similarity in 2D Space\n(Smaller Angle = Higher Dot Product)", fontsize=16)
    plt.show()

# --- Use Case: Image Recognition ---
# Component X: Organic Shapes / Texture
# Component Y: Metallic / Geometric Edges

v_target = np.array([8, 2])    # The Reference (e.g., a Golden Retriever)
v_similar = np.array([7, 3])   # Similar (e.g., a Labrador) - Small angle
v_unrelated = np.array([1, 8]) # Unrelated (e.g., a Building) - Wide angle

vectors = [v_target, v_similar, v_unrelated]
labels = ['Target (Dog)', 'Similar (Dog)', 'Unrelated (Building)']
colors = ['#1f77b4', '#2ca02c', '#d62728']

plot_vectors(vectors, labels, colors)



> Why this plot explains Similarity:



1.   The Angle ($\theta$): Notice how the blue and green arrows are "hugging" each other. Because they point in nearly the same direction, their dot product is very large. In Image Recognition, this means the visual patterns (pixels/edges) match closely.
2.   Orthogonality: The red arrow (Building) is pointing almost vertically, while the blue arrow is mostly horizontal. Their dot product will be very low because they do not share the same "feature space".



*   The Math Connection:

---


High Similarity: $\cos(0^\circ) = 1$


---


No Similarity: $\cos(90^\circ) = 0$

---


Opposite: $\cos(180^\circ) = -1$

---



In [None]:
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# 1. Define our Feature Matrix (Rows = Images, Columns = Features)
# Features: [Organic/Fur, Metallic/Geometry]
features = np.array([
    [8, 2],  # Image 0: Golden Retriever
    [7, 3],  # Image 1: Labrador
    [2, 9],  # Image 2: Skyscrapers
    [1, 8],  # Image 3: Sports Car
    [7, 2]   # Image 4: Poodle
])

labels = ['Retriever', 'Labrador', 'Skyscrapers', 'Sports Car', 'Poodle']

# 2. Normalize the rows to unit length (so dot product = cosine similarity)
norms = np.linalg.norm(features, axis=1, keepdims=True)
normalized_features = features / norms

# 3. Compute the Similarity Matrix using the Dot Product
# Matrix Multiplication: (5x2) @ (2x5) = (5x5) matrix of all pairs
similarity_matrix = np.dot(normalized_features, normalized_features.T)

# 4. Visualization
plt.figure(figsize=(10, 8))
sns.heatmap(similarity_matrix, annot=True, cmap='YlGnBu',
            xticklabels=labels, yticklabels=labels)

plt.title("2D Similarity Matrix (Dot Product)", fontsize=16)
plt.show()

In [None]:
import numpy as np
import matplotlib.pyplot as plt

# --- 1. The Database Matrix (2D) ---
# Each row represents an image in our database.
# Dimensions: [Texture/Fur, Geometry/Edges]
database_matrix = np.array([
    [9, 2],  # Index 0: Husky
    [8, 1],  # Index 1: Beagle
    [2, 8],  # Index 2: Eiffel Tower
    [1, 9],  # Index 3: Bridge
    [5, 5]   # Index 4: Abstract Art
])
db_labels = ['Husky', 'Beagle', 'Eiffel Tower', 'Bridge', 'Abstract Art']

# --- 2. The Query Vector (1D) ---
# Someone uploads a photo of a 'Golden Retriever'
query_vector = np.array([8.5, 1.5])

# --- 3. Vector Similarity Calculation ---
# Step A: Normalize for Cosine Similarity
db_norm = database_matrix / np.linalg.norm(database_matrix, axis=1, keepdims=True)
query_norm = query_vector / np.linalg.norm(query_vector)

# Step B: The Dot Product (The Engine)
# This calculates similarity for ALL database items at once
similarities = np.dot(db_norm, query_norm)

# --- 4. Visualization ---
plt.figure(figsize=(10, 5))
colors = ['green' if s > 0.8 else 'gray' for s in similarities]
plt.bar(db_labels, similarities, color=colors)
plt.axhline(y=0.9, color='r', linestyle='--', label='Match Threshold')

plt.title('Vector Similarity: Query vs. Database Matrix', fontsize=14)
plt.ylabel('Similarity Score (Dot Product)')
plt.ylim(0, 1.1)
plt.legend()

for i, score in enumerate(similarities):
    plt.text(i, score + 0.02, f"{score:.2f}", ha='center', fontweight='bold')

plt.show()

In [None]:
import numpy as np
import matplotlib.pyplot as plt

def cosine_similarity(v1, v2):
    """Calculates similarity using the dot product formula."""
    # The Dot Product measures the combined strength of matching dimensions
    dot_product = np.dot(v1, v2)
    norm_v1 = np.linalg.norm(v1)
    norm_v2 = np.linalg.norm(v2)
    return dot_product / (norm_v1 * norm_v2)

# --- Scenario 1: Natural Language Processing (NLP) ---
# Each vector has 3 dimensions representing specific topics:
# [Technology, Nature, Sports]
# A value of 0.9 means the text is 90% related to that specific topic.

# Query: "Latest smartphone technology"
# Logic: High in Tech (0.9), low in Nature (0.1), zero in Sports (0.0)
query_nlp = np.array([0.9, 0.1, 0.0])

docs_nlp = {
    # Aligns with Tech dimension
    'Tech Article': np.array([0.85, 0.2, 0.05]),
    # Aligns with Nature dimension (orthogonal to Tech)
    'Nature Blog':  np.array([0.1, 0.9, 0.0]),
    # Aligns with Sports dimension
    'Sports News':  np.array([0.05, 0.05, 0.9])
}

sim_nlp = [cosine_similarity(query_nlp, v) for v in docs_nlp.values()]
labels_nlp = list(docs_nlp.keys())

# --- Scenario 2: Recommendation Systems (RecSys) ---
# Each vector has 3 dimensions representing movie genres:
# [Action/Sci-Fi, Romance, Documentary]

# User Profile: What the user likes
# Logic: User loves Action (0.8), Sci-Fi (0.7), but dislikes Romance (0.1)
user_pref = np.array([0.8, 0.7, 0.1])

items_rec = {
    # High in Action/Sci-Fi (Matching the user)
    'Action Movie':  np.array([0.9, 0.8, 0.2]),
    # High in Romance (Mismatching the user)
    'Romance Film':  np.array([0.1, 0.2, 0.9]),
    # High in Documentary (Neutral/Low match)
    'Nature Docu':   np.array([0.3, 0.3, 0.8])
}

sim_rec = [cosine_similarity(user_pref, v) for v in items_rec.values()]
labels_rec = list(items_rec.keys())

# --- Visualization Logic ---
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 6), sharey=True)

# NLP Plot
ax1.bar(labels_nlp, sim_nlp, color=['#1f77b4', '#aec7e8', '#ff7f0e'])
ax1.set_title('Use Case 1: NLP Similarity\n[Tech, Nature, Sports]', fontsize=14)
ax1.set_ylabel('Cosine Similarity Score', fontsize=12)
ax1.set_ylim(0, 1.1)

# RecSys Plot
ax2.bar(labels_rec, sim_rec, color=['#2ca02c', '#98df8a', '#d62728'])
ax2.set_title('Use Case 2: Recommendation Matching\n[Action, Romance, Documentary]', fontsize=14)

for ax, sims in zip([ax1, ax2], [sim_nlp, sim_rec]):
    for i, v in enumerate(sims):
        ax.text(i, v + 0.02, f"{v:.2f}", ha='center', fontweight='bold')

plt.tight_layout()
plt.show()

Breakdown of the Component Logic
In vector similarity, these components are often called Features or Dimensions. Here is how they function in the code:





1.   NLP: Semantic Dimensions

*  Dimension 1 (Tech): Measures the presence of keywords like "processor","silicon," or "software.
* "Dimension 2 (Nature): Measures keywords like "forest," "climate," or "wildlife.
* "Dimension 3 (Sports): Measures keywords like "goal," "stadium," or "athlete.


"Why it works: When you multiply the Query [0.9, 0.1, 0.0] by the Tech Article [0.85, 0.2, 0.05], the high values in the first index multiply together ($0.9 \times 0.85 = 0.765$), leading to a high dot product.

2.   RecSys: Preference Dimensions


*   Dimension 1 (Action/Sci-Fi): Represents the intensity or "weight" of explosions, futuristic technology, or fast pacing.
*   Dimension 2 (Romance): Represents emotional focus, relationship dynamics, or "meet-cutes.
*   "Dimension 3 (Documentary): Represents educational value, real-world footage, and factual narration.


Why it works: The dot product acts as a weighted sum. It calculates how much of the "Action" the user wants times how much "Action" the movie actually has. If both are high, the recommendation score peaks.