# What is dot product?

**Return to the [castle](https://github.com/Nkluge-correa/teeny-tiny_castle).**

**In the early days of Neural Networks, [Frank Rosenblatt](https://en.wikipedia.org/wiki/Frank_Rosenblatt) shown us that using vectors and the property of dot product, we could split hyperplanes of feature vectors.**

## But what is a vector?

**Let's imagine a vector in the context of a 2D or 3D Euclidean space:**

[![vector2D](https://i.stack.imgur.com/Q1rBUm.png)](https://i.stack.imgur.com/Q1rBUm.png)

[![vector3D](https://i.stack.imgur.com/t0plRm.png)](https://i.stack.imgur.com/t0plRm.png)

Source: _[3Blue1Brown](https://www.youtube.com/watch?v=fNk_zzaMoSs) video on Vectors_.

## What is the connection between Matrices and Vectors?

**Vectors are represented as matrices. One example here is an [Euclidean Vector](https://en.wikipedia.org/wiki/Euclidean_vector) in three-dimensional Euclidean space ($R^{3}$), represented as a column vector (usually) or row vector: $v = [a_{1}, a_{2}, a_{3}]$.**

## What is a Dot product? What does it signify?

**Algebraically, the dot product is the sum of the products of the corresponding entries of the two sequences of numbers:**

- $v_{1} = [a_{1}, a_{2}, a_{3}] = [2, 4, 6]$
- $v_{2} = [b_{1}, b_{2}, b_{3}] = [3, 5, 7]$
- $v_{1} \cdot  v_{2} = [2, 4, 6] \cdot  [3, 5, 7] = (2 \times 3) + (4 \times 5) + (6 \times 7) = 68$

**_If two vectors are in the same direction the dot product is positive and if they are in the opposite direction the dot product is negative._** Try it [here](https://sergedesmedt.github.io/MathOfNeuralNetworks/DotProduct.html#learn_dotproduct).

_So you could use the dot product as a way to find out if two vectors are aligned or not._

**That is for any two distinct sets of input feature vectors in a vector space (*say we are classifying if an email is a span*), we can have a weight vector, whose dot product with one input feature vector of the set of input vectors of a certain class (*not span*) is positive and with the other set (*span*) is negative. In essence, we are using the weight vectors to split the hyper-plane into two distinctive sets.**

**The initial neural network _- the [perceptron](https://en.wikipedia.org/wiki/Perceptron) -_ was doing this and could only do this - that is finding a solution if and only if the input set was linearly separable.**

![perceptron2](https://miro.medium.com/max/639/1*_Epn1FopggsgvwgyDA4o8w.png)

**So, the equation `dot(w, x) > b` defines all the points on one side of the hyperplane, and `dot(w, x) < b` all the points on the other side of the hyperplane and on the hyperplane itself. This happens to be the very definition of 'linear separability' Thus, the perceptron allows us to separate our feature space in two convex half-spaces.**

**In [geometry](https://en.wikipedia.org/wiki/Geometry "Geometry"), the **hyperplane separation theorem** is a theorem about [disjoint](https://en.wikipedia.org/wiki/Disjoint_sets "Disjoint sets") [convex sets](https://en.wikipedia.org/wiki/Convex_set "Convex set") in $n$-dimensional [Euclidean space](https://en.wikipedia.org/wiki/Euclidean_space "Euclidean space"): _if both these sets are [closed](https://en.wikipedia.org/wiki/Closed_set "Closed set") and at least one of them is [compact](https://en.wikipedia.org/wiki/Compact_set "Compact set"), then there is a [hyperplane](https://en.wikipedia.org/wiki/Hyperplane "Hyperplane") in between them and even two parallel hyperplanes in between them separated by a gap._**


In [2]:
import numpy as np
from sklearn.svm import SVC
import plotly.graph_objects as go
import plotly.io as pio
pio.renderers.keys()
pio.renderers.default = 'jupyterlab'
pio.templates.default = "plotly_dark"
rs = np.random.RandomState(666)


n_samples = 100
A = np.zeros((100, 3))
A[:n_samples //
    2] = rs.multivariate_normal(np.ones(3), np.eye(3), size=n_samples//2)
A[n_samples //
    2:] = rs.multivariate_normal(-np.ones(3), np.eye(3), size=n_samples//2)
B = np.zeros(n_samples)
B[n_samples//2:] = 1
svc = SVC(kernel='linear')
svc.fit(A, B)


def z(x, y): return (-svc.intercept_[0]-svc.coef_[0]
                     [0]*x-svc.coef_[0][1]*y) / svc.coef_[0][2]


am, aM = A[:, 0].min(), A[:, 0].max()
bm, bM = A[:, 1].min(), A[:, 1].max()
a = np.linspace(am, aM, 10)
b = np.linspace(bm, bM, 10)
a, b = np.meshgrid(a, b)


fig = go.Figure()
fig.add_surface(x=a, y=b, z=z(a, b), showscale=False, opacity=0.9)
fig.add_scatter3d(x=A[B == 0, 0], y=A[B == 0, 1], z=A[B == 0, 2],
                  mode='markers', marker={'color': 'blue'}, name='Class 0')
fig.add_scatter3d(x=A[B == 1, 0], y=A[B == 1, 1], z=A[B == 1, 2],
                  mode='markers', marker={'color': 'red'}, name='Class 1')
fig.update_layout(template='plotly_dark',
                  title='Hyperplane Separation',
                  paper_bgcolor='rgba(0, 0, 0, 0)',
                  plot_bgcolor='rgba(0, 0, 0, 0)')
fig.show()


---

Return to the [castle](https://github.com/Nkluge-correa/teeny-tiny_castle).
