# Cauchy Swarz and Triangular inequality
---

Both cauchy and triagle inequality are practically used in data science field. In NLP; through cosine similarity; corelation and feature relations; PCA and projections; and many more. Understanding the concept behind the two inequalities is clusial step in data extraction, manipulation or visualization.

Objective:

1. To verify that cauchy-swarz inequlity given by the formulae below is true

 $$|<x,y>| \le ||x||||y||$$

2. To compute euclidean distance and verify that the formula below is true:
$$d(A,C) \le d(A,B) + d(B,C)$$

We are going to use the iris dataset from `sklearn.datasets`.

In [79]:
# Import modules
import sklearn
import numpy as np

In [80]:
# Getting the dataset from sklearn
iris = sklearn.datasets.load_iris()
iris.data[:5]

array([[5.1, 3.5, 1.4, 0.2],
       [4.9, 3. , 1.4, 0.2],
       [4.7, 3.2, 1.3, 0.2],
       [4.6, 3.1, 1.5, 0.2],
       [5. , 3.6, 1.4, 0.2]])

Question: 
1. Choose two features to work with and compute:
 - Dot product of the two features
 - ||x||
 - ||y||
 - ||x|| ||y||

In [96]:
# Getting the first feature (sepal width (cm))
x = iris.data[:, 0]
x[:5]

array([5.1, 4.9, 4.7, 4.6, 5. ])

In [97]:
# Getting the second feature (petal widht (cm))
y = iris.data[:, 2]
y[:5]

array([1.4, 1.4, 1.3, 1.5, 1.4])

1. Compute dot product of the two features

In [98]:
dotProduct = np.dot(x, y)
print(f"The dot product of x and y is: {dotProduct.round(4)}")

The dot product of x and y is: 3483.76


2. Computing ||x|| and ||y||

In [99]:
xLength = np.linalg.norm(x)
print(f"||x|| is: {xLength.round(4)}")

||x|| is: 72.2762


In [100]:
yLength = np.linalg.norm(y)
print(f"||y|| is: {yLength.round(4)}")

||y|| is: 50.8204


3. Computing ||x|| ||y|| i.e the product of the lenghts.

In [102]:
Length = xLength * yLength
print(f"||x|| ||y||: {Length.round(4)}")

||x|| ||y||: 3673.1035


Finally, let us verify the cauncy swarz inequality.

In [103]:
print(f"Is <x,y>: {dotProduct.round(4)} <= ||x|| ||y||: {Length.round(4)}")

Is <x,y>: 3483.76 <= ||x|| ||y||: 3673.1035


In [104]:
if dotProduct <= Length:
    print("Yes, cauncy swarz inequality holds")
else:
    print("Cauchy swarz inequality does not hold!")

Yes, cauncy swarz inequality holds


Compute cosine similarity and comment on the alignment.

The cosine similarity is given by the following formula: 
$$\frac{<x,y>}{||x|| ||y||}$$

In [105]:
cosineSimilarity = dotProduct / Length

print(f"Cosine similarity is: {cosineSimilarity.round(4)}")

Cosine similarity is: 0.9485


There is a high level of alignment between sepal and petal lenghts. This is evident by the fact that cosine similarity is approximately $95\%$.

### Part 2.

Create a new feature z such that z is a scaler multiple of x:
$$z = cx \text{ where } c \in \mathbb{R}$$

Note c = 3.

In [106]:
z = 3 * x
print(f"first three values of x: {x[:3]} and of z : {z[:3]}")

first three values of x: [5.1 4.9 4.7] and of z : [15.3 14.7 14.1]


Computing the values of <x,z> and ||x|| ||z||.

In [109]:
# Dot product of x and y
xyDotProduct = np.dot(x, z)

print(f"<x,y> : {xyDotProduct.round(4)}")

<x,y> : 15671.55


In [123]:
# Computing the lenghts of z
zLength = np.linalg.norm(z)
print(f"||z|| is: {zLength.round(4)} ||x|| is : {xLength.round(4)}")

||z|| is: 216.8286 ||x|| is : 72.2762


In [125]:
# Computing the product of the two lenghts
xzProduct = zLength * xLength
print(f"||x|| ||y|| is : {xzProduct.round(4)}")
print(f"<x,y> : {xyDotProduct}")

||x|| ||y|| is : 15671.55
<x,y> : 15671.549999999997


In [133]:
# Check for similarity
if xzProduct.round() == xyDotProduct.round():
    print("<x,y> == ||x|| ||y||")
else:
    print("Equality is not evident")

<x,y> == ||x|| ||y||


Equality holds because z is a scaled version of x, thus, they are moving in the same direction.

### Part 3

Choose three features, and compute their eucledian distance. Verify triangle inequality.