## Exercise

Write a function `pdsist(xs)` which returns a matrix of the pairwise distance between the collection of vectors in `xs` using Euclidean distance.

\begin{align}
    d(x, y) = \sqrt{\sum{(y-x)^2}}
\end{align}


In [1]:
import numpy as np
xs = np.array([[0.20981496, 0.54777461, 0.9398527 ],
       [0.63149939, 0.935947  , 0.29834026],
       [0.46302941, 0.25515557, 0.0698739 ],
       [0.38192644, 0.42378508, 0.26055664],
       [0.46307302, 0.05943961, 0.60204931]])
xs

array([[0.20981496, 0.54777461, 0.9398527 ],
       [0.63149939, 0.935947  , 0.29834026],
       [0.46302941, 0.25515557, 0.0698739 ],
       [0.38192644, 0.42378508, 0.26055664],
       [0.46307302, 0.05943961, 0.60204931]])

In [2]:
np.sqrt(((xs - xs[:, np.newaxis])**2).sum(axis=2))

array([[0.        , 0.86025216, 0.9521589 , 0.71164521, 0.64553997],
       [0.86025216, 0.        , 0.73760151, 0.57098519, 0.9428    ],
       [0.9521589 , 0.73760151, 0.        , 0.26715821, 0.56702329],
       [0.71164521, 0.57098519, 0.26715821, 0.        , 0.50591465],
       [0.64553997, 0.9428    , 0.56702329, 0.50591465, 0.        ]])

In [3]:
a = xs[0,:]
b = xs[1,:]

np.sqrt(((a-b)**2).sum())

0.860252156950211

## Exercise 

try to implement the simple OLS using the linear algebra we went through.

$$
\beta = (X^TX)^{-1}X^Ty
$$

Valudate your results using

```python
np.linalg.lstsq
```

Create the data using

```python
n = 200
np.random.seed(seed=1)
# x co-ordinates
x = np.arange(0, n)/100
X = np.array([x, np.ones(n)]).T
# linearly generated sequence
y = x*10 + np.random.normal(0,3,n)
```

In [4]:
import plotly.express as px

In [10]:
n = 200
np.random.seed(seed=1)
# x co-ordinates
x = np.arange(0, n)/100
X = np.array([x, np.ones(n)])

print(X.shape)

# linearly generated sequence
y = x*10 + np.random.normal(0,3,n)
px.scatter(x=x, y=y, template="none")

(2, 200)


In [12]:
Xt = X.T
inv = linalg.inv(Xt.T@Xt)
xy = Xt.T@y
inv@xy

array([10.51394623, -0.19131006])

In [13]:
import numpy.linalg as linalg
inv = linalg.inv(np.dot(X,X.T))
xy =np.dot(X,y)
beta = np.dot(inv,xy)

In [15]:
np.linalg.lstsq(X.T, y, rcond=None)[0]

array([10.51394623, -0.19131006])

In [16]:
# obtaining the parameters of regression line
import plotly.graph_objects as go

gamma = np.linalg.lstsq(X.T, y, rcond=None)[0] 
 
# plotting the line
line = gamma[0]*x + gamma[1] # regression line
line1 = beta[0]*x + beta[1] # regression line

fig = go.Figure()
# Add traces
fig.add_trace(go.Scatter(x=x, y=y,mode='markers',name='observations'))
fig.add_trace(go.Scatter(x=x, y=line,mode='lines',name='OLS formula'))
fig.add_trace(go.Scatter(x=x, y=line1,mode='lines',name='OLS numpy'))