In [1]:
import numpy as np

# KNN:

## Distancias loop 2:

Norma y distancia entre vectores:

$$u\cdot v = \lVert u \rVert \cdot \lVert v \rVert \cdot \cos(\theta)$$
$$\overset{u=v}{\Rightarrow} u\cdot u = \lVert u \rVert \cdot \lVert u \rVert \cdot \cos(0) = \lVert u \rVert ^ 2$$
$$u\cdot u = \lVert u \rVert ^ 2$$
Utilizando:
$$u=x-y$$
Resulta:
$$(x-y)\cdot(x-y)=\lVert x-y \rVert ^ 2$$
Como nos encontramos en $\mathbb R^n$:
$$u\cdot v = \sum_{i=0}^{N-1}v_iu_i=uv^T$$
Por lo que:
$$(x-y)\cdot(x-y)=\lVert x-y \rVert ^ 2=(x-y)(x-y)^T$$
Entonces, resumiendo:
$$\lVert x-y \rVert ^ 2=(x-y)(x-y)^T\Rightarrow \lVert x-y \rVert=\sqrt{(x-y)(x-y)^T}$$

Test:

In [2]:
import numpy as np

A = np.array([1,2,3])
B = np.array([4,5,6])
dist = np.sqrt((A-B).dot(A-B).T)

print(f'Validation of equality with np.linalg.norm is {dist == np.linalg.norm(A-B)}.')

Validation of equality with np.linalg.norm is True.


## Predicciones:

Implementación de el cálculo de predicciones:

In [3]:
C = np.array([[1,2,3,6],
              [5,4,3,2],
              [7,8,9,8]])

In [4]:
C[1,:]

array([5, 4, 3, 2])

In [5]:
elementos_1 = C[1,:]

In [6]:
fila_1 = np.argsort(C[1,:])

In [7]:
fila_1

array([3, 2, 1, 0])

Ya los devuelve ordenados !

In [8]:
k = 2

print(f'{k} vecinos más cercanos son aquellos cuyos índices son: {fila_1[:k]}.')

2 vecinos más cercanos son aquellos cuyos índices son: [3 2].


In [9]:
vec = np.array([0,2,4,6,8,10])

In [10]:
vec[(0,2),]

array([0, 4])

Y para elegir la predicción:

In [11]:
classes = [1,2,2]

print(f'Clase más común es: {classes[np.argmax(np.bincount(classes))]}.')

Clase más común es: 2.


In [12]:
len(np.bincount(classes))

3

Bincount arma el vector con todos los numeros desde el 0 hasta el mayor numero del array, y luego guarda en la posicion de cada uno la cantidad de veces que aparece...

In [13]:
np.bincount(classes)

array([0, 1, 2])

In [14]:
classes = [1,1,2,3,3,3,4,4,4,4]

print(f'Clase más común es: {np.argmax(np.bincount(classes))}.')

Clase más común es: 4.


In [15]:
np.bincount(classes)

array([0, 2, 1, 3, 4])

In [16]:
np.argmax(np.bincount(classes))

4

Ya de por sí elige aquella clase con la menor etiqueta, como lo solicita la letra.

In [17]:
arr = []

arr = np.append(arr,[1,2])

In [18]:
arr

array([1., 2.])

In [19]:
arr = np.append(arr,[1,2])

In [20]:
arr

array([1., 2., 1., 2.])

## Distancias 1 loop:

Se quiere repetir:

$$\lVert x-y \rVert ^ 2=(x-y)(x-y)^T\Rightarrow \lVert x-y \rVert=\sqrt{(x-y)(x-y)^T},$$

donde $x = \text{X}_{\text{train}}[j]$ y $y = \text{X}_{\text{test}}[i]$.

El asunto ahora es que se quiere realizar en un único loop, siendo que se deberá de utilizar la funcionalidad de `broadcasting`de Python.

La iteración se realizará en un loop a lo largo de las imágenes de Test, por lo tanto se tendrá el vector $y$ como en el caso anterior, pero no figurará el vector $x$, sino que toda la imagen $\text{X}_{\text{train}}$.

$$\lVert \text{X}_{\text{train}}-y \rVert ^ 2=\sum_{i}(\text{X}_{\text{train}}-y)^2,$$

hace las veces del producto interno $(x-y)(x-y)^T$, el cual ya no puede computarse debido a la transposición y diferentes dimensiones que existen en este nuevo caso. Se destaca que $y = \text{X}_{\text{test}}[i]$.

Finalmente, la distancia en el elemento $i$-ésimo resultará:

$$d(i,)=\sqrt{\sum_{i}(\text{X}_{\text{train}}-y)^2}.$$

## Distancias NO loops:

En este caso, sencillamente se nota que:

$$\lVert x-y \rVert ^ 2 = \lVert x \rVert ^ 2 + \lVert y \rVert ^ 2 - 2 x \cdot y ^T$$

Teniendo en cuenta que las dimensiones son diferentes y el broadcasting deberá de realizarse utilizando `[:, np.newaxis]` y `[np.newaxis,:] `.

## Cross Validation:

In [21]:
# Original vector
v = [[1, 2], [3, 4], [5, 6],[7,8],[9,10]]

# Index i
i = 3

# Extract the vector at index i
c = v[i]

# Combine all other vectors into a single list using list comprehension
d = [item for sublist in v[:i] + v[i+1:] for item in sublist]

print(c) 
print(d) 


[7, 8]
[1, 2, 3, 4, 5, 6, 9, 10]


# Softmax:

## Gradients & Loss

In [22]:
X = np.random.randn(500, 3073)
W = np.random.randn(3073, 10) * 0.0001

In [23]:
s = X.dot(W)

In [24]:
s.shape

(500, 10)

In [25]:
loss = 0

In [26]:

loss = loss - np.log((np.e ** s[1,1]) / np.sum(np.e**s) )

In [27]:
s.shape

(500, 10)

In [28]:
2**np.array([1,2,3,4,5,6])

array([ 2,  4,  8, 16, 32, 64])

In [29]:
reg = 0
h = 0.0001
grad = ((W+h)-W)/h
dW = W - reg * grad

In [30]:
p = np.array([1,2,3])

y = np.array([0.1,0.2,0.3])

j = 1

i = 1

p-(j == y[i])


array([1, 2, 3])

In [31]:
j == y[i]

False

## Gradients & Loss Vect

In [32]:
A = np.array([[1,2],[3,4]])

In [33]:
print(A)

[[1 2]
 [3 4]]


In [34]:
np.sum(A)

10

In [35]:
y = [0,1]
s_correct = A[y, range(2)]

In [36]:
s_correct

array([1, 4])

# ANN:

# Layers

# Affine forward:

    Inputs:
    - x: A numpy array containing input data, of shape (N, d_1, ..., d_k)
    - w: A numpy array of weights, of shape (D, M)
    - b: A numpy array of biases, of shape (M,)

    Returns a tuple of:
    - out: output, of shape (N, M)
    - cache: (x, w, b)

Since our input $X$ has a dimension of $(N, d_1, ..., d_k)$, we ought to convert it into a 2D matrix of dimension $(\text{batch\_size},\text{num\_features})$:

$$\text{batch\_size} = N$$ 

In Python:

```python 
batch_size = X.shape[0]
```

For the $\text{num\_features}$, the size of each of the other dimensions must by multiplied:

$$ \text{num\_features} = d_1 \cdot d_2 \cdot \cdots \cdot d_k $$

In Python:

```python 
num_features = np.prod(X.shape[1:])
```

Then the input $X$ will be reshaped to $(\text{batch\_size},\text{num\_features})$ by implementing:

```python

x_reshaped = x.reshaped(batch_size, num_features)

```

In [38]:
np.prod([1,2,3])

6

In [40]:
print(f'Original = {np.arange(6)}.')
print(f'Reshaped = {np.arange(6).reshape((3, 2))}.')


Original = [0 1 2 3 4 5].
Reshaped = [[0 1]
 [2 3]
 [4 5]].


In [42]:
x = np.array([
    [[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]],
    [[13, 14, 15, 16], [17, 18, 19, 20], [21, 22, 23, 24]]
])

print("Original array shape:", x.shape)
print("Original array:\n", x)

Original array shape: (2, 3, 4)
Original array:
 [[[ 1  2  3  4]
  [ 5  6  7  8]
  [ 9 10 11 12]]

 [[13 14 15 16]
  [17 18 19 20]
  [21 22 23 24]]]


In [45]:
x[0]

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

In [None]:
row_dim = x.shape[0]
col_dim = np.prod(x.shape[1:])

x_reshaped = x.reshape(row_dim, col_dim)
print("Reshaped array shape:", x_reshaped.shape)
print("Reshaped array:\n", x_reshaped)


: 

## Affine backward:

Function code:

    Inputs:
    - dout: Upstream derivative, of shape (N, M)
    - cache: Tuple of:
      - x: Input data, of shape (N, d_1, ... d_k)
      - w: Weights, of shape (D, M)
      - b: Biases, of shape (M,)

    Returns a tuple of:
    - dx: Gradient with respect to x, of shape (N, d1, ..., d_k)
    - dw: Gradient with respect to w, of shape (D, M)
    - db: Gradient with respect to b, of shape (M,)
    
    x, w, b = cache
    dx, dw, db = None, None, None
  


Remember, at the forward, $\text{out}$ is implemented as:

$$ y = x_\text{reshaped}\cdot W + b $$

And the backward computes the gradient $d_\text{out}$, and the derivatives $dx$, $dW$ and $db$ as:

$$ \frac{\partial L}{\partial x} = \frac{\partial y}{\partial x} \cdot \underbrace{\frac{\partial L}{\partial y}}_{d_\text{out}} = W^{T}\cdot d_\text{out} \in \mathcal M_{\mathbb R}{(N,d_1,\cdots,d_k)}$$

$$ \frac{\partial L}{\partial W} = \frac{\partial y}{\partial W} \cdot \underbrace{\frac{\partial L}{\partial y}}_{d_\text{out}} =  x_\text{reshaped}^T \cdot d_\text{out} \in \mathcal M_{\mathbb R}{(D,M)}$$

$$ \frac{\partial L}{\partial b} = \frac{\partial y}{\partial b} \cdot \underbrace{\frac{\partial L}{\partial y}}_{d_\text{out}} =  1_{\{\text{vect}\}} \cdot d_\text{out} \in \mathcal M_{\mathbb R}(M)$$

It is important to note, from the precious excersice, that:

$$ D = \text{num\_features} = d_1 \cdot d_2 \cdot \cdots \cdot d_k $$

In [46]:
A = np.array(
    [[1,2,3],
     [1,2,3],
     [1,2,3]]
)

In [50]:
# Para db = 1 * dout
np.sum(A,axis=0)

array([3, 6, 9])

## ReLu:

In [None]:
A = np.array([1,-2,3,-4,5])

print(np.maximum(0,A))

: 

In [54]:
B = np.array([[1,2],[-1,-1]])

In [59]:
B.dot([1,2])

array([ 5, -3])

In [60]:
C = np.maximum(0,B.dot([1,2]))
print(C)

[5 0]


In [61]:
C > 0

array([ True, False])

## Backward for Two Layer Net

In [62]:
a = np.array([1,2,3])
print(a**2)

[1 4 9]


In [63]:
np.exp(1) == np.e ** 1

True