# Symbols and Conventions
*This page lists the symbols, notations and conventions used in this project*
***
Let's say there are $n$ training examples, and the predictors of each example is represented by a vector $x_i$ whose length is equal to the number of features in the training example, $d$. Each label is a scalar represented by $y_i$.

The matrix of the predictor variables is represented by $X$ whose shape is $n \times d$ in which each row is one training example. The labels are represented by a matrix of shape $n \times 1$ in which each row is a scalar and the label corresponding to the training example.

The weight of each feature in the prediction of the label is represented by $w_j$ where $j \in \{1,2,3,...,d\}$. The wight matrix is represented by $W$ of shape $d \times 1$.

$$X =  \left[ \begin{matrix} 
x_{11} & x_{12} & ... & x_{1j} & ... &x_{1d}  \\
x_{21} & x_{22} & ... & x_{2j} & ... &x_{2d}  \\
x_{31} & x_{32} & ... & x_{3j} & ... &x_{3d}  \\
& &....... \\
x_{i1} & x_{i2} & ... & x_{ij} & ... &x_{id}  \\
& &....... \\
x_{n1} & x_{12} & ... & x_{nj} & ... &x_{nd}  \\
\end{matrix} \right]$$

$$Y =  \left[ \begin{matrix} 
y_1 \\
y_2 \\
y_3 \\
. \\
y_i \\
. \\
y_n \\
\end{matrix} \right]$$

$$W =  \left[ \begin{matrix} 
w_1 \\
w_2 \\
w_3 \\
. \\
w_j \\
. \\
w_d \\
\end{matrix} \right]$$

$$i \in \{1,2,3,...,n\} \text{ and } j \in \{1,2,3,...,d\}$$

Regularization parameter is represented by $\lambda$

A generalized form of a linear model can be written as
$$g(Y) = XW$$

A generalized loss function can be written as 
$$\mathcal{L}(y_i, \hat{y_i}) = f(y_i, x_i W)$$

A generalized cost function with regularization penalty can be written as
$$F(W) = \sum_{i=i}^n\mathcal{L}(y_i, x_iW) + \lambda \Omega(W)$$


## Implementation Notes
In the python implementation using [numpy](https://www.numpy.org/), the shape of $X$ must be `(n, d)` and shape of $Y$ must be `(n, 1)`. If you pass $Y$ with a shape `(n,)`, it will automatically be converted to `(n, 1)` with a warning. The predictions, $\hat{Y}$ would be of shape `(n, 1)`. The weight vector $W$ would be of shape `(d, 1)`