# Eigen vectors

> Eigen vectors are unit vectors, which means that their length or magnitude is equal to 1.0.<br>
  They are often referred as right vectors, which simply means a column vector (as opposed to a row vector or a left vector).<br>
  A right-vector is a vector as we understand them.

> An eigenvector is a vector whose direction remains unchanged when a linear transformation is applied to it.<br> Consider the image below in which three vectors are shown.<br> The green square is only drawn to illustrate the linear transformation that is applied to each of these three vectors.

<img src="images_detail/eigenvectors.png" style="width:300px; background:white;"/>

> Eigenvectors (red) do not change direction when a linear transformation (e.g. scaling) is applied to them. Other vectors (yellow) do.

>example of initialising an eigen vector
***

In [8]:
from numpy import array
from numpy.linalg import eig
# define matrix
Arr= array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(Arr)
# calculating the values
values, vectors = eig(Arr)
print("\nEigen vectors from numpy library\n")
print(vectors)

[[1 2 3]
 [4 5 6]
 [7 8 9]]

Eigen vectors from numpy library

[[-0.23197069 -0.78583024  0.40824829]
 [-0.52532209 -0.08675134 -0.81649658]
 [-0.8186735   0.61232756  0.40824829]]


***

# Eigen values

> Eigenvalues are coefficients applied to eigen-vectors that give the vectors their length or magnitude.<br>
  For example, a negative eigenvalue may reverse the direction of the eigenvector as part of scaling it.

> example from the code above
***

In [10]:
print(values)

[ 1.61168440e+01 -1.11684397e+00 -9.75918483e-16]


***

> confirming the eigen vectors

In [11]:
B = Arr.dot(vectors[:, 0])
print(B)

C = vectors[:, 0] * values[0]
print(C)

[ -3.73863537  -8.46653421 -13.19443305]
[ -3.73863537  -8.46653421 -13.19443305]


> The example multiplies the original matrix with the first eigenvector and compares it to the first eigenvector multiplied by the first eigenvalue.<br>
Running the example prints the results of these two multiplications that show the same resulting vector, as we would expect.

<br><br><br><br><br>
# Regularisation

>before i start explaing regularisation let me explain underfitting and over fitting
***
### Underfitting:
   
   >if model is not doing well with training set,this condition is known as high bias or
	 underfitting.

    - Why Underfitting:
    -----------------	
	>less no. of features
	>features are not scaled
	>less no. of samples
	>hyper-parameters are not tuned properly
	>algo is not sensitive to dateset	

    - How to increase features:
    ------------------------
	>we may use polynomial features to add extra features
	>PolynomialFeatures class can be used to implement this concept    
***
***
### Overfitting:

  >And if a model is not doing well with the testing dataset, this condition is known as ‘High-variance’ or over fitting
  
  - and to tackel this high variance problem we use two methods:
    >reducing the number of features<br>
    >Regularisation

>Consider the case of fitting a function of degree 10. The hypothesis would be like:<br><br>
<img src="images_detail/hypothesis.png" style="width:400px" /> <br>
If we penalize theta_10 and make it very small(almost equal to zero) then the hypothesis would be reduced to an equation of degree 9 which will be the optimal fit for the data.<br>
This is a general idea behind Regularization. Instead of reducing just one parameter, we will be penalizing all the parameters. This will give rise to a simpler hypothesis that is less prone to overfitting.

> the cost function to any regression model is:<br>
<img src="images_detail/cost_fun.png" style="width:400px" />

>There are generally two type of regularisation

### 1. L1 regularization or LASSO regression.
### 2. L2 regularization or Ridge regression.

***
# Lasso regression

<img src="images_detail/lasso.jpg" style="width:800px" />

***
<br>

***
# Ridge regression

<img src="images_detail/ridge.jpg" style="width:800px" />

***

- for practical code implementatin of lasso and ridge regression model use:<br>
<code>from sklearn.linear_model import Ridge,Lasso <br>
rd=Ridge(alpha=30)<br>
ls=Lasso(alpha=30)</code>

# Logistic Sigmoid, Tanh AS A BASIC FUNCTION
***

## Logistic Sigmoid:

>The input to the function is transformed into a value between 0.0 and 1.0.

>For a long time, through the early 1990s, it was the default activation used on neural networks.

<CODE>values=[5.0,0.0,-2.3]
print(tf.nn.sigmoid(values))</CODE>

***

## Tanh:(hyperbolic tangent)


>The input to the function is transformed into a value between -1.0 and 1.0.

>In the later 1990s and through the 2000s, the tanh function was preferred over the sigmoid activation function.

- Note:the hyperbolic tangent activation function typically performs better than the logistic sigmoid.
	
<CODE>values=[5.0,0.0,-2.3]
print(tf.nn.tanh(values))</CODE>

***

- Problems with both:

>A general problem with both the sigmoid and tanh functions is that they saturate. This means that large values snap to 1.0 and small values snap to -1 or 0 for tanh and sigmoid respectively. <br>
saturate: output value is bound in a range and not free
