同学们好，欢迎来到第二个模块-数据分析与机器学习！

在这一模块，我们将概览数据分析的全貌，然后重点深入学习机器学习的经典算法的数学原理并且了解其python实现。相比于其他大数据课程，我们更加关注机器学习算法的数学基础。

希望我们能够通过这两个月的学习帮大家打好机器学习的基础！

## 一：矩阵的导数

$a$和$b$是列向量，$X$是矩阵，$x$是向量

求解下列导数
- $\frac{\partial x^Tb}{\partial x}$



- $\frac{\partial a^T X b}{\partial X}$



- $\frac{\partial (Xb+c)D(Xb+c)}{\partial X}$

## 二：迹的导数
定义：方阵$X$的迹（trace）的定义为矩阵对角线上元素之和，
我们有
$tr(X) = \sum_i X_{ii} $。

试求解下列导数
- $\frac{\partial tr(X)}{\partial X}$


- $\frac{\partial tr(XA)}{\partial X}$


### <font color = red>**解题**</font>  
### 一、矩阵倒数

- 第一题：    

解： $\frac{\partial x^Tb}{\partial x}$

由于：$x^Tb = b^Tx = \sum b_i x_i$

并且：$\frac{\partial}{\partial x_j} \Sigma b_i x_i = b_j$

所以：$\frac{\partial x^Tb}{\partial x} = \frac{\partial b^Tx}{\partial x} = b^T$

</br>   
</br>  
- 第二题：  
解：$\frac{\partial a^T X b}{\partial X} = ab^T$



</br>   
</br>   
- 第三题：  
解： $\frac{\partial (Xb+c)^T D(Xb+c)}{\partial X}$

= $\frac{\partial}{\partial X} (b^T X^T DXb + c^T DXb + B^T X^T Dc + c^TDc)$

1. $\frac{\partial}{\partial X} (b^T X^T DXb) = b^T D^T X^T b^T$

2. $\frac{\partial}{\partial X} (c^TD Xb) = \frac{\partial}{\partial X} ((Dc)^T X (b)) = b D^T c$

3. $\frac{\partial}{\partial X} (b^T X^T Dc) = ((b^T) X^T (Dc)) = bc^T D^T$

4. $\frac{\partial}{\partial X} (c^TDc) = 0 $


∴ = $b^T D^T X^T b^T + b D^T c + bc^T D^T$  
 = $((D + D^T)(Xb + c)b^T)^T$



### 二、迹的导数  

- 第一题：  
$\frac{\partial tr(X)}{\partial X}$

= [ ∑ (x_ii / x_11), ∑ (x_ii / x_12), ∑ (x_ii / x_13) ..... ∑ (x_ii / x_1n)

∑ (x_ii / x_21), ∑ (x_ii / x_22), ∑ (x_ii / x_23) ...... ∑ (x_ii / x_2n)
...

∑ (x_ii / x_n1), ∑ (x_ii / x_n2), ∑ (x_ii / x_n3) ...... ∑ (x_ii / x_nn)
]

= I


</br>  
- 第二题：  

$\frac{\partial tr(XA)}{\partial X}$

= $\frac {\partial}{\partial X_{i,j}} \Sigma_{k} X_{k,.} A_{.,k}$

= $\frac {\partial}{\partial X_{i,j}} X_{i,.} A_{.,i}$

= $A_{j,i}$

所以：$\frac{\partial tr(XA)}{\partial X} = A^T$

# 简化版作业（如果对上面的东西有些生疏的话，不如先从这个地方开始）
这里令X是2x2的矩阵
```
A = [1, 4,
     7, 9]
    
a = [2,
     3]

b = [1,
     1]

c = [2,
     5]
```
重新写上面那道题

# 关于numpy函数的当地( in-place )修改

In [16]:
import numpy as np
np.random.seed(1)

In [22]:
# 首先创建array
A = np.random.randn(5)
print(A)

[-2.3015387   1.74481176 -0.7612069   0.3190391  -0.24937038]


In [23]:
# 然后直接调用 np.sqrt
B = np.sqrt(A)
print("A", A)
print("B", B)
# 这里会出现警告（warning），出现了负数的根号，但不会强行停止运行

A [-2.3015387   1.74481176 -0.7612069   0.3190391  -0.24937038]
B [       nan 1.32091323        nan 0.56483546        nan]


  


# 我们如果inplace进行修改，实际上是要如此调用函数
参考numpy的文档 https://docs.scipy.org/doc/numpy/reference/generated/numpy.sqrt.html
我们注意到如果是把结果存在原来的变量，就应该传入out变量，这个实际是inplace操作
## numpy.sqrt
```
numpy.sqrt(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'sqrt'>
Return the non-negative square-root of an array, element-wise.

Parameters:	
    x : array_like
        The values whose square-roots are required.

   out : ndarray, None, or tuple of ndarray and None, optional
        A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

   where : array_like, optional
        This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs
For other keyword-only arguments, see the ufunc docs.

Returns:	
    y : ndarray
        An array of the same shape as x, containing the positive square-root of each element in x. If any element in x is complex, a complex array is returned (and the square-roots of negative reals are calculated). If all of the elements in x are real, so is y, with negative elements returning nan. If out was provided, y is a reference to it. This is a scalar if x is a scalar.
```


In [24]:
# 首先当地修改A
np.sqrt(A, out=A)
print("A", A)
np.sqrt(B, B)
print("B", B)

A [       nan 1.32091323        nan 0.56483546        nan]
B [       nan 1.14930989        nan 0.75155536        nan]


  
