# Derivative

The linear function

$$
    f(x) = ax + b
$$

is nice and simple, and its graph is a straight line. Using **differentiation** and **differentials** one can in some way reduce any **smooth** function to a linear one!


<!--
## Derivative

В курсе матанализа изучаются произвольные функции действительного аргумента,
однако, на практике в целом и в машинном обучении в частности обычно встречаются **дифференцируемые** (**гладкие**) функции, локально «похожие» на простую и понятную линейную функцию. «Малому приращению» $h=\Delta x$ аргумента гладкой функции $f$ соответствует «малое приращение» $\Delta f = f(x+h) - f(x)$ её значения, приблизительно пропорциональное $h$: $f(x+h) - f(x) \approx L(x)h$. Более формально, функция $f\colon \mathbb R \to\mathbb R$ дифференцируема в точке $x$, если 

$$
f(x + h) - f(x) = L(x)h + o(h), \quad \text{ где } L(x) = \lim\limits_{h\to 0} \frac{f(x+h)-f(x)}h =: f'(x)
$$

— **производная** функции $f$ в точке $x$. Геометрически это означает, что график функции $y=f(x)$ имеет касательную в точке $(x, f(x))$.

```{image} https://sites.millersville.edu/bikenaga/calculus1/derivatives/derivatives4.png
:align: center
```

-->

In [27]:
import numpy as np
import plotly.graph_objects as go

# Define the sine function and its derivative (tangent line)
def sine_function(x):
    return np.sin(x)

def tangent_line(x, x0):
    slope = np.cos(x0)  # Derivative of the sine function
    return slope * (x - x0) + np.sin(x0)

# Generate x values
x_values = np.linspace(-np.pi, np.pi, 1000)

# Choose a point where you want to find the tangent line
point_of_interest = np.pi / 3

# Generate y values for sine function and tangent line
y_sine = sine_function(x_values)
y_tangent = tangent_line(x_values, point_of_interest)

# Create the plot
fig = go.Figure()

# Plot the sine function
fig.add_trace(go.Scatter(x=x_values, y=y_sine, mode='lines', name='Sine Function',  line=dict(width=3)))

# Plot the tangent line
fig.add_trace(go.Scatter(x=x_values, y=y_tangent, mode='lines', name='Tangent Line',  line=dict(width=2)))

# Highlight the point of interest
fig.add_trace(go.Scatter(x=[point_of_interest], y=[sine_function(point_of_interest)],
                         mode='markers', marker=dict(size=8, color='red'), name='point'))

fig.add_hline(y=0)
fig.add_vline(x=0)

for trace in fig['data']: 
    if trace['name'] == 'point':
        trace['showlegend'] = False


# Add labels and title
fig.update_layout(title='y = sin(x)',
                  xaxis_title='x',
                  yaxis_title='y')

# Show the plot
fig.show()


In a similar manner any smooth function is linear in a small neighborhood of the tangent point:

$$
    f(x + h) \approx f(x) + f'(x)h \text{ for small } h.
$$

A more scientific definition:

$$
    f(x + h)= f(x) + f'(x)h + O(h^2), \quad h\to 0.
$$

```{admonition} Strict definition
:class: dropdown
The **derivative** of $f \colon (x-\delta, x+\delta) \to \mathbb R$, $\delta > 0$, at point $x$ is

$$
f'(x) = \lim\limits_{h\to 0} \frac{f(x+h)-f(x)}h.
$$

Existence of derivative $f'(x)$ is equivalent to **differentiability** of $f$ at point $x$:

$$
    f(x + h) - f(x) =  f'(x)h + o(h), \quad h\to 0.
$$
```

If the function $f'(x)$ is also differentiable, then its derivative is called the **second derivative** of $f$: $f''(x) =\frac d{dx}(f'(x))$. By indtuction, $n$-th derivative is defined as

$$
    f^{(n)}(x) = \frac d{dx}\big(f^{(n-1)}(x)\big).
$$

## Differential

The function $df(x ,h) = f'(x)h$ is called **differential** of $f$ at point $x$. Note that it is a function of two variables $x$ and $h$, and the dependency on $h$ is linear.

```{important}
Due to historical reasons, the **increment** $h$ is often denoted as $dx$; then the formula for the differential is

$$
    df = f'(x)dx.
$$
```

Differential is the main linear part of the increment $\Delta f = f(x + h) - f(x)$.

## Rules of differentiation

1. $f'(x) \equiv 0$ if $f(x)\equiv \mathrm{const}$

2. $(\alpha f(x) + \beta g(x))' = \alpha f'(x) + \beta g'(x)$

3. $(f(x)g(x))' = f'(x) g(x) + f(x) g'(x)$

4. $\big(\frac{f(x)}{g(x)}\big)' = \frac{f'(x) g(x) - f(x) g'(x)}{g^2(x)}$ if $g(x) \ne 0$

5. $(f(g(x)))' = f'(g(x)) g'(x)$ (**chain rule**)

## Applications of derivatives

1. If $f'(x) > 0$ ($f'(x) < 0$) for all $x\in (a, b)$, then $f$ is increasing (decreasing) on $(a, b)$.

2. If $f'(x) = 0$ and $f''(x) > 0$ ($f''(x) < 0$), then $x$ is a local minimum (maximum) of $f$.

3. If $f''(x) > 0$ ($f''(x) < 0$) for all $x\in (a, b)$, then $f$ is strictly convex (concave) on $(a, b)$

```{image} https://i.stack.imgur.com/GNBZ4.png
:align: center
```

## Exercises

1. Show that $\sigma'(x) = \sigma(x) (1 - \sigma(x))$ where

    $$
        \sigma(x) = \frac 1{1 + e^{-x}}
    $$
    
    — *sigmoid* function.
    
2. Find $\max\limits_{x\in\mathbb R}\sigma'(x)$.