# Jensen's inequality

Used as definition for convex/concave functions. Convex functions satisfy jensen's inequality, and conversely, any function which satisfies Jensen's inequality is a convex function. 

As a converse, concave functions satisfy the reverse of jensen's inequality

jensen's inequality has a classical form , using functions of numbers, and a more general probabilistic form defined for functions of random variables

## Classical form of Jensen's inequality

If f is a convex function, with numbers xi(x1...xn) in its domain, and positive weights ai(a1..an),

Then



$f(\frac{\sum_{i}{ai*xi}}{\sum_{i}{ai}})$ <= $\frac{\sum_{i}{ai*f(xi)}}{\sum_{i}{ai}}$


If f is concave, the converse holds true : 

$f(\frac{\sum_{i}{ai*xi}}{\sum_{i}{ai}})$ >= $\frac{\sum_{i}{ai*f(xi)}}{\sum_{i}{ai}}$


Specifically, if n = 2 (ie if we just have two points x1 and x2),
then defining lambda1 = a fraction (a1/(a1+a2) in equation above) and lambda2 = 1-lambda1 (a2/a1+a2) :


If f is convex

$f(lambda1*x1 +  lambda2*x2)$  <= $lambda1*f(x1) + lambda2*f(x2)$


### Geometric interpretation of the two point case

For a convex function, the secant line (line connecting two points) lies above the graph of the function

![fig1](jensen_geometric.png "wiki jensen's inequality") 

Image Credit - From wikipedia page on Jensen's inequality

## Probabilistic form of Jensen's inequality

If X is a random variable, and f is a convex function, then

f(E(X)) <= E(f(X))

Proof : 

Pick a point x from RV X on a convex function f 

We can always define a tangent line a+bx at X=x to the function f

Since the line is a tangent, for a convex function , it is geometrically true that curve above tangent

ie f(x) >= a + bx

This is true for every x in X, which means it is true for all of X

Therefore, f(X) >= a+BX

Take Expectation on both sides

Since Expectation is a linear operator, it preserves inequality

Therefore, E(f(X)) >= E(a+BX)

Specifically, at the point where tangent is drawn, say mu, f(mu) = a + b*mu

Therefore, E(f(X)) >= f(E(X))

Hence proved

## Examples

1) $X^{2}$ is a convex function (a purely quadratic function.

Therefore , $E(X^{2}) >= (E(X))^{2}$ by Jensen's inequality

But $E(X^{2}) -  (E(X))^{2}$  is nothing but Var(X) which we know is always >= 0

## Uses

1) In EM algorithm proof
2) In lots of proof entropy and conditional entropy. For example, proof that relative entropy >= 0, or H(Y|X) <= H(Y)

## References

1) https://en.wikipedia.org/wiki/Jensen%27s_inequality