In [1]:
import matplotlib.pyplot as plt
import numpy as np

# JNB Lab: Calculus of Entropy in Daily Life

The goal of this lab is to explore how calculus cam be used to explore the concept of entropy as might be useful within 

## 1. Derivative Optimization 

Given a function $y=f(x)$, the derivative $f'(x)$ evalauted at $x=a$ gives us the slope of the tangent (instantaneous rate of change)  to the graph of $y=f(x)$ ad the point $(a,f(a))$.  Maximum and minimum points are special since the tangents are horitzontal at those points and hence the derivative satisfies the application $f'(x)=0$. Maximum and minimum points are important in applications since, for example, minimizing cost or maximizing benefit of a project is a primary goal. 

Entropy is an important example discussed in the section on societal applications in the final chapter (complex systems) of this book. We illustrate here why minimization/maximazation of entropy is important.  Suppose you have  a committee  of $N$ people voting on a certain decision such that a proportion $p_1=p$ strongly prefers option A and a proportion $p_2=1-p$ strongly prefers option B .  Note that there is no disagreement (disorder) if $p_1=1$ and $p_2=0$ (everyone strongly prefers option A) and likewise there is no disagreement (disorder) if $p_1=0$ and $p_2=1$ (everyone strongly prefers option B.)  Note that if $p_1=.9$ and $p_2=.1$, there is a strong consensus for Option A so while there is a little bit of disagreement (disorder), Option A will likely carry.  Note that the case where the population is evely split $p_1=p_2=.5$ will have the greatest disagreement (disorder) in trying to achieve a consensus. 


Entropy is a function used to measure disorder within a system. In this case, we can define the entropy as

$$
H(p) = -\sum_{i=1}^2 p_i \ln p_i = -[p\ln p + (1-p)\ln(1-p)].
$$

where $0\le p \le 1$.  Exercise 1.1 shows the properties of this function correspond to the degree of disorder in our example.  (Note: We could generalize the definition from 2 to $k\ge 1$ options, in which case $H= -\sum_{i=1}^k p_i \ln p_i$ where $p_i$ is the proportion of people who are strongly inclined to vote for option $i$, so that $0<p_i\le 1$ and $p_1+...p_k=1$.)

### Exercises

```{admonition} Exercises

<b>1.1</b>

a) Make a plot of $H(p)= -[p\ln p + (1-p)\ln(1-p)]$ on the interval $0<p<1$

b) Use L'Hopital's rule to evaluate the one-sided limits $H(0)=\lim_{p\rightarrow 0^+} H(p)$ and $H(1)=\lim_{p\rightarrow 1^-} H(p)$.

c) Use calculus to prove that a maximum value of $H(p)$ occurs at $p=.5$.

```

## 2. Series

Let us now switch contexts to a region which has $N$ refugee camps or icamps of internally displaced people (IDP). In this case, we consider needs 1,2,...,k where 1=food, 2=water, 3=shelter, ... and $p_i$ is the proportion of camps whose greatest need is $i$ (we assume each camp has a greatest need and so $p_1+....+p_k=1$.

$$
H= -\sum_{i=1}^k p_i
$$

From the point of view of disaster response, $H$ as a measure of disorder could be interpreted as a measure of the complexity of the response.  For example, if every camp has the same greatest need (labelled 1), then $p_1=1$ and and $p_i=0$ otherwise, in which case $H=-1\ln 1 = 0$. Conceptually the response is simple since every camp's greatest need is for food and so food is the first priority sent to every camp.


### Exercise


```{admonition} Exercise
<b>2.1</b>

a) Compute the entropy $H$ in the case where each of the $N$ camps has a different first priority need. Is $H$ a strictly increasing function of $N$?  Is $H$ bounded?

b) Compute the entropy $H$ in the case where each of $k$ needs is equally likely to be a first priority need.  Make a plot of $H$ as a function of $k$ for $k=1,...,10$.

```

## 3. Probability Distributions

Now let us consider a situation where you are holding a dinner and there is a certain probability $p_k$ that a guest will arrive $k$ hours late.  If all guests arrive on time, $p_0=1$ and all other $p_k=0$, so the entropy $H=0$. This is in certain cultures an ideal situation where there is no disorder and you can start the dinner right on time.

In other cases, it might be normative for guests to arrive later than the announced time.  We can use a probability distribution to model the arrival time.  For example, if the mean arrival time is 10 minutes after the designated time, then we consider the probability density function  $f(x)=\frac{1}{10}e^{-x/10}$ where $x\in[0,\infty)$.  The probability that a guest arrives in the interval $a\le x \le b$ is given by the definite integral

$$
prob(a\le x \le b) = \int_a^b f(x)\, dx = (1/10) \int_a^b e^{-x/10}= (1/10)(-10) e^{-x/10} \mid_a^b = e^{-b/10}+e^{-a/10}. $$

For the case where we wish to know the probability that a guest will arrive up to 10 minutes after the designated time, we set $a=0$ and $b=10$ so the probability is $1-e^{-1}\approx .63$.


## Exercise

```{admonition} Exercise
<b>3.1</b> An exponential distribution has the form $f(x)=ke^{-kx}$ where $k>0$ and $0\le x < \infty$.

a) Show that $prob(0\le x < \infty) = 1$.

b) Given a continuous probability distribution $f(x)$ for $x\in I$, the mean is computed as $\int_{x\in I} x f(x)\, dx$. Find the mean for the exponential distribution in terms of $k$

```

## 4. Monte Carlo Simulation

Let us consider the context of teaching a class of math students with a random variable $X$ denoting a standardized entrance test score. If $X<0$, the score is below average, and if $X>0, the score is above average.  Note that if all students are average, teaching the class is simpler than if there is a big spread in the students' range of ability. Entropy in this case is interpreted in this case to mean complexity. 

Here we would like to compare two distributions with domain $-\infty x \infty$:

$f(x) = \frac{1}{\sqrt{2\pi}} e^{-x^2}$ (standard unit normal distribution)

$g(x) = \frac{1}{\pi} \frac{1}{1+x^2}$ (Cauchy distribution)

In particular, we would like to compare the contribution to the complexity (entropy) for the tails $x<-3$ and $x>3$.
We will classify students as 'usual' if $-3\le X \le 3$ and `exceptional' if $x>3$ or $X<-3$.

In practical terms, we are asking how the complexity is impacted by very strong or very weak students for these two distributions.  

Intuitively we know for a normal distribution, there is only a .3\% chance of being more than 3 standard deviations above or below the mean.  So for a class, say of 10 students, we would expect .03 students to be exceptional. So there would be eseentially no contribution to the complexity from a 

On the other hand, for a Cauchy distribution the probability is .2 to get a very strong or a very weak student, and so the for a class of 10 students, we would expect to see 2 exceptional students.  The contribution to complexity is $H_{tail} = -.2 ln (.2)\approx .1$.


To estimate the complexity of a normally distributed vs. Cauchy distributed class, we can use a Monte Carlo simulation.
That is, we draw a sample of 10 random values, determine the proportion of usual and exceptional students, and compute the entropy. We illustrate this for a routine class (normally distributed) and ask you to do the analysis of a Cauchy distributed class.


In [2]:
trials=10000
class_size=10
s = np.random.standard_normal(trials*class_size)

Hroutine=[]
Hexceptional=[]
Hsum=[]
k=0

for j in np.arange(0,trials,1):
    routine=0
    exceptional=0
    for i in np.arange(0,10,1):   
        if s[k]>3 or s[k]<-3:
            exceptional=exceptional+1
            k=k+1
        else: 
            routine=routine+1
            k=k+1
    if routine>0:
        p=routine/10
        Hroutine.append(-p*np.log(p))
    else:
        Hroutine.append(0)
    if exceptional>0:
        p=exceptional/10
        Hexceptional.append(-p*np.log(p))
    else:
        Hexceptional.append(0)
    Hsum.append(Hroutine[j]+Hexceptional[j])
        
H=np.mean(Hsum)
Hex=np.mean(Hexceptional)

print("Entropy of Normal Class=", H)  
print("Contribution to Entropy by Exceptional Students=", Hex)

Entropy of Normal Class= 0.00879732781927398
Contribution to Entropy by Exceptional Students= 0.006221442622110208


### Exercise

```{admonition} Exercise

a) Skecth the unit normal and Cauchy distributions on the interval $-5\le x \le 5$. What do you note about the tails of these distributions?

b) Do a Monte Carlo simulation with 10,000 trials to estimate the entropy of a Cauchy distributed class with 10 students as well as the contribution to the entropy by exceptional students.

```