# Zengxiang Zhao

#### Joint probability (of co-occurrence) is most important in science
#### and engineering in general, and in machine learning in particular.
#### Refer to what was said about the GLCM (grey level co-occurrence matrix)
#### of an image for example.
#### The purpose of this assignment is to implement the computation
#### of many information-theoretic measures associated with such joint probability,
#### using TensorFlow functions.
####   ...
#### Refer to modules 4 and 5 again, if necessary. 

# ....

##### Recall that  log base 2 of x  =  (log base 2 of e) * (log base e of x)

##### So,    log base 2 of x =  1.44269 * log base e of x

###  .....................

# ............

#### 1. The usual import(s)

In [1]:
import tensorflow as tf

#### 2. The usual Session object

In [2]:
sess = tf.Session()

#### 3. Define a TensorFlow placeholder to hold the joint probability J.
#####    Sizes are not fixed in advance, so that the below functions
#####    may be used on any such joint probability.

In [3]:
J = tf.placeholder(tf.float32, shape=[None] )

####  The actual 3 joint probability tables must be the following:

In [4]:
p1q1 = [[0.125, 0.125], [0.5, 0.25]]
p2q2 = [[0.08,0.01,0.10], [0.10,0.20,0.01], [0.06,0.08,0.20],[0.03,0.08,0.05]]
p3q3 = [[0.,1.], [0.,0.]]

#### 4. Define the joint entropy function for any joint probability table J,
####     using appropriate TensorFlow functions.
####     This entropy must be base 2.
####     It should never produce NaN  ot Inf.

In [5]:
def joint_entropy(J):
    sh = tf.shape(J)
    Q = tf.fill(sh, 1e-10)
    return - tf.reduce_sum(tf.multiply(J, tf.log(J+Q)*1.44269))

#### 5. Show the joint entropy for p1q1, p2q2, p3q3 above.

In [6]:
print('Joint entropy of p1q1: ', sess.run(joint_entropy(p1q1)))
print('Joint entropy of p2q2: ',  sess.run(joint_entropy(p2q2)))
print('Joint entropy of p3q3: ',  sess.run(joint_entropy(p3q3)))

Joint entropy of p1q1:  1.7499939
Joint entropy of p2q2:  3.2119453
Joint entropy of p3q3:  -0.0


#### 6. Define the function that extracts the probability vector
####     of the first component in the probability table (marginalization).

In [7]:
def first(J):
    return tf.reduce_sum(J,1)
    

#### 7. Show the probability vector for the first component of p1q1, p2q2, p3q3.

In [8]:
print('First component of joint p1q1: ', sess.run(first(p1q1)))
print('First component of joint p2q2: ', sess.run(first(p2q2)))
print('First component of joint p3q3: ', sess.run(first(p3q3)))

First component of joint p1q1:  [0.25 0.75]
First component of joint p2q2:  [0.19 0.31 0.34 0.16]
First component of joint p3q3:  [1. 0.]


#### 8. Define the function that extracts the probability vector
####     of the second component in the probability table (marginalization).

In [9]:
def second(J):
     return tf.reduce_sum(J,0)

#### 9. Show the probability vector for the second component of p1q1, p2q2, p3q3.

In [10]:
print('Second component of joint p1q1: ', sess.run(second(p1q1)))
print('Second component of joint p2q2: ', sess.run(second(p2q2)))
print('Second component of joint p3q3: ', sess.run(second(p3q3)))

Second component of joint p1q1:  [0.625 0.375]
Second component of joint p2q2:  [0.27 0.37 0.36]
Second component of joint p3q3:  [0. 1.]


#### 10. Define the entropy function of the first component 
####     for any joint probability table J, using appropriate TensorFlow functions.
####     This entropy must be base 2.
####     It should never produce NaN  ot Inf.

In [11]:
def entropy_first(J):
    first_component = first(J)
    sh= tf.shape(first_component)
    Q = tf.fill(sh, 1e-10)
    return -tf.reduce_sum(tf.multiply(first_component,tf.log(first_component+Q)*1.44269))


#### 11. Show the entropy for the first component of p1q1, p2q2, p3q3.

In [47]:
print('The entropy for the first component of p1q1: ', sess.run(entropy_first(p1q1)))
print('The entropy for the first component of p2q2: ', sess.run(entropy_first(p2q2)))
print('The entropy for the first component of p3q3: ', sess.run(entropy_first(p3q3)))




The entropy for the first component of p1q1:  0.81127536
The entropy for the first component of p2q2:  1.9312049
The entropy for the first component of p3q3:  -0.0


#### 12. Define the entropy function of the second component 
####     for any joint probability table J, using appropriate TensorFlow functions.
####     This entropy must be base 2.
####     It should never produce NaN  ot Inf.

In [48]:
def entropy_second(J):
    second_component = second(J)
    sh= tf.shape(second_component)
    Q = tf.fill(sh, 1e-10)
    return -tf.reduce_sum(tf.multiply(second_component,tf.log(second_component+Q)*1.44269))


#### 13. Show the entropy for the second component of p1q1, p2q2, p3q3.

In [57]:
print('The entropy for the second component of p1q1: ', sess.run(entropy_second(p1q1)))
print('The entropy for the second component of p2q2: ', sess.run(entropy_second(p2q2)))
print('The entropy for the second component of p3q3: ', sess.run(entropy_second(p3q3)))



The entropy for the second component of p1q1:  0.9544307
The entropy for the second component of p2q2:  1.5713603
The entropy for the second component of p3q3:  -0.0
0.93871856


#### 14. Define the conditional entropy function of the first component 
####       if (with respect to) second component, for any joint probability table J, 
####       using appropriate TensorFlow functions.
####     Should be easy, if you recall your entropy formulas.

In [60]:
def conditional_entropy_first_if_second(J):
    return joint_entropy(J)-entropy_second(J)
    


#### 15. Show the conditional entropy H(first | second) for the joint p1q1, p2q2, p3q3.

In [61]:
print(' H(first | second) for the joint p1q1: ', sess.run(conditional_entropy_first_if_second(p1q1)))
print(' H(first | second) for the joint p2q2: ', sess.run(conditional_entropy_first_if_second(p2q2)))
print(' H(first | second) for the joint p3q3: ', sess.run(conditional_entropy_first_if_second(p3q3)))

 H(first | second) for the joint p1q1:  0.7955632
 H(first | second) for the joint p2q2:  1.640585
 H(first | second) for the joint p3q3:  0.0


#### 16. Define the conditional entropy function of the second component 
####       if (with respect to) first component, for any joint probability table J, 
####       using appropriate TensorFlow functions.

In [62]:
def conditional_entropy_second_if_first(J):
    return joint_entropy(J)-entropy_first(J)


#### 17. Show the conditional entropy H(second | first) for the joint p1q1, p2q2, p3q3.

In [65]:
print(' H(second | first) for the joint p1q1: ', sess.run(conditional_entropy_second_if_first(p1q1)))
print(' H(second | first) for the joint p2q2: ', sess.run(conditional_entropy_second_if_first(p2q2)))
print(' H(second | first) for the joint p3q3: ', sess.run(conditional_entropy_second_if_first(p3q3)))

 H(second | first) for the joint p1q1:  0.93871856
 H(second | first) for the joint p2q2:  1.2807404
 H(second | first) for the joint p3q3:  0.0


### ....

#### 18. Define the mutual information function between the first component 
####       and the second component, for any joint probability table J, 
####       using appropriate TensorFlow functions.
####     Should be easy, if you recall your entropy formulas.

In [66]:
def mutual_information_first_and_second(J):
    return entropy_first(J) + entropy_second(J) - joint_entropy(J)


#### 19. Show the mutual information I(first;second) for the joint p1q1, p2q2, p3q3.

In [68]:
print('mutual information I(first;second) for the joint p1q1: ', sess.run(mutual_information_first_and_second(p1q1)))
print('mutual information I(first;second) for the joint p2q2: ', sess.run(mutual_information_first_and_second(p2q2)))
print('mutual information I(first;second) for the joint p3q3: ', sess.run(mutual_information_first_and_second(p3q3)))





mutual information I(first;second) for the joint p1q1:  0.015712142
mutual information I(first;second) for the joint p2q2:  0.2906201
mutual information I(first;second) for the joint p3q3:  0.0


#### 20. Define the mutual information function between the second component 
####       and the first component, for any joint probability table J, 
####       using appropriate TensorFlow functions.

In [70]:
# mutual information is symetric 
def mutual_information_second_and_first(J):
        return entropy_first(J)-conditional_entropy_first_if_second(J)


#### 21. Show the mutual information I(second; first) for the joint p1q1, p2q2, p3q3.
####    .......
###  Since mutual information I(X;Y) is symmetric, the results must be identical to the above.

In [71]:
print('mutual information I(second;first) for the joint p1q1: ', sess.run(mutual_information_second_and_first(p1q1)))
print('mutual information I(second;first) for the joint p2q2: ', sess.run(mutual_information_second_and_first(p2q2)))
print('mutual information I(second;first) for the joint p3q3: ', sess.run(mutual_information_second_and_first(p3q3)))





mutual information I(second;first) for the joint p1q1:  0.015712142
mutual information I(second;first) for the joint p2q2:  0.29061997
mutual information I(second;first) for the joint p3q3:  -0.0


## .....

### KL (and other divergences, distances) apply only to distributions
### of the same length (on the same space of outcomes)

#### 22. Define the Kullback-Leibler (relative entropy) function  between the first component 
####     and the second component for any joint probability table J, using appropriate TensorFlow functions.
####     This entropy must be base 2.
####     It should never produce NaN  ot Inf.

In [74]:
def kullback_leibler_first_and_second(J):
    first_component= first(J)
    second_component = second(J)
    sh = tf.shape(first_component)
    Q = tf.fill(sh, 1e-10)
    return tf.reduce_sum(tf.multiply(first_component,tf.log((first_component+Q)/(second_component+Q))))


#### 23. Show the Kullback-Leibler divergence KL(first;second) for the joint p1q1 and p3q3.
####       p2q2 cannot be used.

In [78]:
print('KL(first;second) for the joint p1q1: ', sess.run(kullback_leibler_first_and_second(p1q1)))
print('KL(first;second) for the joint p3q3: ', sess.run(kullback_leibler_first_and_second(p3q3)))





KL(first;second) for the joint p1q1:  0.2907877
KL(first;second) for the joint p3q3:  23.02585


#### 24. Define the Kullback-Leibler (relative entropy) function  between the second component 
####     and the first component for any joint probability table J, using appropriate TensorFlow functions.
####     This entropy must be base 2.
####     It should never produce NaN  ot Inf.

In [81]:
def kullback_leibler_second_and_first(J):
    first_component= first(J)
    second_component = second(J)
    sh = tf.shape(first_component)
    Q = tf.fill(sh, 1e-10)
    return tf.reduce_sum(tf.multiply(second_component,tf.log((second_component+Q)/(first_component+Q))))




#### 25. Show the Kullback-Leibler divergence KL(second;first) for the joint p1q1 and p3q3.
####       p2q2 cannot be used.
####   ....
###    KL(X;Y) is not symmetric, so you would not always get the same results as above.

In [82]:
print('KL(second;first) for the joint p1q1: ', sess.run(kullback_leibler_second_and_first(p1q1)))
print('KL(second;first) for the joint p3q3: ', sess.run(kullback_leibler_second_and_first(p3q3)))



KL(second;first) for the joint p1q1:  0.31275153
KL(second;first) for the joint p3q3:  23.02585


# ................

###   There is a true metric (true distance) defined for probability distributions of the same
###    length, shown in my write-ups.
### ...
####   26. Define this true metric function between the components of any joint probability table J, 
####   using the conditional entropy functions you defined above.

In [83]:
def distance_first_second(J):
    return conditional_entropy_first_if_second(J) + conditional_entropy_second_if_first(J)


#### 27. Show the distance distance(first,second) for the joint p1q1 and p3q3.
####       p2q2 cannot be used.
####   ....

In [85]:
print('Distance(first,second) for the joint p1q1: ', sess.run(distance_first_second(p1q1)))
print('Distance(first,second) for the joint p3q3: ', sess.run(distance_first_second(p3q3)))




Distance(first,second) for the joint p1q1:  1.7342818
Distance(first,second) for the joint p3q3:  0.0


# ...