# Compositional Nutrient Diagnosis

Compositional Nutrient Diagnosis (CND) is the multivariate expansion of CVA and DRIS
and is fully compatible with PCA. CND nutrient indices
are composed of two separate functions, one considering differences between nutrient levels, another examining
differences between nutrient balances (as defined by nutrient geometric means), of individual and target specimens.
These functions indicate that nutrient insufficiency can be corrected by either adding a single nutrient or taking
advantage of multiple nutrient interactions to improve nutrient balance as a whole.

In [83]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

## DRIS

In [131]:
Z_dris = np.log(x / np.transpose(x))
Z_dris

array([[ 0.        ,  1.16567561, -0.108207  ,  0.68469028,  1.56787999,
         0.87518704,  0.51983389, -0.2322291 , -0.16603979, -0.53872644],
       [-1.16567561,  0.        , -1.27388262, -0.48098533,  0.40220437,
        -0.29048858, -0.64584173, -1.39790472, -1.3317154 , -1.70440205],
       [ 0.108207  ,  1.27388262,  0.        ,  0.79289728,  1.67608699,
         0.98339404,  0.62804089, -0.1240221 , -0.05783278, -0.43051943],
       [-0.68469028,  0.48098533, -0.79289728,  0.        ,  0.88318971,
         0.19049676, -0.16485639, -0.91691938, -0.85073007, -1.22341672],
       [-1.56787999, -0.40220437, -1.67608699, -0.88318971,  0.        ,
        -0.69269295, -1.0480461 , -1.80010909, -1.73391977, -2.10660642],
       [-0.87518704,  0.29048858, -0.98339404, -0.19049676,  0.69269295,
         0.        , -0.35535315, -1.10741614, -1.04122682, -1.41391347],
       [-0.51983389,  0.64584173, -0.62804089,  0.16485639,  1.0480461 ,
         0.35535315,  0.        , -0.75206299

## Utilities for CND

In [6]:
c = np.random.sample((10,5))
c

array([[0.29667549, 0.0564011 , 0.63544837, 0.92230021, 0.98857023],
       [0.33936724, 0.7274561 , 0.97540163, 0.18606172, 0.88492621],
       [0.64999822, 0.2542382 , 0.91042318, 0.13933518, 0.76351368],
       [0.72396227, 0.38190234, 0.84655469, 0.66223434, 0.14185733],
       [0.49844862, 0.51691692, 0.40981146, 0.5474084 , 0.50739282],
       [0.42925977, 0.91847479, 0.43964169, 0.059856  , 0.75690859],
       [0.4269626 , 0.36918995, 0.24692452, 0.0756029 , 0.5147596 ],
       [0.11296652, 0.06716016, 0.41018572, 0.39149682, 0.02629192],
       [0.82178123, 0.53137347, 0.20651435, 0.76992326, 0.64020397],
       [0.02620175, 0.33113379, 0.21937433, 0.77492094, 0.71557128]])

### Calulation of $z_i = \log(x_i/\bar{x}_\mathrm{geo})$

In [13]:
def calculate_z(c):
    ''' Calculates z for CND analysis 
      Args:
        c (array) list of the concentration of each nutrient
      Returns:
        z (array) list of z values of each nutrient
    '''
    # normalize the nurtrient for each plant to 1
    row_sums = c.sum(axis=1)
    x = c / row_sums[:, np.newaxis] 
    # calculate the z value for each nutrient for eveay plant
    g = (np.prod(x))**(1/len(x)) # geometric averge 
    z = np.log(x/g)
    return z

# Here you could add some statistics and show the distribution of the concentration and the z values
calculate_z(c)

array([[7.0167477 , 5.35659747, 7.77843964, 8.15097957, 8.22036849],
       [7.08003858, 7.8425094 , 8.13580507, 6.47903419, 8.03846002],
       [7.86586521, 6.92716721, 8.2028051 , 6.32577798, 8.02682662],
       [7.95938457, 7.31981023, 8.11582009, 7.87026478, 6.3294671 ],
       [7.69186176, 7.72824342, 7.49605845, 7.7855564 , 7.70964675],
       [7.49357048, 8.25422268, 7.51746827, 5.52344989, 8.06075072],
       [7.95461949, 7.80923435, 7.40700579, 6.22341767, 8.14162308],
       [7.10763401, 6.58762274, 8.39715257, 8.35051992, 5.64980418],
       [8.0115919 , 7.57558278, 6.63048756, 7.94640853, 7.7619045 ],
       [4.92824133, 7.46493748, 7.05319452, 8.31517599, 8.23549621]])

### Calculation of the CND index $I_{z_i}$

The CND index is given by $I_{z_i}=(Z_i - z_i) / \sigma_{z_i}$
- $Z_i$ is the z-value of the **test** population for nutrient $i$
- $z_i$ is the z-value of the **target** population for nutrient $i$
- $\sigma_{z_i}$ is the standard deviation of the z-value of the **target** population for nutrient $i$


This index $I_{z_i}$ is the differenz of the z-values normalized by the standart deviation of the target population.
Therefore, for each nutrient, the $I_{z_i}$ measures the distance between the test and target population.

The normalization with $\sigma_{z_i}$ is to have a sensible scale. If a nutrient of the target population has a large standard deviation it means that the range of 'acceptable' nutrient amount is large. As a result the $I_{z_i}$ for this nutrient is smaller.

The interpretation of the the index $I_{z_i}$:
- $I_{z_i} < 0$: relative nutrient insufficiency
- $I_{z_i} = 0$: relative nutrient balance
- $I_{z_i} > 0$: relative nutrient excess



We can deepen the dicussion of $I_{z_i}$ by writing it as a sum and analysing each term:
$I_{z_i} = \frac{1}{\sigma_{z_i}} \biggl[\underbrace{ = \log\left( \frac{X_i}{x_i} \right)}_{f(X_i)} + \underbrace{ = \log\left( \frac{g(X_i)}{g(x_i)} \right)}_{f(g(X_i))} \biggr]$

- The first term $f(X_i) = \log\left( \frac{X_i}{x_i} \right)$ is only dependent on the **individual** nutrient
- The second term $f(X_i) = \log\left( \frac{X_i}{x_i} \right)$ is only dependent on the geometric means $g$ therefore takes into acount **every** nutrient

In [None]:
def calculate_I(c_population, c_target):
    ''' Calculates I for CND analysis 
      Args:
        c_population (array) list of concentrations of the population
        c_target (array) list of z values of the population
      Returns:
        I (array) list of I values of each nutrient
    '''
    z_population = calculate_z(c_population)
    z_target = calculate_z(c_target)
    I = (z_population - z_target)/np.std(z_target)