# Information-Theoretic Approach

Information theory developed by Shannons and others as a branch of applied mathematics, electrical engineering, and computer science is involved in the quantification of information which is often a probability distribution function. Since the electron density is a continuous probability distribution function, as early as in the year of 1980, information theory has been applied to DFT to study atoms and molecules. We call this particular category of work the information-theoretic approach (ITA). 

There are three different representation using in ITA, they are: 

* Electron density representation

* Shape function representation

* Atoms-in-molecules representation

## Electron Density Representation

Using the electron density $\rho(\mathbf{r})$ as the probability function in information theory, one obtains the first ITA representation. A key measure of information is entropy, quantifying the uncertainty involved in predicting the value of the distribution function. Shannon entropy is the first such a measure widely used in the literature, which reads

$$
S_{\text{S}} = -\int \rho(\mathbf{r}) \ln \rho(\mathbf{r}) d\mathbf{r} = \int s_{\text{S}}(\mathbf{r}) d\mathbf{r} \tag{1}
$$ 

where $s_{\text{S}}(\mathbf{r})$ is the Shannon entropy density and $\rho(\mathbf{r})$ is the total electron density, satisfying the following condition in relation to the total number of electrons, N, of the system.

$$
\int \rho(\mathbf{r}) d\mathbf{r} = N \tag{2}
$$

Shannon entropy measures the spatial delocalization of the electronic density. The second important measure in information theory is the Fisher information $I_{\text{F}}$, defined as followse

$$
I_{\text{F}} 
\equiv \int i_{\text{F}}(\mathbf{r}) d\mathbf{r}  
= \int \frac{|\nabla\rho(\mathbf{r})|^2}{\rho(\mathbf{r})} d\mathbf{r} \tag{3}
$$

which is a gauge of the sharpness or concentration of the electron density distribution. In Eq.(3), $i^{\prime}_{\text{F}}(\mathbf{r})$ is the Fisher information density and $\mathbf{\nabla} \rho(\mathbf{r})$ is the density gradient. Earlier, we haveproved that there is an equivalent expression for the Fisher information in terms of the Laplacian of the electron density $\nabla^2 \rho(\mathbf{r})$

$$
I^{\prime}_{\text{F}} 
\equiv \int i^{\prime}_{\text{F}}(\mathbf{r}) d\mathbf{r} 
= -\int \nabla^2\rho(\mathbf{r}) \ln \rho(\mathbf{r})  d\mathbf{r} 
\tag{4}
$$

Equations (3) and (4) are equal in the sense that they can be derived by partial integration from one to the other, and that the two integrals have the same value. As have been shown, local behaviors of the two integrals, $i_F({\mathbf{r}})$, and $i^{\prime}_F(\mathbf{r})$, are markedly different. More importantly, we have proved the existence of the following rigorous relationship among the three quantities $s_s(\mathbf{r})$, $i_F({\mathbf{r}})$, and $i^{\prime}_F(\mathbf{r})$,

$$
S_S = \int s_S(\mathbf{r}) d\mathbf{r} 
= -N + \frac{1}{4\pi} 
\int\int \frac{i_F({\mathbf{r}}) - i^{\prime}_F(\mathbf{r}^{\prime})}{|\mathbf{r} - \mathbf{r}^{\prime}|}
d \mathbf{r} d\mathbf{r}^{\prime}
\tag{5}
$$

whose validity has subsequently been verified by numerical results.

The third quantity in the same spirit is the Ghosh-Berkowitz-Parr (GBP) entropy

$$
S_{GBP} = -\int \frac{3}{2}k\rho(\mathbf{r})\left[c+ ln\frac{t(\mathbf{r};\rho)}{t_{TF}(\mathbf{r};\rho)} \right]  d\mathbf{r}
\tag{6} 
$$

where $t(\mathbf{r}, \rho)$ is the kinetic energy density, which is related to the total kinetic energy $T_s$ via

$$
\int t(\mathbf{r}, \rho) d\mathbf{r} = T_S
\tag{7}
$$

and $t_{\text{TF}}(\mathbf{r}, \rho)$ is the Thomas-Fermi kinetic energy density,

$$
t_{\text{TF}}(\mathbf{r}, \rho) = c_K \rho^{5/3}(\mathbf{r}) 
\tag{8}
$$

with k as the Boltzmann constant, $c = (5/3) + \ln(4\pi c_K/3)$, and $c_K = (3/10)(3\pi^2)^{2/3}$. The GBP entropy originates from the effort to transcribe the ground-state density functional theory into a local thermodynamics through the phase-space distribution function $f(\mathbf{r}, \mathbf{p})$ which is a function of both the electron position $\mathbf{r}$ and momentum $\mathbf{p}$ as its two basic variables. The conditions of such a recast of DFT into thermodynamics are that the phase-space distribution function is associated with the ground state electron density $\rho(\mathbf{r})$ and kinetic energy density $t(\mathbf{r}, \rho)$ through the following relationships

$$
\rho(\mathbf{r}) = \int f(\mathbf{r}, \mathbf{p}) d\mathbf{p} 
\tag{9}
$$
$$
t(\mathbf{r}, \rho) = \frac{1}{2} \int f(\mathbf{r}, \mathbf{p}) p^2 d\mathbf{p} 
\tag{10}
$$

The specific form of the local kinetic energy $t(\mathbf{r}, \rho)$ used is thefollowing,

$$
t(\mathbf{r};\rho) = \sum_i \frac{1}{8} \frac{\nabla \rho_i \dot \nabla \rho_i}{\nabla\rho_i}-\frac{1}{8} \nabla^2 \rho
\tag{11}
$$
 
Very recently, three information-theoretic quantities, Renyi entropy, Tsallis entropy, and Onicescu information energy, are introduced as new reactivity descriptors in DFRTs. The Renyi entropy oforder n, where n > 0 and n / 1, is defined as

$$
R_n^r = \frac{1}{n-1}ln\int \left[ \frac{\rho^n(\mathbf{r})}{\rho^{n-1}_0(\mathbf{r})}  d\mathbf{r} \right]
\tag{12}
$$

When n approaches to 1, the Renyi entropy, Eq.(14), reduces tothe Shannon entropy, Eq.(3). The Tsallis entropy of order n isdefined as follows

$$
T_n = \frac{1}{n-1} \int \left[1 - \ln \frac{\rho^n(\mathbf{r})}{\rho^{n-1}_0(\mathbf{r})}  d\mathbf{r} \right]
\tag{13}
$$

It is a generalization of the standard Boltzmann-Gibbs entropy.The common term in Eqs.(14) and (15) is the integral of the n-th power of the electron density, which is called the Onicescuinformation energy of order n

$$
E_n = \frac{1}{n-1}\int \rho^n(\mathbf{r})  d\mathbf{r} 
\tag{14}
$$

Onicescu introduced this quantity in an attempt to define a finer measure of dispersion distribution than that of Shannon entropy in information theory

Closely related to the concept of entropy in information theory is the relative entropy, which is a non-symmetric measureof the entropy difference between two probability distribution functions. Well known examples in the literature are the relative Shannon entropy, also called information gain, Kullback-Leibler divergence, or information divergence, defined byo

$$
S_r = \int \rho(\mathbf{r}) ln \frac{\rho(\mathbf{r})}{\rho_0(\mathbf{r})}  d\mathbf{r}
\tag{15}
$$

and the relative Renyi entropy of order n

$$
R^r_n = \frac{1}{n-1} \ln  \left[ \int \frac{\rho^n(\mathbf{r})}{\rho^{n-1}_0(\mathbf{r})}  d\mathbf{r} \right]
\tag{16}
$$

where $\rho_0(\mathbf{r})$ is the reference state density satisfying the same normalization condition as $\rho(\mathbf{r})$. This reference density can be from the same molecule with different conformation or from the reactant of a chemical reaction when the transition state is investigated.

<div class="alert alert-info">

**Notes**

* k is the Boltzmann constant, by default 1.0 for convenience.

* k is the Boltzmann constant, by default 1.0 for convenience.

* k is the Boltzmann constant, by default 1.0 for convenience.

</div>

## Shape Function Representation

Information-theoretic quantities defined in Eqs.(3) (18) employ the electron density as the probability distribution func-tion. There is another distribution function in DFRT, the shapefunction o(r)..9.o 72, which is related to the electron density p(r)and the total number of electrons N through the following relationship,

$$
\rho(\mathbf{r}) = N\sigma(\mathbf{r})
\tag{17}
$$

with the following normalization condition

$$
\int \sigma(\mathbf{r}) d\mathbf{r} = 1
\tag{18}
$$

Information-theoretic quantities defined in Eqs.(3)(18) cansimilarly be redefined with the shape function, yielding

$$
S_{\sigma} = -\int \sigma(\mathbf{r}) \ln \sigma(\mathbf{r}) d\mathbf{r} 
\tag{19}
$$ 
$$
I_{\sigma} = \int \frac{|\nabla\sigma(\mathbf{r})|^2}{\sigma(\mathbf{r})} d\mathbf{r} 
\tag{20}
$$
$$
I^{\prime}_{\sigma} = -\int \nabla^2\sigma(\mathbf{r}) \ln \sigma(\mathbf{r})  d\mathbf{r} 
\tag{21}
$$
$$
R_n^{\sigma} = \frac{1}{n-1}ln\int \left[ \frac{\sigma^n(\mathbf{r})}{\sigma^{n-1}_0(\mathbf{r})}  d\mathbf{r} \right]
\tag{22}
$$
$$
T_n^{\sigma} = \frac{1}{n-1} \int \left[1 - \ln \frac{\sigma^n(\mathbf{r})}{\sigma^{n-1}_0(\mathbf{r})}  d\mathbf{r} \right]
\tag{23}
$$
$$
E_n^{\sigma} = \frac{1}{n-1}\int \sigma^n(\mathbf{r})  d\mathbf{r} 
\tag{24}
$$
$$
S_r^{\sigma} = \int \sigma(\mathbf{r}) ln \frac{\sigma(\mathbf{r})}{ \sigma_0 (\mathbf{r})}  d\mathbf{r}
\tag{25}
$$
$$
R^r_{\sigma,n} = \frac{1}{n-1} \ln  \left[ \int \frac{\sigma^n(\mathbf{r})}{\sigma^{n-1}_0(\mathbf{r})}  d\mathbf{r} \right]
\tag{26}
$$

Because of Eq.(19), quantities in these two representations are correlated, except for the GBP entropy, which does not have an analytical expression between the two representations. As can be readily proved, we have

$$
S_{\sigma} = \frac{S_S}{N} + \ln N
\tag{27}
$$
$$
I_{F} = NI_{\sigma}
\tag{28}
$$
$$
I_{F}^{\prime} = NI_{\sigma}^{\prime}
\tag{29}
$$
$$
R_n - R^{\sigma},_n = \frac{n}{1-n} \ln N
\tag{30}
$$
$$
T_n - N^n T^{\sigma}_n = \frac{1-N^n}{n-1}
\tag{31}
$$
$$
E_n =N^n E^{\sigma}_n
\tag{32}
$$
$$
S_r = NS^{\sigma}_n
\tag{33}
$$

and

$$
R_n^r - R_{\sigma,n}^r = \frac{1}{1-n} \ln N
\tag{34}
$$

These rigorous relationships between the two representations of information-theoretic quantities enable us to obtain them inter-changeably from one representation to the other.

## Atoms-in-Molecules Representation

Another important aspect of the information-theoretic approach is to re-evaluate the above quantities from the perspective of atoms in molecules. To consider atomic contributions of an information-theoretic quantity in a molecular system, three approaches are available to perform atom partitionsin molecules. They are Becke's fuzzy atom approach, Bader's zero-flux AIM approach, and Hirshfeld's stockholder approach. 


   
<div class="alert alert-warning">

**TODO**

* Implement the Becke and Bader AIM approach。

</div>

The total electron population N of the system is thesummation ofelectron density in each atomic contribution, $N_A$

$$
N = \sum_A N_A = \sum_A \int \rho_A(\mathbf{r}) d\mathbf{r}
\tag{35}
$$

and

$$
S_{S} = -\sum_{A} \int_{\Omega_A} \rho_{A}(\mathbf{r}) \ln \rho_{A}(\mathbf{r}) d\mathbf{r} 
\tag{36}
$$ 
$$
I_{F} = \sum_{A} \int_{\Omega_A} \frac{|\nabla\rho_A(\mathbf{r})|^2}{\rho_A(\mathbf{r})} d\mathbf{r} 
\tag{37}
$$
$$
I^{\prime}_{F} = \sum_{A} \int_{\Omega_A} \nabla^2\rho_A(\mathbf{r}) \ln \rho_A(\mathbf{r})  d\mathbf{r} 
\tag{38}
$$
$$
S_{GBP} = \sum_{A} \int_{\Omega_A} \frac{3}{2}k\rho_A(\mathbf{r})\left[c+ ln\frac{t(\mathbf{r};\rho_A)}{t_{TF}(\mathbf{r};\rho_A)} \right]  d\mathbf{r}
\tag{39}
$$
$$
E_n^{\sigma} = \frac{1}{n-1} \sum_{A} \int_{\Omega_A} \rho_A^n(\mathbf{r})  d\mathbf{r} 
\tag{40}
$$

and 

$$
S_r^ = \sum_{A} \int_{\Omega_A} \rho_A(\mathbf{r}) ln \frac{\rho_A(\mathbf{r})}{ \rho_A^0 (\mathbf{r})}  d\mathbf{r}
\tag{41}
$$

where $\rho(\mathbf{r})$, is the electron density on atom (or group) A in a molecule, whose total molecular electron density is $\rho(\mathbf{r})$, $\rho_A(\mathbf{r})$ is the counterpart of atom (or group) A in the reference state, which can be neutral atom, or ion, or group, etc, and $\Omega_A$ is the atomic basin of atom A in the molecule. The counterpart in terms ofthe shape function can be derived similarly.

<!--bibtex

@Article{PER-GRA:2007,
  Author    = {P\'erez, Fernando and Granger, Brian E.},
  Title     = {{IP}ython: a System for Interactive Scientific Computing},
  Journal   = {Computing in Science and Engineering},
  Volume    = {9},
  Number    = {3},
  Pages     = {21--29},
  month     = may,
  year      = 2007,
  url       = "http://ipython.org",
  ISSN      = "1521-9615",
  doi       = {10.1109/MCSE.2007.53},
  publisher = {IEEE Computer Society},
}

@article{Papa2007,
  author = {Papa, David A. and Markov, Igor L.},
  journal = {Approximation algorithms and metaheuristics},
  pages = {1--38},
  title = {{Hypergraph partitioning and clustering}},
  url = {http://www.podload.org/pubs/book/part\_survey.pdf},
  year = {2007}
}

-->

<!--bibtex

@Article{PER-GRA:2007,
  Author    = {P\'erez, Fernando and Granger, Brian E.},
  Title     = {{IP}ython: a System for Interactive Scientific Computing},
  Journal   = {Computing in Science and Engineering},
  Volume    = {9},
  Number    = {3},
  Pages     = {21--29},
  month     = may,
  year      = 2007,
  url       = "http://ipython.org",
  ISSN      = "1521-9615",
  doi       = {10.1109/MCSE.2007.53},
  publisher = {IEEE Computer Society},
}

@article{Papa2007,
  author = {Papa, David A. and Markov, Igor L.},
  journal = {Approximation algorithms and metaheuristics},
  pages = {1--38},
  title = {{Hypergraph partitioning and clustering}},
  url = {http://www.podload.org/pubs/book/part\_survey.pdf},
  year = {2007}
}

-->

