**Introduction to Deep Tensor Neural Networks (DTNNs): A Beginner’s Tutorial**

This tutorial will guide you through the key ideas and practical steps behind the _Quantum-chemical insights from deep tensor neural networks_ (DTNN) framework, as presented by Schütt _et al._ (Nature Communications, 2017). By the end, you will understand how DTNNs model molecules, predict energies, and reveal local chemical potentials.

---

## 1. Motivation and Overview

- **Why machine learning in quantum chemistry?** Conventional electronic‑structure methods (e.g., Density Functional Theory) are accurate but computationally costly for large chemical spaces. Machine learning offers a data‑driven route to predict molecular properties quickly.

- **What are DTNNs?** DTNNs combine quantum‑mechanical many‑body principles with deep learning. They learn an _atom‑centered embedding_ that respects physical invariances (translation, rotation, permutation) and yields energy predictions with chemical accuracy (≈1 kcal/mol).

**Core strengths:**
- Size‑extensive: scales linearly with the number of atoms
- Uniform accuracy across composition and configuration space
- Interpretable: yields per‑atom energy contributions and spatially resolved chemical potentials

---

## 2. Representing a Molecule for DTNN

1. **Nuclear Charges Vector**: A list \(Z = [Z_1, Z_2, \dots, Z_N]\) for a molecule of \(N\) atoms, encoding element types.
2. **Inter‑atomic Distance Matrix**: \(D\), an \(N\times N\) matrix where \(D_{ij} = \|\mathbf{r}_i - \mathbf{r}_j\|\).

> These inputs guarantee invariance under rotations/translations. DTNN further enforces permutation invariance by summing atomic energies.

---

## 3. Preprocessing Distances: Gaussian Expansion

To capture different interaction regimes, distances are expanded into a fixed-size feature vector:

\[
\hat d_{ij,k} = \exp\Bigl(-\frac{(D_{ij} - \mu_k)^2}{2\sigma^2}\Bigr), \quad k=1,2,\dots,G
\]

- **\(\mu_k\)**: centers on a uniform grid (e.g., from 0 to 20 Å)
- **\(\sigma\)**: width (e.g., 0.2 Å)
- **Result**: each pair \((i,j)\) has a vector \(\hat d_{ij} \in \mathbb{R}^G\)

---

## 4. Initial Atomic Embeddings

Each atom \(i\) is assigned an initial descriptor vector based on its nuclear charge:

\[c_i^{(0)} = \mathbf{c}_{Z_i} \in \mathbb{R}^B\]

- \(B\) is the embedding dimension (e.g., 30).
- \(\mathbf{c}_Z\) are _trainable_ vectors for each element type.

---

## 5. Interaction Passes: Refining Embeddings

DTNN performs \(T\) iterative “interaction” layers. At pass \(t\), each atom \(i\) updates:

\[
c_i^{(t+1)} = c_i^{(t)} + \sum_{j \neq i} v_{ij}^{(t)}
\]

where the pairwise correction uses a _factored tensor layer_:

\[
v_{ij} = \tanh\bigl( W_f^c(c_j^{(t)}) \odot W_f^d(\hat d_{ij}) + b_f \bigr)
\]

- \(W_f^c: \mathbb{R}^B\to\mathbb{R}^F\) and \(W_f^d: \mathbb{R}^G\to\mathbb{R}^F\)
- \(F\) is the number of factors (e.g., 60)
- ``\(\odot\)`` denotes element‑wise product

**Interpretation:** each atom’s embedding is polished by learned interactions with neighbors.

---

## 6. Predicting Atomic and Molecular Energies

After \(T\) passes, transform each \(c_i^{(T)}\) via two feed‑forward layers to yield per‑atom energy contributions \(\hat E_i\). The total energy is:

\[E_{\rm mol} = \sum_{i=1}^N \hat E_i\]

Training minimizes the mean‑squared error against reference DFT energies.

---

## 7. Visualizing Local Chemical Potentials

Once trained, DTNN can probe “what if” scenarios by placing a _test atom_ A (e.g., H, C) at various positions around a molecule:

1. Treat the probe’s initial \(c_{\rm probe}^{(0)}\) and compute \(T\) passes interacting **from** the molecule **to** the probe only.
2. The resulting energy of the probe at position \(r\) defines a local chemical potential \(\Omega_M^A(r)\).

**Use cases:** mapping reactive sites, evaluating ring aromaticity, comparing stability of functional groups.

---

## 9. Practical Tips and Extensions

- **Distance cutoff:** for large molecules, ignore pairs beyond a threshold (e.g., 3 Å) to reduce cost.
- **Hyperparameters:** common choices: \(B=30, F=60, G=50, T=2\). Tune per dataset.
- **Data requirements:** ~10⁴–10⁵ DFT calculations for good coverage of chemical space.
- **Beyond energies:** can be extended to forces (via gradients), electronic spectra, and alchemical interpolations.

---

## 10. Further Reading

- Original DTNN paper: Schütt _et al._, _Nat. Commun._ 8, 13890 (2017).
- Related architectures: SchNet (Schütt _et al._, _J. Chem. Phys._ 2018), PhysNet (Unke & Meuwly, 2019).
- Tutorials on graph neural networks for molecules: [DeepChem GNN tutorial](https://deepchem.io).

*Happy exploring quantum machine learning!*

