# üß™ Lesson 02: Positional Encoding

**Series**: Chemical Graph Machine Learning  
**Prerequisites**: Lesson 01 (molecular graphs, node/edge features)  
**Next Lesson**: [03 - Graph Attention Networks](./03_GAT_Model.ipynb)  
**Estimated Time**: 60-75 minutes

## üìö Learning Objectives

By the end of this lesson, you will:
1. ‚úÖ Understand why standard GNNs struggle with positional information
2. ‚úÖ Compute Laplacian matrices from molecular graphs
3. ‚úÖ Extract spectral features using eigendecomposition
4. ‚úÖ Implement Random Walk Positional Encoding (RWPE)
5. ‚úÖ Compare different positional encoding strategies
6. ‚úÖ Visualise how positional encodings capture molecular structure

**Why this matters**: Without positional encoding, GNNs can't distinguish between molecules with the same local connectivity but different global structure (e.g., isomers).


## üîÑ Quick Recap: What We Know

From Lesson 01, you learned to:
- Convert SMILES ‚Üí RDKit molecules ‚Üí NetworkX graphs
- Extract node features: atomic number, charge, aromaticity, etc.
- Build edge features: bond types and connectivity

**Today's challenge**: Standard node features are *local* (they describe individual atoms). We need *global* positional information so the model knows where each atom sits in the overall molecular structure.


## üìñ Main Content Structure

### Part 1: The Positional Encoding Problem
- Why identical local neighbourhoods ‚â† same position
- Graph isomorphism and the WL test
- Chemical example: ortho vs meta vs para substitution

### Part 2: Laplacian Eigenvectors
- Building the graph Laplacian matrix
- Eigendecomposition and spectral graph theory
- Connection to molecular vibrations (bonus chemistry insight!)

**Code**: Compute Laplacian eigenvectors for our molecule from Lesson 01

### Part 3: Random Walk Positional Encoding (RWPE)
- Simulating random walks on molecular graphs
- Building the landing probability matrix
- Why RWPE captures both local and global structure

**Code**: Implement RWPE and compare to Laplacian PE

### Part 4: Visualising Positional Encodings
- Plotting eigenvector components
- Understanding what different eigenvectors "see"
- Sanity checks: symmetric molecules should have symmetric encodings

**Code**: Create visualisations overlaying PE on molecular structure

### Part 5: Integration with Feature Matrices
- Concatenating positional encodings to node features
- Choosing the number of eigenvectors (k parameter)
- Normalisation and preprocessing considerations

**Code**: Build final feature matrix: [atomic features | positional encoding]


## üí° Key Chemical Insights

### Why chemists should care about positional encoding:
- **Regioisomers**: ortho-xylene vs meta-xylene have identical local connectivity but different properties
- **Ring systems**: PE distinguishes bridgehead atoms from others
- **Symmetry**: PE respects molecular symmetry (automorphisms)
- **Pharmacophores**: Spatial arrangement matters for binding ‚Üí PE captures this


## ‚úÖ Knowledge Checkpoint

Before moving to Lesson 03, ensure you can:

- [ ] Explain why `[C, N]` bonds tell you nothing about molecular structure
- [ ] Compute the Laplacian matrix for a simple graph
- [ ] Interpret what the first few eigenvectors represent
- [ ] Implement RWPE with different walk lengths
- [ ] Decide how many positional dimensions to use

**Self-test**: Take two isomers (e.g., n-butane vs isobutane: `CCCC` vs `CC(C)C`):
1. Extract positional encodings for both
2. Verify that central carbons have different PE despite same local features
3. Visualise the difference

If the PE distinguishes them, you've succeeded!

## üîÆ Coming Up in Lesson 03: Graph Attention Networks

Now that we have rich node features (atomic properties + positional encoding), we can build our first proper GNN.

**What you'll learn**:
- Message passing: how information flows between neighbouring atoms
- Attention mechanisms: learning which bonds are most important
- Multi-head attention: capturing different types of chemical relationships simultaneously

**What you'll need from today**:
- The positional encoding functions we just built
- Understanding that nodes need both local and global features
- The feature matrix format (will feed directly into PyTorch Geometric)

**The payoff**: By Lesson 07, these attention weights will show you *which atoms and bonds the model focuses on* when predicting solubility‚Äîinterpretable chemistry!


## üìñ Further Reading

**Spectral Graph Theory**:
- Chung, F. R. K. (1997). *Spectral Graph Theory*. AMS. [Classic textbook]
- Von Luxburg, U. (2007). "A tutorial on spectral clustering." *Statistics and Computing*.

**Positional Encoding in GNNs**:
- Dwivedi et al. (2021). "Benchmarking Graph Neural Networks." *arXiv:2003.00982*
- Satorras et al. (2021). "E(n) Equivariant Graph Neural Networks." *ICML 2021*

**Chemistry Connection**:
- Normal modes of vibration use the same eigenvector math!
- See any computational chemistry textbook on molecular vibrations


**Navigation**: [‚Üê Lesson 01](./01_Building_Graphs.ipynb) | [Lesson 03 ‚Üí](./03_GAT_Model.ipynb) | [Series Home](../README.md)
