# 🔬 Contrastive Scaling Laws in Diffusivity Using Neural Networks



## ✨ Abstract

**Contrastive Deep Learning for Generative Simulation (Contrastive GS)** is developed to support fast reasoning when simulation datasets are incomplete or when kernels cannot directly provide an answer to complex engineering or scientific questions. This lecture demonstrates how a neural network can recover **physically meaningful scaling laws** for molecular **diffusivity in polymers**, based only on contrastive ratios. Diffusivity, typically modeled through hole-free volume theory, is governed by temperature, molecular weight, and polymer state (e.g., rubbery vs. glassy).

We show that contrastive deep learning can bridge symbolic physical kernels and black-box inference methods by reconstructing embedded scaling laws from relative data. A neural network is compared with symbolic regression baselines and evaluated through visual diagnostics and dimensionality reduction.




## 1. 🧠 Theoretical Background

  - Diffusivity $D$ of a molecule in a polymer can be modeled as:
    $$
    \frac{D}{D_0} = \left(\frac{M}{M_0}\right)^{-\alpha(T,T_g)}
    $$

    $$
    \text{with} \quad \alpha(T,T_g) = 1 + \frac{K_a}{K_b + r(T - T_g)}
    $$

      - $M$: molecular weight of the substance

      - $M_0$, $D_0$: reference values

      - $T$: temperature in K, $T_g$: glass transition temperature

      - $r$: state parameter ($r=1$ in rubbery state)

      - $K_a \approx 140$ K, $K_b \approx 40$ K (rough estimates here)

        

> 📚 **References:**
>
> These models are not foundational in a unified theory of mass transport in glassy and rubber polymers based on free volume, cohesive energy, and thermodynamic fluctuations. They are described in detail in:
>
> 1.  Xiaoyi Fang, Sandra Domenek, Violette Ducruet, Matthieu Réfrégiers, and Olivier Vitrac.  
>     *Diffusion of Aromatic Solutes in Aliphatic Polymers above Glass Transition Temperature.*  
>     **Macromolecules**, 2013, **46** (3), 874–888.  
>     https://doi.org/10.1021/ma3022103
> 2.  Yan Zhu, Frank Welle, and Olivier Vitrac.  
>     *A Blob Model to Parameterize Polymer Hole Free Volumes and Solute Diffusion.*  
>     **Soft Matter**, 2019, **15**, 8912–8932.  
>     https://doi.org/10.1039/C9SM01556F




## 2. 🧪 Synthetic Data Generation

We simulate diffusivity for two polymers (PP and HDPE) with prescribed $tT_g$ by computing $D$ across a grid of molecular weights $M$ and temperatures $T$.

 For each combination of $M$ and $T$, we compute:

$D = D_0 \left(\frac{M}{M_0}\right)^{-\alpha(T,Tg)}$


In [18]:
# Ka, Kb, M0, D0, and Tg defined above
# Produces a dataset df with columns: Polymer, M, T, alpha, D
import numpy as np
import pandas as pd

from IPython.display import display, HTML

def headtail(self,n=10):
    if self.shape[0] <= 2 * n:
        return self._repr_html_()
    else:
        top = self.head(n)
        bottom = self.tail(n)
        ellipsis = pd.DataFrame([["..."] * self.shape[1]], columns=self.columns)
        combined = pd.concat([top, ellipsis, bottom])
        return combined._repr_html_()

pd.DataFrame.headtail = headtail

pd.set_option("display.max_rows", 100)        # Show more rows
pd.set_option("display.max_columns", 20)      # Show more columns
pd.set_option("display.width", 1000)          # Set display width
pd.set_option("display.float_format", "{:.3g}".format)  # Format floats


Ka, Kb = 140, 40  # guess constants
M0, D0 = 100, 1e-9  # reference values

polymers = {
    'PP': {'Tg': 273, 'r': 1.0},
    'HDPE': {'Tg': 173, 'r': 1.0}
}

M_values = np.linspace(40, 500, 7)
T_values = np.linspace(23, 100, 7) + 273.15

data = []
for polymer, props in polymers.items():
    Tg, r = props['Tg'], props['r']
    for M in M_values:
        for T in T_values:
            alpha = 1 + Ka / (Kb + r * (T - Tg))
            D = D0 * (M / M0) ** (-alpha)
            data.append({
                'Polymer': polymer,
                'M': M,
                'T': T,
                'Tg': Tg,
                'alpha': alpha,
                'D': D
            })

df = pd.DataFrame(data)

display(HTML(df.headtail(10)))
#print(df.to_html())

RecursionError: maximum recursion depth exceeded while calling a Python object

## 3. 🔁 Contrastive Dataset Construction

Instead of absolute diffusivities, we focus on **pairwise contrastive ratios**:

$y = \log\left(\frac{D_i}{D_j}\right) = -\alpha \cdot \log\left(\frac{M_i}{M_j}\right)$

**Contrastive features:**

- $x_1 = \log(M_i/M_j)$
- $x_2 = 1/T_i - 1/T_j$
- $x_3 = T_{g,i} - T_{g,j}$
- $y = \log(D_i/D_j)$

In [12]:
# Generates contrastive_df with inputs and log-ratio targets
from itertools import combinations

df['logD'] = np.log(df['D'])
contrastive_data = []

for polymer in df['Polymer'].unique():
    subset = df[df['Polymer'] == polymer].reset_index(drop=True)
    for i, j in combinations(range(len(subset)), 2):
        row_i, row_j = subset.loc[i], subset.loc[j]
        contrastive_data.append({
            'Polymer': polymer,
            'logM_ratio': np.log(row_i['M'] / row_j['M']),
            'invT_diff': (1 / row_i['T']) - (1 / row_j['T']),
            'Tg_diff': row_i['Tg'] - row_j['Tg'],
            'logD_ratio': row_i['logD'] - row_j['logD']
        })

contrastive_df = pd.DataFrame(contrastive_data)
print(f"Generated {len(contrastive_df)} contrastive pairs")
pd.set_option("display.max_rows", 20)
HTML(contrastive_df.to_html())
#print(contrastive_df.to_html())

Generated 2352 contrastive pairs


Unnamed: 0,Polymer,logM_ratio,invT_diff,Tg_diff,logD_ratio
0,PP,0.0,0.00014,0,0.343
1,PP,0.0,0.000269,0,0.587
2,PP,0.0,0.000388,0,0.769
3,PP,0.0,0.000499,0,0.911
4,PP,0.0,0.000601,0,1.02
5,PP,0.0,0.000697,0,1.12
6,PP,-1.07,0.0,0,3.44
7,PP,-1.07,0.00014,0,3.39
8,PP,-1.07,0.000269,0,3.34
9,PP,-1.07,0.000388,0,3.31


In [None]:
HTML(df.to_html())
#print(df.to_html())