# Quantum Sieving Potential of MOFs: A Simple Machine Learning Approach

In this notebook, we'll use simple machine learning models to assess the potential of Metal-Organic Frameworks (MOFs) for Quantum Sieving. We'll focus on their ability to selectively diffuse hydrogen particles at different rates via quantum effects (De Broglie Wavelength Correlation).

## 1. Import Libraries

First, let's import the necessary libraries:

In [1]:
import pandas as pd
from sklearn.model_selection import train_test_split




## 2. Load and Prepare Data

We'll use a MOF dataset provided by Northwestern Tech:

In [2]:
# Read the MOF dataset
data = pd.read_csv('mofdb.csv')

# Define target based on literature insights
data['Target'] = 0.5 * data['uptake_vol [g H2/L]'] + 0.5 * (1 / data['pore_volume [cm³/g]'])
data['Target'] *= 2  # Scale for better interpretability


Note: Smaller pore radius leads to faster and more selective diffusion of heavier hydrogen isotopes. This counterintuitive behavior is explained by quantum tunneling effects when the pore size becomes comparable to the de Broglie wavelength of the particles.

## 3. Feature Engineering

Select relevant features and normalize them:

In [3]:
column_names = [
    "asa_grav [m²/g]", "asa_vol [m²/cm³]", "av_vf", "pore_volume [cm³/g]",
    "density [g/cm³]", "nexc [wt. %]", "uptake_grav [wt. %]", "uptake_vol [g H2/L]",
    "lcd [Å]", "pld [Å]", "LFPD [Å]", "cell_length_a", "cell_length_b", "cell_length_c",
    "cell_angle_alpha", "cell_angle_beta", "cell_angle_gamma", "cell_volume [Å³]", "Target"
]

svm_data = data[column_names]

# Normalize features
for column in column_names:
    if column != "Target":
        svm_data[column] = (svm_data[column] - svm_data[column].min()) / (svm_data[column].max() - svm_data[column].min())

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  svm_data[column] = (svm_data[column] - svm_data[column].min()) / (svm_data[column].max() - svm_data[column].min())


## 4. Prepare Features and Target

Split the data into features and target:

In [4]:
features = [col for col in column_names if col != "Target"]

svm_data = svm_data.dropna()
X = svm_data[features]
y = svm_data['Target']

## 5. Train-Test Split

Split the data into training and testing sets:

In [5]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

## 6. Linear Regression Model

We'll use a simple linear regression model to predict the potential of MOFs for separating hydrogen isotopes: 

In [6]:
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error

model = LinearRegression()
model.fit(X_train, y_train)

y_pred = model.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
print(f"Mean Squared Error: {mse:.4f}")

Mean Squared Error: 0.0020


## 7. Results and Interpretation

A low Mean Squared Error of indicates that our model is reasonably accurate at determining potential MOF candidates for quantum sieving.

## Conclusion
This simple linear regression model provides a quick and efficient way to screen MOFs for their potential in separating hydrogen isotopes. While not a substitute for more advanced methods like Monte Carlo simulations or VQE (for smaller molecules), it serves as an initial filter to identify promising candidates for further investigation.

It's important to note that this approach serves as a proof-of-concept and should not be considered a rigorous scientific method.

## References

N. S. Bobbitt et al., "MOFX-DB: An online database of computational adsorption data for nanoporous materials," Journal of Chemical & Engineering Data, vol. 68, no. 2, pp. 483–498, Jan. 2023, doi: 10.1021/acs.jced.2c00583.

H. Oh and M. Hirscher, "Quantum sieving for separation of hydrogen isotopes using MOFs," European Journal of Inorganic Chemistry, vol. 2016, no. 27, pp. 4278–4289, Jun. 2016, doi: 10.1002/ejic.201600253.