# First step: install glasspy

Glasspy is not a requirement for gpvisc. Therefore, to run the code in this notebook, you will need to install it.

Run the cell below to do so.

Beware that it will probably mess up with your scikit-learn version at it requires a specific one, so the best is to either let it do it and reset it later, or create a specific Python environnement.

In [1]:
pip install glasspy

Collecting glasspy
  Downloading glasspy-0.4.6-py3-none-any.whl.metadata (3.1 kB)
Collecting lmfit>=1.0.0 (from glasspy)
  Downloading lmfit-1.3.1-py3-none-any.whl.metadata (13 kB)
Collecting chemparse>=0.1.0 (from glasspy)
  Downloading chemparse-0.3.1-py3-none-any.whl.metadata (1.7 kB)
Collecting scikit-learn==1.2.0 (from glasspy)
  Downloading scikit_learn-1.2.0-cp311-cp311-macosx_10_9_x86_64.whl.metadata (11 kB)
Collecting compress-pickle>=2.1.0 (from glasspy)
  Downloading compress_pickle-2.1.0-py3-none-any.whl.metadata (3.1 kB)
Collecting lightning>=2.0.0 (from glasspy)
  Downloading lightning-2.2.5-py3-none-any.whl.metadata (53 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m53.4/53.4 kB[0m [31m1.7 MB/s[0m eta [36m0:00:00[0m
Collecting lightning-utilities<2.0,>=0.8.0 (from lightning>=2.0.0->glasspy)
  Downloading lightning_utilities-0.11.2-py3-none-any.whl.metadata (4.7 kB)
Collecting torchmetrics<3.0,>=0.7.0 (from lightning>=2.0.0->glasspy)
  Downloadin

# Imports

We now import the necessary libraries as well as the GlassNet model, and the GP model.

In [9]:
from glasspy.predict import GlassNet
import pandas as pd
import gpvisc
import numpy as np
import torch
import gpytorch

## Load models

In [10]:
gp_model, likelihood = gpvisc.load_gp_model()
glassnet_model = GlassNet()

## Load LP handheld database

In [11]:
ds = gpvisc.data_loader()

GlassNet uses upper case column names, while we prefer lower case oxyde names in our library to avoid any typos.

We thus will map lower/upper case below for convenience.

We then get the data from the low pressure database using the ds object returned by `gpvisc.data_loader()`.

In [16]:
columns_name = {"sio2":"SiO2",
                "tio2":"TiO2",
                "al2o3":"Al2O3",
                "fe2o3":"Fe2O3",
                "feo":"FeO",
                "mno":"MnO",
                "na2o":"Na2O",
                "k2o":"K2O",
                "mgo":"MgO",
                "cao":"CaO",
               "p2o5":"P2O5",
               "h2o":"H2O"}

# we get compositions from the low
compo = ds.dataset_lp.loc[:,gpvisc.list_oxides()].rename(columns=columns_name).copy()
T = ds.dataset_lp.loc[:,"T"].copy()
# set P at 0 for GP model
P = np.zeros(len(T))

We now make predictions using GlassNet and the GP models:

In [17]:
# make predictions
y_glasspy = glassnet_model.predict_log10_viscosity(T=T, composition=compo)

In [30]:
X_for_GP = gpvisc.scale_for_gaussianprocess(T.values, P, compo.values/100)
with torch.no_grad(), gpytorch.settings.fast_pred_var():
    y_gp = likelihood(gp_model(torch.FloatTensor(X_for_GP)))
    

You will first notice that predictions using the GP model are MUCH faster than with GlassNet. 15 seconds in comparison to 3 minutes 45 seconds on my MacBook Pro laptop equiped with a Intel i7 processor...

We now can calculate the RMSE of each model. There could be NaN values with GlassNet, so we make a RMSE robust to those.

In [34]:
# report RMSE
def rmse_robust_to_nan(y, y2):
    """rmse evaluation robust to NaN"""
    se = np.nan_to_num((y-y2)**2) # here we convert NaN to 0 so that it has little effect on RMSE
    return np.sqrt(np.mean(se))

rmse_glasspy = rmse_robust_to_nan(y_glasspy.ravel(),ds.dataset_lp.viscosity.values.ravel())
rmse_gpvisc = rmse_robust_to_nan(y_gp.mean.detach().numpy().ravel()*gpvisc.Y_scale(),ds.dataset_lp.viscosity.values.ravel())
print("GlassPy: {:.2f}".format(rmse_glasspy))
print("GP model: {:.2f}".format(rmse_gpvisc))


GlassPy: 0.95
GP model: 0.40


  rmse_glasspy = rmse_robust_to_nan(y_glasspy.ravel(),ds.dataset_lp.viscosity.values.ravel())


# Running the GP model on the full SCIGLASS library for the range of compositions it covers.

For convenience, we outputed the SciGlass library for phospho-alumino-silicate compositions in a CSV file. We can first load it.

In [38]:
data_sciglass = pd.read_csv("./additional_data/FULL_SCIGLASS.csv")

In [39]:
# We get dataset usable with the GP now:
X_sciglass = gpvisc.scale_for_gaussianprocess(data_sciglass["T"].values.copy(),
                             np.zeros((len(data_sciglass),1)),
                             data_sciglass.loc[:,gpvisc.list_oxides()].values/100)

In [40]:
with torch.no_grad(), gpytorch.settings.fast_pred_var():
    y_gp_2 = likelihood(gp_model(torch.FloatTensor(X_sciglass)))


In [44]:
mse_glasspy2 = rmse_robust_to_nan(y_gp_2.mean.detach().numpy().ravel()*gpvisc.Y_scale(), data_sciglass.viscosity.values.ravel())
print("GP model error on the full SciGlass dataset: {:.2f}".format(mse_glasspy2))


GP model error on the full SciGlass dataset: 0.49
