Welcome to this Google Colab Notebook where you can predict wetting angles for the metal-ionocovalent pairs of interest. Reading [the Workflow of the WettingAnglePredictor](https://github.com/sokim1/Wettability_Metal-IonocovalentCeramic.git) before going through this Notebook would help you understand how it works and how to use it.





If you are new to Google Colab Notebook, please read the following:
- Colab is a Python development environment that runs in the browser using Google Cloud.
- Each box is called a cell. To execute a cell, hover the mouse over square brackets ([ ]) on the upper left corner of the cell and press the play button that appears. Or press Shift + Enter after clicking the cell.
- Uploading and downloading a file can be done using a session storage, which can be accessed by clicking the folder icon on the left sidebar.

Now, let's run the WettingAnglePredictor.

First, save a copy of the Google Colab Notebook by using "File > Save a copy in Drive".

Second, execute the first cell. This cell installs all the required packages/dependencies and prepare the machine learning model that will be used to predict wetting angles. It may take a few minutes to be completed.

In [1]:
!pip install matminer
!pip install pymatgen==2020.12.31

#!/usr/bin/env python
import json, sys, os
from urllib.request import urlopen
import pandas as pd
import numpy as np
import matminer
from matminer.utils.conversions import str_to_composition
from matminer.featurizers.composition import ElementProperty
magpie = ElementProperty.from_preset(preset_name="magpie")
import sklearn
from sklearn.ensemble import RandomForestRegressor
from IPython.display import clear_output

data = pd.read_csv("input-matrix.csv")
X = data[data.columns[3:]]
y = data['theta']
preds = X.columns[0:]

rf_otm = RandomForestRegressor(n_estimators=400, random_state =51, oob_score = True, max_depth=15, min_samples_split=2)
rf_otm.fit(X[preds], y)

clear_output()
print('Execution complete!')

Execution complete!


Third, choose which modes to use. The current version provides two different ways of specifying the systems of interest: the interactive mode and the type-in mode. Each mode can be found below. Choose either one of the modes depending on your preference.

#Interactive mode
This mode asks for the information necessary to specify the systems of interest during operation.


Execute the cell and enter the information which the code asks for.

In [2]:
print('The first step is to specify the type of estimation you would like to perform.')
option = int(input('Enter type (1: one metal-ceramic pair; 2: metal-list of ceramics; 3: ICSD screening): '))

print('=====================================================================')
print('The second step is to specify the systems of interest.')
print('Note that this code is case sensitive. You need to enter "Fe" for iron; "fe" does not work.') 
print('Also, you need to enter "Si1O2" for silicon dioxide; "SiO2" does not work.')
metal = str(input('Enter the metal of interest (e.g., Fe): '))

if option == 1:
  ceramic = str(input('Enter the ceramic of interest (e.g., Si1O2): '))
  substrate = pd.DataFrame([ceramic])
else:
    if option == 2:
      filename = str(input('Upload the csv file with the list of ceramics to the session storage and enter the filename (e.g., filename.csv): '))
      header = str(input('Enter the header of the list of ceramics (e.g., ceramic): '))
    else: 
        if option == 3:
          MATCHBOOK = str(input('Enter the MATCHBOOK referring to https://doi.org/10.1016/j.commatsci.2017.04.036 (e.g.,species((O:F),!S),nspecies(2*,*8),catalog(icsd))'))
        else: print('Error: check if necessary inputs are all entered.')

print('=====================================================================')
print('The third step is to specify the temperature range of interest.')
Tinterest = str(input('Are you interested in wetting angles at a range of temperatures? (yes or no) '))

if Tinterest == 'no':
  T0 = int(input('Enter temperature of interest in Kelvin (e.g., 1800): '))
  Trange = 0
else:
    T0 = int(input('Enter the lower limit of the temperature of interest in Kelvin (e.g., 1800): '))
    Trange = int(input('Enter the temperature range of interest (e.g., 500 if interested in from 1800 K to 2300 K): '))
    Tinterval = int(input('Enter the interval between temperatures (e.g., 100 if interested in 1800 K, 1900 K, ..., 2300 K): '))

print('=====================================================================')
print('The final step is to specify the wetting angle range of interest.')
LimitWettingAngle = str(input('Do you want to output only results with the wetting angles in a certain range? (yes or no)'))

if LimitWettingAngle == 'yes':
  WettingAngleRange = str(input('Specify the wetting angle range of interest (e.g., "50-", "-100", or "50-100"): '))

outputfilename = str(input('Enter output filename: '))

#================================================================================================
if option == 2:
  substrate = pd.DataFrame(pd.read_csv(filename)[header].values.tolist())

if option == 3:
  substrate =pd.DataFrame(json.loads(urlopen("http://aflowlib.duke.edu/search/API/?" + MATCHBOOK + ",$paging(0)").read().decode("utf-8")))['compound']

matrix = pd.DataFrame([metal] * len(substrate))

sys_cond_0 = pd.concat([matrix, substrate], axis=1)
sys_cond_0['Temp'] = pd.DataFrame([T0] * len(substrate))
sys_cond_0.columns = ['Metal', 'Substrate', 'Temp']

metal_matminer = pd.DataFrame([metal], columns=['Metal'])
metal_matminer['Me_comp'] = metal_matminer['Metal'].transform(str_to_composition)
data_Me = magpie.featurize_dataframe(metal_matminer, col_id="Me_comp", ignore_errors=True)
metal_features = pd.DataFrame(data_Me.values.tolist()*len(substrate), columns = data_Me.columns)
feature_Me = metal_features.filter(like = 'mean')
feature_Me.columns = ['Me_'+ j for j in feature_Me.columns]

sys_cond_0['Sub_comp'] = sys_cond_0['Substrate'].transform(str_to_composition)
data_Sub = magpie.featurize_dataframe(sys_cond_0, col_id="Sub_comp", ignore_errors=True)
feature_Sub = data_Sub.filter(like = 'mean')
feature_Sub.columns = ['Ce_'+ j for j in feature_Sub.columns]

feature_all = pd.concat([feature_Me, feature_Sub], axis = 1)
sys_feature_0 = pd.concat([sys_cond_0[sys_cond_0.columns[0:3]], feature_all], axis = 1)

sys_feature = sys_feature_0

if Trange != 0:
  for i in range(Trange//Tinterval+1):
    sys_feature_temporary = sys_feature_0 
    T_temporary = pd.DataFrame([T0 + Tinterval * (i)] * len(substrate))
    sys_feature_temporary['Temp'] = T_temporary
    sys_feature = sys_feature.append(sys_feature_temporary)

y_sys_pred = rf_otm.predict(sys_feature[preds])
output = sys_feature[sys_feature.columns[0:3]]
output['theta_pred'] = pd.Series(y_sys_pred)

if LimitWettingAngle == 'yes':
  WAsplit = WettingAngleRange.split('-')
  if WAsplit[0].isnumeric():
    if WAsplit[1].isnumeric():
      output_lim = output.loc[(output['theta_pred'] > float(WAsplit[0])) & (output['theta_pred'] < float(WAsplit[1]))]
    else: 
      output_lim = output.loc[(output['theta_pred'] > float(WAsplit[0]))]
  else:
    output_lim = output.loc[(output['theta_pred'] < float(WAsplit[1]))]
  output_lim.to_csv(outputfilename+'.csv')
else:
  output.to_csv(outputfilename+'.csv')

print('=====================================================================')
print('Prediction complete!')
print('The CSV file with the results is stored in the session storage.')

The first step is to specify the type of estimation you would like to perform.
Enter type (1: one metal-ceramic pair; 2: metal-list of ceramics; 3: ICSD screening): 2
The second step is to specify the systems of interest.
Note that this code is case sensitive. You need to enter "Fe" for iron; "fe" does not work.
Also, you need to enter "Si1O2" for silicon dioxide; "SiO2" does not work.
Enter the metal of interest (e.g., Fe): Li
Upload the csv file with the list of ceramics to the session storage and enter the filename (e.g., filename.csv): lists_Li-LEIcandidates.csv
Enter the header of the list of ceramics (e.g., ceramic): comp
The third step is to specify the temperature range of interest.
Are you interested in wetting angles at a range of temperatures? (yes or no) yes
Enter the lower limit of the temperature of interest in Kelvin (e.g., 1800): 500
Enter the temperature range of interest (e.g., 500 if interested in from 1800 K to 2300 K): 200
Enter the interval between temperatures (e

matminer.utils.conversions.str_to_composition is deprecated and will be removed in December 2018. Please use the matminer.featurizers.conversions.StrToComposition Featurizer instead
  mapped = lib.map_infer(values, f, convert=convert_dtype)
matminer.utils.conversions.str_to_composition is deprecated and will be removed in December 2018. Please use the matminer.featurizers.conversions.StrToComposition Featurizer instead
  result = func(self, *args, **kwargs)


ElementProperty:   0%|          | 0/1 [00:00<?, ?it/s]

ElementProperty:   0%|          | 0/48 [00:00<?, ?it/s]

Prediction complete!
The CSV file with the results is stored in the session storage.


#Type-in mode

This mode requires a user to enter all the information necessary to specify the systems of interest before executing the cell. 

There are two cells in this mode. Specify the systems of interest following the instructions written in the first cell. After entering all the necessary information, run the first cell and the second cell sequentially.

In [48]:
# The first step is to specify the type of estimation you would like to perform. Please choose one of the options.
option = 1 # 1: one metal-ceramic pair; 2: metal-list of ceramics; or 3: ICSD screening

#===================================================================================
# The second step is to specify the systems of interest.
# Note that this code is case sensitive. You need to enter "Fe" for iron; "fe" does not work.
# Also, you need to enter "Si1O2" for silicon dioxide; "SiO2" does not work.
# Enter the metal of interest (e.g., "Fe"): 
metal = "Fe"

if option == 1: # If you chose "1" for the option above, enter the ceramic of interest below (e.g., "Si1O2").
  substrate = pd.DataFrame(["Si1O2"])
else:
    if option == 2: # If you chose "2" for the option above, upload the csv file with the list of ceramics to the session storage and enter the filename and the header.
      substrate = pd.DataFrame(pd.read_csv("filename.csv")["header for the ceramic list"].values.tolist())
    else: 
        if option == 3: # If you chose "3" for the option above, you need to enter material keywords and arguments to retrieve ceramics that satisfy certain criteria. 
                        # Please refer to the Figure 1 and the Appendix C of the following article: https://doi.org/10.1016/j.commatsci.2017.04.036.
                        # For example, to retrieve compounds with O or F but no S of which the number of species is between 2 and 8, "species((O:F),!S),nspecies(2*,*8),catalog(icsd)" should be used.
          MATCHBOOK="species((O:F),!S),nspecies(2*,*8),catalog(icsd)"
        else: print('Error: check if necessary inputs are all entered.')

#===================================================================================
# The third step is to specify the temperature range of interest.
# For example, if you are interested in wetting angles at 1800 K, 1900 K, 2000 K, 2100 K, and 2200 K, enter 1800 for T0, 500 for Trange, and 100 for Tinterval.
T0 = 1800 # in Kelvin
Trange = 500 # Enter 0 if only the wetting angles at T0 is of interest.
Tinterval = 100

#===================================================================================
# Next step is to specify the wetting angle range of interest.
# For example, if you would like to see only the results for the pairs of which wetting angles are between 50 degree and 100 degree, 
# enter "yes" for LimitWettingAngle and "50-100" for WettingAngle Range.
LimitWettingAngle = "yes" # "yes" or "no"
WettingAngleRange = "0-180" # e.g., "50-"", "-100", or "50-100"

#===================================================================================
# Finally, enter the desired output filename (e.g., "result")
outputfilename = "result"

In [3]:
if option == 3:
  substrate =pd.DataFrame(json.loads(urlopen("http://aflowlib.duke.edu/search/API/?" + MATCHBOOK + ",$paging(0)").read().decode("utf-8")))['compound']

matrix = pd.DataFrame([metal] * len(substrate))

sys_cond_0 = pd.concat([matrix, substrate], axis=1)
sys_cond_0['Temp'] = pd.DataFrame([T0] * len(substrate))
sys_cond_0.columns = ['Metal', 'Substrate', 'Temp']

metal_matminer = pd.DataFrame([metal], columns=['Metal'])
metal_matminer['Me_comp'] = metal_matminer['Metal'].transform(str_to_composition)
data_Me = magpie.featurize_dataframe(metal_matminer, col_id="Me_comp", ignore_errors=True)
metal_features = pd.DataFrame(data_Me.values.tolist()*len(substrate), columns = data_Me.columns)
feature_Me = metal_features.filter(like = 'mean')
feature_Me.columns = ['Me_'+ j for j in feature_Me.columns]

sys_cond_0['Sub_comp'] = sys_cond_0['Substrate'].transform(str_to_composition)
data_Sub = magpie.featurize_dataframe(sys_cond_0, col_id="Sub_comp", ignore_errors=True)
feature_Sub = data_Sub.filter(like = 'mean')
feature_Sub.columns = ['Ce_'+ j for j in feature_Sub.columns]

feature_all = pd.concat([feature_Me, feature_Sub], axis = 1)
sys_feature_0 = pd.concat([sys_cond_0[sys_cond_0.columns[0:3]], feature_all], axis = 1)

sys_feature = sys_feature_0

if Trange != 0:
  for i in range(Trange//Tinterval+1):
    sys_feature_temporary = sys_feature_0 
    T_temporary = pd.DataFrame([T0 + Tinterval * (i)] * len(substrate))
    sys_feature_temporary['Temp'] = T_temporary
    sys_feature = sys_feature.append(sys_feature_temporary)

y_sys_pred = rf_otm.predict(sys_feature[preds])
output = sys_feature[sys_feature.columns[0:3]]
output['theta_pred'] = pd.Series(y_sys_pred)

if LimitWettingAngle == 'yes':
  WAsplit = WettingAngleRange.split('-')
  if WAsplit[0].isnumeric():
    if WAsplit[1].isnumeric():
      output_lim = output.loc[(output['theta_pred'] > float(WAsplit[0])) & (output['theta_pred'] < float(WAsplit[1]))]
    else: 
      output_lim = output.loc[(output['theta_pred'] > float(WAsplit[0]))]
  else:
    output_lim = output.loc[(output['theta_pred'] < float(WAsplit[1]))]
  output_lim.to_csv(outputfilename+'.csv')
else:
  output.to_csv(outputfilename+'.csv')

print('=====================================================================')
print('Prediction complete!')
print('The CSV file with the results is stored in the session storage.')

matminer.utils.conversions.str_to_composition is deprecated and will be removed in December 2018. Please use the matminer.featurizers.conversions.StrToComposition Featurizer instead
  mapped = lib.map_infer(values, f, convert=convert_dtype)
matminer.utils.conversions.str_to_composition is deprecated and will be removed in December 2018. Please use the matminer.featurizers.conversions.StrToComposition Featurizer instead
  result = func(self, *args, **kwargs)


ElementProperty:   0%|          | 0/1 [00:00<?, ?it/s]

ElementProperty:   0%|          | 0/48 [00:00<?, ?it/s]

Prediction complete!
The CSV file with the results is stored in the session storage.
