# MODNet sklearn interface

The sklearn API implementation of MODNet can be found *under modnet.sklearn*. It enables integration with scikit-learn methods such as pipelines, model selection functions (e.g. gridsearch), and integration with other sklearn models.

The main classes are:

- **MODNetFeaturizer**: A transformer that converts a list of compositions or structures to a dataframe of shape (n_samples, n_features)

- **RR**: A transformer based on Relevance-Redundancy (RR) feature selection. Given an input array or dataframe it will keep *n_feat* features with the highest RR-score.

- **MODNetRegressor**: A regressor based on the MODNetModel for fitting, allowing multiple properties.

- **MODNetClassifier**: A classifier based on the MODNetModel.


Please find hereunder some simple examples using this sklearn interface. They are simplified (omitting validation etc.) to go straight to the point.

## Featurization
Quick example on how to use *MODNetFeaturizer*

In [1]:
from pymatgen.core import Composition
from modnet.sklearn import MODNetFeaturizer
X = [Composition("Si"), Composition("Cu"), Composition("Al"), Composition("Ti")]
featurizer = MODNetFeaturizer()
X = featurizer.fit_transform(X)
X

2022-05-17 17:42:02,696 - modnet - INFO - Loaded CompositionOnlyFeaturizer featurizer.
2022-05-17 17:42:02,698 - modnet - INFO - Computing features, this can take time...
2022-05-17 17:42:02,699 - modnet - INFO - Applying composition featurizers...
2022-05-17 17:42:02,702 - modnet - INFO - Applying featurizers (AtomicOrbitals(), AtomicPackingEfficiency(), BandCenter(), ElementFraction(), ElementProperty(data_source=<matminer.utils.data.MagpieData object at 0x7f79d6f3a9d0>,
                features=['Number', 'MendeleevNumber', 'AtomicWeight',
                          'MeltingT', 'Column', 'Row', 'CovalentRadius',
                          'Electronegativity', 'NsValence', 'NpValence',
                          'NdValence', 'NfValence', 'NValence', 'NsUnfilled',
                          'NpUnfilled', 'NdUnfilled', 'NfUnfilled', 'NUnfilled',
                          'GSvolume_pa', 'GSbandgap', 'GSmagmom',
                          'SpaceGroupNumber'],
                stats=['minimum',

MultipleFeaturizer: 100%|██████████| 4/4 [00:00<00:00, 22.72it/s]


2022-05-17 17:42:05,538 - modnet - INFO - Data has successfully been featurized!


Unnamed: 0_level_0,AtomicOrbitals|HOMO_character,AtomicOrbitals|HOMO_element,AtomicOrbitals|HOMO_energy,AtomicOrbitals|LUMO_character,AtomicOrbitals|LUMO_element,AtomicOrbitals|LUMO_energy,AtomicOrbitals|gap_AO,AtomicPackingEfficiency|mean simul. packing efficiency,AtomicPackingEfficiency|mean abs simul. packing efficiency,AtomicPackingEfficiency|dist from 1 clusters |APE| < 0.010,...,ValenceOrbital|avg s valence electrons,ValenceOrbital|avg p valence electrons,ValenceOrbital|avg d valence electrons,ValenceOrbital|avg f valence electrons,ValenceOrbital|frac s valence electrons,ValenceOrbital|frac p valence electrons,ValenceOrbital|frac d valence electrons,ValenceOrbital|frac f valence electrons,YangSolidSolution|Yang omega,YangSolidSolution|Yang delta
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
id0,2,14,-0.153293,2,14,-0.153293,0.0,0.023994,0.023994,1.0,...,2.0,2.0,0.0,0.0,0.5,0.5,0.0,0.0,0,0.0
id1,1,29,-0.172056,1,29,-0.172056,0.0,0.023994,0.023994,1.0,...,1.0,0.0,10.0,0.0,0.090909,0.0,0.909091,0.0,0,0.0
id2,2,13,-0.102545,2,13,-0.102545,0.0,0.023994,0.023994,1.0,...,2.0,1.0,0.0,0.0,0.666667,0.333333,0.0,0.0,0,0.0
id3,3,22,-0.17001,3,22,-0.17001,0.0,0.023994,0.023994,1.0,...,2.0,0.0,2.0,0.0,0.5,0.0,0.5,0.0,0,0.0


## Feature selection

In [2]:
from sklearn.datasets import fetch_california_housing
from modnet.sklearn import RR
housing = fetch_california_housing()
X = housing["data"]
y = housing["target"]

rr = RR(n_feat=4)
X = rr.fit_transform(X,y)

2022-05-17 17:29:01,936 - modnet - INFO - Multiprocessing on 1 workers.
2022-05-17 17:29:01,942 - modnet - INFO - Computing "self" MI (i.e. information entropy) of features


100%|██████████| 8/8 [00:03<00:00,  2.02it/s]

2022-05-17 17:29:05,916 - modnet - INFO - Computing cross NMI between all features...



100%|██████████| 28/28 [00:08<00:00,  3.15it/s]


['f7', 'f0', 'f2', 'f5']
4


## Regression

In [5]:
from modnet.sklearn import MODNetRegressor

# simple train example
model = MODNetRegressor()
model.fit(X,y)
model.predict(X)

array([[4.6175666 ],
       [4.6177154 ],
       [4.098857  ],
       ...,
       [0.83606505],
       [0.7955098 ],
       [0.9387408 ]], dtype=float32)

## Classification

In [8]:
from modnet.sklearn import MODNetClassifier
from sklearn.datasets import load_digits
data = load_digits()
X = data["data"]
y = data["target"]

model = MODNetClassifier()
model.fit(X,y)
model.predict(X)

array([0, 1, 2, ..., 8, 9, 8])

## Pipe

In [2]:
from sklearn.pipeline import make_pipeline
from pymatgen.core import Composition
from modnet.sklearn import MODNetFeaturizer, RR, MODNetRegressor
pipe = make_pipeline(
    MODNetFeaturizer(), RR(drop_thr=0, n_feat=10), MODNetRegressor()
)
X = [Composition("Si"), Composition("Cu"), Composition("Al"), Composition("Ti")]
y = [1.1, 1.6, 2.6, 0.5]
pipe.fit(X, y)
pipe.predict(X)

2022-05-17 17:48:37,023 - modnet - INFO - Loaded CompositionOnlyFeaturizer featurizer.
2022-05-17 17:48:37,024 - modnet - INFO - Computing features, this can take time...
2022-05-17 17:48:37,025 - modnet - INFO - Applying composition featurizers...
2022-05-17 17:48:37,030 - modnet - INFO - Applying featurizers (AtomicOrbitals(), AtomicPackingEfficiency(), BandCenter(), ElementFraction(), ElementProperty(data_source=<matminer.utils.data.MagpieData object at 0x7f893df15400>,
                features=['Number', 'MendeleevNumber', 'AtomicWeight',
                          'MeltingT', 'Column', 'Row', 'CovalentRadius',
                          'Electronegativity', 'NsValence', 'NpValence',
                          'NdValence', 'NfValence', 'NValence', 'NsUnfilled',
                          'NpUnfilled', 'NdUnfilled', 'NfUnfilled', 'NUnfilled',
                          'GSvolume_pa', 'GSbandgap', 'GSmagmom',
                          'SpaceGroupNumber'],
                stats=['minimum',

MultipleFeaturizer: 100%|██████████| 4/4 [00:00<00:00, 22.02it/s]


2022-05-17 17:48:40,714 - modnet - INFO - Data has successfully been featurized!
2022-05-17 17:48:40,735 - modnet - INFO - Multiprocessing on 1 workers.
2022-05-17 17:48:40,739 - modnet - INFO - Computing "self" MI (i.e. information entropy) of features


100%|██████████| 270/270 [00:03<00:00, 77.22it/s]

2022-05-17 17:48:44,258 - modnet - INFO - Computing cross NMI between all features...



  mutual_info.loc[res[1], res[2]] = mutual_info.loc[res[2], res[1]] = res[0] / (
  mutual_info.loc[res[1], res[2]] = mutual_info.loc[res[2], res[1]] = res[0] / (
100%|██████████| 4851/4851 [00:08<00:00, 562.55it/s]
  mutual_info.loc[x, target_name] = mutual_info.loc[x, target_name] / (
  mutual_info.loc[x, target_name] = mutual_info.loc[x, target_name] / (


['AtomicOrbitals|HOMO_character', 'AtomicOrbitals|HOMO_element', 'AtomicOrbitals|HOMO_energy', 'AtomicOrbitals|LUMO_character', 'AtomicOrbitals|LUMO_element', 'AtomicOrbitals|LUMO_energy', 'BandCenter|band center', 'ElementFraction|Al', 'ElementFraction|Si', 'ElementFraction|Ti']
10


2022-05-17 17:48:53.963323: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


2022-05-17 17:48:54,972 - modnet - INFO - Loaded CompositionOnlyFeaturizer featurizer.
2022-05-17 17:48:54,974 - modnet - INFO - Computing features, this can take time...
2022-05-17 17:48:54,975 - modnet - INFO - Applying composition featurizers...
2022-05-17 17:48:54,979 - modnet - INFO - Applying featurizers (AtomicOrbitals(), AtomicPackingEfficiency(), BandCenter(), ElementFraction(), ElementProperty(data_source=<matminer.utils.data.MagpieData object at 0x7f893e154a00>,
                features=['Number', 'MendeleevNumber', 'AtomicWeight',
                          'MeltingT', 'Column', 'Row', 'CovalentRadius',
                          'Electronegativity', 'NsValence', 'NpValence',
                          'NdValence', 'NfValence', 'NValence', 'NsUnfilled',
                          'NpUnfilled', 'NdUnfilled', 'NfUnfilled', 'NUnfilled',
                          'GSvolume_pa', 'GSbandgap', 'GSmagmom',
                          'SpaceGroupNumber'],
                stats=['minimum',

MultipleFeaturizer: 100%|██████████| 4/4 [00:00<00:00, 17.89it/s]


2022-05-17 17:48:57,768 - modnet - INFO - Data has successfully been featurized!
['AtomicOrbitals|HOMO_character', 'AtomicOrbitals|HOMO_element', 'AtomicOrbitals|HOMO_energy', 'AtomicOrbitals|LUMO_character', 'AtomicOrbitals|LUMO_element', 'AtomicOrbitals|LUMO_energy', 'BandCenter|band center', 'ElementFraction|Al', 'ElementFraction|Si', 'ElementFraction|Ti']
10


array([[1.10017  ],
       [1.5998619],
       [2.6000707],
       [0.5000025]], dtype=float32)

## GridSearch

In [10]:
from sklearn.model_selection import GridSearchCV
from sklearn.pipeline import make_pipeline
from pymatgen.core import Composition
from modnet.sklearn import MODNetFeaturizer, RR, MODNetRegressor
pipe = make_pipeline(
    MODNetFeaturizer(), RR(drop_thr=0, n_feat=10), MODNetRegressor()
)
X = [Composition("Si"), Composition("Cu"), Composition("Al"), Composition("Ti")]
y = [1.1, 1.6, 2.6, 0.5]


parameters={"modnetregressor__lr":[0.1, 0.01], "rr__n_feat":[20,50]}
grid = GridSearchCV(pipe, cv=2, param_grid=parameters)
grid.fit(X,y)

2022-05-17 17:58:58,492 - modnet - INFO - Loaded CompositionOnlyFeaturizer featurizer.
2022-05-17 17:58:58,494 - modnet - INFO - Computing features, this can take time...
2022-05-17 17:58:58,494 - modnet - INFO - Applying composition featurizers...
2022-05-17 17:58:58,497 - modnet - INFO - Applying featurizers (AtomicOrbitals(), AtomicPackingEfficiency(), BandCenter(), ElementFraction(), ElementProperty(data_source=<matminer.utils.data.MagpieData object at 0x7f893edc9040>,
                features=['Number', 'MendeleevNumber', 'AtomicWeight',
                          'MeltingT', 'Column', 'Row', 'CovalentRadius',
                          'Electronegativity', 'NsValence', 'NpValence',
                          'NdValence', 'NfValence', 'NValence', 'NsUnfilled',
                          'NpUnfilled', 'NdUnfilled', 'NfUnfilled', 'NUnfilled',
                          'GSvolume_pa', 'GSbandgap', 'GSmagmom',
                          'SpaceGroupNumber'],
                stats=['minimum',

MultipleFeaturizer: 100%|██████████| 2/2 [00:00<00:00,  9.11it/s]


2022-05-17 17:59:00,706 - modnet - INFO - Data has successfully been featurized!
2022-05-17 17:59:00,727 - modnet - INFO - Multiprocessing on 1 workers.
2022-05-17 17:59:00,730 - modnet - INFO - Computing "self" MI (i.e. information entropy) of features


  0%|          | 0/270 [00:02<?, ?it/s]
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/Users/ppdebreuck/anaconda3/envs/modnet-develop/lib/python3.8/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/Users/ppdebreuck/anaconda3/envs/modnet-develop/lib/python3.8/multiprocessing/pool.py", line 48, in mapstar
    return list(map(*args))
  File "/Users/ppdebreuck/Research/Software/modnet/modnet/preprocessing.py", line 64, in map_mi
    return compute_mi(**kwargs)
  File "/Users/ppdebreuck/Research/Software/modnet/modnet/preprocessing.py", line 53, in compute_mi
    mi = mutual_info_regression(
  File "/Users/ppdebreuck/anaconda3/envs/modnet-develop/lib/python3.8/site-packages/sklearn/utils/validation.py", line 72, in inner_f
    return f(**kwargs)
  File "/Users/ppdebreuck/anaconda3/envs/modnet-develop/lib/python3.8/site-packages/sklearn/feature_selection/_mutual_info.py", line 368, in mutual_info_regression

2022-05-17 17:59:03,551 - modnet - INFO - Loaded CompositionOnlyFeaturizer featurizer.
2022-05-17 17:59:03,552 - modnet - INFO - Computing features, this can take time...
2022-05-17 17:59:03,553 - modnet - INFO - Applying composition featurizers...
2022-05-17 17:59:03,567 - modnet - INFO - Applying featurizers (AtomicOrbitals(), AtomicPackingEfficiency(), BandCenter(), ElementFraction(), ElementProperty(data_source=<matminer.utils.data.MagpieData object at 0x7f893ee3d1c0>,
                features=['Number', 'MendeleevNumber', 'AtomicWeight',
                          'MeltingT', 'Column', 'Row', 'CovalentRadius',
                          'Electronegativity', 'NsValence', 'NpValence',
                          'NdValence', 'NfValence', 'NValence', 'NsUnfilled',
                          'NpUnfilled', 'NdUnfilled', 'NfUnfilled', 'NUnfilled',
                          'GSvolume_pa', 'GSbandgap', 'GSmagmom',
                          'SpaceGroupNumber'],
                stats=['minimum',

MultipleFeaturizer: 100%|██████████| 2/2 [00:00<00:00,  7.87it/s]


2022-05-17 17:59:05,966 - modnet - INFO - Data has successfully been featurized!
2022-05-17 17:59:05,986 - modnet - INFO - Multiprocessing on 1 workers.
2022-05-17 17:59:05,990 - modnet - INFO - Computing "self" MI (i.e. information entropy) of features


  0%|          | 0/270 [00:02<?, ?it/s]


2022-05-17 17:59:08,784 - modnet - INFO - Loaded CompositionOnlyFeaturizer featurizer.
2022-05-17 17:59:08,785 - modnet - INFO - Computing features, this can take time...
2022-05-17 17:59:08,786 - modnet - INFO - Applying composition featurizers...
2022-05-17 17:59:08,797 - modnet - INFO - Applying featurizers (AtomicOrbitals(), AtomicPackingEfficiency(), BandCenter(), ElementFraction(), ElementProperty(data_source=<matminer.utils.data.MagpieData object at 0x7f893efcfa30>,
                features=['Number', 'MendeleevNumber', 'AtomicWeight',
                          'MeltingT', 'Column', 'Row', 'CovalentRadius',
                          'Electronegativity', 'NsValence', 'NpValence',
                          'NdValence', 'NfValence', 'NValence', 'NsUnfilled',
                          'NpUnfilled', 'NdUnfilled', 'NfUnfilled', 'NUnfilled',
                          'GSvolume_pa', 'GSbandgap', 'GSmagmom',
                          'SpaceGroupNumber'],
                stats=['minimum',

MultipleFeaturizer: 100%|██████████| 2/2 [00:00<00:00, 10.54it/s]


2022-05-17 17:59:10,983 - modnet - INFO - Data has successfully been featurized!
2022-05-17 17:59:11,004 - modnet - INFO - Multiprocessing on 1 workers.
2022-05-17 17:59:11,007 - modnet - INFO - Computing "self" MI (i.e. information entropy) of features


  0%|          | 0/270 [00:02<?, ?it/s]


2022-05-17 17:59:14,224 - modnet - INFO - Loaded CompositionOnlyFeaturizer featurizer.
2022-05-17 17:59:14,226 - modnet - INFO - Computing features, this can take time...
2022-05-17 17:59:14,227 - modnet - INFO - Applying composition featurizers...
2022-05-17 17:59:14,242 - modnet - INFO - Applying featurizers (AtomicOrbitals(), AtomicPackingEfficiency(), BandCenter(), ElementFraction(), ElementProperty(data_source=<matminer.utils.data.MagpieData object at 0x7f893efbac10>,
                features=['Number', 'MendeleevNumber', 'AtomicWeight',
                          'MeltingT', 'Column', 'Row', 'CovalentRadius',
                          'Electronegativity', 'NsValence', 'NpValence',
                          'NdValence', 'NfValence', 'NValence', 'NsUnfilled',
                          'NpUnfilled', 'NdUnfilled', 'NfUnfilled', 'NUnfilled',
                          'GSvolume_pa', 'GSbandgap', 'GSmagmom',
                          'SpaceGroupNumber'],
                stats=['minimum',

MultipleFeaturizer: 100%|██████████| 2/2 [00:00<00:00,  8.02it/s]


2022-05-17 17:59:16,568 - modnet - INFO - Data has successfully been featurized!
2022-05-17 17:59:16,589 - modnet - INFO - Multiprocessing on 1 workers.
2022-05-17 17:59:16,592 - modnet - INFO - Computing "self" MI (i.e. information entropy) of features


  0%|          | 0/270 [00:02<?, ?it/s]


2022-05-17 17:59:19,463 - modnet - INFO - Loaded CompositionOnlyFeaturizer featurizer.
2022-05-17 17:59:19,464 - modnet - INFO - Computing features, this can take time...
2022-05-17 17:59:19,465 - modnet - INFO - Applying composition featurizers...
2022-05-17 17:59:19,476 - modnet - INFO - Applying featurizers (AtomicOrbitals(), AtomicPackingEfficiency(), BandCenter(), ElementFraction(), ElementProperty(data_source=<matminer.utils.data.MagpieData object at 0x7f893ee3d580>,
                features=['Number', 'MendeleevNumber', 'AtomicWeight',
                          'MeltingT', 'Column', 'Row', 'CovalentRadius',
                          'Electronegativity', 'NsValence', 'NpValence',
                          'NdValence', 'NfValence', 'NValence', 'NsUnfilled',
                          'NpUnfilled', 'NdUnfilled', 'NfUnfilled', 'NUnfilled',
                          'GSvolume_pa', 'GSbandgap', 'GSmagmom',
                          'SpaceGroupNumber'],
                stats=['minimum',

MultipleFeaturizer: 100%|██████████| 2/2 [00:00<00:00,  9.67it/s]


2022-05-17 17:59:22,147 - modnet - INFO - Data has successfully been featurized!
2022-05-17 17:59:22,168 - modnet - INFO - Multiprocessing on 1 workers.
2022-05-17 17:59:22,171 - modnet - INFO - Computing "self" MI (i.e. information entropy) of features


  0%|          | 0/270 [00:02<?, ?it/s]


2022-05-17 17:59:25,022 - modnet - INFO - Loaded CompositionOnlyFeaturizer featurizer.
2022-05-17 17:59:25,024 - modnet - INFO - Computing features, this can take time...
2022-05-17 17:59:25,025 - modnet - INFO - Applying composition featurizers...
2022-05-17 17:59:25,036 - modnet - INFO - Applying featurizers (AtomicOrbitals(), AtomicPackingEfficiency(), BandCenter(), ElementFraction(), ElementProperty(data_source=<matminer.utils.data.MagpieData object at 0x7f893efd6f70>,
                features=['Number', 'MendeleevNumber', 'AtomicWeight',
                          'MeltingT', 'Column', 'Row', 'CovalentRadius',
                          'Electronegativity', 'NsValence', 'NpValence',
                          'NdValence', 'NfValence', 'NValence', 'NsUnfilled',
                          'NpUnfilled', 'NdUnfilled', 'NfUnfilled', 'NUnfilled',
                          'GSvolume_pa', 'GSbandgap', 'GSmagmom',
                          'SpaceGroupNumber'],
                stats=['minimum',

MultipleFeaturizer: 100%|██████████| 2/2 [00:00<00:00,  9.46it/s]


2022-05-17 17:59:27,635 - modnet - INFO - Data has successfully been featurized!
2022-05-17 17:59:27,655 - modnet - INFO - Multiprocessing on 1 workers.
2022-05-17 17:59:27,659 - modnet - INFO - Computing "self" MI (i.e. information entropy) of features


  0%|          | 0/270 [00:02<?, ?it/s]


2022-05-17 17:59:30,555 - modnet - INFO - Loaded CompositionOnlyFeaturizer featurizer.
2022-05-17 17:59:30,557 - modnet - INFO - Computing features, this can take time...
2022-05-17 17:59:30,557 - modnet - INFO - Applying composition featurizers...
2022-05-17 17:59:30,568 - modnet - INFO - Applying featurizers (AtomicOrbitals(), AtomicPackingEfficiency(), BandCenter(), ElementFraction(), ElementProperty(data_source=<matminer.utils.data.MagpieData object at 0x7f893efbeb50>,
                features=['Number', 'MendeleevNumber', 'AtomicWeight',
                          'MeltingT', 'Column', 'Row', 'CovalentRadius',
                          'Electronegativity', 'NsValence', 'NpValence',
                          'NdValence', 'NfValence', 'NValence', 'NsUnfilled',
                          'NpUnfilled', 'NdUnfilled', 'NfUnfilled', 'NUnfilled',
                          'GSvolume_pa', 'GSbandgap', 'GSmagmom',
                          'SpaceGroupNumber'],
                stats=['minimum',

MultipleFeaturizer: 100%|██████████| 2/2 [00:00<00:00,  9.96it/s]


2022-05-17 17:59:32,870 - modnet - INFO - Data has successfully been featurized!
2022-05-17 17:59:32,891 - modnet - INFO - Multiprocessing on 1 workers.
2022-05-17 17:59:32,894 - modnet - INFO - Computing "self" MI (i.e. information entropy) of features


  0%|          | 0/270 [00:02<?, ?it/s]


2022-05-17 17:59:35,822 - modnet - INFO - Loaded CompositionOnlyFeaturizer featurizer.
2022-05-17 17:59:35,823 - modnet - INFO - Computing features, this can take time...
2022-05-17 17:59:35,824 - modnet - INFO - Applying composition featurizers...
2022-05-17 17:59:35,835 - modnet - INFO - Applying featurizers (AtomicOrbitals(), AtomicPackingEfficiency(), BandCenter(), ElementFraction(), ElementProperty(data_source=<matminer.utils.data.MagpieData object at 0x7f893f527370>,
                features=['Number', 'MendeleevNumber', 'AtomicWeight',
                          'MeltingT', 'Column', 'Row', 'CovalentRadius',
                          'Electronegativity', 'NsValence', 'NpValence',
                          'NdValence', 'NfValence', 'NValence', 'NsUnfilled',
                          'NpUnfilled', 'NdUnfilled', 'NfUnfilled', 'NUnfilled',
                          'GSvolume_pa', 'GSbandgap', 'GSmagmom',
                          'SpaceGroupNumber'],
                stats=['minimum',

MultipleFeaturizer: 100%|██████████| 2/2 [00:00<00:00,  9.93it/s]


2022-05-17 17:59:38,166 - modnet - INFO - Data has successfully been featurized!
2022-05-17 17:59:38,189 - modnet - INFO - Multiprocessing on 1 workers.
2022-05-17 17:59:38,192 - modnet - INFO - Computing "self" MI (i.e. information entropy) of features


  0%|          | 0/270 [00:02<?, ?it/s]


2022-05-17 17:59:41,299 - modnet - INFO - Loaded CompositionOnlyFeaturizer featurizer.
2022-05-17 17:59:41,300 - modnet - INFO - Computing features, this can take time...
2022-05-17 17:59:41,300 - modnet - INFO - Applying composition featurizers...
2022-05-17 17:59:41,312 - modnet - INFO - Applying featurizers (AtomicOrbitals(), AtomicPackingEfficiency(), BandCenter(), ElementFraction(), ElementProperty(data_source=<matminer.utils.data.MagpieData object at 0x7f893efa3df0>,
                features=['Number', 'MendeleevNumber', 'AtomicWeight',
                          'MeltingT', 'Column', 'Row', 'CovalentRadius',
                          'Electronegativity', 'NsValence', 'NpValence',
                          'NdValence', 'NfValence', 'NValence', 'NsUnfilled',
                          'NpUnfilled', 'NdUnfilled', 'NfUnfilled', 'NUnfilled',
                          'GSvolume_pa', 'GSbandgap', 'GSmagmom',
                          'SpaceGroupNumber'],
                stats=['minimum',

MultipleFeaturizer: 100%|██████████| 4/4 [00:00<00:00, 16.14it/s]


2022-05-17 17:59:44,114 - modnet - INFO - Data has successfully been featurized!
2022-05-17 17:59:44,134 - modnet - INFO - Multiprocessing on 1 workers.
2022-05-17 17:59:44,138 - modnet - INFO - Computing "self" MI (i.e. information entropy) of features


100%|██████████| 270/270 [00:03<00:00, 84.00it/s] 

2022-05-17 17:59:47,376 - modnet - INFO - Computing cross NMI between all features...



  mutual_info.loc[res[1], res[2]] = mutual_info.loc[res[2], res[1]] = res[0] / (
  mutual_info.loc[res[1], res[2]] = mutual_info.loc[res[2], res[1]] = res[0] / (
100%|██████████| 4851/4851 [00:08<00:00, 562.40it/s]
  mutual_info.loc[x, target_name] = mutual_info.loc[x, target_name] / (
  mutual_info.loc[x, target_name] = mutual_info.loc[x, target_name] / (


['AtomicOrbitals|HOMO_character', 'AtomicOrbitals|HOMO_element', 'AtomicOrbitals|HOMO_energy', 'AtomicOrbitals|LUMO_character', 'AtomicOrbitals|LUMO_element', 'AtomicOrbitals|LUMO_energy', 'BandCenter|band center', 'ElementFraction|Al', 'ElementFraction|Si', 'ElementFraction|Ti', 'ElementFraction|Cu', 'ElementProperty|MagpieData minimum Number', 'ElementProperty|MagpieData maximum Number', 'ElementProperty|MagpieData mean Number', 'ElementProperty|MagpieData mode Number', 'ElementProperty|MagpieData minimum MendeleevNumber', 'ElementProperty|MagpieData maximum MendeleevNumber', 'ElementProperty|MagpieData mean MendeleevNumber', 'ElementProperty|MagpieData mode MendeleevNumber', 'ElementProperty|MagpieData minimum AtomicWeight']
20


GridSearchCV(cv=2,
             estimator=Pipeline(steps=[('modnetfeaturizer', MODNetFeaturizer()),
                                       ('rr', RR(drop_thr=0, n_feat=10)),
                                       ('modnetregressor', MODNetRegressor())]),
             param_grid={'modnetregressor__lr': [0.1, 0.01],
                         'rr__n_feat': [20, 50]})