# Matbench Example Notebook
###### Created April 1, 2021

![logo](matbench_logo.png)


# Description
###### Give a brief overview of this notebook and your algorithm.

This directory is an example of a matbench submission, which should be made via pull-request (PR). This also is a minimum working example of how to use the matbench python package to run, record, and submit benchmarks with nested cross validation.

The benchmark used here is the original Matbench v0.1, as described in [Dunn et al.](https://doi.org/10.1038/s41524-020-00406-3).

All submissions should include the following in a PR:
- Description
- Benchmark name
- Package versions
- Algorithm description
- Relevant citations
- Any other relevant info

# Benchmark name
###### Name the benchmark you are reporting results for.
Matbench v0.1

# Package versions
###### List all versions of packages required to run your notebook, including the matbench version used.
- matbench==0.1.0
- scikit-learn==0.24.1
- numpy==1.20.1

# Algorithm description
###### An in-depth explanation of your algorithm. 
###### Submissions are limited to one algorithm per notebook.
The model here is a dummy (random) model as described in [Dunn et al.](https://doi.org/10.1038/s41524-020-00406-3).
- Dummy classification model: randomly selects label in proportion to training+validation set. 
- Dummy regression model: predicts the mean of the training+validation set. 


# Relevant citations
###### List all relevant citations for your algorithm
- [Dunn et al.](https://doi.org/10.1038/s41524-020-00406-3)
- Your model's other citations go here.


# Any other relevant info
###### Freeform field to include any other relevant info about this notebook, your benchmark, or your PR submission.

---


General notes on notebooks:
- Please provide a short description for each code block, either
    - in markdown, as a separate cell
    - as inline comments
- Keep the output of each cell in the final notebook
- **The notebook must be named `notebook.ipynb`**!

In [1]:
# Import our required classes
from sklearn.dummy import DummyRegressor, DummyClassifier
from matbench.bench import MatbenchBenchmark

# Define our dummy models
Here we show code for defining the dummy models

In [2]:
# Define our dummy model
dummy_clf = DummyClassifier(strategy="stratified")
dummy_reg = DummyRegressor(strategy="mean")

# Running the actual benchmark

Create a benchmark of the 13 original matbench v0.1 tasks, train a model on each fold for each task, and record the results with any salient metadata.



In [3]:
# Create a benchmark
mb = MatbenchBenchmark(autoload=False)

# Run our benchmark on the dummy models
for task in mb.tasks:
    task.load()
    
    for fold in task.folds:
        
        # Get the training inputs (an array of pymatgen.Structure or string Compositions, e.g. "Fe2O3")
        # Get the training outputs (an array of either bools or floats, depending on problem)
        train_inputs, train_outputs = task.get_train_and_val_data(fold)
    
        # Do all model tuning and selection with the training data only
        # The split of training/validation is up to you and your algorithm
        if task.metadata.task_type == "classification":
            model = dummy_clf
            param = "class_prior_"
        else:
            model = dummy_reg
            param = "constant_"
        model.fit(train_inputs, train_outputs)
        
        # Get test data (an array of pymatgen.Structure or string compositions, e.g., "Fe2O3")
        test_inputs = task.get_test_data(fold, include_target=False)
        
        # Make predictions on the test data, returning an array of either bool or float, depending on problem
        predictions = model.predict(test_inputs)
        
        # Record our predictions into the benchmark object
        # you can optionally add parameters corresponding to the particular model in this fold
        # if particular hyperparameters or configurations are chosen based on training/vaidation
        task.record(fold, predictions, params={param: getattr(model, param)})
        

2021-04-08 15:49:57 INFO     Initialized benchmark 'matbench_v0.1' with 13 tasks: 
['matbench_dielectric',
 'matbench_expt_gap',
 'matbench_expt_is_metal',
 'matbench_glass',
 'matbench_mp_e_form',
 'matbench_jdft2d',
 'matbench_log_gvrh',
 'matbench_log_kvrh',
 'matbench_mp_gap',
 'matbench_mp_is_metal',
 'matbench_perovskites',
 'matbench_phonons',
 'matbench_steels']
2021-04-08 15:49:57 INFO     Loading dataset 'matbench_dielectric'...
2021-04-08 15:50:01 INFO     Dataset 'matbench_dielectric loaded.
2021-04-08 15:50:01 INFO     Recorded fold 0 successfully.
2021-04-08 15:50:01 INFO     Recorded fold 1 successfully.
2021-04-08 15:50:01 INFO     Recorded fold 2 successfully.
2021-04-08 15:50:01 INFO     Recorded fold 3 successfully.
2021-04-08 15:50:01 INFO     Recorded fold 4 successfully.
2021-04-08 15:50:01 INFO     Loading dataset 'matbench_expt_gap'...
2021-04-08 15:50:01 INFO     Dataset 'matbench_expt_gap loaded.
2021-04-08 15:50:01 INFO     Recorded fold 0 successfully.
2021-

# Check out the results of the benchmark

First, validate the benchmark to make sure everything is ok - if you did not get any error messages during the recording process your benchmark results will almost certainly be valid. 

Next, get a feeling for how our benchmark is doing, in terms of MAE or ROCAUC, along with various other scores.

Finally, add some metadata related to this benchmark, if applicable.

In [4]:
# Make sure our benchmark is valid
valid = mb.is_valid
print(f"is valid: {valid}")


# Check out how our algorithm is doing using scores
import pprint
pprint.pprint(mb.scores)

# Get some more info about the benchmark
mb.get_info()

# Add some additional metadata about our algorithm
# These sections are very freeform; any and all data you think are relevant to your benchmark
mb.add_metadata({"classification_strategy": "stratified", "regression_strategy": "mean", "algorithm": "dummy"})

is valid: True
{'matbench_dielectric': {'mae': {'max': 0.9218171348811479,
                                 'mean': 0.8088199760403804,
                                 'min': 0.7025642979783651,
                                 'std': 0.07178351256516467},
                         'mape': {'max': 0.32661610968668264,
                                  'mean': 0.3197147851956918,
                                  'min': 0.31415205215394426,
                                  'std': 0.004533890338774917},
                         'max_error': {'max': 59.66526111541415,
                                       'mean': 35.399479986904495,
                                       'min': 14.950062733735118,
                                       'std': 17.922149690511926},
                         'rmse': {'max': 3.1054896516592585,
                                  'mean': 1.972764399592657,
                                  'min': 1.0677348073603554,
                                  'std': 0.7

2021-04-08 16:04:09 INFO     
Matbench package 0.1.0 running benchmark 'matbench_v0.1'
	is complete: True
	is recorded: True
	is valid: True

Results:
	- 'matbench_dielectric' MAE mean: 0.8088199760403804
	- 'matbench_expt_gap' MAE mean: 1.1435269609161665
	- 'matbench_expt_is_metal' ROCAUC mean: 0.49239183431150113
	- 'matbench_glass' ROCAUC mean: 0.5005340805941929
	- 'matbench_mp_e_form' MAE mean: 1.005926896959356
	- 'matbench_jdft2d' MAE mean: 67.28508739581468
	- 'matbench_log_gvrh' MAE mean: 0.2931477285283942
	- 'matbench_log_kvrh' MAE mean: 0.2896939420244248
	- 'matbench_mp_gap' MAE mean: 1.3271561213175904
	- 'matbench_mp_is_metal' ROCAUC mean: 0.5012165980957544
	- 'matbench_perovskites' MAE mean: 0.5659986361918642
	- 'matbench_phonons' MAE mean: 323.9822262402079
	- 'matbench_steels' MAE mean: 229.74452965311315
2021-04-08 16:04:09 INFO     User metadata added successfully!


# Save our benchmark to file

Make sure you use the filename `results.json.gz` - this is important for our automated leaderboard to work properly!

In [5]:
# Save the valid benchmark to file to include with your submission
mb.to_file("results.json.gz")