Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update matbench dataset links to work with MPContribs #446

Merged
merged 2 commits into from
May 1, 2020

Conversation

ardunn
Copy link
Contributor

@ardunn ardunn commented May 1, 2020

Big shouts out to @tschaume

@ardunn
Copy link
Contributor Author

ardunn commented May 1, 2020

Although dataset downloads are not explicitly tested, I am testing them locally.

from matminer.datasets.dataset_retrieval import load_dataset, get_available_datasets

datasets = get_available_datasets(print_format=None)

for dataset in datasets:
    if "matbench_" in dataset:
        print(f"LOADING {dataset}...")
        df = load_dataset(dataset)
        print(df)
        print("...\n\n\n")

All datasets seem to load perfectly:

LOADING matbench_dielectric...
Fetching matbench_dielectric.json.gz from https://ml.materialsproject.org/matbench_dielectric.json.gz to /Users/ardunn/alex/lbl/projects/common_env/dev_codes/matminer/matminer/datasets/matbench_dielectric.json.gz
                                              structure         n
0     [[4.29304147 2.4785886  1.07248561] S, [4.2930...  1.752064
1     [[3.95051434 4.51121437 0.28035002] K, [4.3099...  1.652859
2     [[-1.78688104  4.79604117  1.53044621] Rb, [-1...  1.867858
3     [[4.51438064 4.51438064 0.        ] Mn, [0.133...  2.676887
4     [[-4.36731958  6.8886097   0.50929706] Li, [-2...  1.793232
...                                                 ...       ...
4759  [[ 2.79280881  0.12499663 -1.84045389] Ca, [-2...  2.136837
4760  [[0.         5.50363806 3.84192106] O, [4.7662...  2.690619
4761  [[0. 0. 0.] Ba, [ 0.23821924  4.32393487 -0.35...  2.811494
4762  [[0.         0.18884638 0.        ] K, [0.    ...  1.832887
4763  [[0. 0. 0.] Cs, [2.80639641 2.80639641 2.80639...  2.559279

[4764 rows x 2 columns]
...



LOADING matbench_expt_gap...
Fetching matbench_expt_gap.json.gz from https://ml.materialsproject.org/matbench_expt_gap.json.gz to /Users/ardunn/alex/lbl/projects/common_env/dev_codes/matminer/matminer/datasets/matbench_expt_gap.json.gz
            composition  gap expt
0              Ag(AuS)2      0.00
1            Ag(W3Br7)2      0.00
2      Ag0.5Ge1Pb1.75S4      1.83
3     Ag0.5Ge1Pb1.75Se4      1.51
4                Ag2BBr      0.00
...                 ...       ...
4599             ZrTaN3      1.72
4600               ZrTe      0.00
4601             ZrTi2O      0.00
4602             ZrTiF6      0.00
4603               ZrW2      0.00

[4604 rows x 2 columns]
...



LOADING matbench_expt_is_metal...
Fetching matbench_expt_is_metal.json.gz from https://ml.materialsproject.org/matbench_expt_is_metal.json.gz to /Users/ardunn/alex/lbl/projects/common_env/dev_codes/matminer/matminer/datasets/matbench_expt_is_metal.json.gz
            composition  is_metal
0              Ag(AuS)2      True
1            Ag(W3Br7)2      True
2      Ag0.5Ge1Pb1.75S4     False
3     Ag0.5Ge1Pb1.75Se4     False
4                Ag2BBr      True
...                 ...       ...
4916             ZrTaN3     False
4917               ZrTe      True
4918             ZrTi2O      True
4919             ZrTiF6      True
4920               ZrW2      True

[4921 rows x 2 columns]
...



LOADING matbench_glass...
Fetching matbench_glass.json.gz from https://ml.materialsproject.org/matbench_glass.json.gz to /Users/ardunn/alex/lbl/projects/common_env/dev_codes/matminer/matminer/datasets/matbench_glass.json.gz
      composition    gfa
0              Al  False
1        Al(NiB)2   True
2     Al10Co21B19   True
3     Al10Co23B17   True
4     Al10Co27B13   True
...           ...    ...
5675        ZrTi9  False
5676      ZrTiSi2   True
5677      ZrTiSi3   True
5678       ZrVCo8   True
5679       ZrVNi2   True

[5680 rows x 2 columns]
...



LOADING matbench_jdft2d...
Fetching matbench_jdft2d.json.gz from https://ml.materialsproject.org/matbench_jdft2d.json.gz to /Users/ardunn/alex/lbl/projects/common_env/dev_codes/matminer/matminer/datasets/matbench_jdft2d.json.gz
                                             structure  exfoliation_en
0    [[1.49323139 3.32688406 7.26257785] Hf, [3.326...       63.593833
1    [[1.85068084 4.37698238 6.9301577 ] As, [0.   ...      134.863750
2    [[ 0.          2.0213325  11.97279555] Ti, [ 1...       43.114667
3    [[2.39882726 2.39882726 2.53701553] In, [0.054...      240.715488
4    [[-1.83484554e-06  1.73300105e+00  2.61675943e...       67.442833
..                                                 ...             ...
631  [[ 2.38592362  1.37751086 13.178104  ] Co, [-2...       26.426545
632  [[0.         0.         6.02219863] Br, [0.   ...       43.574286
633  [[2.74646086 0.06822876 1.46596737] Se, [6.324...       88.808659
634  [[6.79056646 2.04327631 3.37729406] I, [2.0440...      132.265250
635  [[ 0.69409027  1.22690182 -0.85636865] Co, [-0...       63.564333

[636 rows x 2 columns]
...



LOADING matbench_log_gvrh...
Fetching matbench_log_gvrh.json.gz from https://ml.materialsproject.org/matbench_log_gvrh.json.gz to /Users/ardunn/alex/lbl/projects/common_env/dev_codes/matminer/matminer/datasets/matbench_log_gvrh.json.gz
                                               structure  log10(G_VRH)
0      [[0. 0. 0.] Ca, [1.37728887 1.57871271 3.73949...      1.447158
1      [[3.14048493 1.09300401 1.64101398] Mg, [0.625...      1.518514
2      [[ 2.06884519  2.40627241 -0.45891585] Si, [1....      1.740363
3      [[2.06428082 0.         2.06428082] Pd, [0.   ...      1.707570
4      [[3.09635262 1.0689416  1.53602403] Mg, [0.593...      1.602060
...                                                  ...           ...
10982  [[0. 0. 0.] Rh, [3.2029368  3.2029368  2.09459...      1.414973
10983  [[-1.51157821  4.4173925   1.21553922] Mg, [3....      1.431364
10984  [[4.37546772 4.51128393 6.81784473] H, [0.4573...      1.000000
10985  [[0. 0. 0.] Si, [ 4.55195829  4.55195829 -4.55...      1.579784
10986  [[1.44565668 0.         2.05259079] Al, [1.445...      1.698970

[10987 rows x 2 columns]
...



LOADING matbench_log_kvrh...
Fetching matbench_log_kvrh.json.gz from https://ml.materialsproject.org/matbench_log_kvrh.json.gz to /Users/ardunn/alex/lbl/projects/common_env/dev_codes/matminer/matminer/datasets/matbench_log_kvrh.json.gz
                                               structure  log10(K_VRH)
0      [[0. 0. 0.] Ca, [1.37728887 1.57871271 3.73949...      1.707570
1      [[3.14048493 1.09300401 1.64101398] Mg, [0.625...      1.633468
2      [[ 2.06884519  2.40627241 -0.45891585] Si, [1....      1.908485
3      [[2.06428082 0.         2.06428082] Pd, [0.   ...      2.117271
4      [[3.09635262 1.0689416  1.53602403] Mg, [0.593...      1.690196
...                                                  ...           ...
10982  [[0. 0. 0.] Rh, [3.2029368  3.2029368  2.09459...      1.778151
10983  [[-1.51157821  4.4173925   1.21553922] Mg, [3....      1.724276
10984  [[4.37546772 4.51128393 6.81784473] H, [0.4573...      1.342423
10985  [[0. 0. 0.] Si, [ 4.55195829  4.55195829 -4.55...      1.770852
10986  [[1.44565668 0.         2.05259079] Al, [1.445...      1.954243

[10987 rows x 2 columns]
...



LOADING matbench_mp_e_form...
Fetching matbench_mp_e_form.json.gz from https://ml.materialsproject.org/matbench_mp_e_form.json.gz to /Users/ardunn/alex/lbl/projects/common_env/dev_codes/matminer/matminer/datasets/matbench_mp_e_form.json.gz
                                                structure    e_form
0       [[0. 0. 0.] Pt, [1.40802527 1.40802527 1.40802...  1.995384
1       [[0. 0. 0.] Cd, [-3.88999977 -3.88999977 -3.88...  2.103048
2       [[0. 0. 0.] Cd, [ 2.4281169  2.4281169 -2.4281...  1.983566
3       [[0. 0. 0.] Au, [ 3.73685716  3.73685716 -3.73...  1.528850
4       [[0. 0. 0.] Ag, [-2.33387948 -2.33387948 -2.33...  2.235059
...                                                   ...       ...
132747  [[0.90688012 0.90688012 3.65081586] Sc, [2.720... -4.060381
132748  [[ 0.07547367  2.75237591 -8.16999325] Ce, [ 0... -3.918910
132749  [[3.04379721 2.81477641 7.80253026] Ca, [6.363... -3.851440
132750  [[3.29589261 2.29147874 5.98314495] Ca, [0. 0.... -3.983134
132751  [[0.671257   5.45652618 0.19875358] Ca, [0.970... -3.898357

[132752 rows x 2 columns]
...



LOADING matbench_mp_gap...
Fetching matbench_mp_gap.json.gz from https://ml.materialsproject.org/matbench_mp_gap.json.gz to /Users/ardunn/alex/lbl/projects/common_env/dev_codes/matminer/matminer/datasets/matbench_mp_gap.json.gz
                                                structure  gap pbe
0       [[-0.00812638  0.02476014 -0.01698117] K, [-0....   1.3322
1       [[0.         1.78463544 1.78463544] Cr, [1.784...   0.0000
2       [[-2.13764909 -2.12540569 -2.14704542] Cs, [-6...   0.0000
3       [[0. 0. 0.] Si, [ 4.55195829  4.55195829 -4.55...   0.4113
4       [[0.    2.655 2.655] Ca, [2.655 0.    2.655] C...   0.3514
...                                                   ...      ...
106108  [[ 2.91058377  3.61215869 -0.19100541] Ca, [-0...   1.1354
106109  [[0.07215014 3.75835129 1.91249744] Ta, [2.014...   2.7274
106110  [[0.99954964 0.70129827 4.70919163] Mg, [ 0.87...   2.8860
106111  [[0.99298226 0.71146045 4.70710628] Zn, [ 0.86...   2.2330
106112  [[ 7.28898036  5.15386774 12.6253607 ] Mg, [1....   1.0583

[106113 rows x 2 columns]
...



LOADING matbench_mp_is_metal...
Fetching matbench_mp_is_metal.json.gz from https://ml.materialsproject.org/matbench_mp_is_metal.json.gz to /Users/ardunn/alex/lbl/projects/common_env/dev_codes/matminer/matminer/datasets/matbench_mp_is_metal.json.gz
                                                structure  is_metal
0       [[-0.00812638  0.02476014 -0.01698117] K, [-0....     False
1       [[0.         1.78463544 1.78463544] Cr, [1.784...      True
2       [[-2.13764909 -2.12540569 -2.14704542] Cs, [-6...      True
3       [[0. 0. 0.] Si, [ 4.55195829  4.55195829 -4.55...     False
4       [[0.    2.655 2.655] Ca, [2.655 0.    2.655] C...     False
...                                                   ...       ...
106108  [[ 2.91058377  3.61215869 -0.19100541] Ca, [-0...     False
106109  [[0.07215014 3.75835129 1.91249744] Ta, [2.014...     False
106110  [[0.99954964 0.70129827 4.70919163] Mg, [ 0.87...     False
106111  [[0.99298226 0.71146045 4.70710628] Zn, [ 0.86...     False
106112  [[ 7.28898036  5.15386774 12.6253607 ] Mg, [1....     False

[106113 rows x 2 columns]
...



LOADING matbench_perovskites...
Fetching matbench_perovskites.json.gz from https://ml.materialsproject.org/matbench_perovskites.json.gz to /Users/ardunn/alex/lbl/projects/common_env/dev_codes/matminer/matminer/datasets/matbench_perovskites.json.gz
                                               structure  e_form
0      [[0. 0. 0.] Rh, [1.97726555 1.97726555 1.97726...    2.16
1      [[2.54041798 0.         0.        ] Hf, [1.020...    1.52
2      [[0.60790913 0.         0.        ] Re, [2.186...    1.48
3      [[2.83091357 0.         0.        ] W, [2.6573...    1.24
4      [[0.00518937 0.         0.        ] Bi, [2.172...    0.62
...                                                  ...     ...
18923  [[4.44077598 0.         0.        ] Rb, [2.652...    1.66
18924  [[4.56913824e-03 7.21569024e-19 0.00000000e+00...    2.12
18925  [[0.0040044 0.        0.       ] Zn, [1.821570...    1.50
18926  [[0. 0. 0.] Ca, [2.16744896 2.16744896 2.16744...    2.48
18927  [[1.23999712 4.09195837 4.09195837] Al, [2.500...    1.06

[18928 rows x 2 columns]
...



LOADING matbench_phonons...
Fetching matbench_phonons.json.gz from https://ml.materialsproject.org/matbench_phonons.json.gz to /Users/ardunn/alex/lbl/projects/common_env/dev_codes/matminer/matminer/datasets/matbench_phonons.json.gz
                                              structure  last phdos peak
0     [[2.8943817  2.04663693 5.01321616] Te, [0. 0....        98.585771
1     [[0.98372595 0.69559929 1.70386332] B, [0. 0. ...       701.585723
2     [[0. 0. 0.] Ba, [2.15053493 1.24161183 2.85808...      1138.585689
3     [[-2.23741407  0.         -2.23366548] Al, [0....       718.585722
4     [[1.60015264 0.92384464 2.65049608] B, [0.0000...       795.585716
...                                                 ...              ...
1260  [[1.69099645e-03 3.81913207e+00 1.07685858e-01...       142.585768
1261  [[1.64439731e-03 3.64832409e+00 1.03287739e-01...       223.585761
1262  [[-3.66731982 -1.91142875  2.96640499] K, [ 3....       219.718383
1263  [[ 1.57631457 -0.32583322 -1.57631457] N, [ 1....      1090.585692
1264  [[1.46542555 0.84617794 1.95957081] W, [0.0000...      1038.585697

[1265 rows x 2 columns]
...



LOADING matbench_steels...
Fetching matbench_steels.json.gz from https://ml.materialsproject.org/matbench_steels.json.gz to /Users/ardunn/alex/lbl/projects/common_env/dev_codes/matminer/matminer/datasets/matbench_steels.json.gz
                                           composition  yield strength
0    Fe0.620C0.000953Mn0.000521Si0.00102Cr0.000110N...          2411.5
1    Fe0.623C0.00854Mn0.000104Si0.000203Cr0.147Ni0....          1123.1
2    Fe0.625Mn0.000102Si0.000200Cr0.0936Ni0.129Mo0....          1736.3
3    Fe0.634C0.000478Mn0.000523Si0.00102Cr0.000111N...          2487.3
4    Fe0.636C0.000474Mn0.000518Si0.00101Cr0.000109N...          2249.6
..                                                 ...             ...
307  Fe0.823C0.0176Mn0.00183Si0.000198Cr0.0779Ni0.0...          1722.5
308  Fe0.823Mn0.000618Si0.00101Cr0.0561Ni0.0984Mo0....          1019.0
309  Fe0.825C0.0174Mn0.00175Si0.000201Cr0.0565Ni0.0...          1860.3
310  Fe0.858C0.0191Mn0.00194Si0.000199Cr0.0753Ni0.0...          1812.1
311  Fe0.860C0.0125Mn0.00274Si0.000198Cr0.00439Ni0....          1139.7

[312 rows x 2 columns]
...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant