Open Catalyst 2020 Nudged Elastic Band (OC20NEB)
======================================================

## Overview
This is a validation dataset which was used to assess model performance in [CatTSunami: Accelerating Transition State Energy Calculations with Pre-trained Graph Neural Networks](https://arxiv.org/abs/2405.02078). It is comprised of 932 NEB relaxation trajectories. There are three different types of reactions represented: desorptions, dissociations, and transfers. NEB calculations allow us to find transition states. The rate of reaction is determined by the transition state energy, so access to transition states is very important for catalysis research. For more information, check out the paper.

## File Structure and Contents
The tar file contains 3 subdirectories: dissociations, desorptions, and transfers. As the names imply, these directories contain the converged DFT trajectories for each of the reaction classes. Within these directories, the trajectories are named to identify the contents of the file. Here is an example and the anatomy of the name:

```desorption_id_83_2409_9_111-4_neb1.0.traj```

1. `desorption` indicates the reaction type (dissociation and transfer are the other possibilities)
2. `id` identifies that the material belongs to the validation in domain split (ood - out of domain is th e other possibility)
3. `83` is the task id. This does not provide relavent information
4. `2409` is the bulk index of the bulk used in the ocdata bulk pickle file
5. `9` is the reaction index. for each reaction type there is a reaction pickle file in the repository. In this case it is the 9th entry to that pickle file
6. `111-4` the first 3 numbers are the miller indices (i.e. the (1,1,1) surface), and the last number cooresponds to the shift value. In this case the 4th shift enumerated was the one used.
7. `neb1.0` the number here indicates the k value used. For the full dataset, 1.0 was used so this does not distiguish any of the trajectories from one another.


The content of these trajectory files is the repeating frame sets. Despite the initial and final frames not being optimized during the NEB, the initial and final frames are saved for every iteration in the trajectory. For the dataset, 10 frames were used - 8 which were optimized over the neb. So the length of the trajectory is the number of iterations (N) * 10. If you wanted to look at the frame set prior to optimization and the optimized frame set, you could get them like this:

In [1]:
from __future__ import annotations

!wget https://dl.fbaipublicfiles.com/opencatalystproject/data/large_files/desorption_id_83_2409_9_111-4_neb1.0.traj

from ase.io import read

traj = read("desorption_id_83_2409_9_111-4_neb1.0.traj", ":")
unrelaxed_frames = traj[0:10]
relaxed_frames = traj[-10:]

--2025-12-23 21:42:34--  https://dl.fbaipublicfiles.com/opencatalystproject/data/large_files/desorption_id_83_2409_9_111-4_neb1.0.traj
Resolving dl.fbaipublicfiles.com (dl.fbaipublicfiles.com)... 

52.84.217.5, 52.84.217.128, 52.84.217.55, ...
Connecting to dl.fbaipublicfiles.com (dl.fbaipublicfiles.com)|52.84.217.5|:443... connected.
HTTP request sent, awaiting response... 

200 OK
Length: 10074935 (9.6M) [binary/octet-stream]
Saving to: ‘desorption_id_83_2409_9_111-4_neb1.0.traj’

          desorptio   0%[                    ]       0  --.-KB/s               

         desorption  15%[==>                 ]   1.46M  6.66MB/s               


2025-12-23 21:42:35 (30.7 MB/s) - ‘desorption_id_83_2409_9_111-4_neb1.0.traj’ saved [10074935/10074935]



## Download
|Splits |Size of compressed version (in bytes)  |Size of uncompressed version (in bytes)    | MD5 checksum (download link)   |
|---    |---    |---    |---    |
|ASE Trajectories   |1.5G  |6.3G   | [52af34a93758c82fae951e52af445089](https://dl.fbaipublicfiles.com/opencatalystproject/data/oc20neb/oc20neb_dft_trajectories_04_23_24.tar.gz)   |



## Use
One more note: We have not prepared an lmdb for this dataset. This is because it is NEB calculations are not supported directly in ocp. You must use the ase native OCP class along with ase infrastructure to run NEB calculations. Here is an example of a use:

In [2]:
import os

from ase.io import read
from ase.mep import DyNEB
from ase.optimize import BFGS
from fairchem.core import FAIRChemCalculator, pretrained_mlip

traj = read("desorption_id_83_2409_9_111-4_neb1.0.traj", ":")
images = traj[0:10]
predictor = pretrained_mlip.get_predict_unit("uma-s-1p1")

neb = DyNEB(images, k=1)
for image in images:
    image.calc = FAIRChemCalculator(predictor, task_name="oc20")

optimizer = BFGS(
    neb,
    trajectory="neb.traj",
)

# Use a small number of steps here to keep the docs fast during CI, but otherwise do quite reasonable settings.
fast_docs = os.environ.get("FAST_DOCS", "false").lower() == "true"
if fast_docs:
    optimization_steps = 20
else:
    optimization_steps = 300

conv = optimizer.run(fmax=0.45, steps=optimization_steps)
if conv:
    neb.climb = True
    conv = optimizer.run(fmax=0.05, steps=optimization_steps)

checkpoints/uma-s-1p1.pt:   0%|          | 0.00/1.17G [00:00<?, ?B/s]

iso_atom_elem_refs.yaml:   0%|          | 0.00/9.00k [00:00<?, ?B/s]

form_elem_refs.yaml:   0%|          | 0.00/11.8k [00:00<?, ?B/s]



      Step     Time          Energy          fmax
BFGS:    0 21:42:50     -305.763014        5.169705


BFGS:    1 21:42:51     -305.691690       11.366597


BFGS:    2 21:42:52     -305.916311        1.889963


BFGS:    3 21:42:53     -305.932501        2.616028


BFGS:    4 21:42:54     -306.010363        2.264344


BFGS:    5 21:42:55     -306.003677        6.892221


BFGS:    6 21:42:56     -306.254761        9.617142


BFGS:    7 21:42:57     -306.224762        3.370867


BFGS:    8 21:42:58     -306.290787        4.666003


BFGS:    9 21:42:59     -306.315119        0.727082


BFGS:   10 21:43:00     -306.329416        0.654044


BFGS:   11 21:43:01     -306.357723        1.619526


BFGS:   12 21:43:03     -306.412176        1.940742


BFGS:   13 21:43:04     -306.441267        0.604959


BFGS:   14 21:43:05     -306.471001        0.559471


BFGS:   15 21:43:06     -306.495159        2.155568


BFGS:   16 21:43:07     -306.497845        0.480775


BFGS:   17 21:43:08     -306.504566        0.518926


BFGS:   18 21:43:09     -306.511310        0.714050


BFGS:   19 21:43:10     -306.508516        0.832510


BFGS:   20 21:43:11     -306.477873        1.208988


BFGS:   21 21:43:12     -306.508839        0.552523


BFGS:   22 21:43:13     -306.509933        0.379559


BFGS:   23 21:43:14     -306.396639        3.011676


BFGS:   24 21:43:15     -306.426273        1.008951


BFGS:   25 21:43:16     -306.390575        0.993510


BFGS:   26 21:43:17     -306.185505        0.925041


BFGS:   27 21:43:18     -306.127514        0.638757


BFGS:   28 21:43:19     -306.158028        0.669404


BFGS:   29 21:43:20     -306.240238        0.423369


BFGS:   30 21:43:21     -306.258076        0.527813


BFGS:   31 21:43:22     -306.257641        0.610025


BFGS:   32 21:43:23     -306.250221        0.648697


BFGS:   33 21:43:24     -306.257538        0.536956


BFGS:   34 21:43:25     -306.273964        0.436475


BFGS:   35 21:43:26     -306.311193        0.513248


BFGS:   36 21:43:27     -306.361092        0.543804


BFGS:   37 21:43:28     -306.432796        0.515683


BFGS:   38 21:43:29     -306.503700        0.485924


BFGS:   39 21:43:29     -306.530908        0.795098


BFGS:   40 21:43:30     -306.458948        1.429151


BFGS:   41 21:43:31     -306.300835        1.016157


BFGS:   42 21:43:32     -306.236401        0.796859


BFGS:   43 21:43:33     -306.260273        0.390382


BFGS:   44 21:43:34     -306.288555        0.345175


BFGS:   45 21:43:34     -306.316677        0.403329


BFGS:   46 21:43:35     -306.326887        0.518405


BFGS:   47 21:43:36     -306.308002        0.552446


BFGS:   48 21:43:37     -306.283466        0.417554


BFGS:   49 21:43:38     -306.273874        0.486212


BFGS:   50 21:43:39     -306.273495        0.296720


BFGS:   51 21:43:40     -306.276230        0.364837


BFGS:   52 21:43:41     -306.291931        0.296960


BFGS:   53 21:43:42     -306.317383        0.355936


BFGS:   54 21:43:42     -306.317064        0.359512


BFGS:   55 21:43:43     -306.309568        0.310920


BFGS:   56 21:43:44     -306.313925        0.269742


BFGS:   57 21:43:45     -306.321632        0.284377


BFGS:   58 21:43:46     -306.325144        0.252475


BFGS:   59 21:43:47     -306.331365        0.285908


BFGS:   60 21:43:48     -306.335789        0.244770


BFGS:   61 21:43:49     -306.332867        0.274321


BFGS:   62 21:43:50     -306.335716        0.217616


BFGS:   63 21:43:51     -306.344705        0.201684


BFGS:   64 21:43:52     -306.352756        0.209932


BFGS:   65 21:43:53     -306.349832        0.262015


BFGS:   66 21:43:54     -306.346144        0.266027


BFGS:   67 21:43:55     -306.365097        0.223561


BFGS:   68 21:43:56     -306.370190        0.382092


BFGS:   69 21:43:56     -306.327362        0.406294


BFGS:   70 21:43:57     -306.342176        0.208020


BFGS:   71 21:43:58     -306.350353        0.241796


BFGS:   72 21:43:59     -306.352682        0.127880


BFGS:   73 21:44:00     -306.351660        0.091098


BFGS:   74 21:44:01     -306.351990        0.074124


BFGS:   75 21:44:02     -306.355099        0.207719


BFGS:   76 21:44:03     -306.351688        0.235298


BFGS:   77 21:44:04     -306.349195        0.183163


BFGS:   78 21:44:04     -306.351738        0.172485


BFGS:   79 21:44:05     -306.356153        0.190859


BFGS:   80 21:44:06     -306.356924        0.124700


BFGS:   81 21:44:07     -306.349928        0.191069


BFGS:   82 21:44:08     -306.335524        0.277620


BFGS:   83 21:44:09     -306.248386        0.520793


BFGS:   84 21:44:10     -306.255560        0.482899


BFGS:   85 21:44:11     -306.317415        0.533368


BFGS:   86 21:44:12     -306.337398        0.722898


BFGS:   87 21:44:13     -306.342962        0.528065


BFGS:   88 21:44:14     -306.345962        0.339795


BFGS:   89 21:44:15     -306.345669        0.128839


BFGS:   90 21:44:16     -306.341663        0.165354


BFGS:   91 21:44:17     -306.340023        0.245824


BFGS:   92 21:44:18     -306.348480        0.341181


BFGS:   93 21:44:19     -306.359266        0.316924


BFGS:   94 21:44:20     -306.355087        0.234602


BFGS:   95 21:44:21     -306.345188        0.213146


BFGS:   96 21:44:22     -306.340412        0.176998


BFGS:   97 21:44:23     -306.325180        0.390251


BFGS:   98 21:44:24     -306.321546        0.404556


BFGS:   99 21:44:25     -306.347170        0.369873


BFGS:  100 21:44:26     -306.365696        0.375770


BFGS:  101 21:44:27     -306.368448        0.262075


BFGS:  102 21:44:28     -306.366629        0.274718


BFGS:  103 21:44:29     -306.359739        0.249257


BFGS:  104 21:44:30     -306.360655        0.136701


BFGS:  105 21:44:31     -306.355627        0.224843


BFGS:  106 21:44:32     -306.354639        0.252765


BFGS:  107 21:44:33     -306.352846        0.209089


BFGS:  108 21:44:34     -306.359110        0.185192


BFGS:  109 21:44:35     -306.360464        0.118802


BFGS:  110 21:44:36     -306.360580        0.135978


BFGS:  111 21:44:37     -306.357159        0.154989


BFGS:  112 21:44:38     -306.358721        0.137018


BFGS:  113 21:44:38     -306.357166        0.135403


BFGS:  114 21:44:39     -306.348875        0.228825


BFGS:  115 21:44:40     -306.334053        0.363359


BFGS:  116 21:44:41     -306.297026        0.538728


BFGS:  117 21:44:42     -306.289436        0.546414


BFGS:  118 21:44:43     -306.337982        0.416674


BFGS:  119 21:44:44     -306.357647        0.600634


BFGS:  120 21:44:45     -306.353899        0.205752


BFGS:  121 21:44:46     -306.365894        0.337602


BFGS:  122 21:44:47     -306.355513        0.226244


BFGS:  123 21:44:48     -306.359566        0.118230


BFGS:  124 21:44:49     -306.361311        0.162960


BFGS:  125 21:44:50     -306.361362        0.065746


BFGS:  126 21:44:51     -306.362484        0.136808


BFGS:  127 21:44:51     -306.362183        0.073019


BFGS:  128 21:44:52     -306.362183        0.128290


BFGS:  129 21:44:53     -306.362183        0.051831


BFGS:  130 21:44:53     -306.362183        0.050717


BFGS:  131 21:44:54     -306.362183        0.476923


BFGS:  132 21:44:54     -306.362183        0.171536


BFGS:  133 21:44:55     -306.362183        0.283749


BFGS:  134 21:44:55     -306.362183        0.263087


BFGS:  135 21:44:56     -306.362183        0.142148


BFGS:  136 21:44:56     -306.362183        0.117537


BFGS:  137 21:44:57     -306.362183        0.720018


BFGS:  138 21:44:58     -306.362183        0.436413


BFGS:  139 21:44:58     -306.362183        0.339500


BFGS:  140 21:44:59     -306.362183        0.160624


BFGS:  141 21:44:59     -306.362183        0.085518


BFGS:  142 21:45:00     -306.362183        0.177152


BFGS:  143 21:45:00     -306.362183        0.124666


BFGS:  144 21:45:01     -306.362183        0.118213


BFGS:  145 21:45:01     -306.362183        0.074450


BFGS:  146 21:45:02     -306.362183        0.194829


BFGS:  147 21:45:02     -306.362183        0.481882


BFGS:  148 21:45:03     -306.362183        1.196119


BFGS:  149 21:45:04     -306.362183        0.622913


BFGS:  150 21:45:05     -306.362183        0.678975


BFGS:  151 21:45:06     -306.362183        0.306593


BFGS:  152 21:45:06     -306.362183        0.273331


BFGS:  153 21:45:07     -306.362183        0.412972


BFGS:  154 21:45:08     -306.362183        0.638374


BFGS:  155 21:45:09     -306.362183        0.324265


BFGS:  156 21:45:10     -306.362183        0.264524


BFGS:  157 21:45:11     -306.362183        0.205086


BFGS:  158 21:45:12     -306.362183        0.199725


BFGS:  159 21:45:13     -306.362183        0.090013


BFGS:  160 21:45:14     -306.362183        0.118822


BFGS:  161 21:45:14     -306.362183        0.205041


BFGS:  162 21:45:15     -306.362183        0.246975


BFGS:  163 21:45:16     -306.362183        0.331774


BFGS:  164 21:45:16     -306.362183        0.215043


BFGS:  165 21:45:17     -306.362183        0.215259


BFGS:  166 21:45:18     -306.362183        0.230183


BFGS:  167 21:45:18     -306.362183        0.208553


BFGS:  168 21:45:19     -306.362183        0.431626


BFGS:  169 21:45:20     -306.362183        0.133996


BFGS:  170 21:45:20     -306.362183        0.105486


BFGS:  171 21:45:21     -306.362183        0.480994


BFGS:  172 21:45:22     -306.362183        0.300610


BFGS:  173 21:45:22     -306.362183        0.205220


BFGS:  174 21:45:23     -306.362183        0.188655


BFGS:  175 21:45:24     -306.362183        0.304838


BFGS:  176 21:45:24     -306.362183        0.263159


BFGS:  177 21:45:25     -306.362183        0.147536


BFGS:  178 21:45:26     -306.362183        0.128801


BFGS:  179 21:45:26     -306.362183        0.369622


BFGS:  180 21:45:27     -306.362183        0.206404


BFGS:  181 21:45:28     -306.362183        0.078798


BFGS:  182 21:45:28     -306.362183        0.056927


BFGS:  183 21:45:29     -306.362183        0.048082
