# reaction-network (Demo Notebook): Networks

### Author: Matthew McDermott
Last Updated: 08/23/22

The code provided in this notebook is an updated walkthrough of the first example (YMnO3) in the accompanying manuscript (see citation below). The refactored `reaction-network` package contains similar code to what was released with the manuscript; however, many processes/functions are now separated into their own defined classes/methods. For a look at the previous demo notebook (which also contained some of the raw results that went into the manuscript), please check out the _archived_ folder.

**If you use this code or Python package in your work, please consider citing the following paper:**

McDermott, M. J., Dwaraknath, S. S., and Persson, K. A. (2021). A graph-based network for predicting chemical reaction pathways in solid-state materials synthesis. 
Nature Communications, 12(1). https://doi.org/10.1038/s41467-021-23339-x

### Imports

In [1]:
import logging 

from mp_api import MPRester
from pymatgen.core.composition import Element
from pymatgen.analysis.phase_diagram import PhaseDiagram, PDPlotter
from monty.serialization import loadfn

from rxn_network.costs.softplus import Softplus
from rxn_network.thermo.chempot_diagram import ChemicalPotentialDiagram
from rxn_network.core.composition import Composition
from pymatgen.core.composition import Composition, Element
from rxn_network.entries.entry_set import GibbsEntrySet
from rxn_network.network.network import ReactionNetwork
from rxn_network.entries.nist import NISTReferenceEntry
from rxn_network.reactions.computed import ComputedReaction
from rxn_network.reactions.reaction_set import ReactionSet
from rxn_network.reactions.open import OpenComputedReaction
from rxn_network.network.entry import NetworkEntry, NetworkEntryType
from rxn_network.network.visualize import plot_network_on_graphistry, plot_network
from rxn_network.pathways.solver import PathwaySolver
from rxn_network.enumerators.basic import BasicEnumerator, BasicOpenEnumerator
from rxn_network.enumerators.minimize import MinimizeGibbsEnumerator, MinimizeGrandPotentialEnumerator

#import graphistry

%load_ext autoreload
%autoreload 2

logging.info("Logging initialized")

  from tqdm.autonotebook import tqdm


### Case Study: YMnO3 assisted metathesis

We will be using the assisted metathesis synthesis of YMnO3 as a case study for the reaction network code. This is the first example discussed in the original manuscript. The assisted metathesis reaction reported by Todd & Neilson (JACS, 2019) corresponds to a net reaction equation:

$$ Mn_2O_3 + 2 YCl_3 + 3Li_2CO_3 \to 2YMnO_3 + 6LiCl + 3CO_2 $$

In the paper, they report a reaction pathway involving the formation of intermediates LiMnO2 and YOCl. These react to form YMnO3 product and LiCl byproduct. (The CO2 is released when Li2CO3 reacts initially to form LiMnO2).

### Downloading and modifying entries

First, we acquire entries for phases in the Y-Mn-O-Li-Cl-C chemical system from the Materials Project (MP), a computed materials database containing calculations for over 130,000 materials.

In [2]:
with MPRester("H6vDZld7WdHyE3kgw8tmLLPKLHXShZQX") as mpr:  # insert your Materials Project API key here
    entries = mpr.get_entries_in_chemsys("Ba-O")



Retrieving ThermoDoc documents:   0%|          | 0/61 [00:00<?, ?it/s]

In [3]:
#entries = loadfn("entries.json.gz")

The `GibbsEntrySet` class allows us to automatically converet `ComputedStructureEntry` objects downloaded from the MP database into `GibbsComputedEntry` objects, where DFT-calculated energies have been converted to machine learning (ML)-estimated equivalent values of the Gibbs free energies of formation, $\Delta G_f$ for all entries at the specified temperature. 

For more information, check out the citation in the documentation for `GibbsComputedEntry`.

In [4]:
temp = 900  # units: Kelvin
entry_set = GibbsEntrySet.from_entries(entries, temp)

The `GibbsEntrySet` class has many helpful functions, such as the following `filter_by_stability()` function, which automatically removes entries which are a specified energy per atom above the convex hull of stability:

In [5]:
entry_set = entry_set.filter_by_stability(0.0)
entry_set.entries_list

[GibbsComputedEntry | mp-122 | Ba1 (Ba)
 Gibbs Energy (900 K) = 0.0000,
 NISTReferenceEntry | BaO
 Gibbs Energy (900 K) = -4.8097,
 GibbsComputedEntry | mp-12957 | O8 (O2)
 Gibbs Energy (900 K) = 0.0000]

In this case, we remove all entries which are unstable (above an energy cutoff of 10 meV/atom), which greatly reduces the combinatorial complexity of the system.

## Building the reaction network

The reaction network can be initialized by providing 3 arguments to the `ReactionNetwork` class:

1. **entries:** iterable of entry-like objects (e.g., `GibbsComputedEntry`)
2. **enumerators:** iterable of enumerators which will be called during the build of the network
3. **cost_function:** the function used to calculate the cost of each reaction edge 

We will use a BasicEnumerator (see the **Enumerators Demo Notebook** for more information on the type of enumerators available):

In [6]:
be = BasicEnumerator()

In [7]:
rxns = be.enumerate(entry_set)

INFO:enumerator:Ray is not initialized. Checking for existing cluster...
INFO:enumerator:Could not identify existing Ray instance. Creating a new one...
2022-09-06 23:37:20,901	INFO worker.py:1509 -- Started a local Ray instance. View the dashboard at [1m[32m127.0.0.1:8265 [39m[22m
INFO:enumerator:HOST: macbook-pro-4.lan, {'node:127.0.0.1': 1.0, 'CPU': 12.0, 'memory': 15041344308.0, 'object_store_memory': 2147483648.0}
BasicEnumerator: 1it [00:04,  4.86s/it]


In [8]:
list(rxns.get_rxns())

[2 BaO -> O2 + 2 Ba, O2 + 2 Ba -> 2 BaO]

The cost function is a monotonic function used to assign weights to edges in the network. In this case, we will use the softplus function, assigned a temperature scaling of $T=900$ K, and use the default arguments which automatically determine the softplus weighting based on the energy per atom of the reaction:

In [8]:
cf = Softplus(900)

Finally, we provide these as arugments to the `ReactionNetwork` initialization:

In [9]:
rn = ReactionNetwork(rxns, cf)

This simply initializes a `ReactionNetwork` object but does not build the network graph. To do so, we call the `.build()` function:

In [9]:
rn.build()

INFO:ReactionNetwork:Building graph from reactions...
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2182/2182 [00:02<00:00, 739.77it/s]


This should have completed within a few seconds. You'll notice that two things happened:

1. The enumerator(s) were run and a list of reactions was generated
2. The weighted graph object was built with these reactions and stored under the `graph` attribute of the reaction network object

We can access this graph object, which is a graph-tool object, by using the `graph` attribute:

In [10]:
rn.graph

<Graph object, directed, with 984 vertices and 2674 edges, 2 internal vertex properties, 3 internal edge properties, at 0x14c6f0790>

There are a couple provided ways to plot reaction networks. The first is to use the built in drawing features in graph-tool, which have been provided in a wrapper function.

In [11]:
#plot_network(rn.graph);

You'll notice that at this stage, the reaction network graph is a collection of "sub"-networks, i.e. a collection of smaller reaction networks for smaller chemical subsystems. This configuration will change once we set up for pathfinding in the next section.

The second way to plot graphs is to use graphistry, which requires setting up an account on Graphistry Hub: https://hub.graphistry.com/

In [12]:
#plot_network_on_graphistry(rn.graph)

### Solving for reaction pathways

To solve for reaction pathways, we must set precursor phases, as well as a target phase. This will automatically build all the required "zero-cost" edges which connect the different chemical subsystems. Please see the original manuscript for more detail with regards to how this works. In short, zero-cost edges are drawn between and product node to any reactant node that contains a subset of the set consisting of the {precursors + products} phases. 

In [13]:
rn.set_precursors(["Li2CO3", "Mn2O3", "YCl3"])

In [14]:
rn.set_target("YMnO3")

We can see how this changes the network by re-drawing it:

In [23]:
rn.graph

<Graph object, directed, with 986 vertices and 5268 edges, 2 internal vertex properties, 4 internal edge properties, at 0x14c6f0790>

In [16]:
# plot_network(rn.graph);

You should now see that the chemical subsystems have come together -- this is due to the zero-cost edges that were just described. We can now perform pathfinding to extract reaction pathways.

To get reaction pathways, we simply call the `find_pathways()` method. This automatically handles finding pathways to multiple targets, by calling the internal shortest paths method. The _k_ parameter specifies the number of shortest paths to find to each target.

In [17]:
paths = rn.find_pathways(["YMnO3", "LiCl", "CO2"], k=5)

PATHS to YMnO3 

--------------------------------------- 

YCl3 + 4.5 Mn2O3 -> 0.5 Y2Mn2O7 + Mn8Cl3O10 (dG = -0.015 eV/atom) 
Mn2O3 + Y2Mn2O7 -> 2 YMnO3 + 2 MnO2 (dG = 0.005 eV/atom) 
Total Cost: 0.528 

YCl3 + 4.5 Mn2O3 -> 0.5 Y2Mn2O7 + Mn8Cl3O10 (dG = -0.015 eV/atom) 
0.5 Y2Mn2O7 -> YMnO3 + 0.25 O2 (dG = 0.038 eV/atom) 
Total Cost: 0.535 

YCl3 + 5 Mn2O3 -> Mn8Cl3O10 + YMn2O5 (dG = -0.025 eV/atom) 
YMn2O5 -> YMnO3 + MnO2 (dG = 0.048 eV/atom) 
Total Cost: 0.535 

2 Li2CO3 + Mn2O3 -> Li2O + 2 LiMnCO4 (dG = 0.029 eV/atom) 
Li2O + 0.6667 YCl3 -> 0.3333 Y2O3 + 2 LiCl (dG = -0.297 eV/atom) 
Mn2O3 + Y2O3 -> 2 YMnO3 (dG = -0.069 eV/atom) 
Total Cost: 0.724 

2 Li2CO3 + Mn2O3 -> Li2O + 2 LiMnCO4 (dG = 0.029 eV/atom) 
Li2O + YCl3 -> YClO + 2 LiCl (dG = -0.273 eV/atom) 
3 YClO + Mn2O3 -> YCl3 + 2 YMnO3 (dG = 0.0 eV/atom) 
Total Cost: 0.744 

PATHS to LiCl 

--------------------------------------- 

2 Li2CO3 + Mn2O3 -> Li2O + 2 LiMnCO4 (dG = 0.029 eV/atom) 
Li2O + 0.6667 YCl3 -> 0.3333 Y2O3 + 2 

The output of this method is a list of `BasicPathway` objects. Note that these objects contain a list of reactions and associated costs, but the actual pathway is typically not balanced:

In [18]:
example_path = paths[0]
print(example_path)

YCl3 + 4.5 Mn2O3 -> 0.5 Y2Mn2O7 + Mn8Cl3O10 (dG = -0.015 eV/atom) 
Mn2O3 + Y2Mn2O7 -> 2 YMnO3 + 2 MnO2 (dG = 0.005 eV/atom) 
Total Cost: 0.528


This means that the reactions you see above do not necessarily include all reactants, nor do they include form all desired products. They are simply a series of reactions extracted from the reaction network that maybe encountered as the system attempts to get from starter phases to target phases.

To actually get balanced reactions, we can use the `PathwaySolver` class. This class takes a set of entries, a list of `BasicPathway` objects, as well as a cost function, and can be used to solve for balanced pathways given a net reaction. First we initialize the class:

In [25]:
ps = PathwaySolver(rn.entries, paths, cf) # open_elem="O", chempot=0

To balance the pathways, we must provide a net reaction representing the total conversion of precursors to final products. This corresponds to the assisted metathesis reaction we defined in the beginning. We can automatically make this reaction by initializing a `ComputedReaction` object from the corresponding entries:

In [26]:
product_entries = []
for i in ["YMnO3","LiCl","CO2"]:
    product_entries.append(rn.entries.get_min_entry_by_formula(i))
    
net_rxn = ComputedReaction.balance(rn.precursors,product_entries)
net_rxn

1.5 Li2CO3 + YCl3 + 0.5 Mn2O3 -> YMnO3 + 3 LiCl + 1.5 CO2

Finally, we provide the net reaction to the `PathwaySolver` object. Note that the _intermediate_rxn_energy_cutoff_ helps to limit which intermediate reactions are considered (this can substantially decrease the combinatorial complexity), and the _filter_interdependent_ flag verifies that suggested pathways do not contain interdependent reactions (i.e. where both the reactants of reaction A depend on the products of the reaction B, and the reactants of reaction B depend on the products of reaction A).

**Note: Even though this step is compiled/parallelized using Numba, this is often the most time-intensive step in the reaction network analysis. Consider limiting the value of the maximum number of combos, as well as the value of the intermediate reaction energy cutoff.**

In [27]:
balanced_paths = ps.solve(net_rxn, max_num_combos=4, 
                          intermediate_rxn_energy_cutoff=0.0, 
                          use_minimize_enumerator=True,
                          filter_interdependent=True)

INFO:PathwaySolver:Net reaction: 1.5 Li2CO3 + YCl3 + 0.5 Mn2O3 -> YMnO3 + 3 LiCl + 1.5 CO2 

INFO:PathwaySolver:Identifying reactions between intermediates...
BasicEnumerator: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████| 31/31 [00:00<00:00, 128.08it/s]
MinimizeGibbsEnumerator: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████| 45/45 [00:00<00:00, 165.71it/s]
INFO:PathwaySolver:Found 165 intermediate reactions!
[2m[36m(_get_balanced_paths_ray pid=55065)[0m OMP: Info #273: omp_set_nested routine deprecated, please use omp_set_max_active_levels instead.
[2m[36m(_get_balanced_paths_ray pid=55061)[0m OMP: Info #273: omp_set_nested routine deprecated, please use omp_set_max_active_levels instead.
Solving pathways by batch...:   0%|                                                                                                       | 0/170 [00:00<?, ?it/s][2m[36

We can now print the suggested, balanced reaction pathways:

In [28]:
for idx, path in enumerate(balanced_paths):
    print(f"Path {idx+1}", "\n")
    print(path)
    print("\n")

Path 1 

Li2O + 0.6667 YCl3 -> 2 LiCl + 0.3333 Y2O3 (dG = -0.297 eV/atom) 
Mn2O3 + Y2O3 -> 2 YMnO3 (dG = -0.069 eV/atom) 
Li2CO3 -> CO2 + Li2O (dG = 0.146 eV/atom) 
Average Cost: 0.252


Path 2 

Li2O + YCl3 -> 2 LiCl + YClO (dG = -0.273 eV/atom) 
Li2CO3 -> CO2 + Li2O (dG = 0.146 eV/atom) 
Li2CO3 + Mn2O3 -> CO2 + 2 LiMnO2 (dG = 0.009 eV/atom) 
LiMnO2 + YClO -> LiCl + YMnO3 (dG = -0.081 eV/atom) 
Average Cost: 0.254


Path 3 

Li2O + YCl3 -> 2 LiCl + YClO (dG = -0.273 eV/atom) 
Li2CO3 -> CO2 + Li2O (dG = 0.146 eV/atom) 
2 Li2CO3 -> CO2 + Li4CO4 (dG = 0.055 eV/atom) 
LiMnO2 + YClO -> LiCl + YMnO3 (dG = -0.081 eV/atom) 
Li4CO4 + 2 Mn2O3 -> CO2 + 4 LiMnO2 (dG = -0.025 eV/atom) 
Average Cost: 0.254


Path 4 

Li2O + YCl3 -> 2 LiCl + YClO (dG = -0.273 eV/atom) 
Li2CO3 -> CO2 + Li2O (dG = 0.146 eV/atom) 
Li2CO3 + Mn2O3 -> LiCO2 + LiMn2O4 (dG = 0.046 eV/atom) 
LiMnO2 + YClO -> LiCl + YMnO3 (dG = -0.081 eV/atom) 
LiCO2 + LiMn2O4 -> CO2 + 2 LiMnO2 (dG = -0.038 eV/atom) 
Average Cost: 0.255


Pat

In [23]:
balanced_paths

[Li2O + 0.6667 YCl3 -> 2 LiCl + 0.3333 Y2O3 (dG = -0.297 eV/atom) 
 Mn2O3 + Y2O3 -> 2 YMnO3 (dG = -0.069 eV/atom) 
 Li2CO3 -> CO2 + Li2O (dG = 0.146 eV/atom) 
 Average Cost: 0.252,
 Li2O + YCl3 -> 2 LiCl + YClO (dG = -0.273 eV/atom) 
 Li2CO3 -> CO2 + Li2O (dG = 0.146 eV/atom) 
 Li2CO3 + Mn2O3 -> CO2 + 2 LiMnO2 (dG = 0.009 eV/atom) 
 LiMnO2 + YClO -> LiCl + YMnO3 (dG = -0.081 eV/atom) 
 Average Cost: 0.254,
 Li2O + YCl3 -> 2 LiCl + YClO (dG = -0.273 eV/atom) 
 Li2CO3 -> CO2 + Li2O (dG = 0.146 eV/atom) 
 2 Li2CO3 -> CO2 + Li4CO4 (dG = 0.055 eV/atom) 
 LiMnO2 + YClO -> LiCl + YMnO3 (dG = -0.081 eV/atom) 
 Li4CO4 + 2 Mn2O3 -> CO2 + 4 LiMnO2 (dG = -0.025 eV/atom) 
 Average Cost: 0.254,
 Li2O + YCl3 -> 2 LiCl + YClO (dG = -0.273 eV/atom) 
 Li2CO3 -> CO2 + Li2O (dG = 0.146 eV/atom) 
 Li2CO3 + Mn2O3 -> LiCO2 + LiMn2O4 (dG = 0.046 eV/atom) 
 LiMnO2 + YClO -> LiCl + YMnO3 (dG = -0.081 eV/atom) 
 LiCO2 + LiMn2O4 -> CO2 + 2 LiMnO2 (dG = -0.038 eV/atom) 
 Average Cost: 0.255,
 Li2O + 0.6667 YCl3 ->

In [24]:
from rxn_network.pathways.pathway_set import PathwaySet

In [25]:
path_set = PathwaySet.from_paths(balanced_paths)

In [26]:
path_set.get_paths()

[Li2O + 0.6667 YCl3 -> 2 LiCl + 0.3333 Y2O3 (dG = -0.297 eV/atom) 
 Mn2O3 + Y2O3 -> 2 YMnO3 (dG = -0.069 eV/atom) 
 Li2CO3 -> CO2 + Li2O (dG = 0.146 eV/atom) 
 Average Cost: 0.252,
 Li2O + YCl3 -> 2 LiCl + YClO (dG = -0.273 eV/atom) 
 Li2CO3 -> CO2 + Li2O (dG = 0.146 eV/atom) 
 Li2CO3 + Mn2O3 -> CO2 + 2 LiMnO2 (dG = 0.009 eV/atom) 
 LiMnO2 + YClO -> LiCl + YMnO3 (dG = -0.081 eV/atom) 
 Average Cost: 0.254,
 Li2O + YCl3 -> 2 LiCl + YClO (dG = -0.273 eV/atom) 
 Li2CO3 -> CO2 + Li2O (dG = 0.146 eV/atom) 
 2 Li2CO3 -> CO2 + Li4CO4 (dG = 0.055 eV/atom) 
 LiMnO2 + YClO -> LiCl + YMnO3 (dG = -0.081 eV/atom) 
 Li4CO4 + 2 Mn2O3 -> CO2 + 4 LiMnO2 (dG = -0.025 eV/atom) 
 Average Cost: 0.254,
 Li2O + YCl3 -> 2 LiCl + YClO (dG = -0.273 eV/atom) 
 Li2CO3 -> CO2 + Li2O (dG = 0.146 eV/atom) 
 Li2CO3 + Mn2O3 -> LiCO2 + LiMn2O4 (dG = 0.046 eV/atom) 
 LiMnO2 + YClO -> LiCl + YMnO3 (dG = -0.081 eV/atom) 
 LiCO2 + LiMn2O4 -> CO2 + 2 LiMnO2 (dG = -0.038 eV/atom) 
 Average Cost: 0.255,
 Li2O + 0.6667 YCl3 ->

We note that **Pathway 12 most closely matches the experimentally observed reaction pathway** (ordering subject to change in the future).

However, many of the pathways include hypothetical (never-before-synthesized) materials (e.g., Li3MnO3), so the top-ranked pathway does not necessarily match what is experimentally observed.

### Running networks with Fireworks

The `NetworkFW` class allows you to easily run the reaction network construction and pathfinding analysis via fireworks. Simply create a network firework, add it to the LaunchPad, and launch it on your computing resource. See the documentation for more information about each of the parameters.

In [27]:
from fireworks import LaunchPad, Workflow
from rxn_network.fireworks.core import NetworkFW

ImportError: cannot import name 'initialize_entry' from 'rxn_network.enumerators.utils' (/Users/mcdermott/PycharmProjects/reaction-network/src/rxn_network/enumerators/utils.py)

In [None]:
lpad = LaunchPad.auto_load()

In [None]:
fw = NetworkFW([BasicEnumerator()], Softplus(900), chemsys="Y-Mn-O-Li-Cl-C", entry_set_params={"e_above_hull":0.000}, 
               pathway_params={"precursors":["YCl3","Mn2O3","Li2CO3"], "targets":["YMnO3","LiCl","CO2"], "k":5},
              solver_params={"max_num_combos":4, "intermediate_rxn_energy_cutoff":0.0, "use_minimize_enumerator": True})

In [None]:
lpad.add_wf(Workflow([fw]))

In [None]:
!rlaunch singleshot

### Thank you!

If any errors with the reaction-network code are encountered, please raise an Issue here: https://github.com/GENESIS-EFRC/reaction-network/issues

In [None]:
from maggma.stores import MongoStore
from rxn_network.pathways.pathway_set import PathwaySet

In [None]:
ms = MongoStore.from_db_file("/Users/mcdermott/db_rn.json")
ms.connect()

In [None]:
d = ms.query_one({"task_id":82}, ["balanced_pathways"])

In [None]:
pset = PathwaySet.from_dict(d["balanced_pathways"])