<h2>A Data Driven Approach to Predicting Chemical Reaction Kinetics</h2>
<br>
Maneet Goyal<sup>1</sup>, Keren Zhang<sup>2</sup>
<br>
<i><sup>1</sup>School of Civil and Environmental Engineering, <sup>2</sup>School of Chemical and Biomolecular Engineering, Georgia Institute of Technology, Atlanta, GA</i>
<hr>

In [2]:
# Displaying Old Proposal PDF
from IPython.display import IFrame
IFrame("Proposals/OldProposal.pdf", width=1000, height=500)

<h3>Response to Proposal Review</h3>

<u>Reviewer Comments: </u>The proposed project is certainly related to chemical engineering, and the dataset is far too large and complex for traditional tools, so this is an appropriate project idea. The fact that the team has already scraped the data is commendable, as is the idea of ultimately delivering a Python package. However, the goals are very vague, and should be significantly clarified. I suggest making major revisions to the proposal before moving forward with the project. Specifically, it is not clear to me <u>what the inputs/outputs of the proposed algorithm will be</u>, and <u>how this relates to existing work</u>. 

The proposal indicates that half of “goal 1” is already achieved, so I would focus on the “feature vector” part of this goal. The process of identifying feature vectors for chemical reactions is far from trivial. The proposal glosses over this complexity, but determining <u>an appropriate representation for the molecular inputs</u> may be the most challenging part of the project. <u>Will the reactants be cross-referenced with PubChem to extract information about the bonds that are formed/broken?</u> <u>How will radical species be represented?</u> <u>How will reactions with 2 reactants/products be compared to reactions with 1 or 3 reactants/products?</u> These questions are critical and should be explored further in the proposal goals.

The scope of the outputs should be narrowed, given the challenges with defining the inputs. I recommend focusing on activation energies, since this implicitly defines “feasibility”. Predicting activation energies is also extremely useful. The authors should review the work of Bill Greene at MIT, and the RMG code, which uses some physically-inspired models to predict reaction barriers. This provides a good benchmark for what constitutes “good” performance from a new approach, and a high level of success could be defined as a model that out-performs the existing approach.

<u>Response</u>

<ol>
<li>The input to the model (at the time of prediction) will be a feature vector depicting a reaction under consideration. Output will be reaction order and activation energy. Prime focus is on reaction order since we have more data for the same. More than 55% of our reaction records don't have any reported activation energy.</li>

Here's a more complete picture: we have around 28000 reactions. Corresponding to these 28000 reactions, there are about 65000 records in total. Each record has reaction order, activation enery, Arrhenius rate law constants, reaction temperature, etc. stored in it which were reported in some or the other work/paper. Around 37000 of these records don't have activation enery reported. On the other hand, only around 1000 records have reaction order missing.

<li>We are currently looking into leveraging past work to aid our analysis. We have take a note of Bill Greene's work and will refer to it as necessary.</li>

<li>Molecular representation will be done with a feature vector containing elements like the number of following entities (functional groups): C-H, C=H, C#H, C=C, -OH, -CHO, -COOH, -Cl, -Br, -Fl, etc. We plan to use around 20-30 such entities including the reaction temperature.</li>

<li>We have used the ChemSpider Web API to query individual reactants. Around 35% of the reactants weren't found in their database. Of those found, some had multiple structures reported. The reactions that weren't found will be cleaned using OpenRefine and again queried using the same API. We will look into PubChem API also to query these reactants to extract some more relevant elements to complete our features. The reactants for which multiple structures are reported will be scanned through to select the most relevant structure. Here, structure implies SMILE representation. In the end, whichever reactants have fully developed feature vectors will be mapped to their reactions and only those reactions will be used for training.</li>

<li>Currently, we plan to use a binary element in the feature to reprsent whther the reactant is a radical or not.</li>

<li>The feature vector for the reaction will be formed by appending the feature vectors of the involved reactants and the normalized reaction temperature. The first half of the reactions' feature vector will correspond to the reactant of higher molecular weight just to ensure uniformity. For 3-reactant reactions, the features vectors of the 2 lowest molecular weight compounds will be summed and then appended to the feature vector of the first element. If the sum of the molecular weights of these 2 reactants is more than that of the third compound, the corresponding feature vector will occupy the first half of the reaction feature vector.</li>

</ol>

<h3>Revised Proposal</h3>

In [3]:
# Displaying Revised Proposal PDF
from IPython.display import IFrame
IFrame("Proposals/RevisedProposal.pdf", width=1000, height=500)

<h3>Importing our Source File for HTML Parsing</h3>

In [5]:
import htmlparser as hp # Our Source File. Open the file in a code editor for viewing the completing implementation.

<h3>Creating an in-memory table from the input HTML file</h3>

In [6]:
myTableCreator = hp.TableCreator("ReactionHTMLFile/NIST Chemical Kinetics Database.html")

----Status----
HTML File read in memory.
----Status----
Soup created out of the HTML file
----Status----
Done


<p style="color:magenta;">*Soup, specific to BeautifulSoup is a alternate representation of the HTML data that makes it easy for us to filter out the required data.</p>

<h3>Creating an output ".tsv" file after reading all the reactions</h3>

In [7]:
myTableCreator.extrct_rxn_to_txt("PreliminaryOutput/DemoGenerated/reactions.tsv")

----Status----
All reactions written into the output .tsv file in TSV format.


<h3>Read HREFs from the Input ".tsv" file path into a Pandas DataFrame</h3>

In [8]:
myRxnExtrator = hp.RxnDetailsExtractor("PreliminaryOutput/DemoGenerated/reactions.tsv")
print('\n----Here''s how our dataframe looks----')
print(myRxnExtrator.reactions_df.head(10))
print('\n----No. of Rows---')
print(len(myRxnExtrator.reactions_df.index))

----Status----
Reactions '.tsv' file read into a Pandas DataFrame.

----Heres how our dataframe looks----
                                         Reaction Link  Records
RID                                                            
1    http://kinetics.nist.gov/kinetics/ReactionSear...        1
2    http://kinetics.nist.gov/kinetics/ReactionSear...        1
3    http://kinetics.nist.gov/kinetics/ReactionSear...        1
4    http://kinetics.nist.gov/kinetics/ReactionSear...        1
5    http://kinetics.nist.gov/kinetics/ReactionSear...        1
6    http://kinetics.nist.gov/kinetics/ReactionSear...        1
7    http://kinetics.nist.gov/kinetics/ReactionSear...        1
8    http://kinetics.nist.gov/kinetics/ReactionSear...        1
9    http://kinetics.nist.gov/kinetics/ReactionSear...        1
10   http://kinetics.nist.gov/kinetics/ReactionSear...        1

----No. of Rows---
28983


<h3>Scraping data from the url suplied by the DataFrame above into TSV files.</h3>
<h4 style="color: red;">We suggest terminating the below code as soon as you are convinced that it works. On our system, the entire scraping was done in around 2 hours.</h4>
<h4 style="color: green;">The records.tsv and ref_reaction.tsv files were generated using BeautifulSoup and is given with the code. You don't need to generate it.</h4>
<h4>Here, records.tsv will contain all the data pertaining to a reaction order, for e.g., reaction order, activation enery, temperature, etc. "ref_reaction.tsv", on the other hand, will contain information on reactive whose kinetics was studied with respect to some other reactions. These reactions will not be included in our analysis.</h4>
<p>The code should start running and printing log instantly. In case the code takes a lot of time and still doesn't print anything, check whether the NIST Server is responding by going to http://kinetics.nist.gov/.</p>

In [23]:
myRxnExtrator.extrct_rec_to_tsv("PreliminaryOutput/DemoGenerated/records.tsv", "PreliminaryOutput/DemoGenerated/ref_reaction.tsv")

RID: 1 parsed
RID: 2 parsed
RID: 3 parsed
RID: 4 parsed
RID: 5 parsed
RID: 6 parsed
RID: 7 parsed
RID: 8 parsed
RID: 9 parsed
RID: 10 parsed
RID: 11 parsed
RID: 12 parsed
RID: 13 parsed
RID: 14 parsed
RID: 15 parsed
RID: 16 parsed
RID: 17 parsed
RID: 18 parsed
RID: 19 parsed
RID: 20 parsed
RID: 21 parsed
RID: 22 parsed
RID: 23 parsed
RID: 24 parsed
RID: 25 parsed
RID: 26 parsed
RID: 27 parsed
RID: 28 parsed
RID: 29 parsed
RID: 30 parsed
RID: 31 parsed
RID: 32 parsed
RID: 33 parsed
RID: 34 parsed
RID: 35 parsed
RID: 36 parsed
RID: 37 parsed
RID: 38 parsed
RID: 39 parsed
RID: 40 parsed
RID: 41 parsed
RID: 42 parsed
RID: 43 parsed
RID: 44 parsed
RID: 45 parsed
Bad HTML | Something is wrong with this HTML http://kinetics.nist.gov/kinetics/ReactionSearch?r0=325135340&r1=0&r2=0&r3=0&r4=0&p0=-2003113971&p1=0&p2=0&p3=0&p4=0&expandResults=true&. Proceeding with the next one.
RID: 47 parsed
RID: 48 parsed


KeyboardInterrupt: 

<hr>
<h4 style="color: blue;">Beyond this point, we manually dealt with the records.tsv to fix some formatting issues that accompanied the scraping process.</h4> 
<hr>

<h3>Initializing Populator</h3>
<p><em>Here, we extract all the unique reactants and products from all our reactions.</em></p>

In [26]:
import species as sp
my_populator = sp.Populator()

DEBUG:chemspipy.api:Initializing ChemSpider


--Populator Initialized--


<h3>Moving Reactions and Individual Chemical Species to a HDF5 File as Pandas Dataframes.</h3>

In [27]:
my_populator.reactions_and_species('PreliminaryOutput/DemoGenerated/reactions.tsv', 'PreliminaryOutput/DemoGenerated/DataDF.h5', 'Reactions', 'Species')

--DataFrames Created and Stored in PreliminaryOutput/DemoGenerated/DataDF.h5


your performance may suffer as PyTables will pickle object types that it cannot
map directly to c-types [inferred_type->mixed,key->block1_values] [items->['Reaction Link', 'Reactants', 'Products', 'Reactants_List', 'Products_List', 'Reactants_SIDs_List', 'Products_SIDs_List']]

  return pytables.to_hdf(path_or_buf, key, self, **kwargs)


<p>This is how our reactions dataframe looks:</p>

In [28]:
my_populator.print_from_hdf5(hdf5_store="PreliminaryOutput/DemoGenerated/DataDF.h5", dataframe_key="Reactions", lines=5)

                                         Reaction Link  Records  \
RID                                                               
1    http://kinetics.nist.gov/kinetics/ReactionSear...        1   
2    http://kinetics.nist.gov/kinetics/ReactionSear...        1   
3    http://kinetics.nist.gov/kinetics/ReactionSear...        1   
4    http://kinetics.nist.gov/kinetics/ReactionSear...        1   
5    http://kinetics.nist.gov/kinetics/ReactionSear...        1   

         Reactants            Products   Reactants_List       Products_List  \
RID                                                                           
1    C2H5OCH=CHNH2  C2H4 + CH3NH2 + CO  [C2H5OCH=CHNH2]  [C2H4, CH3NH2, CO]   
2           CBr3OF           CBr3 + OF         [CBr3OF]          [CBr3, OF]   
3           CBr3OF     CBr3O(Â·) + Â·F         [CBr3OF]      [CBr3O(·), ·F]   
4           CCl3OF         Â·F + CCl3O         [CCl3OF]         [·F, CCl3O]   
5           CCl3OF         Â·CCl3 + OF         [CCl3OF] 

<h3>Assigning Scores to Individual Chemical Species.</h3>
<p style="color: green;">A <u>Score</u> is defined as <em>No. of times a species is occuring as a product and as a reactant<em>.</p>

In [29]:
sp.Populator.status_check('PreliminaryOutput/DemoGenerated/DataDF.h5', 'Reactions', 'Species')

<p>This is how our updated species dataframe looks:</p>

In [30]:
my_populator.print_from_hdf5(hdf5_store="PreliminaryOutput/DemoGenerated/DataDF.h5", dataframe_key="Species", lines=8)

                            Species  Scores
SID                                        
0      Ethyl-2,2-dimethylpropionate       1
1                        CH3SCH2CH2       2
2    Cyclobutane-1,2-d2,(1S-trans)-       1
3                                Mn       6
4                           (CH3N)2       7
5                               CDO       5
6     (CH3)2CHCH2CH(CH2)CH2CH(CH3)2       1
7         3-methyl-1,2-benzoquinone       1


<p> The HDF5 File doesn't delete the old content when we update a dataframe that was stored in it. The new content is appended to it while the old content remains. We have to manually get rid of the old content to keep the file size under control. For more info: http://pandas.pydata.org/pandas-docs/stable/io.html#delete-from-a-table</p>

In [31]:
# Compress the present HDF5 file and replace it with the newer (smaller) version
!ptrepack --chunkshape=auto --propindexes --complevel=9 --complib=blosc PreliminaryOutput/DemoGenerated/DataDF.h5 PreliminaryOutput/DemoGenerated/Comp_DataDF.h5
import os
os.remove("PreliminaryOutput/DemoGenerated/DataDF.h5")
os.rename("PreliminaryOutput/DemoGenerated/Comp_DataDF.h5", "PreliminaryOutput/DemoGenerated/DataDF.h5")

<hr>
<h4 style="color: blue;">Beyond this point, we manually dealt with the unique_reactants.tsv to fix some formatting issues. The data was again stored to kinetics.db. 
</h4>
<hr>

<h3>Importing our Source File for Reactant Dataset Augmentation</h3>

In [27]:
import chemicalparser as cp

<h3> Loading Reactants into a Pandas DataFrame and augmenting it with ChemiSpider API CSIDs</h3>

In [28]:
myChemParser = cp.ChemicalParser("CleanedOutput/kineticsDB_ParentFiles/unique_reactants.csv")

DEBUG:chemspipy.api:Initializing ChemSpider


----Status----
Unique Reactants CSV Read
----Status----
ChemSpider Authetication Details Loaded.


<h4 style="color: red;">We suggest terminating the below code as soon as you are convinced that it works. On our system, all the calls were completed in around 2 hours.</h4>

In [29]:
myChemParser.fetch_chemspider_result()  # Making the API calls to ChemSpider

DEBUG:chemspipy.search:Results init
DEBUG:chemspipy.search:Searching in background thread
DEBUG:chemspipy.search:Waiting for search to finish
DEBUG:chemspipy.api:Request: https://www.chemspider.com/Search.asmx/AsyncSimpleSearch {'query': "5,5'-Bicyclopentadienyl"}
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): www.chemspider.com
DEBUG:urllib3.connectionpool:https://www.chemspider.com:443 "POST /Search.asmx/AsyncSimpleSearch HTTP/1.1" 200 128
DEBUG:chemspipy.search:Setting rid: 0fc0ea3f-8197-4d61-8616-08d215aae3d7
DEBUG:chemspipy.search:Checking status: 0fc0ea3f-8197-4d61-8616-08d215aae3d7
DEBUG:chemspipy.api:Request: https://www.chemspider.com/Search.asmx/GetAsyncSearchStatusAndCount {'rid': '0fc0ea3f-8197-4d61-8616-08d215aae3d7'}
DEBUG:urllib3.connectionpool:https://www.chemspider.com:443 "POST /Search.asmx/GetAsyncSearchStatusAndCount HTTP/1.1" 200 292
DEBUG:chemspipy.search:{'status': 'Processing', 'count': 0, 'elapsed': '0:00:00.443'}
DEBUG:chemspipy.search:Checkin

0


DEBUG:urllib3.connectionpool:https://www.chemspider.com:443 "POST /Search.asmx/AsyncSimpleSearch HTTP/1.1" 200 128
DEBUG:chemspipy.search:Setting rid: 10a9e744-8539-4041-af5b-df0e0c23379f
DEBUG:chemspipy.search:Checking status: 10a9e744-8539-4041-af5b-df0e0c23379f
DEBUG:chemspipy.api:Request: https://www.chemspider.com/Search.asmx/GetAsyncSearchStatusAndCount {'rid': '10a9e744-8539-4041-af5b-df0e0c23379f'}
DEBUG:urllib3.connectionpool:https://www.chemspider.com:443 "POST /Search.asmx/GetAsyncSearchStatusAndCount HTTP/1.1" 200 341
DEBUG:chemspipy.search:{'status': 'ResultReady', 'count': 1, 'message': 'Found by approved synonym', 'elapsed': '0:00:00.043'}
DEBUG:chemspipy.search:Search success!
DEBUG:chemspipy.api:Request: https://www.chemspider.com/Search.asmx/GetAsyncSearchResult {'rid': '10a9e744-8539-4041-af5b-df0e0c23379f'}
DEBUG:urllib3.connectionpool:https://www.chemspider.com:443 "POST /Search.asmx/GetAsyncSearchResult HTTP/1.1" 200 221
DEBUG:chemspipy.search:Results: [Compound(2

1


DEBUG:urllib3.connectionpool:https://www.chemspider.com:443 "POST /Search.asmx/AsyncSimpleSearch HTTP/1.1" 200 128
DEBUG:chemspipy.search:Setting rid: a2e73773-31d9-4d69-b56f-fb1296e7dcfc
DEBUG:chemspipy.search:Checking status: a2e73773-31d9-4d69-b56f-fb1296e7dcfc
DEBUG:chemspipy.api:Request: https://www.chemspider.com/Search.asmx/GetAsyncSearchStatusAndCount {'rid': 'a2e73773-31d9-4d69-b56f-fb1296e7dcfc'}
DEBUG:urllib3.connectionpool:https://www.chemspider.com:443 "POST /Search.asmx/GetAsyncSearchStatusAndCount HTTP/1.1" 200 332
DEBUG:chemspipy.search:{'status': 'ResultReady', 'count': 1, 'message': 'Found by synonym', 'elapsed': '0:00:00.013'}
DEBUG:chemspipy.search:Search success!
DEBUG:chemspipy.api:Request: https://www.chemspider.com/Search.asmx/GetAsyncSearchResult {'rid': 'a2e73773-31d9-4d69-b56f-fb1296e7dcfc'}
DEBUG:urllib3.connectionpool:https://www.chemspider.com:443 "POST /Search.asmx/GetAsyncSearchResult HTTP/1.1" 200 221
DEBUG:chemspipy.search:Results: [Compound(14947)]


KeyboardInterrupt: 

<h3>Storing the DataFrame into a HDF5 File for later use</h3>

In [None]:
# myChemParser.store_to_hdf5("reactant_df.h5")  # Dont run this if the above code wasn't run till completion.

<h3> Exporting the DataFrame to CSV for Use in OpenRefine </h3>

In [4]:
myChemParser.hdf5_df_to_csv("PreliminaryOutput/reactant_df.h5", "reac_df", "PreliminaryOutput/augemented_reac_df.csv")

<hr>
<h4 style="color: blue;">Now, we manually deal with the <strong>augemented_reac_df.csv</strong> to do club same data with alternate reprsentations and for more uniform classification. The cleaned data was again exported to another csv. 
</h4>
<hr>

<h3>Reads cleaned CSV into Pandas dataframe, augment it with SMILE strings and MOLD2D data</h3>

<h4 style="color: red;">We suggest terminating the below code as soon as you are convinced that it works. On our system, all the calls were completed in around 2 hours.</h4>

In [19]:
myChemParser.smile_it("CleanedOutput/augmented_reac_df_cleaned.csv", "mol2d_df", "reactant_df.h5")  # Stored in a Dataframe

DEBUG:chemspipy.api:Request: https://www.chemspider.com/InChI.asmx/CSIDToMol {'csid': 499905}
DEBUG:urllib3.connectionpool:Resetting dropped connection: www.chemspider.com
DEBUG:urllib3.connectionpool:https://www.chemspider.com:443 "POST /InChI.asmx/CSIDToMol HTTP/1.1" 200 None
DEBUG:chemspipy.api:Request: https://www.chemspider.com/InChI.asmx/CSIDToMol {'csid': 21472}
DEBUG:urllib3.connectionpool:https://www.chemspider.com:443 "POST /InChI.asmx/CSIDToMol HTTP/1.1" 200 None
DEBUG:chemspipy.api:Request: https://www.chemspider.com/InChI.asmx/CSIDToMol {'csid': 129377}


0
1


DEBUG:urllib3.connectionpool:https://www.chemspider.com:443 "POST /InChI.asmx/CSIDToMol HTTP/1.1" 200 None
DEBUG:chemspipy.api:Request: https://www.chemspider.com/InChI.asmx/CSIDToMol {'csid': 120497}


4


DEBUG:urllib3.connectionpool:https://www.chemspider.com:443 "POST /InChI.asmx/CSIDToMol HTTP/1.1" 200 None
DEBUG:chemspipy.api:Request: https://www.chemspider.com/InChI.asmx/CSIDToMol {'csid': 9317711}


12


DEBUG:urllib3.connectionpool:https://www.chemspider.com:443 "POST /InChI.asmx/CSIDToMol HTTP/1.1" 200 None
DEBUG:chemspipy.api:Request: https://www.chemspider.com/InChI.asmx/CSIDToMol {'csid': 217}
DEBUG:urllib3.connectionpool:https://www.chemspider.com:443 "POST /InChI.asmx/CSIDToMol HTTP/1.1" 200 240
DEBUG:chemspipy.api:Request: https://www.chemspider.com/InChI.asmx/CSIDToMol {'csid': 137749}


14
17


DEBUG:urllib3.connectionpool:https://www.chemspider.com:443 "POST /InChI.asmx/CSIDToMol HTTP/1.1" 200 723
DEBUG:chemspipy.api:Request: https://www.chemspider.com/InChI.asmx/CSIDToMol {'csid': 9645}
DEBUG:urllib3.connectionpool:https://www.chemspider.com:443 "POST /InChI.asmx/CSIDToMol HTTP/1.1" 200 522
DEBUG:chemspipy.api:Request: https://www.chemspider.com/InChI.asmx/CSIDToMol {'csid': 7605}


23
25


DEBUG:urllib3.connectionpool:https://www.chemspider.com:443 "POST /InChI.asmx/CSIDToMol HTTP/1.1" 200 710
DEBUG:chemspipy.api:Request: https://www.chemspider.com/InChI.asmx/CSIDToMol {'csid': 9911}


26


DEBUG:urllib3.connectionpool:https://www.chemspider.com:443 "POST /InChI.asmx/CSIDToMol HTTP/1.1" 200 None
DEBUG:chemspipy.api:Request: https://www.chemspider.com/InChI.asmx/CSIDToMol {'csid': 441}


28


KeyboardInterrupt: 

<p>This is how the augmented dataframes look:</p>

In [1]:
import pandas as pd
data_store = pd.HDFStore('CleanedOutput/reactant_df_with_extended_MOL.h5')  # Opening HDF5 File
print('These dataframes are present in the data_store HDF5 file:\n')
print(data_store.keys())
print('\n')
print('DF with MOL2D data')
first_df = data_store['mol2d_df']  # Reading the desired DF
print(first_df.head(3))
print('\n')
print('DF with SMILE, Mol. Wt, etc.')
second_df = data_store['smiling_df']  # Reading the desired DF
print(second_df.head(3))
data_store.close()

These dataframes are present in the data_store HDF5 file:

['/mol2d_df', '/reac_df', '/smiling_df']


DF with MOL2D data
                                  Reactant  Occurences     CSIDs  \
ReactantID                                                         
0                  5,5'-Bicyclopentadienyl           1  [499905]   
1               1,2-Propanediol, dinitrate           1   [21472]   
2           4-tert-butylphenoxyacetic acid           1   [14947]   

                                                      Message  \
ReactantID                                                      
0           Found by conversion query string to chemical s...   
1                                   Found by approved synonym   
2                                            Found by synonym   

                                                        Mol2d  
ReactantID                                                     
0           574980\n  -OEChem-02070708282D\n\n 20 21  0   ...  
1           22933\n 

In [5]:
import pandas as pd
data_store = pd.HDFStore('CleanedOutput/reactant_df_with_extended_MOL.h5')  # Opening HDF5 File
first_df = data_store['mol2d_df']  # Reading the desired DF
data_store.close()
print(first_df['Reactant'])

ReactantID
0                                 5,5'-Bicyclopentadienyl
1                              1,2-Propanediol, dinitrate
2                          4-tert-butylphenoxyacetic acid
3                                            (ClCH2CH2)2O
4                Methanesulfonic acid 2-phenylethyl ester
5                                            CH2C(Â·)CCCH
6                                                t-C4H9As
7                                    deuteriocyclopropane
8           Rhodium carbonyl (5-2,4-cyclopentadien-1-yl)-
9                                           cyc-CHCC(CCH)
10                                             4H-Pyranyl
11                                            (CH3)2CHNCO
12                                    1-Methylcyclooctene
13                                            CF2Br-CF2Br
14       Benzoic acid, 4-chloro-, 1,1-dimethylethyl ester
15                                                 CH2SOH
16                                    SiH3Si(:)SiH(SiH3)2
17 