In [1]:
import sys
sys.path.append('..')
import chemex as cx
import chemex.web
from itertools import islice
import pandas as pd
from pandas import DataFrame

# Experimental and calculated properties
Many properties displayed on ChemSpider pages aren't accessible through the ChemSpider web API. (As far as I can tell!)

`cx.web.cs_scrape_properties(csid, [props])` retrieves the listed properties from a given ChemSpider page in the main "Properties" tab, plus the contents of the "EPI Suite" tab (see below). If you know exactly which properties you want, you the optional second argument can be a list of those properties. See `cx.web.cs_default_props` for an example.

In [2]:
example_csid_1 = 4471
example_data_1 = cx.web.cs_scrape_properties(example_csid_1)

In [3]:
DataFrame({example_data_1['CSID']: example_data_1})

Unnamed: 0,4471
Bio Activity,[Oxybenzone(Eusolex 4360; Escalol 567) is an o...
CSID,4471
Compound Source,[synthetic Microsource \r\n [01500...
EPI Suite,\n\nPredicted data is generated using the US E...
Experimental Boiling Point,[150 deg C / 5 mm (346.3046 °C / 760 mmHg)\r\n...
Experimental Flash Point,"[216 °C Alfa Aesar, 216 °C Alfa Aesar, 100 °C ..."
Experimental Gravity,[1.3 g/mL Alfa Aesar A17662]
Experimental LogP,"[3.641 Vitas-M STK057962, 2.9758 Synthon-Lab ..."
Experimental Melting Point,"[63 °C TCI H0266, 62-65 °C Alfa Aesar, 64 °C J..."
More details,[]


## Combining multiple results

In [4]:
example_data_2 = cx.web.cs_scrape_properties(5889)
DataFrame([example_data_1, example_data_2]).set_index('CSID')

Unnamed: 0_level_0,Appearance,Bio Activity,Compound Source,EPI Suite,Experimental Boiling Point,Experimental Flash Point,Experimental Freezing Point,Experimental Gravity,Experimental Ionization Potent,Experimental LogP,...,Predicted Melting Point,Retention Index (Kovats),Retention Index (Lee),Retention Index (Linear),Retention Index (Normal Alkane),Safety,Stability,Symptoms,Target Organs,Toxicity
CSID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
4471,,[Oxybenzone(Eusolex 4360; Escalol 567) is an o...,[synthetic Microsource \r\n [01500...,\n\nPredicted data is generated using the US E...,[150 deg C / 5 mm (346.3046 °C / 760 mmHg)\r\n...,"[216 °C Alfa Aesar, 216 °C Alfa Aesar, 100 °C ...",,[1.3 g/mL Alfa Aesar A17662],,"[3.641 Vitas-M STK057962, 2.9758 Synthon-Lab ...",...,"[63 °C TCI, 63 °C TCI H0266]",[2012 (estimated with error: 89) NIST Spectra ...,,[1938 (Program type: Ramp; Column cl... (show ...,,"[26-37 Alfa Aesar A17662, 36/37/38 Alfa Aesar ...",,,,
5889,"[Colorless to brown, oily liquid with an aroma...",,,\n\nPredicted data is generated using the US E...,"[183-184 °C Alfa Aesar, 363 F (183.8889 °C)\r\...","[70 °C Alfa Aesar, 158 F (70 °C)\r\n NIOSH B...",[21 F (-6.1111 °C)\r\n NIOSH BW6650000],"[20 g/mL Merck Millipore 3818, 20 g/l Merck Mi...",[7.7 Ev NIOSH BW6650000],[0.9 Egon Willighagen http://dx.doi.org/10.102...,...,,[992 (estimated with error: 83) NIST Spectra m...,[154.3 (Program type: Ramp; Column cl... (show...,[939.2 (Program type: Ramp; Column cl... (show...,[947 (Program type: Isothermal; Col... (show m...,[23/24/25-40-41-43-48/23/24/25-68-50 Alfa Aesa...,"[Stable. Incompatible with oxidizing agents, b...","[Headache, lassitude (weakness, exhaustion), d...","[Blood, cardiovascular system, eyes, liver, ki...","[ORL-RAT LD50 250 mg kg-1 , ORL-MUS LD50 464..."


## Getting information for multiple chemicals at once using a generator
With the generator `cx.web.cs_properties_gen` you can also specify a list of properties of interest just as above. If you don't, it will return everything it retrieves from the page.

In [5]:
example_csid_list = [4471, 5889, 8677, 20939]
multi_data = cx.web.cs_properties_gen(example_csid_list, cx.web.cs_default_props)
example_multi_df = DataFrame(multi_data)

In [6]:
example_multi_df.set_index('CSID')

Unnamed: 0_level_0,EPI Suite,Experimental Boiling Point,Experimental LogP,Experimental Melting Point,Experimental Solubility,Experimental Vapor Pressure
CSID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
4471,\n\nPredicted data is generated using the US E...,[150 deg C / 5 mm (346.3046 °C / 760 mmHg)\r\n...,"[3.641 Vitas-M STK057962, 2.9758 Synthon-Lab ...","[63 °C TCI H0266, 62-65 °C Alfa Aesar, 64 °C J...",,
5889,\n\nPredicted data is generated using the US E...,"[183-184 °C Alfa Aesar, 363 F (183.8889 °C)\r\...",[0.9 Egon Willighagen http://dx.doi.org/10.102...,"[-6 °C Alfa Aesar, -6 °C Oxford University Che...","[4% NIOSH BW6650000, Soluble in water Alfa Aes...",[0.6 mmHg NIOSH BW6650000]
8677,,,,"[206 °C Alfa Aesar, 204-207 °C Oxford Universi...",,
20939,\n\nPredicted data is generated using the US E...,"[170-180 °C Alfa Aesar, 175-177 °C Food and Ag...",,"[-40 °C LKT Labs \r\n [L3250], -74...",[Insoluble in water. LKT Labs \r\n ...,


# EPI Suite results
[EPI Suite](http://www2.epa.gov/tsca-screening-tools/epi-suitetm-estimation-program-interface) is a software package for estimating environmental fate and properties of chemicals (it also looks up experimentally measured properties from a database). It only runs on Windows, but ChemSpider conveniently stores EPI Suite results for many chemicals. They aren't exposed through the web API, but they appear in a tab on the compound page as a text blob. 

If you use `cx.web.cs_scrape_properties` and the information is available, you can get this EPI Suite blob as the property `'EPI Suite'`.

In [8]:
d = cx.web.cs_scrape_properties(592, props=['EPI Suite'])
for i in d['EPI Suite'].split('\n'):
    print(i)



Predicted data is generated using the US Environmental Protection Agency’s EPISuite

                        
 Log Octanol-Water Partition Coef (SRC):
    Log Kow (KOWWIN v1.67 estimate) =  -0.65
    Log Kow (Exper. database match) =  -0.72
       Exper. Ref:  Hansch,C et al. (1995)
    Log Kow (Exper. database match) =  -0.72
       Exper. Ref:  Hansch,C et al. (1995)

 Boiling Pt, Melting Pt, Vapor Pressure Estimations (MPBPWIN v1.42):
    Boiling Pt (deg C):  204.20  (Adapted Stein & Brown method)
    Melting Pt (deg C):  22.66  (Mean or Weighted MP)
    VP(mm Hg,25 deg C):  0.0286  (Modified Grain method)
    MP  (exp database):  52.8 deg C
    BP  (exp database):  122 @ 14.5 mm Hg deg C
    VP  (exp database):  8.14E-02 mm Hg at 25 deg C
    Subcooled liquid VP: 0.153 mm Hg (25 deg C, exp database VP )

 Water Solubility Estimate from Log Kow (WSKOW v1.41):
    Water Solubility at 25 deg C (mg/L):  1e+006
       log Kow used: -0.72 (expkow database)
       no-melting pt equatio

Extract a few of the specific values as a dict, with `cx.web.epi_suite_values`. (A rough attempt at text processing, help would be appreciated.)

In [9]:
cx.web.epi_suite_values(d['EPI Suite'])

{'Henrys LC [VP/WSol estimate using EPI values]': '3.390E-009 atm-m3/mole',
 'Log BCF from regression-based method': '0.500 (BCF = 3.162)',
 'Log Koa (KOAWIN v1.10 estimate)': '4.615',
 'Log Koa (experimental database)': 'None',
 'Log Kow (Exper. database match)': '-0.72',
 'Log Kow (KOWWIN v1.67 estimate)': '-0.65',
 'Persistence Time': '309 hr',
 'Ready Biodegradability Prediction': 'YES'}