<a href="https://colab.research.google.com/github/2kristint/scifinder/blob/main/scifinderAPI.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!pip install chemicals pubchempy



In [None]:
import json
import requests
from pprint import pprint
from IPython.display import SVG, HTML
import chemicals
import pandas as pd
from PIL import Image
import base64
from IPython.display import display
from bs4 import BeautifulSoup
import pubchempy as pcp
# import cirpy
# import io
detail_base_url = "https://commonchemistry.cas.org/api/detail?"

In [None]:
chemical_list = 'ammonium hydroxide, Camphor-10-sulfonyl chloride, methylene chloride, magnesium sulfate, toluene, peracetic acid, sodium sulfate, hexane, 153221-24-0'.split(', ')

In [None]:
#create list of cid numbers from names list
def create_cidList(nameList):
  compounds = []
  #create list of compounds from names
  for element in nameList:
    compound = pcp.get_compounds(element, 'name')
    compounds.append(compound[0])

  cids = []
  for ele in compounds:
    cids.append(ele.cid)

  return cids

#get images
#found this webpage: https://pubchem.ncbi.nlm.nih.gov/docs/imaging-services
def display_image(cidNum):
  # Fetch image from URL
  url = "https://pubchem.ncbi.nlm.nih.gov/image/imgsrv.fcgi?cid=" + str(cidNum)
  return '<img src="'+ url + '" style=max-height:124px;"/>'

#get melting point
def get_pug_melting_point(cidNum):
  data = requests.get("https://pubchem.ncbi.nlm.nih.gov/rest/pug_view/data/compound/" + str(cidNum) + "/xml?heading=Melting+Point")
  html = BeautifulSoup(data.content, "xml")
  melting_points = html.find_all('String')
  for melting_point in melting_points:
      mp = str(melting_point)
      mp = mp.replace('<String>','').replace('</String>','')
      if 'C' in mp:
        return mp

#get density
def get_pug_density(cidNum):
  data = requests.get("https://pubchem.ncbi.nlm.nih.gov/rest/pug_view/data/compound/" + str(cidNum) + "/xml?heading=Density")
  html = BeautifulSoup(data.content, "xml")
  densities = html.find_all('String')
  for density in densities:
      d = str(density)
      d = d.replace('<String>','').replace('</String>','')
      if 'C' in d:
        return d

#get sdf info
def get_pug_hazards(cidNum):
  data = requests.get("https://pubchem.ncbi.nlm.nih.gov/rest/pug_view/data/compound/" + str(cidNum) + "/xml?heading=Hazards+Summary")
  html = BeautifulSoup(data.content, "xml")
  hazards = html.find_all('String')
  for hazard in hazards:
      h = str(hazard)
      h = h.replace('<String>','').replace('</String>','')
      if 'C' in h:
        return h

#creating table
def create_table(nameList):
  images = []
  mp = []
  name = []
  mol_wt = []
  density = []
  sds_list = []
  dataset = pd.DataFrame()

  compounds = []
  #create list of compounds from names
  for element in nameList:
    compound = pcp.get_compounds(element, 'name')
    compounds.append(compound[0])

  cids = create_cidList(nameList)

  #images
  for cidNum in cids:
    images.append(display_image(cidNum))

  # melting point
  for cidNum in cids:
    mp.append(get_pug_melting_point(cidNum))

  # density
  for cidNum in cids:
    density.append(get_pug_density(cidNum))

  #name
  for compoundName in nameList:
    name.append(compoundName)

  #molecular weight
  for compound in compounds:
    mol_wt.append(compound.molecular_weight)

  # sdf
  for cidNum in cids:
    sds_list.append(get_pug_hazards(cidNum))

  dataset['2D Models'] = pd.DataFrame(images)
  dataset['Name'] = pd.DataFrame(name)
  dataset['Molecular Weight'] = pd.DataFrame(mol_wt)
  dataset['Density'] = pd.DataFrame(density)
  dataset['Melting Point'] = pd.DataFrame(mp)
  dataset['Hazards'] = pd.DataFrame(sds_list)
  dataset = dataset.to_html(escape=False)
  display(HTML(dataset))

create_table(chemical_list)



Unnamed: 0,2D Models,Name,Molecular Weight,Density,Melting Point,Hazards
0,,ammonium hydroxide,35.046,About 0.90 @ 25 °C/25 °C,-58 °C (25%),"Corrosive to skin; [Quick CPC] High inhalation exposure can cause pulmonary edema. [HSDB] A corrosive substance that can cause injury to the skin, eyes and respiratory tract; Inhalation of high concentrations may cause laryngeal edema, respiratory tract inflammation, and pneumonia; Prolonged or repeated exposure to vapor or aerosol may cause injury to lungs; [ICSC] Solution of <28% aqueous ammonia: Causes burns; Short-term exposure causes smarting of the skin and first-degree burns; Second-degree burns can result from extended exposure; [CHRIS] Human inhalation of 408 ppm causes focal fibrosis (pneumoconiosis) and acute pulmonary edema; [RTECS] Causes burns; A lachrymator; Toxic by ingestion; Inhalation may cause corrosive injuries to upper respiratory tract and lungs; [Aldrich MSDS] See Ammonia."
1,,Camphor-10-sulfonyl chloride,250.74,,,
2,,methylene chloride,84.93,"1.322 at 68 °F (USCG, 1999) - Denser than water; will sink",-95 °C,"Methylene chloride is predominantly used as a solvent. The acute (short-term) effects of methylene chloride inhalation in humans consist mainly of nervous system effects including decreased visual, auditory, and motor functions, but these effects are reversible once exposure ceases. The effects of chronic (long-term) exposure to methylene chloride suggest that the central nervous system (CNS) is a potential target in humans and animals. Human data are inconclusive regarding methylene chloride and cancer. Animal studies have shown increases in liver and lung cancer and benign mammary gland tumors following the inhalation of methylene chloride."
3,,magnesium sulfate,120.37,"Efflorescent crystals or powder; bitter, saline, cooling taste; density: 1.67; pH 6-7; soluble in water (g/100 ml): 71 @ 20 °C, 91 @ 40 °C; slightly soluble in alcohol; its aqueous soln is neutral; it loses 4 H2O @ 70-80 °C, 5 H2O @ 100 °C, 6 H2O @ 120 °C; loses last molecule of H2O @ about 250 °C; rapidly reabsorbing water when exposed to moist air; on exposure to dry air at ordinary temperatures it losses approx one H2O /Heptahydrate/",Decomposes @ 1124 °C,No listed effects of short-term or long-term exposure; [ICSC]
4,,toluene,92.14,"0.867 at 68 °F (USCG, 1999) - Less dense than water; will float",-94.9 °C,"Toluene is added to gasoline, used to produce benzene, and used as a solvent. Exposure to toluene may occur from breathing ambient or indoor air affected by such sources. The central nervous system (CNS) is the primary target organ for toluene toxicity in both humans and animals for acute (short-term) and chronic (long-term) exposures. CNS dysfunction and narcosis have been frequently observed in humans acutely exposed to elevated airborne levels of toluene; symptoms include fatigue, sleepiness, headaches, and nausea. CNS depression has been reported to occur in chronic abusers exposed to high levels of toluene. Chronic inhalation exposure of humans to toluene also causes irritation of the upper respiratory tract and eyes, sore throat, dizziness, and headache. Human studies have reported developmental effects, such as CNS dysfunction, attention deficits, and minor craniofacial and limb anomalies, in the children of pregnant women exposed to high levels of toluene or mixed solvents by inhalation. EPA has concluded that that there is inadequate information to assess the carcinogenic potential of toluene."
5,,peracetic acid,76.05,1.226 g/cu cm at 15 °C,-0.2 °C,"Highly corrosive to skin; [Quick CPC] Commercial solution is peracetic acid, hydrogen peroxide, acetic acid, and water (at equilibrium); [Merck Index] May decompose violently from shock, friction, or concussion; A strong oxidizing agent that reacts violently with combustible substances; Corrosive to skin, eyes, and respiratory tract; Inhalation may cause pulmonary edema; [ICSC] Does not exist in pure (100%) form; Commercially available as an equilibrium solution, distilled product (mainly peracetic acid and water), and can be generated in situ with an activator and a persalt dissolved in water; [OECD SIDS] Occupational asthma confirmed by broncho-provocation testing in two endoscopy unit workers exposed to peracetic acid/hydrogen peroxide mixture; [Malo] TLV Basis: Irritation (upper respiratory, eye, and skin); [ACGIH] Available in 40% solution; Causes severe burns; Inhalation of high concentration can cause injury to the upper respiratory tract; [MSDSonline]"
6,,sodium sulfate,142.04,,884 °C,May cause gastrointestinal disturbance if ingested; [ICSC] Nonirritating to skin and mucous membranes; [HSDB]
7,,hexane,86.18,"0.659 at 68 °F (USCG, 1999) - Less dense than water; will float",-95.35 °C,"n-Hexane is a chemical made from crude oil. Pure n-Hexane is a colorless liquid with a slightly disagreeable odor. It is highly flammable, and its vapors can be explosive. Puren-Hexane is used in laboratories. Most of then-Hexane used in industry is mixed with similar chemicals called solvents. The major use for solvents containing n-Hexane is to extract vegetable oils from crops such as soybeans. These solvents are also used as cleaning agents in the printing, textile, furniture, and shoemaking industries. Certain kinds of special glues used in the roofing and shoe and leather industries also contain n-Hexane. Several consumer products containn-Hexane, such as gasoline, quick-drying glues used in various hobbies, and rubber cement. ."
8,,153221-24-0,229.3,,,
