# PubChemPy for the Bioinformatics Club
This notebooks is designed to introduced you to PubChemPy, a library for working with [PubChem](https://www.example.com) resource. To use pubchempy, you'll need to either use the command

```pip install pubchempy```

on your command line or use the command

```!pip install pubchempy```

in the first coding cell in this notebook.

In [None]:
!pip install pubchempy

Once you have installed pubchempy on your computer, you'll need to import it to use it. The standard abbreviation for pubchempy is pcp.

In [None]:
import pubchempy as pcp

Now let's play with it a bit. We're going to learn a bit about the compound object that pubchempy creates, starting with NAD+, a compound I worked with every day in graduate school. In the next cell, use the 

```Compound.from_cid(compound#)```

command to pull NAD+ from PubChem.

In [None]:
molecule = pcp.Compound.from_cid(5287958)

Now we will use explore the contents of the compound object that can be extracted using the command

```molecule = c.trait```

where trait can be molecular_weight, molecular_formula, isomeric_smiles, xlogp, iupac_name, and synonyms. You can also select any trait from a menu if you type

```print(molecule.<tab>)```

where <tab> means to hit the tab kit so you can see all options. Try a few.

In [None]:
print(molecule.molecular_weight)

In [None]:
print(molecule.molecular_formula)

In [None]:
print(molecule.isomeric_smiles)

In [None]:
print(molecule.xlogp)

In [None]:
print(molecule.iupac_name)

In [None]:
print(molecule.synonyms)

What if you don't know the PubChem cid for your compound of interest? pubchempy has a get_compound function that addresses this.

In [None]:
results = pcp.get_compounds('C21H27N7O14P2', 'formula')
print(results)

In [None]:
pcp.get_compounds('tylenol', 'name', record_type='3d')

In [None]:
tylenol = pcp.Compound.from_cid(1983)
print(tylenol.iupac_name)
print(tylenol.molecular_weight)
print(tylenol.molecular_formula)
print(tylenol.synonyms)

In [None]:
pcp.get_compounds('benzene', 'name')

In [None]:
benzene = pcp.Compound.from_cid(241)
print(benzene.isomeric_smiles)

### Dataframes from PubChemPy

You can import information from PubChem in the form of a pandas DataFrame.

In [None]:
df1 = pcp.get_compounds('C20H41Br', 'formula', as_dataframe=True)
df2 = pcp.get_substances([9,99,999,9999], as_dataframe=True)
df3 = pcp.get_properties(['isomeric_smiles', 'xlogp', 'rotatable_bond_count'], 'C20H41Br', 'formula', as_dataframe=True)

In [None]:
df3.head()

In [None]:
df2.head()

In [None]:
df1.head()

In [None]:
# Download image files from PubChem

pcp.download('PNG', 'images/asp.png', 'Aspirin', 'name', overwrite=True)
pcp.download('PNG', 'images/acet.png', 'Acetaminophen', 'name', overwrite=True)
pcp.download('CSV', 'data/s.csv', [1,2,3], operation='property/CanonicalSMILES,IsomericSMILES', overwrite=True)


In [None]:
#Display the aspirin image

from IPython.display import Image, display

image_paths = ['images/asp.png', 'images/acet.png']

for image_path in image_paths:
    display(Image(filename=image_path))

In [None]:
cid = pcp.get_cids('acetaminophen', 'name')
cid

In [None]:
# Visualize the aspirin in 3D

import py3Dmol
py3Dmol.view()
view = py3Dmol.view(width = 680, height = 250, query ='cid:3345', viewergrid = (1,3), linked = True)

view.setStyle({'line': {'linewidth': 8}}, viewer = (0,0))
view.setStyle({'stick': {'colorscheme':'cyanCarbon'}}, viewer = (0,1))
view.setStyle({'sphere': {}}, viewer = (0,2))

view.setBackgroundColor('#ebf4fb', viewer = (0,0))
view.setBackgroundColor('#cda9fc', viewer = (0,1))
view.setBackgroundColor('#e6e6e6', viewer = (0,2))