<a href="https://colab.research.google.com/github/Nithish-Chandra-Devarashetty/deepchem/blob/master/snap1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Using SNAPFeaturizer for Molecular Graph Representation
Graph Neural Networks (GNNs) are powerful tools for molecular property prediction, as they naturally represent molecules as graphs. Each atom corresponds to a node, and bonds form the edges between them. To process molecular graphs, we need a featurizer that converts molecular structures into numerical representations. SNAPFeaturizer provides a simplified yet effective way to extract atomic and bond features.
#Colab
This tutorial and the rest in the sequences are designed to be done in Google colab. If you'd like to open this notebook in colab, you can use the following link.

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1wSrOFG-_tkxW4hptDIIXFwxoeJwk9yUQ?usp=sharing)

##Installing Dependencies
Ensure you have DeepChem and RDKit installed. If not, install them using:

In [None]:
!pip install deepchem rdkit-pypi

Collecting deepchem
  Downloading deepchem-2.8.0-py3-none-any.whl.metadata (2.0 kB)
Collecting rdkit-pypi
  Downloading rdkit_pypi-2022.9.5-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.9 kB)
Collecting rdkit (from deepchem)
  Downloading rdkit-2024.9.5-cp311-cp311-manylinux_2_28_x86_64.whl.metadata (4.0 kB)
Downloading deepchem-2.8.0-py3-none-any.whl (1.0 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.0/1.0 MB[0m [31m11.1 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading rdkit_pypi-2022.9.5-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (29.4 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m29.4/29.4 MB[0m [31m40.0 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading rdkit-2024.9.5-cp311-cp311-manylinux_2_28_x86_64.whl (34.3 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m34.3/34.3 MB[0m [31m21.6 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: rdkit-pypi, rdkit, deepchem
Succ

##Importing Required Libraries

In [None]:
import deepchem as dc
from deepchem.feat import SNAPFeaturizer
from rdkit import Chem

##Understanding SNAPFeaturizer


*   Atom features (atomic number, chirality).
*   Bond features (bond type, bond direction)
*   Graph connectivity (adjacency of atoms).

###Example: Featurizing Aspirin
Let's apply SNAPFeaturizer to aspirin (CC(=O)OC1=CC=CC=C1C(=O)O), a common pharmaceutical molecule.


In [None]:
aspirin_smiles = "CC(=O)OC1=CC=CC=C1C(=O)O"

aspirin_mol = Chem.MolFromSmiles(aspirin_smiles)

featurizer = SNAPFeaturizer()

graph_data = featurizer.featurize([aspirin_mol])

print(graph_data)

[GraphData(node_features=[13, 2], edge_index=[2, 26], edge_features=[26, 2])]


##Understanding the Output
* node_features: Array of atomic features (shape: [num_atoms, 2]).
* edge_index: Connectivity matrix (shape: [2, num_edges]).
* edge_features: Bond attributes (shape: [num_edges, 2]).


In [None]:
node_features = graph_data[0].node_features
edge_index = graph_data[0].edge_index
edge_features = graph_data[0].edge_features

print(f"Node Features Shape: {node_features.shape}")
print(f"Edge Index Shape: {edge_index.shape}")
print(f"Edge Features Shape: {edge_features.shape}")

Node Features Shape: (13, 2)
Edge Index Shape: (2, 26)
Edge Features Shape: (26, 2)


##Interpreting Graph Representation
* Node Features (num_atoms, 2): Each atom is represented by its atomic number and chirality.
* Edge Index (2, num_edges): A connectivity matrix defining which atoms are connected.
* Edge Features (num_edges, 2): Each bond is described by bond type and bond direction.



##Conclusion
SNAPFeaturizer converts molecules into graph representations that can be used as input for Graph Neural Networks (GNNs). This featurization is useful in applications like drug discovery and molecular property prediction.