# Synutils repo walk-through 7th Sep 2023

This walk-through series primarily aims to explore Synple's Python utility module, providing an in-depth understanding of the technical details of computational services, with a primary focus on cheminformatics functions.

However, as we delve into the fundamental programming practices of cheminformatic functions, I extend this invitation to non-Synple members to join the series.

## Git usage
There are tones of great tutorials on internet, here is one (thx Julian!) https://codingforchemists.com/vcs-basics/

More a practical understanding of commits: 
![git_snapshot](../resources/figs/git_snapshot.png)

## Installation
```
% git clone https://github.com/Synple-Chem/synple-utils.git
% cd synple-utils
% make env
% conda activate ./env
```




In [1]:
# import packages
import pandas as pd
import numpy as np
from rdkit.Chem import MolFromSmiles
from synutils.featurizers import AVAILABLE_FEATURIZERS, get_featurizer
from synutils.path import ROOT_PATH

In [None]:
# load sample data
data_path = ROOT_PATH/'data'/ 'data_230907.csv'
df = pd.read_csv(data_path)
df.head()

In [None]:
# featurizers
avalable_feature_names = AVAILABLE_FEATURIZERS.keys()
print(f"There are {len(avalable_feature_names)} available featurizers: {avalable_feature_names}")

In [None]:
rdkit_2d_featurizer = get_featurizer('rdkit_2d')
print(f"rdkit 2D featurizer returns you rdkit 2D features includes {rdkit_2d_featurizer.desc_list}")

In [None]:
# add features to dataframe
df["mol"] = df["smiles"].apply(lambda x: MolFromSmiles(x))
df["rdkit_2d"] = df["mol"].apply(lambda x: rdkit_2d_featurizer.get_feat(x))
df.head()

In [None]:
morgan_featurizer = get_featurizer('morgan')
df["morgan"] = df["mol"].apply(lambda x: morgan_featurizer.get_feat(x))
df.head()