paccmann is a package for drug sensitivity prediction and is the core component of the repo.
The package provides a toolbox of learning models for IC50 prediction using drug's chemical properties and tissue-specific cell lines gene expression.
Setup of the virtual environment
We strongly recommend to work inside a virtual environment (
Create the environment:
python3 -m venv venv
The module can be installed either in editable mode:
pip3 install -e .
Or as a normal package:
pip3 install .
Models can be trained using the script
bin/training_paccmann that is installed together with the module. Check the examples for a quick start.
For more details see the help of the training command by typing
usage: training_paccmann [-h] [-save_checkpoints_steps 300] [-eval_throttle_secs 60] [-model_suffix] [-train_steps 10000] [-batch_size 64] [-learning_rate 0.001] [-dropout 0.5] [-buffer_size 20000] [-number_of_threads 1] [-prefetch_buffer_size 6] train_filepath eval_filepath model_path model_specification_fn_name params_filepath feature_names Run training of a `paccmann` model. positional arguments: train_filepath Path to train data. eval_filepath Path to eval data. model_path Path where the model is stored. model_specification_fn_name Model specification function. Pick one of the following: ['dnn', 'rnn', 'scnn', 'sa', 'ca', 'mca']. params_filepath Path to model params. Dictionary with parameters defining the model. feature_names Comma separated feature names. Select from the following: ['smiles_character_tokens', 'smiles_atom_tokens', 'fingerprints_256', 'fingerprints_512', 'targets_10', 'targets_20', 'targets_50', 'selected_genes_10', 'selected_genes_20', 'cnv_min', 'cnv_max', 'disrupt', 'zigosity', 'ic50', 'ic50_labels']. optional arguments: -h, --help show this help message and exit -save_checkpoints_steps 300, --save-checkpoints-steps 300 Steps before saving a checkpoint. -eval_throttle_secs 60, --eval-throttle-secs 60 Throttle seconds between evaluations. -model_suffix , --model-suffix Suffix for the trained moedel. -train_steps 10000, --train-steps 10000 Number of training steps. -batch_size 64, --batch-size 64 Batch size. -learning_rate 0.001, --learning-rate 0.001 Learning rate. -dropout 0.5, --dropout 0.5 Dropout to be applied to set and dense layers. -buffer_size 20000, --buffer-size 20000 Buffer size for data shuffling. -number_of_threads 1, --number-of-threads 1 Number of threads to be used in data processing. -prefetch_buffer_size 6, --prefetch-buffer-size 6 Prefetch buffer size to allow pipelining.