Skip to content

libokj/CDPN-VS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CDPN-VS

DeepSEQreen is a PyTorch-based CPI prediction toolkit with a unified data pipeline and a modular framework. CDPN-VS is a barebone version of DeepSEQreen for demonstrative purposes.

Installation

Install PyTorch according to instructions: https://pytorch.org/get-started/locally/.

Install other requirements:

pip install -r requirements.txt

Train model with a preset model configuration from configs/preset/ and a sweep (experiment) configuration from configs/sweep/:

# Using preset model `graph_dta_gin` for GraphDTA-GIN and experiment `demo` for pre-split demo dataset.
# Train on GPU
python -m deepseqreen.train preset=graph_dta_gin sweep=demo trainer=gpu tags=[demo,graph_dta_gin]

Basic usages

Resume training from checkpoint

python -m deepseqreen.train preset=graph_dta_gin sweep=demo ckpt_path=/path/to/ckpt/name.ckpt tags=[demo,graph_dta_gin]

Note: Checkpoint can be either path or URL.

Test checkpoint on an external test dataset

python -m deepseqreen.test preset=graph_dta_gin data.data_file=/path/to/test.csv ckpt_path=/path/to/ckpt_name.ckpt tags=[demo,graph_dta_gin]

Data files

DeepSEQreen requires input data files in CSV format similar to those from TDC. The CSV file should contain the following columns:

  • ID1: The identifier for the compounds (optional).
  • X1: The SMILES for the compounds.
  • ID2: The identifier for the targets (optional).
  • X2: The sequence for the targets.
  • Y: The label for the interaction (e.g., binary interaction, binding affinity, etc.).
ID1,X1,ID2,X2,Y
DB06075,C[C@]1(O)C[C@@H](c2nc(-c3ccc4ccc(-c5ccccc5)nc4c3)c3c(N)nccn32)C1,Q9UPZ9,MNRYTTIRQLGDGTYGSVLLGRSIESGELIAIKKMKRKFYSWEECMNLREVKSLKKLNHANVVKLKEVIRENDHLYFIFEYMKENLYQLIKERNKLFPESAIRNIMYQILQGLAFIHKHGFFHRDLKPENLLCMGPELVKIADFGLAREIRSKPPYTDYVSTRWYRAPEVLLRSTNYSSPIDVWAVGCIMAEVYTLRPLFPGASEIDTIFKICQVLGTPKKTDWPEGYQLSSAMNFRWPQCVPNNLKTLIPNASSEAVQLLRDMLQWDPKKRPTASQALRYPYFQVGHPLGSTTQNLQDSEKPQKGILEKAGPPPYIKPVPPAQPPAKPHTRISSRQHQASQPPLHLTYPYKAEVSRTDHPSHLQEDKPSPLLFPSLHNKHPQSKITAGLEHKNGEIKPKSRRRWGLISRSTKDSDDWADLDDLDFSPSLSRIDLKNKKRQSDDTLCRFESVLDLKPSEPVGTGNSAPTQTSYQRRDTPTLRSAAKQHYLKHSRYLPGISIRNGILSNPGKEFIPPNPWSSSGLSGKSSGTMSVISKVNSVGSSSTSSSGLTGNYVPSFLKKEIGSAMQRVHLAPIPDPSPGYSSLKAMRPHPGRPFFHTQPRSTPGLIPRPPAAQPVHGRTDWASKYASRR,0
DB06075,C[C@]1(O)C[C@@H](c2nc(-c3ccc4ccc(-c5ccccc5)nc4c3)c3c(N)nccn32)C1,Q9Y2K2,MAAAAASGAGGAAGAGTGGAGPAGRLLPPPAPGSPAAPAAVSPAAGQPRPPAPASRGPMPARIGYYEIDRTIGKGNFAVVKRATHLVTKAKVAIKIIDKTQLDEENLKKIFREVQIMKMLCHPHIIRLYQVMETERMIYLVTEYASGGEIFDHLVAHGRMAEKEARRKFKQIVTAVYFCHCRNIVHRDLKAENLLLDANLNIKIADFGFSNLFTPGQLLKTWCGSPPYAAPELFEGKEYDGPKVDIWSLGVVLYVLVCGALPFDGSTLQNLRARVLSGKFRIPFFMSTECEHLIRHMLVLDPNKRLSMEQICKHKWMKLGDADPNFDRLIAECQQLKEERQVDPLNEDVLLAMEDMGLDKEQTLQSLRSDAYDHYSAIYSLLCDRHKRHKTLRLGALPSMPRALAFQAPVNIQAEQAGTAMNISVPQVQLINPENQIVEPDGTLNLDSDEGEEPSPEALVRYLSMRRHTVGVADPRTEVMEDLQKLLPGFPGVNPQAPFLQVAPNVNFMHNLLPMQNLQPTGQLEYKEQSLLQPPTLQLLNGMGPLGRRASDGGANIQLHAQQLLKRPRGPSPLVTMTPAVPAVTPVDEESSDGEPDQEAVQSSTYKDSNTLHLPTERFSPVRRFSDGAASIQAFKAHLEKMGNNSSIKQLQQECEQLQKMYGGQIDERTLEKTQQQHMLYQQEQHHQILQQQIQDSICPPQPSPPLQAACENQPALLTHQLQRLRIQPSSPPPNHPNNHLFRQPSNSPPPMSSAMIQPHGAASSSQFQGLPSRSAIFQQQPENCSSPPNVALTCLGMQQPAQSQQVTIQVQEPVDMLSNMPGTAAGSSGRGISISPSAGQMQMQHRTNLMATLSYGHRPLSKQLSADSAEAHSLNVNRFSPANYDQAHLHPHLFSDQSRGSPSSYSPSTGVGFSPTQALKVPPLDQFPTFPPSAHQQPPHYTTSALQQALLSPTPPDYTRHQQVPHILQGLLSPRHSLTGHSDIRLPPTEFAQLIKRQQQQRQQQQQQQQQQEYQELFRHMNQGDAGSLAPSLGGQSMTERQALSYQNADSYHHHTSPQHLLQIRAQECVSQASSPTPPHGYAHQPALMHSESMEEDCSCEGAKDGFQDSKSSSTLTKGCHDSPLLLSTGGPGDPESLLGTVSHAQELGIHPYGHQPTAAFSKNKVPSREPVIGNCMDRSSPGQAVELPDHNGLGYPARPSVHEHHRPRALQRHHTIQNSDDAYVQLDNLPGMSLVAGKALSSARMSDAVLSQSSLMGSQQFQDGENEECGASLGGHEHPDLSDGSQHLNSSCYPSTCITDILLSYKHPEVSFSMEQAGV,0
DB06075,C[C@]1(O)C[C@@H](c2nc(-c3ccc4ccc(-c5ccccc5)nc4c3)c3c(N)nccn32)C1,Q76MJ5,MASAVRGSRPWPRLGLQLQFAALLLGTLSPQVHTLRPENLLLVSTLDGSLHALSKQTGDLKWTLRDDPVIEGPMYVTEMAFLSDPADGSLYILGTQKQQGLMKLPFTIPELVHASPCRSSDGVFYTGRKQDAWFVVDPESGETQMTLTTEGPSTPRLYIGRTQYTVTMHDPRAPALRWNTTYRRYSAPPMDGSPGKYMSHLASCGMGLLLTVDPGSGTVLWTQDLGVPVMGVYTWHQDGLRQLPHLTLARDTLHFLALRWGHIRLPASGPRDTATLFSTLDTQLLMTLYVGKDETGFYVSKALVHTGVALVPRGLTLAPADGPTTDEVTLQVSGEREGSPSTAVRYPSGSVALPSQWLLIGHHELPPVLHTTMLRVHPTLGSGTAETRPPENTQAPAFFLELLSLSREKLWDSELHPEEKTPDSYLGLGPQDLLAASLTAVLLGGWILFVMRQQQPQVVEKQQETPLAPADFAHISQDAQSLHSGASRRSQKRLQSPSKQAQPLDDPEAEQLTVVGKISFNPKDVLGRGAGGTFVFRGQFEGRAVAVKRLLRECFGLVRREVQLLQESDRHPNVLRYFCTERGPQFHYIALELCRASLQEYVENPDLDRGGLEPEVVLQQLMSGLAHLHSLHIVHRDLKPGNILITGPDSQGLGRVVLSDFGLCKKLPAGRCSFSLHSGIPGTEGWMAPELLQLLPPDSPTSAVDIFSAGCVFYYVLSGGSHPFGDSLYRQANILTGAPCLAHLEEEVHDKVVARDLVGAMLSPLPQPRPSAPQVLAHPFFWSRAKQLQFFQDVSDWLEKESEQEPLVRALEAGGCAVVRDNWHEHISMPLQTDLRKFRSYKGTSVRDLLRAVRNKKHHYRELPVEVRQALGQVPDGFVQYFTNRFPRLLLHTHRAMRSCASESLFLPYYPPDSEARRPCPGATGR,0

About

DeepSEQreen is a PyTorch-based CPI prediction toolkit with a unified data pipeline and a modular framework. CDPN-VS is a barebone version of DeepSEQreen for demonstrative purposes.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages