data split confusion

Hello, a great job. But I have some confusion about the train and test data split,  in do_kfold.py and do_stratified_kfold.py use sklearn KFold to split the data and only have train and test set with a random way, in paper "It is crucial to consider these non-binders as outliers during model development and evaluation to ensure model accuracy and robustness." for the S645 dataset. Does handling these outliers mean directly deleting the data with a ddG==8 in S645? In the last, Could you please provide a more detailed document on training or inference? Thanks.







Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data split confusion #1

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

data split confusion #1

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions