We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
As far as I can tell, the splitters will only do train/test splits. It would be really useful to allow for a third validation split.
The text was updated successfully, but these errors were encountered:
Hi @kkovary,
Thank you for your question!
Similar to how Scikit-learn does this, you can achieve this by simply splitting the train set again. For example:
import datamol as dm from splito import ScaffoldSplit # Load some data data = dm.data.chembl_drugs() all_smiles = data["smiles"].tolist() # Generate the trainval-test split splitter = ScaffoldSplit(smiles=all_smiles, test_size=0.2) trainval_idx, test_idx = next(splitter.split(X=all_smiles)) # Generate the train-val split trainval_smiles = all_smiles[trainval_idx] splitter = ScaffoldSplit(smiles=trainval_smiles, test_size=0.2) train_idx, val_idx = next(splitter.split(X=trainval_smiles))
I do agree that with our current setup, this is a bit verbose. Having something like #8 would probably help to make this easier!
Let me know if that helps!
Sorry, something went wrong.
Thanks so much!
No branches or pull requests
As far as I can tell, the splitters will only do train/test splits. It would be really useful to allow for a third validation split.
The text was updated successfully, but these errors were encountered: