New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shuffling+clustering round (best params) using classification #134
Comments
Best configuration experimentsComparisons
More in detail - GNNsLegenda for experiments' names
Shuffled data
Clustering on peptides
Clustering on (stratified) alleles
Clustering on alleles
Why is it performing so badly? Testing set: 1218 samples, 1%, containing cluster C only (A, B and E are in the training+validation set only).
It's evident that the network is not learning the physics behind the binding of pMHC complexes. Also, the test set is too small to be considered a fair evaluation of the network and contains a proportion of 1s and 0s very different from the training (56/44 vs 29/71). Conclusions
|
After having determined the influence of standardization (#124), weighted loss function (#126), batch size (#127), and batch normalization (#131), experiment with clusters, using the choices that gave better results for shuffled data.
Common features among the following experiments:
PMHCI_Network01
in src/4_train_models/GNN/I/classification/struct/pmhc_gnn.py)pssm
feature removedExperiments:
The text was updated successfully, but these errors were encountered: