-
Notifications
You must be signed in to change notification settings - Fork 5
Description
Hi! Would it be of interest to have replicability (i.e. are the components discovered replicable?) as another metric to evaluate cp models?
The replicability check would boil down to the following steps:
- Split the data in a (user-chosen) mode in
$$N$$ folds (also user-chosen) - Create
$$N$$ train (sub)sets from subtracting each fold the complete dataset - Fit multiple initializations to each train (sub)set and choose the best run according to lowest loss (
$$N$$ total best runs) - Repeat the above process
$$M$$ times (user-chosen), to find a total of$$M \times N$$ best runs - Compare in terms of FMS (skipping the mode the splitting was applied) the factorization to evaluate the replicability of the uncovered patterns
If a certain percentile of the formed set is more than a given threshold, then this model passes this check as it consistently finds the same patterns. What do you think?
References:
[1] Adali T, Kantar F, Akhonda MABS, Strother S, Calhoun VD, Acar E. Reproducibility in Matrix and Tensor Decompositions: Focus on Model Match, Interpretability, and Uniqueness. IEEE Signal Process Mag. 2022 Jul;39(4):8-24. doi: 10.1109/msp.2022.3163870. Epub 2022 Jun 28. PMID: 36337436; PMCID: PMC9635492.
[2] Yan S, Li L, Horner D, Ebrahimi P, Chawes B, Dragsted LO, Rasmussen MA, Smilde AK, Acar E. Characterizing human postprandial metabolic response using multiway data analysis. Metabolomics. 2024 May 9;20(3):50. doi: 10.1007/s11306-024-02109-y. PMID: 38722393; PMCID: PMC11082008.