You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to use SuStaIn with a train / test like approach, in which I have two dataset:
the training one, that I want to use as to infer the summary subtypes (that is to say: subtype 1 is this sequence of abnormalities, subtype 2 is this sequence, etc), and I do not really worry about the actual subtyping of the individual subjects in this dataset. This is where I would run the run_sustain_algorithm method, if i'm correct.
the test one, that I would like to use as follows: given the subtypes discovered on the training set, I want to subtype (and maybe stage, even though I'm less interest in this) these new subjects with respect to the subtypes discovered on the training set. Intuitively, if I go back to the following notations from the Young et al. (2018) paper, if (f_c, S_c)_c are the subtypes (and their prevalence) discovered at the training step, given some new X (the test data), I would want to evaluate P(X | S_c) for each c, and infer the best subtype for each of my new subjects from this mixture of subtypes.
So it seems to me that this makes sense from a methodological point of view (but I could be mistaken 😅).
Now I don't seem to find exactly how I would proceed to perform this last step, given the output from the first step. I went back to the notebook from the workshop (that I had followed some time ago) and it looks to me that the presented cross_validate_sustain_model mainly focuses on cross validation metrics, rather than outputting the subtypes corresponding to the "test" subtypes.
I am sorry if this is treated somewhere that I have missed, and don't hesitate if the question is somewhat unclear, I'm happy to rephrase or go more into details 🙂
Cheers,
Nemo
The text was updated successfully, but these errors were encountered:
Hi SuStaIn team!
I am trying to use SuStaIn with a train / test like approach, in which I have two dataset:
run_sustain_algorithm
method, if i'm correct.So it seems to me that this makes sense from a methodological point of view (but I could be mistaken 😅).
Now I don't seem to find exactly how I would proceed to perform this last step, given the output from the first step. I went back to the notebook from the workshop (that I had followed some time ago) and it looks to me that the presented
cross_validate_sustain_model
mainly focuses on cross validation metrics, rather than outputting the subtypes corresponding to the "test" subtypes.I am sorry if this is treated somewhere that I have missed, and don't hesitate if the question is somewhat unclear, I'm happy to rephrase or go more into details 🙂
Cheers,
Nemo
The text was updated successfully, but these errors were encountered: