You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I realized that some of the filtering we are doing (emptyDrops and miQC, notably) do/may have statistical models that are fit with some random components. Should we alleviate any variation from this by adding fixed random seeds to the relevant scripts?
The text was updated successfully, but these errors were encountered:
If this was outside of the context of a project like this (e.g., I was working with a single dataset), I might run things multiple times with different seeds to make sure there was some semblance of stability in my results and then set a seed for reproducibility. Taking that approach doesn't really make sense here. I don't have an answer so much as a series of questions/prompts:
Does setting a seed improve our ability to debug things? Which I will extend to include update versions of software without going, "Hm, maybe it was just the seed?"
If we set a seed, do we need to communicate to our users that we haven't done what I describe above (i.e., what we give to them may not be representative of multiple runs)?
And perhaps most importantly: what degree of variability are we expecting? Depending on the answer, is it better to write tests instead?
I realized that some of the filtering we are doing (
emptyDrops
andmiQC
, notably) do/may have statistical models that are fit with some random components. Should we alleviate any variation from this by adding fixed random seeds to the relevant scripts?The text was updated successfully, but these errors were encountered: