Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion: add random seeds to scripts #45

Closed
jashapiro opened this issue Oct 1, 2021 · 2 comments
Closed

Discussion: add random seeds to scripts #45

jashapiro opened this issue Oct 1, 2021 · 2 comments

Comments

@jashapiro
Copy link
Member

I realized that some of the filtering we are doing (emptyDrops and miQC, notably) do/may have statistical models that are fit with some random components. Should we alleviate any variation from this by adding fixed random seeds to the relevant scripts?

@jaclyn-taroni
Copy link
Member

jaclyn-taroni commented Oct 1, 2021

If this was outside of the context of a project like this (e.g., I was working with a single dataset), I might run things multiple times with different seeds to make sure there was some semblance of stability in my results and then set a seed for reproducibility. Taking that approach doesn't really make sense here. I don't have an answer so much as a series of questions/prompts:

  • Does setting a seed improve our ability to debug things? Which I will extend to include update versions of software without going, "Hm, maybe it was just the seed?"
  • If we set a seed, do we need to communicate to our users that we haven't done what I describe above (i.e., what we give to them may not be representative of multiple runs)?
  • And perhaps most importantly: what degree of variability are we expecting? Depending on the answer, is it better to write tests instead?

@jashapiro
Copy link
Member Author

closed by #50

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants