Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add blinding option #40

Open
wtraylor opened this issue May 15, 2021 · 0 comments
Open

Add blinding option #40

wtraylor opened this issue May 15, 2021 · 0 comments
Labels
enhancement New feature or request science Biological model assumptions are questionable

Comments

@wtraylor
Copy link
Owner

wtraylor commented May 15, 2021

Blinding simulations is helpful for registration-driven modeling: You can prepare all your analysis scripts with the blinded simulation output, then preregister, and only use the real output when your methods are all prepared. That helps against hindsight bias, and you are encouraged to actually document and archive every simulation and analysis.

The best way would be to let MMM generate blinded output alongside real output; both are structurally similar and only differ in their semantics.
For the final analysis, one just needs to switch from one to the other.

Consider to obfuscate both average and temporal pattern of each aggregation unit:

  • Multiply with random constant to obfuscate the average.
  • Multiply each datum with a random number r. It could be a continuous curve in the form of R = [r₁, r₂, r₃, …] = [r₁, r₁*r, r₂*r, r₃*r, …].

The new TOML option would boolean: output.text_tables.blinding = true|false

I think it would make sense to create separate blinded files alongside the real files, e.g. mass_density_per_hft.blinded.tsv.
demo_results.Rmd should then show an example of how to read it, something like:

library(tools)

blinded_filename <- paste(file_path_sans_ext(filename), "blinded", "tsv", sep = ".")
if (file.exists(blinded_filename))
  filename <- blinded_filename
if (!file.exists(filename))
  stop("File does not exist: ", filename)
tbl <- read.delim(filename)
attr(tbl, blinded) <- (filename == blinded_filename)

# ...and in the plot:
ggplot::labs(caption = ifelse(attributes(tbl)$blinded, "These data are blinded!", ""))

When the user is ready, they can just delete all blinded files:

find -name '*.blinded.tsv' -delete
@wtraylor wtraylor added enhancement New feature or request science Biological model assumptions are questionable labels May 15, 2021
@wtraylor wtraylor added this to the 1.1.0 milestone May 15, 2021
@wtraylor wtraylor removed this from the 1.1.0 milestone Jun 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request science Biological model assumptions are questionable
Projects
None yet
Development

No branches or pull requests

1 participant