Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Name change request: mice #64

Closed
stefvanbuuren opened this issue Jul 9, 2019 · 3 comments · Fixed by #70
Closed

Name change request: mice #64

stefvanbuuren opened this issue Jul 9, 2019 · 3 comments · Fixed by #70

Comments

@stefvanbuuren
Copy link

stefvanbuuren commented Jul 9, 2019

Dear Elton,

Thanks for your effort to implement an algorithm for imputing multivariate data.

I’d like to request a name change of your impyute.imputation.cs.mice procedure. The documentation of this procedure says that it implements Multivariate Imputation by Chained Equations (MICE) from my JSS 2011 paper. However, this documentation is not accurate since your procedure does not implement the MICE algorithm. It differs in important respects from my method:

  • Your procedure provides a single imputation, whereas MICE is a procedure for generating multiple imputations;
  • Your procedure imputes the “best” (predicted) value, while the MICE algorithm always adds noise;
  • Your procedure uses linear regression, whereas the MICE algorithm is open to any type of imputation model;
  • Your procedure uses different convergence criteria.

These differences have profound methodological implications. Advertising your procedure as “MICE” will create confusion among analysts, who might be led to believe that they are doing MICE when in fact they are not.

Your procedure is an implementation of Buck’s method published in 1960 (described in more detail in Little & Rubin 2002), so I would suggest that you could perhaps rename to “buck”?

With regards,
Stef van Buuren

@eltonlaw
Copy link
Owner

eltonlaw commented Jul 10, 2019

Well...this is really, really terrible. Apologies for any trouble this has brought you (and anyone else this has affected), I understand that this mistake could potentially make a huge impact in certain cases. It's been more than year but I thought I had understood the paper, that was my mistake. I haven't put enough effort into validating the results of the imputations and that's something I should have prioritized.

This will be remedied as soon as possible and I will make extra effort to ensure this won't happen ever again. Sorry.

And thank you for raising the issue, the explanation is enlightening (and sombering)

@eltonlaw
Copy link
Owner

eltonlaw commented Jul 11, 2019

Hmm, Buck's method seems to vary with the current implementation in two ways:

  1. For each column, regression coefficients are calculated on the complete case. In the current implementation, a mean impute is calculated on all null values in other columns.
  2. Buck's is just one pass whereas what's current implemented is iterative

I could just rewrite it to completely follow the method in the paper...but I don't know if that would be most effective. A concern I have: in 1) a faked complete case is generated on the entire data because anecdotally, I've sometimes worked with sparse datasets and I feel that asking for complete cases would pare down the resulting input set too much. Perhaps if there was a way to optimize the path of which columns get computed first such that the total amount (or some other measure) of dropped rows is minimized. Not too sure on this, need to do more research.

Anyways, for the short term, going to temporarily rename it buck_iterative while a longer term fix (and a better name) is prepared.

This was referenced Jul 11, 2019
eltonlaw added a commit that referenced this issue Jul 11, 2019
eltonlaw added a commit that referenced this issue Jul 11, 2019
eltonlaw added a commit that referenced this issue Jul 11, 2019
[#64] Rename `mice` -> `buck_iterative`
@stefvanbuuren
Copy link
Author

Elton, wonderful.

I agree that buck_iterative is indeed the proper name for the procedure.

Stef.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants