You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've noticed that the quality of imputed data is worse than that of generated data. Below is a minimal reproducible example, with Two Moons data. I generated a N=200 dataset, and then created a ForestDiffusionModel with N=400 samples, comprising both the N=200 dataset and a modified copy of that dataset with the second dimension set to NaN.
Sampling from the model produced nice-looking results, but the imputations for the samples with NaNs were much more noisy:
We also observed worse results for imputation. In our paper, you can see that MissForest is the best imputation method, while we end up far from the first place. We are not quite sure why imputation is so much worse than generation considering that for images it works fine. I have tried a lot of things, but nothing improves imputation performance. Our method is best used for generation.
I've noticed that the quality of imputed data is worse than that of generated data. Below is a minimal reproducible example, with Two Moons data. I generated a N=200 dataset, and then created a ForestDiffusionModel with N=400 samples, comprising both the N=200 dataset and a modified copy of that dataset with the second dimension set to NaN.
Sampling from the model produced nice-looking results, but the imputations for the samples with NaNs were much more noisy:
![image](https://private-user-images.githubusercontent.com/360485/292677171-119c772c-7f49-4058-8fec-b666acae6a53.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjE2MDE4MjksIm5iZiI6MTcyMTYwMTUyOSwicGF0aCI6Ii8zNjA0ODUvMjkyNjc3MTcxLTExOWM3NzJjLTdmNDktNDA1OC04ZmVjLWI2NjZhY2FlNmE1My5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwNzIxJTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDcyMVQyMjM4NDlaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT1iNTkxYWIxNTQyMTVkZWU0MTAyMmNmZDg5NmRkOWVjOGY2NzUwMGViZDJkNWNmYTkyYzgyNzA2MmQ4ZDViMTY3JlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.V9_zTj-jivccet9-sa_nSPYp8FlJKO1IzRRkSR5__I0)
Code below:
I get similar results if I instead do the following, so that the imputed data don't exactly match on the first dimension with the original data:
The text was updated successfully, but these errors were encountered: