I'm M. Liu, a second-year master student in Statistics at the University of Chicago. Here I'm fortunate to work with Prof. Barber and Prof. Aragam. Prior to UChicago, I obtained my Bachelor of Science degree in Statistics at Beijing Normal Unversity.
My current research interests lie in the intersection of distribution-free inference, graphical model, causal inference and generalized machine learning.
Current project about validity and power exploration for the local permutation test
Supervisor: Rina Barber
Experimental results later...
Initial Code
Second Version
Interesting Presentation
Current project about global minimum check and possible solutions for NOTEARS
Supervisor: Bryon Aragam
Experimental results later...
Initial Code
Pytorch Version
Mar 2023 - Apr 2023
Reproduced randomization-t and randomization-c algorithm from Paper. Original paper integrates a bunch of econometric studies and use randomization tests to test their significance results, saying that radomization tests yield 33% to 49% fewer statistically significant results than conventional tests.
Reproduction Code
Sep 2021 - May 2022
Under supervision of Dr. Renjun Xu, we replicated biologially informed network, where the first layer is randomly connected and the following layers are connected according to their gene-pathway relationships. Then we collected public data for stomach, colorectal and liver cancer, and applied the model on these three datasets, with the hope to identify domain-specific and common genetic variation sites.
Reference: Elmarakeby, H.A., Hwang, J., Arafeh, R. et al. Biologically informed deep neural network for prostate cancer discovery. Nature 598, 348–352 (2021).
- Stomach cancer
- Colorectal cancer
- Liver cancer
Aug 2021 - Jan 2022
ExtraMAE 🔗paper
Collaborators: Mengyue Zha, SiuTim Wong, Tong Zhang, Kani Chen
We worked on generating synthetic financial time series data for downstream tasks to compensate for financial data's expensiveness and scarcity. Here we used the supervised learning method, a masked autoencoder model, to complete time series generation and found its superiority in various downstream tasks such as time series classification, prediction, and imputation. We detailed the model and its outstanding performance comparing to other benchmark models in the paper.
Sources: ExtraMAE code; One benchmark C-RNN-GAN code
- I love exploration and challenges. ♒️
- Recent activities: ⛷🧩🏸🎹⛰🚴🏻♀️
- If you would like to walk along Lake Michigan in Chicago, feel free to reach out✉️.