I am passionate about building cutting-edge technologies in CV, NLP and Mobile to create innovative solutions.
- VideoMultiAgent: SOTA multi-agent VQA system on Intent-QA (79.0%, +6.2% over previous SOTA), EgoSchema subset (75.4%, +3.4%), and NExT-QA (79.6%, +0.4%) through specialized agents for visual, textual, and graph-based reasoning.
- On-Device NLP library: Developed DIPS, a lightweight Cantonese word segmentation model achieving near SOTA performance while being 17x smaller and 3x faster than baselines. Utilized a suite of model optimization techniques including knowledge distillation, structured pruning, and quantization to reduce a BERT model's size from 51MB down to 3MB for deployment to edge devices.
- Cantonese Dictionary Mobile App: Developed a cross-platform mobile app using Flutter and Rust, supporting efficient search across 15K+ dictionary entries. Optimized user experience with local SQLite and cloud DynamoDB databases. Gained over 5K monthly active users.
My research at the Michigan Intelligent Programming Lab led to a novel program synthesis algorithm that outperformed the previous state-of-the-art method for web automation by 6x and solved 2.5x more benchmarks. The algorithm lifts program interpretation from evaluating one program at a time to simultaneously evaluating all programs. I presented this work as the first author at POPL 2024, a top conference in the field, and was invited to give a talk at MIT PLR 2024 as one of 9 distinguished papers.
I have a strong foundation in computer systems, with experience in operating systems, compilers, and programming languages. My coursework included Programming Languages, Compiler Construction, Intro to Operating Systems, Computer Security, and more. I led a team of 3 to implement a memory manager, concurrent file server, and thread library in C++, and developed comprehensive testing infrastructure to ensure system stability.