๐ M.S. in Computer Science @ University of Rochester
๐งฌ Working at the intersection of AI, single-cell genomics, and systems
๐ Based in Rochester, NY
๐ง yhu116@u.rochester.edu | yhu116@ur.rochester.edu
๐ (603) 866-4355
๐ GitHub
Iโm a computer scientist and researcher interested in how modern ML and foundation models can be used to understand complex systems โ from disease-associated brain cell states to end-to-end AI music generation. I enjoy building things that actually run: reproducible pipelines, research code that other people can use, and small systems that behave like โrealโ tools rather than class demos.
Right now, my work spans single-cell foundation models (scGPT, Geneformer), distributed systems, and multimodal generative AI.
Research Assistant, University of Rochester Medical Center
Training Generative AI to Identify Disease-Associated Cell States (May 2025 โ Present)
- Build end-to-end pipelines for brain snRNA-seq datasets from psychiatric and neurodegenerative cohorts (QC, normalization, HVG selection, batch correction, metadata harmonization).
- Apply scGPT-based reference mapping and cell-type annotation; generate latent embeddings with Scanpy / PyTorch on NAIRR GPU clusters (Slurm).
- Fine-tune single-cell foundation models (scGPT, Geneformer) for cell-type/state classification and disease trajectories; run ablations and baselines and visualize results with UMAPs and model cards.
- Languages: Python, Java, C, C++, OCaml, Rust, Swift, HTML5, CSS, JavaScript
- Data / Stats / Scripting: R, Stan, MATLAB, MySQL, PHP, Ruby, C#
- ML / AI: scikit-learn, PyTorch, TensorFlow, Hugging Face Transformers, GitHub Copilot
- Cloud & Tools: Git, SSH, Docker, Google Cloud Platform, Google Colab, AWS, Xcode, Visual Studio, VS Code, JetBrains IDEs, RStudio, JMeter, AMP
- Focus Areas: Machine learning, deep learning, NLP, single-cell modeling, data mining, distributed systems
An end-to-end multimodal music-generation system.
- Uses Whisper for speech-to-text, Llama-4-Scout for lyric creation, Gemini Flash 2.0 for lyric structuring, and YuE for full music synthesis (vocals + instrumentation).
- Provides a Gradio-based UI and a multi-model pipeline supporting quantized deployment across different GPUs.
- Implements API orchestration and custom model adapters, benchmarking generation latency (~13โ9 minutes depending on GPU).
A teaching-oriented but fully functional DVCS.
- Implements Git-like operations such as commits, branching, merging, and remote syncing.
- Designed around information hiding and minimal interfaces for the core
FileLogmodule. - Written in Rust, with a focus on correctness and clear separation of responsibilities.
Using ML to study lifestyle-related physiological patterns.
- Preprocessed and analyzed 70,000+ physiological samples to study associations between body signals and smoking/drinking habits.
- Applied PCA for dimensionality reduction and trained SVM and XGBoost models to predict smoking/drinking status, achieving ~81% accuracy with supporting visualizations.
NLP for political tweet classification.
- Classified political tweets from seven Northern European countries.
- Built an NLP pipeline with feature extraction and LinearSVC, achieving 77% accuracy.
University of Rochester โ M.S. in Computer Science
Rochester, NY โข Aug 2024 โ Present
- GPA: 3.95/4.00
- Graduate Tuition Scholarship (50% of full-time tuition cost)
- Aโs in all core CS courses (NLP, Machine Vision, Parallel & Distributed Systems, Databases, Networks, etc.)
University of Rochester โ B.S. in Computer Science
Rochester, NY โข Aug 2021 โ Dec 2023
- GPA: 3.96/4.00
- Merit-based Tuition Scholarship ($5,000)
- Deanโs List every semester; Aโs in all CS and Math major courses
Research Participant, Department of Mathematics
University of Rochester (Jul 2023 โ Aug 2023)
- Studied fractal-based metrics for forecasting retail time-series data.
- Computed a Discrete-S Energy metric to approximate Hausdorff dimension and compared it with forecastability scores and compression-based complexity measures.
- Evaluated Holt-Winters, Prophet, and LSTM models and analyzed correlations between discrete energy, complexity metrics, and RMSE/SD.
Teaching Assistant
University of Rochester (Sep 2022 โ Dec 2023)
- Supported courses including Computer Organization and Mobile App Development.
- Held office hours, answered questions on course material, and graded assignments and exams.
Computer Science Intern & Web Group Leader
VISION X LLC, San Jose, CA (Aug 2020 โ Oct 2020)
- Led a small team to integrate Cohere LLMs into client websites using Django and AWS, creating embedded AI chatbots and premium subscription features.
- Optimized backend infrastructure for scalability and responsiveness, improving customer interaction quality by ~30%.
- Received a recommendation from the CEO for leadership and problem-solving.
- Fine-tuning single-cell foundation models (scGPT, Geneformer) on brain snRNA-seq datasets.
- Experimenting with training tricks such as Smart-Freeze, mixture-of-experts, and CUDA 12.x migration on NAIRR/ACES GPUs.
- Exploring AI + creative tools, especially systems that combine language, music, and generative models.
If youโd like to collaborate, feel free to open an issue, start a discussion, or reach out by email.

