Skip to content
View AnakinHuang's full-sized avatar
  • University of Rochester
  • Rochester, NY

Highlights

  • Pro

Block or report AnakinHuang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please donโ€™t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this userโ€™s behavior. Learn more about reporting abuse.

Report abuse
AnakinHuang/README.md

๐Ÿ‘‹ Hi, I'm Yuesong (Anakin) Huang

๐ŸŽ“ M.S. in Computer Science @ University of Rochester
๐Ÿงฌ Working at the intersection of AI, single-cell genomics, and systems
๐ŸŒŽ Based in Rochester, NY
๐Ÿ“ง yhu116@u.rochester.edu | yhu116@ur.rochester.edu
๐Ÿ“ž (603) 866-4355
๐Ÿ”— GitHub


๐Ÿš€ About Me

Iโ€™m a computer scientist and researcher interested in how modern ML and foundation models can be used to understand complex systems โ€” from disease-associated brain cell states to end-to-end AI music generation. I enjoy building things that actually run: reproducible pipelines, research code that other people can use, and small systems that behave like โ€œrealโ€ tools rather than class demos.

Right now, my work spans single-cell foundation models (scGPT, Geneformer), distributed systems, and multimodal generative AI.


๐Ÿงช Current Research

Research Assistant, University of Rochester Medical Center
Training Generative AI to Identify Disease-Associated Cell States (May 2025 โ€“ Present)

  • Build end-to-end pipelines for brain snRNA-seq datasets from psychiatric and neurodegenerative cohorts (QC, normalization, HVG selection, batch correction, metadata harmonization).
  • Apply scGPT-based reference mapping and cell-type annotation; generate latent embeddings with Scanpy / PyTorch on NAIRR GPU clusters (Slurm).
  • Fine-tune single-cell foundation models (scGPT, Geneformer) for cell-type/state classification and disease trajectories; run ablations and baselines and visualize results with UMAPs and model cards.

๐Ÿ› ๏ธ Technical Skills

  • Languages: Python, Java, C, C++, OCaml, Rust, Swift, HTML5, CSS, JavaScript
  • Data / Stats / Scripting: R, Stan, MATLAB, MySQL, PHP, Ruby, C#
  • ML / AI: scikit-learn, PyTorch, TensorFlow, Hugging Face Transformers, GitHub Copilot
  • Cloud & Tools: Git, SSH, Docker, Google Cloud Platform, Google Colab, AWS, Xcode, Visual Studio, VS Code, JetBrains IDEs, RStudio, JMeter, AMP
  • Focus Areas: Machine learning, deep learning, NLP, single-cell modeling, data mining, distributed systems

๐Ÿ’ป Highlighted Projects

An end-to-end multimodal music-generation system.

  • Uses Whisper for speech-to-text, Llama-4-Scout for lyric creation, Gemini Flash 2.0 for lyric structuring, and YuE for full music synthesis (vocals + instrumentation).
  • Provides a Gradio-based UI and a multi-model pipeline supporting quantized deployment across different GPUs.
  • Implements API orchestration and custom model adapters, benchmarking generation latency (~13โ€“9 minutes depending on GPU).

A teaching-oriented but fully functional DVCS.

  • Implements Git-like operations such as commits, branching, merging, and remote syncing.
  • Designed around information hiding and minimal interfaces for the core FileLog module.
  • Written in Rust, with a focus on correctness and clear separation of responsibilities.

Using ML to study lifestyle-related physiological patterns.

  • Preprocessed and analyzed 70,000+ physiological samples to study associations between body signals and smoking/drinking habits.
  • Applied PCA for dimensionality reduction and trained SVM and XGBoost models to predict smoking/drinking status, achieving ~81% accuracy with supporting visualizations.

NLP for political tweet classification.

  • Classified political tweets from seven Northern European countries.
  • Built an NLP pipeline with feature extraction and LinearSVC, achieving 77% accuracy.

๐Ÿ“š Education

University of Rochester โ€“ M.S. in Computer Science
Rochester, NY โ€ข Aug 2024 โ€“ Present

  • GPA: 3.95/4.00
  • Graduate Tuition Scholarship (50% of full-time tuition cost)
  • Aโ€™s in all core CS courses (NLP, Machine Vision, Parallel & Distributed Systems, Databases, Networks, etc.)

University of Rochester โ€“ B.S. in Computer Science
Rochester, NY โ€ข Aug 2021 โ€“ Dec 2023

  • GPA: 3.96/4.00
  • Merit-based Tuition Scholarship ($5,000)
  • Deanโ€™s List every semester; Aโ€™s in all CS and Math major courses

๐Ÿ’ผ Experience

Research Participant, Department of Mathematics
University of Rochester (Jul 2023 โ€“ Aug 2023)

  • Studied fractal-based metrics for forecasting retail time-series data.
  • Computed a Discrete-S Energy metric to approximate Hausdorff dimension and compared it with forecastability scores and compression-based complexity measures.
  • Evaluated Holt-Winters, Prophet, and LSTM models and analyzed correlations between discrete energy, complexity metrics, and RMSE/SD.

Teaching Assistant
University of Rochester (Sep 2022 โ€“ Dec 2023)

  • Supported courses including Computer Organization and Mobile App Development.
  • Held office hours, answered questions on course material, and graded assignments and exams.

Computer Science Intern & Web Group Leader
VISION X LLC, San Jose, CA (Aug 2020 โ€“ Oct 2020)

  • Led a small team to integrate Cohere LLMs into client websites using Django and AWS, creating embedded AI chatbots and premium subscription features.
  • Optimized backend infrastructure for scalability and responsiveness, improving customer interaction quality by ~30%.
  • Received a recommendation from the CEO for leadership and problem-solving.

๐Ÿ“ˆ What Iโ€™m Up To

  • Fine-tuning single-cell foundation models (scGPT, Geneformer) on brain snRNA-seq datasets.
  • Experimenting with training tricks such as Smart-Freeze, mixture-of-experts, and CUDA 12.x migration on NAIRR/ACES GPUs.
  • Exploring AI + creative tools, especially systems that combine language, music, and generative models.

If youโ€™d like to collaborate, feel free to open an issue, start a discussion, or reach out by email.

Pinned Loading

  1. crypto_market crypto_market Public

    iOS app displaying real-time cryptocurrency data using CoinGecko API, emulating Apple's Stocks app functionality.

    Swift 3

  2. enhancing_image_classification_in_low_light_conditions_using_modified_cnn_algorithms enhancing_image_classification_in_low_light_conditions_using_modified_cnn_algorithms Public

    The modifications to CNN architectures and specialized preprocessing techniques to enhance their performance in such conditions. Using the โ€Exclusively Darkโ€ image dataset (as in โ€See Loh & Chan (2โ€ฆ

    Jupyter Notebook 2

  3. kaggle-project-classification-of-tweets-from-northern-europe kaggle-project-classification-of-tweets-from-northern-europe Public

    Classifies 500K+ political tweets from Northern Europe using NLP and machine learning to analyze political discourse.

    Jupyter Notebook 1

  4. the_impact_on_body-signals-analysis-of-smoking-and-drinking the_impact_on_body-signals-analysis-of-smoking-and-drinking Public

    Analyzed 70,000+ physiological records to classify individuals' smoking and drinking status using SVM and XGBoost, achieving ~81% accuracy with comprehensive statistical analysis and visualizations.

    Jupyter Notebook 1

  5. futurespyhi/MiloMusic futurespyhi/MiloMusic Public

    A music generator built on Gradio, generating music based on users' prompts and their choice of genres, themes and moods

    Python 5