Skip to content

104 Machine Learning Engineering Interview Questions for LLM, Transformer & Multimodal AI

License

Notifications You must be signed in to change notification settings

JonnesLin/mle-leetcode

Repository files navigation

MLE Leetcode Banner

MLE-Leetcode

Master Machine Learning Engineering, LLMs & Multimodal AI

中文版 | Contributing | Issues

License Python PyTorch Transformers


🚀 Overview

MLE-Leetcode is a comprehensive collection of 103 production-grade coding questions designed to prepare you for top-tier Machine Learning Engineer (MLE) interviews.

Unlike traditional LeetCode, this repository focuses on Large Language Models (LLMs), Transformers, and Multimodal AI systems. These are not toy problems—they are simplified versions of real-world engineering challenges faced at companies like OpenAI, Google, and Meta.

workflow

graph LR
    A[Start] --> B{Choose Module};
    B --> C[Read question.md];
    C --> D[Implement solution.py];
    D --> E[Run Public Tests];
    E -- Pass --> F[Run Private Evals];
    E -- Fail --> D;
    F -- Pass --> G[Mastered!];
    F -- Fail --> D;
    
    style A fill:#4CAF50,stroke:#333,stroke-width:2px;
    style G fill:#4CAF50,stroke:#333,stroke-width:2px;
    style E fill:#2196F3,stroke:#333,stroke-width:2px;
    style F fill:#9C27B0,stroke:#333,stroke-width:2px;
Loading

📚 Curriculum

The curriculum is organized into 9 Modules covering the full spectrum of modern AI engineering.

Module 1: HuggingFace Transformers Engineering (Q02-Q14)

Focus: Practical engineering with the HuggingFace ecosystem. Model loading, tokenizer customization, training optimization.

ID Topic
Q02 Special Token Addition
Q03 Checkpoint Auto Loading
Q04 Robust Model Download
Q05 Config Modification & Safe Loading
Q06 SFT Training Trainer
Q07 Accelerate Training Loop
Q08 Monitoring Callback
Q09 Dynamic Padding Collator
Q10 Online Sequence Packing
Q11 Advanced Generate Wrapper
Q12 Smart Dynamic Batching
Q13 Multi Turn KV Cache Dialogue
Q14 Reproducible Generation
Module 2: Attention & Transformer Core (Q15-Q26)

Focus: Hand-written implementations of core components from scratch to understand the mathematics and logic.

ID Topic
Q15 Scaled Dot Product Attention
Q16 Multi Head Attention
Q17 Causal Mask Construction
Q18 Numerically Stable Softmax
Q19 Multi Query Attention (MQA)
Q20 Grouped Query Attention (GQA)
Q21 Attention Residual Dropout
Q22 FlashAttention Compatibility Layer
Q23 Decoder Only Transformer Block
Q24 Encoder Decoder Transformer Block
Q25 RMSNorm Implementation
Q26 Activation Checkpointing
Module 3: RoPE & Positional Encoding (Q27-Q31, Q75-Q83)

Focus: Mastering Rotary Positional Embeddings (RoPE), ALiBi, and long-context strategies.

ID Topic
Q27 Advanced Multi Head Attention
Q28 Transformer Normalization Strategies
Q29 RoPE Rotary Position Embedding
Q30 RoPE With Position Offset
Q31 RoPE Context Extension Scaling
Q75-Q83 Advanced RoPE Variants & Numerical Stability
Module 4: KV Cache & Inference Optimization (Q32-Q37, Q84-Q89)

Focus: Efficient inference, memory management, PagedAttention, and KV cache optimizations.

ID Topic
Q32 ALiBi Attention Bias
Q33 RoPE Shape Bug Debugging
Q34 Advanced RoPE Implementation
Q35 Positional Encoding Comparison
Q36 Paged Attention Simplified
Q37 KV Cache Memory Estimator
Q84-Q89 KV Cache Quantization, Streaming, Compression
Module 5: Sampling, Decoding & Evaluation (Q38-Q42, Q90-Q94)

Focus: Decoding strategies like Beam Search, Top-k/Top-p, Speculative Sampling.

ID Topic
Q38 Top K Top P Sampling
Q39 Repetition Penalty
Q40 Beam Search Length Penalty
Q41 Constrained Decoding
Q42 Perplexity Packed Sequences
Q90-Q94 Diverse Beam Search, Speculative Sampling, Token Healing
Module 6: Training Engineering (Q43-Q50)

Focus: Distributed training, Mixed Precision (AMP), Checkpointing, FSDP/ZeRO concepts.

ID Topic
Q43 AMP Training Loop
Q44 Gradient Accumulation Resume
Q45 ZeRO2 Optimizer Sharding
Q46 FSDP DP Wrapper
Q47 Atomic Checkpoint Writing
Q48 Training Consistency Checkpoint
Q49 Automatic Checkpoint Recovery
Q50 Multi Rank Metrics Aggregation
Module 7: Multimodal Modeling (Q51-Q64, Q95-Q104)

Focus: Vision-Language Models (VLM), Audio, Video, CLIP, LLaVA, Adapters.

ID Topic
Q51 ViT Patch Embedding
Q52 CLIP Contrastive Loss
Q53 Vision Projector
Q54 Image Token Inserter
Q55 SigLIP Similarity Loss
Q56 LLaVA Fusion
Q57-Q64 Cross Attention Adapters, Q-Former, Video/Audio Processing
Q95-Q104 Advanced Multimodal: 3D, Video Temporal Modeling, Cross-Modal Retrieval
Module 8: Data Engineering (Q65-Q69)

Focus: Data mixing, deduplication, quality filtering, and packing strategies.

ID Topic
Q65 Multi Source Data Mixer
Q66 Online Deduplication
Q67 Quality Filtering Scoring
Q68 Sample Packing Collator
Q69 Safety Compliance Filtering
Module 9: Debugging & Consistency (Q70-Q74)

Focus: Real-world bug hunting—fixing deadlocks, silent failures, and numerical instability.

ID Topic
Q70 Attention Mask Bug
Q71 RoPE Position Offset Bug
Q72 Dataloader Resume Bug
Q73 Multi GPU Deadlock
Q74 Mixed Precision NaN

🛠️ Getting Started

Prerequisites

pip install torch transformers accelerate pytest datasets

How to Use

  1. Select a Module: Pick a topic you want to master.
  2. Read: Go to the folder (e.g., Q15_Scaled_Dot_Product_Attention) and read question.md.
  3. Code: Write your implementation in a new file or modify solution.py (if practicing blind).
  4. Test: Run the public test suite.
# Example: Testing your attention implementation
cd questions/Module_2_Attention_Transformer_Core/Q15_Scaled_Dot_Product_Attention
pytest test_case_public.py -v
  1. Evaluate: For a deeper check, run the private evaluation script.
python eval_script_private.py

🤝 Contributing

Contributions are welcome! If you have a new interview question idea or want to improve an existing solution:

  1. Fork the repo.
  2. Create a branch for your feature.
  3. Submit a Pull Request.

📄 License

This project is licensed under the CC BY-SA 4.0 license.

🙏 Acknowledgments

Designed to bridge the gap between theory and production engineering. Special thanks to the open-source community for the simplified model components used as references.

About

104 Machine Learning Engineering Interview Questions for LLM, Transformer & Multimodal AI

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •