Large Language Models (LLMs) Llama-2: Developed from scratch, including training and testing. Mistral (Base + Mixture of Experts): Developed from scratch, including training and testing. Optimization Techniques Model Compilation Mixed Precision Training Flash Attention Distributed Training Distributed Data-Parallel (DDP)