Inference Benchmark Benchmark for machine learning model online serving. Collected information throughput latency P99 P90 P50 CPU usage memory usage GPU utilization GPU memory usage Tasks NLP LLaMA CV Stable Diffusion Speech Recognition Whisper Embedding ImageBind Coding Assistant Frameworks MOSEC MLServer Triton Inference Server Text Generation Inference BentoML Truss Potassium