Skip to content
Zhaobo edited this page May 27, 2022 · 8 revisions

Welcome to the Alnair wiki!

Project Scope

  • Resource utilization improvement
    • GPU sharing
  • Intelligent scheduling (workload placement)
    • co-scheduling, predictive placement, complementary placement
  • Cross-stack multi-functional profiler
    • built-in intelligence, open metrics standard, application-level profiling with user transparency, and cross level correlation
  • In-memory distributed caching
    • content based hashing, intelligent shared caching layer
  • Elastic training framework
    • dynamic resource allocation without training interruption
  • Secure container runtime with RDMA support

Build and Setup

Documents

Below are some overview, talks and design documents that will help you understand key features in Alnair.

Clone this wiki locally