Skip to content
@DeepLM

Deeplm.ai

On a mission to make deep learning more available and accessible to everyone

Popular repositories Loading

  1. Insights Insights Public

    Real-time Grafana dashboards and Prometheus metrics for HPC/SLURM GPU clusters. Track job performance, GPU utilization, power consumption, and checkpoint efficiency. Docker Compose deploy, optional…

    Python 1 1

  2. Baseline Baseline Public

    Baseline your GPU cluster's real performance in one run. Tests compute throughput (TFLOPS, HBM bandwidth, thermal throttling), interconnect health (NVLink, NVSwitch, PCIe, NUMA), and network scalin…

Repositories

Showing 2 of 2 repositories
  • Insights Public

    Real-time Grafana dashboards and Prometheus metrics for HPC/SLURM GPU clusters. Track job performance, GPU utilization, power consumption, and checkpoint efficiency. Docker Compose deploy, optional NVIDIA BCM integration, Cassandra-backed historical analysis.

    DeepLM/Insights’s past year of commit activity
    Python 1 Apache-2.0 1 0 0 Updated Apr 22, 2026
  • Baseline Public

    Baseline your GPU cluster's real performance in one run. Tests compute throughput (TFLOPS, HBM bandwidth, thermal throttling), interconnect health (NVLink, NVSwitch, PCIe, NUMA), and network scaling (IB/RDMA, NCCL, AllReduce). Pass/fail thresholds against vendor specs. Works on SLURM and Kubernetes with NVIDIA, AMD, or Intel GPUs

    DeepLM/Baseline’s past year of commit activity
    0 0 0 0 Updated Apr 15, 2026

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…