Distributed ML Training and Fine-Tuning on Kubernetes
-
Updated
Jun 21, 2024 - Go
PyTorch is an open source machine learning library based on the Torch library, used for applications such as computer vision and natural language processing, primarily developed by Facebook's AI Research lab.
Distributed ML Training and Fine-Tuning on Kubernetes
Automated Machine Learning on Kubernetes
Determined is an open-source machine learning platform that simplifies distributed training, hyperparameter tuning, experiment tracking, and resource management. Works with PyTorch and TensorFlow.
Kubernetes Operator for MPI-based applications (distributed training, HPC, etc.)
Fabric for Deep Learning (FfDL, pronounced fiddle) is a Deep Learning Platform offering TensorFlow, Caffe, PyTorch etc. as a Service on Kubernetes
The open source, end-to-end computer vision platform. Label, build, train, tune, deploy and automate in a unified platform that runs on any cloud and on-premises.
Go binding for Pytorch C++ API (libtorch)
LibTorch (PyTorch) bindings for Golang
Tools for easing the handoff between AI/ML and App/SRE teams.
Experiment tracking server focused on speed and scalability
Machine Learning Operator & Controller for Kubernetes
PyTorch in Go, using LibTorch.
go binding for pytorch
emergent GUI and other support for pytorch networks: provides a NetView for torch networks
GPU scheduler for elastic/distributed deep learning workloads in Kubernetes cluster (IC2E'23)
Example applications using the onnxruntime_go library.
golang client for the huggingface hub aiming for minimal subset of features over `huggingface-hub` python package
A simple Go client for the clip-as-service server
Created by Facebook's AI Research lab (FAIR)
Released September 2016
Latest release 24 days ago