Stars
Machine Learning Toolkit for Kubernetes
Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.
Dragonfly is an open source P2P-based file distribution and image acceleration system. It is hosted by the Cloud Native Computing Foundation (CNCF) as an Incubating Level Project.
Harness Open Source is an end-to-end developer platform with Source Control Management, CI/CD Pipelines, Hosted Developer Environments, and Artifact Registries.
A simple, modern and secure encryption tool (and Go library) with small explicit keys, no config options, and UNIX-style composability.
A library for efficient similarity search and clustering of dense vectors.
Bulk port forwarding Kubernetes services for local development.
Deploy a Production Ready Kubernetes Cluster
The machine learning toolkit for time series analysis in Python
A transformative log viewer for Kubernetes
NVIDIA Data Center GPU Manager (DCGM) is a project for gathering telemetry and measuring the health of NVIDIA GPUs
Prowler is an Open Cloud Security tool for AWS, Azure, GCP and Kubernetes. It helps for continuos monitoring, security assessments and audits, incident response, compliance, hardening and forensics…
Highly available Prometheus setup with long term storage capabilities. A CNCF Incubating project.
Prometheus exporter for AWS CloudWatch - Discovers services through AWS tags, gets CloudWatch metrics data and provides them as Prometheus metrics with AWS tags as labels
Metrics exporter for Amazon AWS CloudWatch
☁️ 40+ Grafana dashboards for AWS CloudWatch metrics: EC2, Lambda, S3, ELB, EMR, EBS, SNS, SES, SQS, RDS, EFS, ElastiCache, Billing, API Gateway, VPN, Step Functions, Route 53, CodeBuild, ...
Packer configuration for building a custom EKS AMI
Terraform and OpenTofu provider for bootstrapping Flux
The AWS Provider enables Terraform to manage AWS resources.
Machine Learning Engineering Open Book
Prometheus exporter for performance metrics from Slurm.
This repo includes everything you need to know about deploying GPU nodes on OCI
Prometheus-based Kubernetes Resource Recommendations
Guide for getting started with common network testing tools