🧬 Kubefold

Kubefold is a Kubernetes operator for managing protein structure prediction workflows. It provides a declarative way to handle protein databases and run conformation predictions in a Kubernetes cluster.

🚀 Features

Protein Database Management - Download and manage various protein databases (UniProt, PDB, BFD, etc.)
Conformation Prediction - Run protein structure predictions with configurable parameters
Cloud Integration - Store results in S3 and receive notifications via SMS
Scalable Architecture - Built on Kubernetes for horizontal scaling and resource management

📋 Prerequisites

Kubernetes cluster (v1.11.3+)
kubectl (v1.11.3+)
Access to an S3 bucket for storing results
AWS credentials for S3 and SMS notifications (if using these features)

🛠️ Installation

kubectl apply -f https://raw.githubusercontent.com/kubefold/operator/main/dist/install.yaml

📝 Usage

Protein Database

Create a ProteinDatabase resource to download and manage protein databases:

apiVersion: data.kubefold.io/v1
kind: ProteinDatabase
metadata:
  name: my-database
spec:
  datasets:
    uniprot: true
    pdb: true
  volume:
    storageClassName: fsx-sc

Protein Conformation Prediction

Run a protein structure prediction:

apiVersion: data.kubefold.io/v1
kind: ProteinConformationPrediction
metadata:
  name: my-prediction
spec:
  database: my-database
  protein:
    id: ['A']
    sequence: "YOUR_PROTEIN_SEQUENCE"
  model:
    volume:
      storageClassName: fsx-sc
    weights:
      http: "https://your-model-weights.bin.zst"
  destination:
    s3:
      bucket: your-bucket
      region: your-region
  notify:
    region: your-region
    sms:
      - "+1234567890"

🧹 Cleanup

To uninstall the operator:

kubectl delete -f https://raw.githubusercontent.com/kubefold/operator/main/dist/install.yaml

☁️ Running on AWS EKS

This project provides resources for easy deployment on Amazon EKS, including GPU and FSx Lustre support for high-performance workloads.

eks/cluster.yaml: Example EKS cluster configuration (with CPU and GPU node groups, IAM policies, and VPC setup) for use with eksctl.
eks/fsx.storageclass.k8s.yaml: StorageClass for FSx for Lustre, enabling fast, shared storage for protein data and models.
eks/sample-lustre-volume.k8s.yaml: Sample PersistentVolumeClaim using the FSx StorageClass.
up.sh: Automated script to provision the EKS cluster, install the FSx CSI driver, set up storage, deploy Kubefold, and apply a sample resource. Run this script for a one-command setup (requires AWS CLI, eksctl, and kubectl configured).

Note:

The provided EKS configuration includes a GPU node group using g5.xlarge instances. You must request a quota increase for g5 instances in your AWS region before running the setup.

Quick start: (Recommended) Run the automated setup script:

# Automatically:
# - creates EKS cluster with GPU nodegroup
# - installs AWS FSx CSI Driver for automated FSx for Lustre volume provisioning
# - create fsx-sc storage class
# - installs kubefold controller
# - deploys sample ProteinDatabase
./up.sh

These resources help you get started with scalable, cloud-native protein prediction workflows on AWS.

Name		Name	Last commit message	Last commit date
Latest commit History 85 Commits
.devcontainer		.devcontainer
.github/workflows		.github/workflows
api/v1		api/v1
assets		assets
cmd		cmd
config		config
dist		dist
eks		eks
hack		hack
internal		internal
test		test
.cursorrules		.cursorrules
.dockerignore		.dockerignore
.gitignore		.gitignore
.golangci.yml		.golangci.yml
Dockerfile		Dockerfile
Makefile		Makefile
PROJECT		PROJECT
README.md		README.md
down.sh		down.sh
go.mod		go.mod
go.sum		go.sum
up.sh		up.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🧬 Kubefold

🚀 Features

📋 Prerequisites

🛠️ Installation

📝 Usage

Protein Database

Protein Conformation Prediction

🧹 Cleanup

☁️ Running on AWS EKS

About

Uh oh!

Packages

Uh oh!

Uh oh!

Languages

kubefold/operator

Folders and files

Latest commit

History

Repository files navigation

🧬 Kubefold

🚀 Features

📋 Prerequisites

🛠️ Installation

📝 Usage

Protein Database

Protein Conformation Prediction

🧹 Cleanup

☁️ Running on AWS EKS

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Packages 0

Uh oh!

Uh oh!

Languages

Packages