A Kubernetes admission webhook that prevents accidental data loss from PersistentVolume deletions
pv-safe acts as a safety gate for Kubernetes storage operations, automatically blocking risky deletion attempts and providing clear guidance for safe data management.
- Automatic Risk Assessment - Analyzes PV reclaim policies and VolumeSnapshot availability before allowing deletions
- Smart Blocking - Prevents data loss while allowing safe operations to proceed
- VolumeSnapshot Aware - Recognizes when backups exist and permits deletion accordingly
- Clear Error Messages - Provides actionable guidance with specific commands to resolve issues
- Minimal Overhead - Read-only permissions, no data modification
- Graceful Degradation - Works without VolumeSnapshot CRDs installed
- Kubernetes 1.19+
- Helm 3.0+
- cert-manager (optional but recommended)
# Install cert-manager if not already installed
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.13.0/cert-manager.yaml
# Wait for cert-manager to be ready
kubectl wait --for=condition=available --timeout=300s \
deployment/cert-manager -n cert-manager
# Install pv-safe from GitHub Release
helm install pv-safe https://github.com/automationpi/pv-safe/releases/download/v0.1.0/pv-safe-0.1.0.tgz
# Verify installation
kubectl get pods -n pv-safe-systemFor development or customization:
# Clone and build
git clone https://github.com/automationpi/pv-safe.git
cd pv-safe
make webhook-build
make webhook-deploypv-safe automatically intercepts DELETE operations on Namespaces, PVCs, and PVs:
# This will be blocked if the PVC has a Delete reclaim policy and no snapshot
$ kubectl delete pvc my-data -n production
Error from server (Forbidden): admission webhook "validate.pv-safe.io" denied the request:
DELETION BLOCKED: PVC 'production/my-data' would lose data permanently
Reason: PV has Delete reclaim policy, no snapshot found
To safely delete this PVC:
1. Create a VolumeSnapshot of the data
2. OR change PV reclaim policy to Retain:
kubectl patch pv pvc-xxx -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'
3. OR force delete (will lose data):
kubectl label pvc my-data -n production pv-safe.io/force-delete=true
kubectl delete pvc my-data -n productionpv-safe uses a ValidatingWebhookConfiguration to intercept DELETE operations and applies the following logic:
┌─────────────────────────────────────────────────────────┐
│ DELETE Request (Namespace/PVC/PV) │
└───────────────────────┬─────────────────────────────────┘
│
v
┌─────────────────────────────────────────────────────────┐
│ Check bypass label (pv-safe.io/force-delete=true) │
├─────────────────────────────────────────────────────────┤
│ YES → ALLOW (with audit log) │
│ NO → Continue to risk assessment │
└───────────────────────┬─────────────────────────────────┘
│
v
┌─────────────────────────────────────────────────────────┐
│ Risk Assessment │
├─────────────────────────────────────────────────────────┤
│ 1. PV reclaim policy = Retain? → ALLOW │
│ 2. Ready VolumeSnapshot exists? → ALLOW │
│ 3. Otherwise → BLOCK │
└─────────────────────────────────────────────────────────┘
A deletion is considered risky when:
- PersistentVolume has
reclaimPolicy: Delete, AND - No ready VolumeSnapshot with
deletionPolicy: Retainexists
A deletion is considered safe when:
- PersistentVolume has
reclaimPolicy: Retain, OR - A ready VolumeSnapshot with
deletionPolicy: Retainexists, OR - Bypass label
pv-safe.io/force-delete=trueis present
Create a VolumeSnapshot before deleting:
# Create a snapshot
kubectl apply -f - <<EOF
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
name: my-data-backup
namespace: production
spec:
volumeSnapshotClassName: csi-snapclass
source:
persistentVolumeClaimName: my-data
EOF
# Wait for snapshot to be ready
kubectl wait --for=jsonpath='{.status.readyToUse}'=true \
volumesnapshot/my-data-backup -n production --timeout=300s
# Now deletion is allowed
kubectl delete pvc my-data -n production
# persistentvolumeclaim "my-data" deletedMake deletion safe by changing the reclaim policy:
# Get the PV name
PV_NAME=$(kubectl get pvc my-data -n production -o jsonpath='{.spec.volumeName}')
# Change reclaim policy to Retain
kubectl patch pv $PV_NAME -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'
# Now deletion is allowed
kubectl delete pvc my-data -n production
# persistentvolumeclaim "my-data" deletedWhen you're certain you want to delete without backup:
# Add bypass label (explicit acknowledgment of data loss)
kubectl label pvc my-data -n production pv-safe.io/force-delete=true
# Delete the PVC
kubectl delete pvc my-data -n production
# persistentvolumeclaim "my-data" deletedpv-safe checks all PVCs in a namespace:
$ kubectl delete namespace staging
Error from server (Forbidden): admission webhook "validate.pv-safe.io" denied the request:
DELETION BLOCKED: Namespace 'staging' contains 3 PVC(s) that would lose data permanently
Risky PVCs:
- postgres-data: PV has Delete reclaim policy, no snapshot found
- redis-data: PV has Delete reclaim policy, no snapshot found
- app-cache: PV has Delete reclaim policy, no snapshot found
To safely delete this resource:
1. Create VolumeSnapshots for the PVCs
2. OR change PV reclaim policy to Retain for each PVC
3. OR force delete (will lose data):
kubectl label namespace staging pv-safe.io/force-delete=true
kubectl delete namespace stagingBy default, these namespaces are excluded from validation:
kube-systemkube-publickube-node-leasepv-safe-system
To modify exclusions, edit deploy/05-webhook-config.yaml.
For VolumeSnapshot support, you need:
- Install VolumeSnapshot CRDs:
VERSION=v6.3.0
BASE_URL=https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/${VERSION}
kubectl apply -f ${BASE_URL}/client/config/crd/snapshot.storage.k8s.io_volumesnapshotclasses.yaml
kubectl apply -f ${BASE_URL}/client/config/crd/snapshot.storage.k8s.io_volumesnapshotcontents.yaml
kubectl apply -f ${BASE_URL}/client/config/crd/snapshot.storage.k8s.io_volumesnapshots.yaml
kubectl apply -f ${BASE_URL}/deploy/kubernetes/snapshot-controller/rbac-snapshot-controller.yaml
kubectl apply -f ${BASE_URL}/deploy/kubernetes/snapshot-controller/setup-snapshot-controller.yaml- Create a VolumeSnapshotClass:
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
name: csi-snapclass
driver: <your-csi-driver> # e.g., ebs.csi.aws.com, pd.csi.storage.gke.io
deletionPolicy: RetainNote: pv-safe works without VolumeSnapshot CRDs but only uses reclaim policy checks.
- Architecture - Internal design and how pv-safe works
- Development - Local setup, testing, and contributing
- Troubleshooting - Common issues and solutions
# Follow webhook logs
kubectl logs -n pv-safe-system -l app=pv-safe-webhook -f
# View blocked deletions
kubectl logs -n pv-safe-system -l app=pv-safe-webhook --since=24h | grep BLOCKING
# View bypass usage
kubectl logs -n pv-safe-system -l app=pv-safe-webhook --since=24h | grep BYPASS# Check webhook status
kubectl get pods -n pv-safe-system
# Check webhook configuration
kubectl get validatingwebhookconfiguration pv-safe-validating-webhook# Create a kind cluster with test fixtures
make setup
# Build and deploy webhook
make webhook-build
make webhook-deploy
# View logs
make webhook-logs
# Run tests
make test
# Cleanup
make teardownpv-safe/
├── cmd/webhook/ # Webhook server entry point
├── internal/webhook/ # Core webhook logic
│ ├── handler.go # Admission request handler
│ ├── risk.go # Risk assessment engine
│ ├── snapshot.go # VolumeSnapshot detection
│ └── client.go # Kubernetes client
├── deploy/ # Kubernetes manifests
├── docs/ # Documentation
├── scripts/ # Build and deployment scripts
└── test/fixtures/ # Test scenarios
Check if the webhook is running and configured:
kubectl get pods -n pv-safe-system
kubectl get validatingwebhookconfiguration pv-safe-validating-webhook
kubectl logs -n pv-safe-system -l app=pv-safe-webhookVerify snapshot is ready and has correct deletion policy:
# Check snapshot status
kubectl get volumesnapshot <name> -n <namespace> -o yaml
# Verify it's ready
kubectl get volumesnapshot <name> -n <namespace> \
-o jsonpath='{.status.readyToUse}'
# Check snapshot class deletion policy
kubectl get volumesnapshotclass <class-name> -o yaml | grep deletionPolicyCheck webhook RBAC permissions:
kubectl auth can-i get pv \
--as=system:serviceaccount:pv-safe-system:pv-safe-webhook
kubectl auth can-i list pvc \
--as=system:serviceaccount:pv-safe-system:pv-safe-webhookFor more troubleshooting, see the Operator Guide.
helm uninstall pv-safe# Using make
make webhook-delete
# Or manually
kubectl delete namespace pv-safe-system
kubectl delete validatingwebhookconfiguration pv-safe-validating-webhook- Admission Webhook: Intercepts DELETE operations via Kubernetes ValidatingWebhookConfiguration
- Risk Calculator: Analyzes PV reclaim policies and VolumeSnapshot availability
- Snapshot Checker: Queries VolumeSnapshot API (if available) to verify backups exist
- Handler: Processes admission requests and generates allow/deny responses
pv-safe operates with minimal permissions:
- Read-only access to PVs, PVCs, Namespaces, and VolumeSnapshots
- No data modification capabilities
- TLS-secured webhook endpoint (managed by cert-manager)
- Audit trail for all bypass operations
- Latency: Typically <100ms per request
- Timeout: 5-second timeout for risk assessment (10-second webhook timeout)
- Failure Mode: Configurable (default: fail-closed blocks deletions if webhook is down)
Contributions are welcome! Please read our contributing guidelines before submitting PRs.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Make your changes
- Run tests (
make test) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Documentation: docs/
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Built with controller-runtime
- Inspired by Kubernetes admission webhook best practices
- Certificate management by cert-manager
Status: Active Development | Version: 0.1.0 | Kubernetes: 1.19+