Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backup is backing up all namespaces when includedNamespaces are omitted and using only labelSelector #7105

Closed
sbahar619 opened this issue Nov 15, 2023 · 11 comments · Fixed by #7697

Comments

@sbahar619
Copy link
Contributor

I created a backup:

oc get backup b2 -n openshift-adp -o yaml
apiVersion: velero.io/v1
kind: Backup
metadata:
  annotations:
    helm.sh/resource-policy: keep
    meta.helm.sh/release-name: aws
    meta.helm.sh/release-namespace: openshift-adp
    velero.io/resource-timeout: 10m0s
    velero.io/source-cluster-k8s-gitversion: v1.27.6+f67aeb3
    velero.io/source-cluster-k8s-major-version: "1"
    velero.io/source-cluster-k8s-minor-version: "27"
  creationTimestamp: "2023-11-15T12:58:54Z"
  generation: 8
  labels:
    app.kubernetes.io/managed-by: Helm
    velero.io/storage-location: main-bsl
  name: b2
  namespace: openshift-adp
  resourceVersion: "139158"
  uid: e39ff300-d58b-4969-aafc-22ba048925ce
spec:
  csiSnapshotTimeout: 10m0s
  defaultVolumesToFsBackup: false
  itemOperationTimeout: 4h0m0s
  labelSelector:
    matchLabels:
      app: mysql
  snapshotMoveData: true
  storageLocation: main-bsl
  ttl: 720h0m0s
status:
  backupItemOperationsAttempted: 2
  backupItemOperationsCompleted: 2
  completionTimestamp: "2023-11-15T12:59:49Z"
  expiration: "2023-12-15T12:58:54Z"
  formatVersion: 1.1.0
  phase: Completed
  progress:
    itemsBackedUp: 83
    totalItems: 83
  startTimestamp: "2023-11-15T12:58:54Z"
  version: 1

Without includedNamespaces, but only with labelSelector.
The backup included all namespaces in the cluster, unrelated to the label selector.

I expected all resources being backed up to be related only to the label selector.

velero backup describe b2 --details 
Name:         b2
Namespace:    openshift-adp
Labels:       app.kubernetes.io/managed-by=Helm
              velero.io/storage-location=main-bsl
Annotations:  helm.sh/resource-policy=keep
              meta.helm.sh/release-name=aws
              meta.helm.sh/release-namespace=openshift-adp
              velero.io/resource-timeout=10m0s
              velero.io/source-cluster-k8s-gitversion=v1.27.6+f67aeb3
              velero.io/source-cluster-k8s-major-version=1
              velero.io/source-cluster-k8s-minor-version=27

Phase:  Completed


Namespaces:
  Included:  *
  Excluded:  <none>

Resources:
  Included:        *
  Excluded:        <none>
  Cluster-scoped:  auto

Label selector:  app=mysql

Or label selector:  <none>

Storage Location:  main-bsl

Velero-Native Snapshot PVs:  auto
Snapshot Move Data:          true
Data Mover:                  velero

TTL:  720h0m0s

CSISnapshotTimeout:    10m0s
ItemOperationTimeout:  4h0m0s

Hooks:  <none>

Backup Format Version:  1.1.0

Started:    2023-11-15 12:58:54 +0000 UTC
Completed:  2023-11-15 12:59:49 +0000 UTC

Expiration:  2023-12-15 12:58:54 +0000 UTC

Total items to be backed up:  83
Items backed up:              83

Backup Item Operations:
  Operation for persistentvolumeclaims a1/mysql:
    Backup Item Action Plugin:  velero.io/csi-pvc-backupper
    Operation ID:               du-e39ff300-d58b-4969-aafc-22ba048925ce.4085af8a-65dc-45a7d59ac
    Items to Update:
                           datauploads.velero.io openshift-adp/b2-mddcq
    Phase:                 Completed
    Progress:              107854713 of 107854713 complete (Bytes)
    Progress description:  Completed
    Created:               2023-11-15 12:59:09 +0000 UTC
    Started:               2023-11-15 12:59:09 +0000 UTC
    Updated:               2023-11-15 12:59:25 +0000 UTC
  Operation for persistentvolumeclaims a1/mysql-1:
    Backup Item Action Plugin:  velero.io/csi-pvc-backupper
    Operation ID:               du-e39ff300-d58b-4969-aafc-22ba048925ce.d3d8fbc5-11db-47627ecb0
    Items to Update:
                           datauploads.velero.io openshift-adp/b2-6dxhh
    Phase:                 Completed
    Progress description:  Completed
    Created:               2023-11-15 12:59:19 +0000 UTC
    Started:               2023-11-15 12:59:19 +0000 UTC
    Updated:               2023-11-15 12:59:39 +0000 UTC
Resource List:
  apps/v1/Deployment:
    - a1/mysql
    - a1/nginx
  apps/v1/ReplicaSet:
    - a1/mysql-68d84d7c89
  discovery.k8s.io/v1/EndpointSlice:
    - a1/mysql-9wwf7
  v1/Endpoints:
    - a1/mysql
  v1/Namespace:
    - a1
    - default
    - grafana
    - kube-node-lease
    - kube-public
    - kube-system
    - openshift
    - openshift-adp
    - openshift-apiserver
    - openshift-apiserver-operator
    - openshift-authentication
    - openshift-authentication-operator
    - openshift-cloud-controller-manager
    - openshift-cloud-controller-manager-operator
    - openshift-cloud-credential-operator
    - openshift-cloud-network-config-controller
    - openshift-cluster-csi-drivers
    - openshift-cluster-machine-approver
    - openshift-cluster-node-tuning-operator
    - openshift-cluster-samples-operator
    - openshift-cluster-storage-operator
    - openshift-cluster-version
    - openshift-config
    - openshift-config-managed
    - openshift-config-operator
    - openshift-console
    - openshift-console-operator
    - openshift-console-user-settings
    - openshift-controller-manager
    - openshift-controller-manager-operator
    - openshift-dns
    - openshift-dns-operator
    - openshift-etcd
    - openshift-etcd-operator
    - openshift-host-network
    - openshift-image-registry
    - openshift-infra
    - openshift-ingress
    - openshift-ingress-canary
    - openshift-ingress-operator
    - openshift-insights
    - openshift-kni-infra
    - openshift-kube-apiserver
    - openshift-kube-apiserver-operator
    - openshift-kube-controller-manager
    - openshift-kube-controller-manager-operator
    - openshift-kube-scheduler
    - openshift-kube-scheduler-operator
    - openshift-kube-storage-version-migrator
    - openshift-kube-storage-version-migrator-operator
    - openshift-machine-api
    - openshift-machine-config-operator
    - openshift-marketplace
    - openshift-monitoring
    - openshift-multus
    - openshift-network-diagnostics
    - openshift-network-node-identity
    - openshift-network-operator
    - openshift-node
    - openshift-nutanix-infra
    - openshift-oauth-apiserver
    - openshift-openstack-infra
    - openshift-operator-lifecycle-manager
    - openshift-operators
    - openshift-ovirt-infra
    - openshift-route-controller-manager
    - openshift-sdn
    - openshift-service-ca
    - openshift-service-ca-operator
    - openshift-user-workload-monitoring
    - openshift-vsphere-infra
  v1/PersistentVolume:
    - pvc-4085af8a-65dc-45a8-abcf-aaca231aa523
    - pvc-d3d8fbc5-11db-4761-9627-e5d11f98b112
  v1/PersistentVolumeClaim:
    - a1/mysql
    - a1/mysql-1
  v1/Pod:
    - a1/mysql-68d84d7c89-l992v
  v1/Secret:
    - a1/mysql
  v1/Service:
    - a1/mysql

Velero-Native Snapshots: <none included>

Environment:

  • Velero version (use velero version): v1.12.1
  • Kubernetes version (use kubectl version):
  • Kubernetes installer & version: v1.27.6+f67aeb3
  • Cloud provider or hardware configuration: Openshift on AWS
  • OS (e.g. from /etc/os-release): Red Hat Enterprise Linux CoreOS
@sseago
Copy link
Collaborator

sseago commented Nov 15, 2023

I think this is a bit of an edge case, but we may want a fix for it.
Essentially you have a backup that includes all namespaces but with a label selector. It seems reasonable to expect to only include namespaces in the backup that have matching resources.

In the non-label-selector use case, if a namespace is included in the backup, then we include it even if empty. Since label selectors are applied to the resources within namespaces, and we need the namespace any time we're backing something up in the namespace, we need to include the namespaces with resources.

So one possible fix here is to exclude the namespace resource for the backup if all of the following are true:

  1. A label selector is specified for the backup.
  2. The namespace does not have the label itself.
  3. Nothing in the namespace is included in the backup.

@kaovilai
Copy link
Contributor

In this case

backup that includes all namespaces

was not the intention nor was it explicitly specified as "all namespaces backup", as they expected velero to only include namespace the label selector resources are in.

The namespace does not have the label itself.

So is the current behavior such that if at least one namespace have a label, other namespaces are excluded?

@sseago
Copy link
Collaborator

sseago commented Nov 15, 2023

@kaovilai but it is an "all namespaces" backup -- i.e. selector and namespaces are both applied, result is the intersection of the two. It's "all resources with this label in all namespaces". You can also do a "some namespaces" backup with selectors -- i.e. only include namespaces "foo,bar" but also within those only with a certain label.

"So is the current behavior such that if at least one namespace have a label, other namespaces are excluded?"
I'd have to look, but I don't think so. I think we currently include all namespaces included in the backup because the expectation is that if I label my deployments for inclusion, I shouldn't have to label the namespace as well. I added that to the list of conditions we'd use to exclude an otherwise-included namespace to give people a way to back up empty namespaces if they wanted to -- i.e. to preserve existing behavior. But this was just one suggestion for what to change.

@kaovilai
Copy link
Contributor

--exclude-empty-ns flag :D

@sseago
Copy link
Collaborator

sseago commented Nov 15, 2023

@kaovilai that could work -- and it would be applicable to both label and non-label backups -- no logic change around obscure edge cases, just a new flag on the backup.

@blackpiglet
Copy link
Contributor

It's reasonable to enhance, but I think there is no need to introduce a new parameter for this.
We can fix the issue as the default behavior in future.

@kaovilai
Copy link
Contributor

We can fix the issue as the default behavior in future.

That would be "breaking change" then. But I agree that most people would not want an empty NS backup.

@sseago
Copy link
Collaborator

sseago commented Jan 17, 2024

It would appear that this is a change in behavior from prior versions. From what another OADP user is seeing, this didn't happen in Velero 1.11, so this is possibly a regression.

@sseago
Copy link
Collaborator

sseago commented Jan 17, 2024

@kaovilai if we're reverting to prior behavior, though, it's less a breaking change than it is a fix for a regression.

@blackpiglet
Copy link
Contributor

blackpiglet commented Feb 2, 2024

The namespaces should be gathered after the item collection to resolve this problem.

The method getResourceItems should collect and return namespaces; the function getItems should then gather the namespace resources that are returned and attach them to its return value []*kubernetesResource.

Concurrently, item collectors can benefit from a few potential enhancements:

  • Velero can use FieldSelector to query resources in many namespaces with a single query to the kube-apiserver, rather than making calls per namespace as is currently the case when multiple namespaces are handed in as the resource filter.
  • The item collector should apply the resource type filter earlier. Function getResourceItems applies the filter now, we can bring it forward to the function getGroupItems.

@reasonerjt
Copy link
Contributor

This is a valid issue.
Let's double check if we can fix it without having to change the item collection procedure, i.e. we can remove the empty ns from memory or can easily remove the tmp file before it's put into the backup tarball.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment