Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[OADP-459] Caches getters, move MigrationRegistry annotation from common to imagestream plugin #133

Merged
merged 12 commits into from
May 9, 2022

Conversation

kaovilai
Copy link
Member

@kaovilai kaovilai commented Apr 26, 2022

See https://github.com/openshift/oadp-operator/blob/master/docs/config/custom_plugin_images.md on how to use custom images for testing

snippet of DPA with image built from this PR

apiVersion: oadp.openshift.io/v1alpha1
kind: DataProtectionApplication
metadata:
  name: dpa-sample
spec:
  unsupportedOverrides:
    openshiftPluginImageFqin: ghcr.io/kaovilai/openshift-velero-plugin:registrytmp

OADP-459

  • The following shared functions now remembers the value after making first api request.
    getOADPRegistryRoute, GetRegistryInfo, GetServerVersion, GetBackup, getBackupStorageLocationForBackup, internalRegistrySystemContext
  • moved registry bits to imagestream

@kaovilai
Copy link
Member Author

Initial testing indicates this PR works

$ cat /tmp/openshift.io/velero-plugin/<uid>/openshift-adp/velero-sample-1/oadp-registry-config/registry.txt 
<registryhostname>sh-4.4$ 

@sseago
Copy link
Contributor

sseago commented Apr 26, 2022

Looking at the different things being done in the common backup/restore plugins, the main thing that all pods, deployments, etc. need that we could cache is probably the result of the GetServerVersion and GetRegistryInfo calls. It looks like the OADP registry route (anything done with the MigrationRegistry annotation) is only used by the ImageStream plugin, so I'm thinking that we should probably move the part of common backup/restore plugins which set this to the imagestream backup/restore plugins, since there's no need to do that part at all for 10K deployments, etc.

kaovilai and others added 4 commits April 27, 2022 23:09
Cache GetRegistryInfo, GetServerVersion, GetBackup, getBackupStorageLocationForBackup, internalRegistrySystemContext
@kaovilai kaovilai changed the title Saves found route to registry in tmp Getters caches per backup static values in /tmp Apr 28, 2022
@kaovilai kaovilai changed the title Getters caches per backup static values in /tmp Caches getters, move MigrationRegistry annotation from common to imagestream plugin Apr 28, 2022
if err != nil {
return nil, err
}
annotations[RestoreRegistryHostname] = registryHostname

if input.Restore.Labels[MigrationApplicationLabelKey] != MigrationApplicationLabelValue {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved to imagestream, left migLabel alone

@@ -36,3 +70,96 @@ func migrationRegistrySystemContext() (*types.SystemContext, error) {
return ctx, nil
}

// Takes Namesapce where the operator resides, name of the BackupStorageLocation and name of configMap as input and returns the Route of backup registry.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved here so we know only imagestream depends on this.

Copy link
Member Author

@kaovilai kaovilai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

notes

@kaovilai
Copy link
Member Author

using plugin built from this PR latest commit backup of parks-app with restify DC scaled to 700 pods
❯ velero backup create test-parksapp-pods --include-namespaces=parks-app --include-resources=pod -n openshift-adp;

Started:    2022-04-28 02:09:29 -0400 EDT
Completed:  2022-04-28 02:10:00 -0400 EDT

Expiration:  2022-05-28 02:09:29 -0400 EDT

Total items to be backed up:  704
Items backed up:              704

about 30 seconds or so.

using instead quay.io/konveyor/openshift-velero-plugin:latest yields

Started:    2022-04-28 02:13:45 -0400 EDT
Completed:  2022-04-28 02:16:30 -0400 EDT

Expiration:  2022-05-28 02:13:45 -0400 EDT

Total items to be backed up:  704
Items backed up:              704

almost 3 minutes.

@kaovilai
Copy link
Member Author

Copy link
Contributor

@sseago sseago left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall. Might be worth making that one trivial change I suggested, removing the switch statement. Won't change much in terms of performance, but it will make the code more readable.

velero-plugins/common/restore.go Outdated Show resolved Hide resolved
@kaovilai
Copy link
Member Author

This PR is being updated to use structs in memory instead of writing to disk.

@kaovilai kaovilai marked this pull request as draft April 28, 2022 14:30
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 28, 2022
@kaovilai kaovilai marked this pull request as ready for review April 28, 2022 16:01
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 28, 2022
@kaovilai kaovilai requested a review from sseago April 28, 2022 16:01
@kaovilai
Copy link
Member Author

As of 2b131ef
31 seconds. exactly the same as when writing to disk

Started:    2022-04-28 12:07:44 -0400 EDT
Completed:  2022-04-28 12:08:15 -0400 EDT

Expiration:  2022-05-28 12:07:37 -0400 EDT

Total items to be backed up:  704
Items backed up:              704

sseago
sseago previously requested changes Apr 28, 2022
Copy link
Contributor

@sseago sseago left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach makes sense overall, but lets cache these values in the already-existing plugin structs rather than creating global vars.

"k8s.io/client-go/rest"
)
var (
registryInfo *string
serverVersion *serverVersionStruct
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of saving these in global vars, why not add them to the BackupPlugin and RestorePlugin structs? Seems like that would be a much cleaner design.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason is, these can be reused for the next backup if my understanding is right. Having them in BackupPlugin and RestorePlugin may result in higher memory usage.

"k8s.io/client-go/rest"
)

var (
internalRegistrySystemContextVar *types.SystemContext
oadpRegistryRoute *string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As with common, I'd rather see this added to the BackupPlugin and RestorePlugin structs than use a global var.

@kaovilai kaovilai requested a review from sseago May 2, 2022 21:04
@kaovilai
Copy link
Member Author

kaovilai commented May 2, 2022

Results from Travis in slack

pre-fix: ~35 minutes was for 10k deployments. ~4 minutes for 1k deployments
post-fix: haven't tested 10k yet, but 1k took 30 seconds, which is about the same as velero on non-image resources (e.g. configmaps, secrets, etc.) that I've tested

@kaovilai kaovilai changed the title Caches getters, move MigrationRegistry annotation from common to imagestream plugin [OADP-459] Caches getters, move MigrationRegistry annotation from common to imagestream plugin May 9, 2022
@dymurray dymurray merged commit c231cbe into openshift:master May 9, 2022
dymurray pushed a commit that referenced this pull request May 9, 2022
…mon to imagestream plugin (#133)

* registry tmp proto

* use get for restore

* add error text to log

* reset err when reading route

* Make dir before writing file

* Cache GetRegistryInfo, GetServerVersion, GetBackup, getBackupStorageLocationForBackup

* export WriteByteToDirPath

* Cache internalRegistrySystemContext

* move MigrationRegistry, SkipImageCopy to imagestream plugin, refactor condition for restore migLabel

* remove redundant svcAccount switch/case

* reduce params for GetRegistryInfo, use var instead of writing to disk

(cherry picked from commit c231cbe)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants