Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat/support-v3-envoy-resources #56

Merged
merged 44 commits into from
Nov 18, 2020
Merged

Conversation

roivaz
Copy link
Member

@roivaz roivaz commented Nov 13, 2020

This PR adds support for using envoy v3 resources in the envoy.marin3r.3scale.net api:

  • Added all the relevant v3 packages in order to generate resources and serialize/deserialize resources.
  • The EnvoyConfigRevision controller, which is the one in charge of reconciling the xDS server cache with the resources loaded from k8s custom resources, now understands working with both v2 and v3 resources. The design is that an instance of the controller needs to be started for each envoy API version. Each instance will care only about the EnvoyConfigRevisions that match the same version.
  • The EnvoyConfigRevision resource has been given a new field spec.envoyAPI to specify the envoy API version the resources belong to. Field is optional, and when unset, v2 is chosen by default.
  • The EnvoyConfig resource has been given a new field spec.envoyAPI to specify the envoy API version the resources belong to. Field is optional, and when unset, v2 is chosen by default.
  • The finalizer that existed in the EnvoyConfig resource has been moved to the EnvoyConfigRevision object instead. This finalizer was in charge of clearing the xDS server cache whenever a nodeID was removed. By doing so, the EnvoyConfigRevision controller is now the only one that talks directly to the xDS cache and thus the only one strictly required to run in the discovery service pod.
  • Due to the fact that tests for package controllers/envoy needed a lot of changes to adapt them to the code changes, they have been reworked using envtest

Design for upgrading/downgrading EnvoyConfig resources to/from v3/v2:

  • Whenever an existent EnvoyConfig is changed from v2 to v3, the v2 EnvoyConfigRevisions are removed form the status.revisions list of the EnvoyConfig, and substituted for a v3 only list. The EnvoyConfigRevision resources are not removed themselves though, in order for the v2 envoy resources to be still published in the xDS server cache (if we removed the ECR resources, the v2 xDS cache would be cleared). This allows the envoy clients to point from the v2 to the v3 xDS server whenever they are ready to change because the v3 resources are already available while the v2 resources are still being served by the v2 xDS server, because they have not been removed.
  • Downgrading from v3 to v2 behaves exactly the same way.
  • Removal of old v2 EnvoyConfigResources is never automatically done. There will be at most 10 per nodeID.

An example of how resources would look when upgrading from v2 to v3:

▶ k get ec                                       
NAME     NODEID   ENVOY API   DESIRED VERSION   PUBLISHED VERSION   CACHE STATE
envoy1   envoy1   v3          596455d9cf        596455d9cf          InSync
▶ k get ecr
NAME                NODEID   ENVOY API   VERSION      PUBLISHED   CREATED AT             LAST PUBLISHED AT      TAINTED
envoy1-596455d9cf   envoy1   v3          596455d9cf   true        2020-11-10T13:40:49Z   2020-11-10T13:40:49Z   
envoy1-68b66cff7b   envoy1   v2          68b66cff7b               2020-11-10T13:16:48Z   2020-11-10T13:16:48Z   
envoy1-758c7d9f4c   envoy1   v2          758c7d9f4c   true        2020-11-10T13:19:21Z   2020-11-10T13:19:21Z   
envoy1-7664567ff4   envoy1   v2          7664567ff4               2020-11-10T13:17:24Z   2020-11-10T13:17:24Z  

Related to #38

This code was used initially when envoy configuration were fed from ConfigMaps.
Changes to support both versions:

* Adds a field spec.envoyAPI to both EnvoyCnfig and EnvoyConfigRevision resources.
* The EnvoyConfigRevision needs to be instantiated either for v2 or v3, it can work for both at the same time.
* A Generator interface has been created with methods to create the required envoy resource structs, with implementations for v2 and v3
* The code to reconcile the xDS cache has been moved to pkg/reconcilers/envoy/cache.go
The EnvoyConfig controller needs to create EnvoyConfigRevisions either in v2 or in v3, depending on what is configured in spec.envoyAPI. In order to support a non-disruptive transition of EnvoyConfigs from v2 to v3 (and vicesversa) some changes have been do
ne:

* The list of EnvoyConfigRevision references in status.configRevisions is now completely rebuilt in each reconcile by using the publication and creation dates of EnvoyConfigRevision objects to determine the order. The list was previously maintained by appending the new EnvoyConfigRevisions as they were created. This is required because at a given time, the EnvoyConfig will hold only references to the EnvoyConfigRevisions that match the spec.EnvoyAPI and the list requires a full recreation when changing from v2 to v3 and vicesversa. EnvoyConfigRevisions v2 are not deleted when changin to v3 (and the other way around) and this is what allows a user to transition its envoy clients from v2 to v3 in a non-disruptive way.
* A label with the EnvoyAPI at use has been added to all EnvoyConfigRevisions to allow easy label selection from the EnvoyConfig controller.
* The OnError callback functions of the xdss servers (v2 and v3) have been given a new parameter to determine if the config to be tainted is v2 or v3 and use that in the label selector.
@@ -67,7 +63,13 @@ type EnvoyConfigSpec struct {
// are supported. "json" is used if unset.
// +kubebuilder:validation:Enum=json;b64json;yaml
// +operator-sdk:gen-csv:customresourcedefinitions.specDescriptors=true
Serialization string `json:"serialization,omitempty"`
// +optional
Serialization *string `json:"serialization,omitempty"`
Copy link
Member Author

@roivaz roivaz Nov 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been corrected because optional fields should always be pointers.

// +kubebuilder:validation:Enum=v2;v3
// +operator-sdk:gen-csv:customresourcedefinitions.statusDescriptors=true
// +optional
EnvoyAPI *string `json:"envoyAPI,omitempty"`
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New field to specify the API version

Comment on lines +188 to +195
// GetEnvoyAPIVersion returns envoy's API version for the EnvoyConfigRevision
func (ecr *EnvoyConfig) GetEnvoyAPIVersion() envoy.APIVersion {
if ecr.Spec.EnvoyAPI == nil {
return envoy.APIv2
}
return envoy.APIVersion(*ecr.Spec.EnvoyAPI)
}

Copy link
Member Author

@roivaz roivaz Nov 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To keep existent behavior, if the field is left unset it means v2 is used. Going forward, when v2 is retired from the envoy codebase and v4 available, the default will be v3. We will keep 2 versions in the codebase at a given time, same as envoy does.

// +kubebuilder:validation:Enum=v2;v3
// +operator-sdk:gen-csv:customresourcedefinitions.statusDescriptors=true
// +optional
EnvoyAPI *string `json:"envoyAPI,omitempty"`
Copy link
Member Author

@roivaz roivaz Nov 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// Serialization specicifies the serialization format used to describe the resources. "json" and "yaml"
// are supported. "json" is used if unset.
// +kubebuilder:validation:Enum=json;b64json;yaml
// +operator-sdk:gen-csv:customresourcedefinitions.specDescriptors=true
Serialization string `json:"serialization,omitempty"`
// +optional
Serialization *string `json:"serialization,omitempty"`
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -81,7 +97,8 @@ type EnvoyConfigRevisionStatus struct {
// controller and are not intended to be directly used. Use EnvoyConfig objects instead.
// +kubebuilder:subresource:status
// +kubebuilder:resource:path=envoyconfigrevisions,scope=Namespaced,shortName=ecr
// +kubebuilder:printcolumn:JSONPath=".spec.nodeID",name=NodeID,type=string
// +kubebuilder:printcolumn:JSONPath=".spec.nodeID",name=Node ID,type=string
// +kubebuilder:printcolumn:JSONPath=".spec.envoyAPI",name=Envoy API,type=string
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Show the API version info in kubectl list operations

Client client.Client
Log logr.Logger
Scheme *runtime.Scheme
XdsCache xdss.Cache
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The finalizer has been moved to the EnvoyConfigRevision controler, so access to the xDS server in-memory cache is no longer required for the EnvoyConfig controller.

Comment on lines -63 to -69
// Add finalizer for this CR
if !contains(ec.GetFinalizers(), envoyv1alpha1.EnvoyConfigFinalizer) {
r.Log.Info("Adding Finalizer for the EnvoyConfig")
if err := r.addFinalizer(ctx, ec); err != nil {
r.Log.Error(err, "Failed adding finalizer for envoyconfig")
return ctrl.Result{}, err
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code moved to EnvoyConfigRevision controller

Comment on lines -72 to -86
// Check if the EnvoyConfig instance is marked to be deleted, which is
// indicated by the deletion timestamp being set.
if ec.GetDeletionTimestamp() != nil {
if contains(ec.GetFinalizers(), envoyv1alpha1.EnvoyConfigFinalizer) {
r.finalizeEnvoyConfig(ec.Spec.NodeID)
r.Log.V(1).Info("Successfully cleared ads server cache")
// Remove memcachedFinalizer. Once all finalizers have been
// removed, the object will be deleted.
controllerutil.RemoveFinalizer(ec, envoyv1alpha1.EnvoyConfigFinalizer)
err := r.Client.Update(ctx, ec)
if err != nil {
return ctrl.Result{}, err
}
}
return ctrl.Result{}, nil
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code moved to EnvoyConfigRevision controller

Comment on lines +64 to 66
if err := r.reconcileSpecDefaults(ctx, ec); err != nil {
return ctrl.Result{}, err
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mainly done to fill the spec.envoyAPI field if it is left unset by the user so list operations with kubectl show the correct information. Won't be necessary when upgrading the CRD api as it will directly accept specifying default values.

Comment on lines +68 to 74
// This must be done before any other revision related code due to the addition of support for
// envoy API v3 which comes with the addition of a new label to identify revisions belonging
// to each api version. The first time this version of the controller runs, it will set the
// required labels. This will also help with future labels that get added.
if err := r.reconcileRevisionLabels(ctx, ec); err != nil {
return ctrl.Result{}, err
}
Copy link
Member Author

@roivaz roivaz Nov 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The EnvoyConfig controller relies on labels to get the EnvoyConfigRevision resources relevant to it. As new labels have been added, this needs to be reconciled so when upgrading from an older marin3r version, the labels get added to all existent EnvoyConfigRevision objects.

@@ -97,14 +82,14 @@ func (r *EnvoyConfigReconciler) Reconcile(req ctrl.Request) (ctrl.Result, error)
}

// Update the ConfigRevisions list in the status
if err := r.consolidateRevisionList(ctx, ec, desiredVersion); err != nil {
if err := r.reconcileRevisionList(ctx, ec, desiredVersion); err != nil {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change of name for this function is because so far envoyconfigrevisions were simply appended to the end of the list. Now the list is fully regenerated on each reconcile based on a specific criteria. This was required in order to recreate the full list when moving from v2 to v3 and viceversa.

Comment on lines -107 to +92
if err.(cacheError).ErrorType == AllRevisionsTaintedError {
if err.(ControllerError).ErrorType == AllRevisionsTaintedError {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cacheError is a name left from a very early version of marin3r. Has been renamed to the more generic ControllerError and made public to allow for type assertions from other packages, in case it is required in the future.

Comment on lines +103 to +109
cacheReconciler := reconcilers_envoy.NewCacheReconciler(
ctx, r.Log, r.Client, r.XdsCache,
decoder,
envoy_resources.NewGenerator(r.APIVersion),
)

result, err := cacheReconciler.Reconcile(req.NamespacedName, ecr.Spec.EnvoyResources, ecr.Spec.NodeID, ecr.Spec.Version)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the code related to the process of writing envoy resources to the xDS server cache has been moved to pkg/reconcilers/envoy/cache.go

Comment on lines +111 to +122
// If a type errors.StatusError is returned it means that the config in spec.envoyResources is wrong
// and cannot be written into the xDS cache. This is true for any error loading all types of resources
// except for Secrets. Secrets are dynamically loaded from the API and transient failures are possible, so
// setting a permanent taint could occur for a transient failure, which is not desirable.
if result.Requeue || err != nil {
switch err.(type) {
case *errors.StatusError:
if err := r.taintSelf(ctx, ecr, "FailedLoadingResources", err.Error()); err != nil {
return ctrl.Result{}, err
}
default:
return result, err
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realised that marking an EnvoyConfigRevision as tainted when errors occur while reading certificates from secrets is a problem, as transitory problems connecting the API server could end up tainting a perfectly valid EnvoyConfigRevision.

Comment on lines +214 to +241
func filterByAPIVersion(obj runtime.Object, version envoy.APIVersion) bool {
switch o := obj.(type) {
case *envoyv1alpha1.EnvoyConfigRevision:
if o.GetEnvoyAPIVersion() == version {
return true
}
return false

default:
return false
}
}

func filterByAPIVersionPredicate(version envoy.APIVersion,
filter func(runtime.Object, envoy.APIVersion) bool) predicate.Predicate {

return predicate.Funcs{
CreateFunc: func(e event.CreateEvent) bool {
return filter(e.Object, version)
},
UpdateFunc: func(e event.UpdateEvent) bool {
return filter(e.ObjectNew, version)

},
DeleteFunc: func(e event.DeleteEvent) bool {
return filter(e.Object, version)
},
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a filter to avoid receiving events from EnvoyConfigRevisions of the wrong envoy API version.

}

// Revision does not yet exists, create one
if len(ecrList.Items) == 0 {
// Create the revision for this config version
ecr := &envoyv1alpha1.EnvoyConfigRevision{
ObjectMeta: metav1.ObjectMeta{
Name: fmt.Sprintf("%s-%s", ec.Spec.NodeID, version),
Name: fmt.Sprintf("%s-%s-%s", ec.Spec.NodeID, envoyAPI, version),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was required to avoid name collisions between v2 and v3 EnvoyConfigRevisions.

Comment on lines +252 to 267
func getRevisionListOptions(namespace string, nodeID, version, envoyAPI *string) []client.ListOption {
labelSelector := client.MatchingLabels{}
if nodeID != nil {
labelSelector[nodeIDTag] = *nodeID
}
if version != nil {
labelSelector[versionTag] = *version
}
if envoyAPI != nil {
labelSelector[envoyAPITag] = *envoyAPI
}
return []client.ListOption{
labelSelector,
client.InNamespace(namespace),
}
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a small helper to generate the list filters for the EnvoyConfigRevisions.

Comment on lines +119 to +134
Client: mgr.GetClient(),
Log: ctrl.Log.WithName("controllers").WithName(fmt.Sprintf("envoyconfigrevision_%s", string(envoy.APIv2))),
Scheme: mgr.GetScheme(),
XdsCache: xdss.GetCache(envoy.APIv2),
APIVersion: envoy.APIv2,
}).SetupWithManager(mgr); err != nil {
setupLog.Error(err, "unable to create controller", "controller", fmt.Sprintf("envoyconfigrevision_%s", string(envoy.APIv2)))
os.Exit(1)
}

if err := (&envoycontroller.EnvoyConfigRevisionReconciler{
Client: mgr.GetClient(),
Log: ctrl.Log.WithName("controllers").WithName(fmt.Sprintf("envoyconfigrevision_%s", string(envoy.APIv3))),
Scheme: mgr.GetScheme(),
XdsCache: xdss.GetCache(envoy.APIv3),
APIVersion: envoy.APIv3,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 2 instances of the EnvoyConfigRevision controller, one for v2 and one for v3.

Comment on lines +45 to +46
case *envoy_api_v2.RouteConfiguration:
s.v2.Resources[v2CacheResources(envoy.Route)].Items[name] = o
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was actually a bug. We never encountered it because we had never used the route object in an EnvoyConfig.

ClusterName: "ads_cluster",
ClusterName: "xds_cluster",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for consistency because we are calling the discovery service xDS everywhere.

Comment on lines +11 to +14
type Generator interface {
New(rType envoy.Type) envoy.Resource
NewSecret(string, string, string) envoy.Resource
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This abstracts the generation of envoy resources from the version. Any envoy API version should implement this interface. Adding new API versions should be easier in the future.

@roivaz
Copy link
Member Author

roivaz commented Nov 13, 2020

@raelga @slopezz @miguelsorianod this can now be reviewed.

Copy link
Contributor

@raelga raelga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Outstanding @roivaz 🙇

Copy link
Member

@slopezz slopezz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work to have compatibility with 2 envoy versions!

@roivaz roivaz merged commit 0e30d25 into master Nov 18, 2020
@roivaz roivaz deleted the feat/support-v3-envoy-resources branch December 1, 2020 09:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants