Skip to content

Meta

Load-balancing improvements with low containerConcurrency

At low containerConcurrency’s we now perform significantly better due to improvements in the application-specific load-balancing performed by the Activator component.

Kourier networking support

We have a new option for handling the ingress capabilities used by knative/serving. Kourier is the first Knative-native ingress implementation, which reconciles the Knative networking CRDs to program Envoy directly.

Autoscaling

Locally perfect loadbalancing and endpoint subsetting improvements (thanks @vagababov)

These are further improvements to the loadbalancing enhancements over the last releases. Given a stable activator count, loadbalancing of a revision with the activator on the path is now locally ideal. The graph.

Reduced the needed Kubernetes Services per Revision from 3 to 2 #5900 (thanks @markusthoemmes)

The third service used to be used for metric scraping exclusively. This is now done via the private service as well. Metric services are no longer created and actively removed in existing deployments.

Allow applications with a livenessProbe to properly scale down #5986 (thanks @nak3)

The queue-proxy wrongly counted requests sent via livenessProbes as actual requests, causing the revision to never shut down. These requests are now properly ignored.

Target annotation values can now exceed configured defaults #5975 (thanks @markusthoemmes)

This fixes a bug in the logic to determine the actual target of the autoscaler which capped the user-defined target value to the configured default value.

Report desired/actual scale in PodAutoscalers for the HPA as well (thanks @vagababov)

The values for desired and actual scale are now plumbed through from the HPA into the PodAutoscaler’s status.

Assorted code readability, optimizations and clean ups (thanks @vagababov, @markusthoemmes, @mgencur)

Core API

Improved error messages for image tag resolving #5920 (thanks @markusthoemmes)

Previous error messages did not indicate that the image pull failure occurred during digest resolution, and did not provide further details as to why the digest resolution failed. This change aides users in debugging problems in container registry permissions.

Enabled imagePullSecrets in PodSpec #5917 (thanks @taragu)

Users may now specify imagePullSecrets directly without attaching them to their Kubernetes ServiceAccount.

Add permissions for caching.internal.knative.dev to edit and view cluster roles #5992 (thanks @nak3)

Knative provides aggregated ClusterRoles that users can use to manage their Knative resources. These roles previously did not include the caching resource. This change adds the caching resource to both the edit and view roles.

Split apart defaulting and validation webhooks #5947 (thanks @mattmoor)

This fixes a problem where our validation wasn’t necessarily applied to the final object because it runs at the same time as defaulting, which might be before additional mutating webhooks. By separating things out we ensure that the validation occurs on the final object to be committed to etcd.

Configuration and Service now labeled with duck.knative.dev/podspecable #6121 (thanks @mattmoor)

This enables tools that reflect over the Kubernetes type system to reason about the podspec portion of these Knative resources.

Bug Fixes:

  • Fix bug where latestRevision routes can point to wrong revision #5319 (thanks @taragu)
  • Fix issue where config-defaults were not getting applied #5892 (thanks @taragu)
  • Fix validation issue for lastModifier when using multiple service accounts #6072 (thanks @savitaashture)
  • Fix problem with Configuration reporting Ready early #6096 (thanks @taragu)
  • Validation added for name and generateName fields in RevisionTemplate #5110 (thanks @savitaashture)

Test Improvements:

Networking

Compatibility with Istio 1.4 #6058 (thanks @nak3)

Istio 1.4 introduced a breaking restriction to the length of regular expressions allowed in VirtualServices. We switched to using prefixes to be compatible with Istio 1.4.

Integration with istio/client-go #5969 pkg#208 pkg#831 (thanks @skaslev)

Knative now uses istio/client-go instead of its own version of Istio API client. This addressed a long pain-point of maintaining a manually-translated API client to a changing API.

3scale/kourier integration #5983 (thanks @bbrowning @davidor @jmprusi)

Kourier is a light-weight Ingress for Knative (a deployment of Kourier consists of an Envoy proxy and a control plane for it). In v0.11 we add Kourier as an option to run Knative e2e integration tests.

Better LoadBalancerReady condition when VirtualService failed to be reconciled #6048 (thanks @nak3)

Previously when VirtualService failed to be reconciled the LoadBalancerReady Condition isn’t updated. We fix this to surface reason and message from the failing VirtualService.

Post-ClusterIngress migration cleanups (thanks @markusthoemmes)

Clean up port names of Knative components to follow Istio convention #5070 (thanks @iamejboy)

Bug fix #5734: Do not permit cluster local kservices on the cluster ingress #6174 (thanks @vagababov)

Fix a bug where cluster-local ksvcs are erroneously exposed to the public ingress.

Monitoring

Add default request metrics backend in observability config #6022 (thanks @drpmma)

This change makes the default backend Prometheus and makes it consistent with the default example value in config-observability.yaml

Fix missing required selector for node-exporter #5934 (thanks @lionelvillard)

Assets 15

Meta

Eliminated errors across dataplane-probe scenarios!

With some of the work to improve the activator’s load balancing, the errors we would see with containerConcurrency: 1 due to queuing have been eliminated. With the CC aware load balancing in the activator we actually see better latency going through the activator than talking directly to the user container (everywhere else the extra hop adds latency).

Moving minimum Kubernetes version to 1.14

As part of moving to a single install path we are moving the minimum supported Kubernetes version to 1.14. This allows us to take advantage of multiple CRD endpoints and begin to experiment with future Kubernetes CRD features. Installation will fail if a lower version is detected.

We’re using Go1.13 for Serving

Serving has switched to Go 1.13, which has much better performance around synchronizations, has error wrapping and other features, like Duration.Milliseconds.

Autoscaling

We no longer use GenerateName for service names (@vagababov)

We were using GenerateName for creation of metrics and private K8s Service names. Which caused us various problems in testing and reliability. We no longer have this problem.

Activator refactoring and load balancing (@vagababov, @markusthoemmes):

After great changes by @greghaynes that permit probing and routing requests to individual pods -- we were able to significantly simplify the code, remove redundancies and ineffective pieces.

Semi-ideal load balancing (@vagababov):

Based on previous work we are able now to route requests to the individual pods always (when pods are individually addressable) permitting us to achieve significant improvements in the CC=1 case. See the graph. Work in this direction is not complete though (CC=10 case is still in progress)

Significant improvements to the integration tests that led to a big reduction in flakes (@nak3 @JRBANCEL):

Assorted improvements to the code and the tests have brought the number of flakes to practically white noise. See graph (check test/e2e.TestAutoscale.* tests)

Assorted code readability, optimizations and clean ups (@markusthoemmes, @nak3, @JRBANCEL, @skaslev)

Core API

Move to K8s 1.15.3 client libs #5570 (thanks @mattmoor)

The client version allows Knative to take advantage of newer features within Kubernetes going forward. This client version was chosen as it is compatible with 1.14 (our minimum version), but can also be used for longer as it is compatible with 1.16 Kubernetes releases as well.

Move back to single install path will all API versions (v1alpha1, v1beta1, v1) #5594 (thanks @mattmoor)

As we move to a 1.14 minimum Kubernetes version we are migrating back to a single install path that contains all 3 API endpoints. See Release Principles Draft for how this will work for future releases.

Skip copying last-applied-configuration annotation to Route #5468 (thanks @nak3)

The last-applied-configuration annotation is applied when using kubectl to manipulate the Service object. Previously this annotation would be copied down to Route which would cause confusion when describing the Route.

ConfigMaps are now validated synchronously #5404 (thanks @nimakaviani)

Malformed configmaps were a potential source of latent error and were often difficult to debug. This release adds synchronous validation of configmaps which ensures that bad values aren't able to sneak through.

New ClusterRoles for editing/viewing Knative resources through User-facing roles #5683 (thanks @nak3)

The ClusterRoles knative-serving-namespaced-edit and knative-serving-namespaced-view have been added as User-facing Roles. Read more about User-facing Roles in the Kubernetes documentation.

Add annotation /creator and /lastModifier to Configuration and Route #5240 (thanks @savitaashture)

These annotations were added to Services in previously releases where the information was propagated down. This release adds the annotations for Routes and Configurations created directly by a user.

Service and Route CRDs are now labeled with duck.knative.dev/addressable=true #5874 (thanks @n3wscott)

Bugs:

  • Fix v1 Route validation for multiple traffic targets with empty tags #5583 (thanks @skaslev)
  • Fix error message for missing probe field. #5713 (thanks @nak3)
  • Fail Service creation on invalid delaySeconds value #5733 (thanks @savitaashture)
  • Better error message for invalid combinations #5566 (thanks @nak3)
  • Delete finalizer on Route on deletion #5715 (thanks @nak3)
  • Validate name and generateName for RevisionTemplate #5110 (thanks @savitaashture)

Tests:

  • Use Status.URL instead of Status.URL.Host for conformance tests #5503 (thanks @bancel)
  • Enable downgrade testing #5596 (thanks @mattmoor)
  • New e2e test for rollback scenario #5702 (thanks @taragu)
  • Remove logURL assertion from conformance tests #5448 (thanks @markusthommes)

Networking

Activator graceful shutdown #5542 (thanks @nak3)

Fix a bug due to readiness probes showing successes when an activator is shutting down, causing requests to still be routed to terminating activators. We do this by letting SIGTERM triggers readinessProbe failure to allow terminating activators to properly drain.

Avoid creating wildcard certs when AutoTLS is disabled #5636 (thanks @nak3)

In v0.9.0, we attempted to create wildcard certs even though cert-manager is not setup, causing certificate creation errors when creating a new namespace. We fixed this to avoid requiring wildcard certs when AutoTLS is disabled.

Finalize ClusterIngress migration #5689 (thanks @wtam2018)

We removed the remaining references to ClusterIngress, and that completes the ClusterIngress migration feature track. Going forward, knative/serving no longer has cluster-scoped CRDs.

Improved readiness prober to support HTTPS redirect, HTTPS, HTTP2 & custom ports #5223 (thanks @JRBANCEL)

Previously Route readiness probers rely on a special domain coded into Istio VirtualServices. We now change the probes to probe the actual data path, allowing probes to work with the domains that the Route is expected to serve.

Fix cluster-local visibility when using tags #5734 (thanks @andrew-su)

Fix a visibility resolution bug, introduced in v0.9.0, causing cluster-local tags to not have cluster-local visibility even if explicitly set.

Avoid creating invalid certificate for cluster-local service #5611 (thanks @nak3)

Monitoring

Add container and pod labels to revision and activator metrics #5824 (thanks @yanweiguo)

Assets 15

@knative-prow-releaser-robot knative-prow-releaser-robot released this Oct 29, 2019 · 2 commits to release-0.10 since this release

Meta

Eliminated errors across dataplane-probe scenarios!

With some of the work to improve the activator’s load balancing, the errors we would see with containerConcurrency: 1 due to queuing have been eliminated. With the CC aware load balancing in the activator we actually see better latency going through the activator than talking directly to the user container (everywhere else the extra hop adds latency).

Moving minimum Kubernetes version to 1.14

As part of moving to a single install path we are moving the minimum supported Kubernetes version to 1.14. This allows us to take advantage of multiple CRD endpoints and begin to experiment with future Kubernetes CRD features. Installation will fail if a lower version is detected.

We’re using Go1.13 for Serving

Serving has switched to Go 1.13, which has much better performance around synchronizations, has error wrapping and other features, like Duration.Milliseconds.

Autoscaling

We no longer use GenerateName for service names (@vagababov)

We were using GenerateName for creation of metrics and private K8s Service names. Which caused us various problems in testing and reliability. We no longer have this problem.

Activator refactoring and load balancing (@vagababov, @markusthoemmes):

After great changes by @greghaynes that permit probing and routing requests to individual pods -- we were able to significantly simplify the code, remove redundancies and ineffective pieces.

Semi-ideal load balancing (@vagababov):

Based on previous work we are able now to route requests to the individual pods always (when pods are individually addressable) permitting us to achieve significant improvements in the CC=1 case. See the graph. Work in this direction is not complete though (CC=10 case is still in progress)

Significant improvements to the integration tests that led to a big reduction in flakes (@nak3 @JRBANCEL):

Assorted improvements to the code and the tests have brought the number of flakes to practically white noise. See graph (check test/e2e.TestAutoscale.* tests)

Assorted code readability, optimizations and clean ups (@markusthoemmes, @nak3, @JRBANCEL, @skaslev)

Core API

Move to K8s 1.15.3 client libs #5570 (thanks @mattmoor)

The client version allows Knative to take advantage of newer features within Kubernetes going forward. This client version was chosen as it is compatible with 1.14 (our minimum version), but can also be used for longer as it is compatible with 1.16 Kubernetes releases as well.

Move back to single install path will all API versions (v1alpha1, v1beta1, v1) #5594 (thanks @mattmoor)

As we move to a 1.14 minimum Kubernetes version we are migrating back to a single install path that contains all 3 API endpoints. See Release Principles Draft for how this will work for future releases.

Skip copying last-applied-configuration annotation to Route #5468 (thanks @nak3)

The last-applied-configuration annotation is applied when using kubectl to manipulate the Service object. Previously this annotation would be copied down to Route which would cause confusion when describing the Route.

ConfigMaps are now validated synchronously #5404 (thanks @nimakaviani)

Malformed configmaps were a potential source of latent error and were often difficult to debug. This release adds synchronous validation of configmaps which ensures that bad values aren't able to sneak through.

New ClusterRoles for editing/viewing Knative resources through User-facing roles #5683 (thanks @nak3)

The ClusterRoles knative-serving-namespaced-edit and knative-serving-namespaced-view have been added as User-facing Roles. Read more about User-facing Roles in the Kubernetes documentation.

Add annotation /creator and /lastModifier to Configuration and Route #5240 (thanks @savitaashture)

These annotations were added to Services in previously releases where the information was propagated down. This release adds the annotations for Routes and Configurations created directly by a user.

Service and Route CRDs are now labeled with duck.knative.dev/addressable=true #5874 (thanks @n3wscott)

Bugs:

  • Fix v1 Route validation for multiple traffic targets with empty tags #5583 (thanks @skaslev)
  • Fix error message for missing probe field. #5713 (thanks @nak3)
  • Fail Service creation on invalid delaySeconds value #5733 (thanks @savitaashture)
  • Better error message for invalid combinations #5566 (thanks @nak3)
  • Delete finalizer on Route on deletion #5715 (thanks @nak3)
  • Validate name and generateName for RevisionTemplate #5110 (thanks @savitaashture)

Tests:

  • Use Status.URL instead of Status.URL.Host for conformance tests #5503 (thanks @bancel)
  • Enable downgrade testing #5596 (thanks @mattmoor)
  • New e2e test for rollback scenario #5702 (thanks @taragu)
  • Remove logURL assertion from conformance tests #5448 (thanks @markusthommes)

Networking

Activator graceful shutdown #5542 (thanks @nak3)

Fix a bug due to readiness probes showing successes when an activator is shutting down, causing requests to still be routed to terminating activators. We do this by letting SIGTERM triggers readinessProbe failure to allow terminating activators to properly drain.

Avoid creating wildcard certs when AutoTLS is disabled #5636 (thanks @nak3)

In v0.9.0, we attempted to create wildcard certs even though cert-manager is not setup, causing certificate creation errors when creating a new namespace. We fixed this to avoid requiring wildcard certs when AutoTLS is disabled.

Finalize ClusterIngress migration #5689 (thanks @wtam2018)

We removed the remaining references to ClusterIngress, and that completes the ClusterIngress migration feature track. Going forward, knative/serving no longer has cluster-scoped CRDs.

Improved readiness prober to support HTTPS redirect, HTTPS, HTTP2 & custom ports #5223 (thanks @JRBANCEL)

Previously Route readiness probers rely on a special domain coded into Istio VirtualServices. We now change the probes to probe the actual data path, allowing probes to work with the domains that the Route is expected to serve.

Fix cluster-local visibility when using tags #5734 (thanks @andrew-su)

Fix a visibility resolution bug, introduced in v0.9.0, causing cluster-local tags to not have cluster-local visibility even if explicitly set.

Avoid creating invalid certificate for cluster-local service #5611 (thanks @nak3)

Monitoring

Add container and pod labels to revision and activator metrics #5824 (thanks @yanweiguo)

Assets 15

Meta

This is “Serving v1” RC2

There is discussion ongoing within the community about how we will message and document that Serving (within constraints) is ready for production workloads, and how we coordinate this with the rest of Knative, which is not yet there.

v1 API

The v1 API shape and endpoint is available starting in this release. Due to potential minimum version constraints this release can be deployed with either just the v1alpha1 endpoint or with all endpoints (v1alpha1, v1beta1, and v1) endpoints enabled. The v1 API shape is usable through all endpoints.

To use the v1beta1 or v1 endpoints, a minimum Kubernetes version of 1.14 is required (1.13.10 also had the fix backported). The minimum required Kubernetes version will become 1.14 in the next release of Knative.

autoscaling.knative.dev/minScale now only applies to routable revisions

We have changed the behavior of minScale to only apply to Revisions that are referenced by a Route. This addresses a long-standing pain point where users used minScale, but Revisions would stick around until garbage collected, which takes at least 10 hours.

Cold Start improvements

We have made some improvement to our cold-start latency, which should result in a small net improvement across the board, but also notably improves:

  • Cold-starts that are sequenced (e.g. front-end calls back-end and both cold-start)
  • Events with responses (e.g. passing events back to the broker with each hop cold starting)
  • The long tail of cold-start latency (this should now be reliably under 10s for small container images)

Autoscaling

Cold Start Improvements #4902 and #3885 (thanks @greghaynes)

The Activator will now send requests directly to the pods when the ClusterIP is not yet ready, providing us with ~200ms latency from the time the pod is ready to the time we send the first request, compared to up to 10s before.
This also fixes a problem where cold start was subject to the 1iptables-min-sync-period of the kubelet (10s on GKE), which created a relatively high floor for cold start times under certain circumstances.

RPS autoscaling #3416 (thanks @yanweiguo and @taragu)

It is possible to drive autoscaling not only by concurrency but also by RPS/QPS/OPS metric, which is a better metric for short and light weight requests (@yanweiguo)
Report RPS metrics (@taragu)

minScale only applies to routable revisions #4183 (thanks @tanzeeb)

Previously Revisions would keep around the minScale instance even when they were no longer routable.
Added Reachability concept to the PodAutoscaler.

Continuous benchmarks are live at https://mako.dev (thanks @mattmoor, @srinivashegde86, @Fredy-Z, @vagababov)

Autoscaler scaledown rate #4993 (thanks @vagababov)

The rate at which the autoscaler scales down revisions can now be limited to a rate configured in config-autoscaler.

Various bug fixes/improvements:

Core API

v1 API #5483, #5259, #5337, #5439, #5559 (thanks @dgerd, @mattmoor)

The v1 API shape and endpoint is available starting in this release. See the "Meta" section for more details.

Validate system annotations #4995 (thanks @shashwathi)

Webhook validation now ensures that serving.knative.dev annotations have appropriate values.

Revisions now have the service.knative.dev/route label #5048 (thanks @mattmoor)

Revisions are now labeled by the referencing Route to enable querying.

Revision GC refactored into its own reconciler #4876 (thanks @taragu)

Revision reconciliation now occurs separately from Configuration reconciliation.

Surface Deployment failures to Revision status #5077 (thanks @jonjohnsonjr)

DeploymentProgressing and DeploymentReplicaFailure information is propagated up to Revision status. An event is no longer emitted when the deployment times out.

Validate VolumeSources and VolumeProjections #5128 (thanks @markusthoemmes)

We now validate the KeyToPath items in the webhook to ensure that both Key and Path are specified. This prevents potential pod deployments problems.

ContainerConcurrecy default is now configurable #5099 (thanks @taragu, @Zyqsempai)

ContainerConcurrency is now configured through the config-defaults ConfigMap. Unspecified values will receive the default value, and explicit zero values will receive 'unlimited' concurrency.

Apply Route's labels to the child Ingress #5467 (thanks @nak3)

Labels on the Route will be propagated to the Ingress owned by the Route.

Jitter global resyncs to improve performance at scale #5275 (thanks @mattmoor)

Global resyncs no longer enqueue all objects at once. This prevents latency spikes in reconciliation time and improves the performance of larger clusters.

Improved error messages for readiness probes #5385 (thanks @nak3)

Bug Fixes:

  • Fix Revisions stuck in updating when scaled-to-zero #5106 (thanks @tanzeeb)
  • Fix Service reconcile when using named Revisions #5547 (thanks @dgerd)
  • Skip copying kubectl.kubernetes.io/last-applied-configuration annotation #5202 (thanks @skaslev)
  • Image repository credentials now work for image pulling #5477 (thanks @jonjohnsonjr)
  • Error earlier if using invalid autoscaling annotations #5412 (thanks @savitaashture)
  • Fix potential NPE in Route reconciler #5333 (thanks @mjaow)
  • Fix timeoutSeconds=0 to set default timeout #5224 (thanks @nak3)
  • Consistent update for Ingress ObservedGeneration #5250 (thanks @taragu)

Test Improvements:

Networking

Cold start improvement

The activator sends request directly to Pod #3885 #4902 (thanks @greghaynes)

Disable and remove ClusterIngress resources #5024 (thanks @wtam)

Various bug fixes

  • Prober ignore Gateways that can’t be probed #5129 (thanks @JRBANCEL)
  • Make port name in Gateway unique by adding namespace prefix #5324 (thanks @nak3)
  • Activator to handle graceful shutdown correctly #5364 (thanks @mattmoor)
  • Route cluster-local visibility should take precedence over placeholder Services #5411 (thanks @tcnghia)

Monitoring

Assets 18

Meta

This release is our first “release candidate” for Serving v1

We are burning down remaining issues here, but barring major issues we will declare 0.9 the “v1” release of knative/serving.

Istio minimum version is now 1.1.x

In order to support #4755 we have to officially remove support for Istio 1.0.x (which is end-of-life).

Route/Service Ready actually means Ready!

Route now only reports Ready if it is accessible from the Istio Ingress. This allows users to start using a Service/Route the moment it reports Ready.

Target Burst Capacity (TBC) support

The activator can now be used to shield user services at smaller scales (not just zero!), where it will buffer requests until adequate capacity is available. This is configurable on cluster and revision level; it is currently off by default.

Migrate to knative.dev/serving import path

We have migrated github.com/knative/serving import paths to use knative.dev/serving.

Autoscaling

Target Burst Capacity (TBC) support #4443, #4516, #4580, #4758 (thanks @vagababov)

The activator can now be used to shield user services at smaller scales (not just zero!), where it will buffer requests until adequate capacity is available. This is configurable on cluster and revision level; it is currently off by default.

Activator HPA and performance improvements #4886, #4772 (thanks @yanweiguo)

With the activator on the dataplane more often (for TBC), several performance and scale problems popped up. We now horizontally scale the activator on CPU, and have made several latency improvements to its request handling.

Faster Scale Down to 0 #4883, #4949, #4938, etc (thanks @vagababov)

We will now elide the scale-to-zero “grace period” when the activator was already in the request path (this is now possible through the use of “target burst capacity”).
The scale-to-zero “grace period” is now computed from the time the activator was confirmed on the data path vs. a fixed duration.

Metrics Resource #4753, #4894, #4895, #4913, #4924 (thanks @markusthoemmes)

Autoscaling metrics are now full-fledged resources in Knative, this enables new autoscalers to plug in from out-of-process.

HPA is a separate controller now #4990 (thanks @markusthoemmes)

This proves that the metrics resource model enables a fully capable autoscaler outside of the main autoscaling controller.

Stability and performance (thanks to many):

  • Improvements to test flakiness
  • Better validation of annotation and config maps is performed
  • Autoscaler will wait for a reasonable population of metrics to be collected before scaling user pods down after it has been restarted.

Core API

Readiness probe cold-start improvements #4148, #4649, #4667, #4668, #4731 (thanks @joshrider, @shashwathi)

The queue-proxy sidecar will now evaluate both user specified readiness probes and the (default) TCP probe. This enables us to much more aggressively probe the user-provided container for readiness (vs. K8s default second granularity).
The default periodSeconds for the readinessProbe is now 0 which enables a system defined sub-second readiness check.
This contains a breaking change for users relying on the default periodSeconds while specifying either timeoutSeconds or failureThreshold. Services using these values should remove them to enable the benefits of faster probing, or they should specify a periodSeconds greater than 0 to restore previous behavior.

Enable specifying protocol without port number #4515 (thanks @tanzeeb)

Container ports can now be specified without a port number. This allows for specifying just a name (i.e. "http1", "h2c") to select the protocol.

Tag-to-digest resolution now works with AWS ECR #4084 (thanks @jonjonshonjr)

Knative has been updated to use the new AWS credential provider to enable pulling images from AWS ECR.

Revisions annotated with serving.knative.dev/creator #4526 (thanks @nak3)

Annotation Validations #4560, #4656, #4669, #4888, #4879, #4763 (thanks @vagababov, @markusthoemmes, @savitaashture , @shashwathi)

System annotations (autoscaling.knative.dev/* and serving.knative.dev/*) are now validated by the webhook for correctness and immutability (where applicable). This improves visibility to errors in annotations, and ensures annotations on Knative objects are accurate and valid.

ServiceAccountName Validation #4733, #4919 (thanks @shashwathi)

Service account names are now validated to be a valid kubernetes identifier to improve the time to error and reduce potential impact of an incorrect identifier.

Fixes

  • Tag resolution for schema 1 images #4432 (thanks @jonjohnsonjr )
  • Don't display user-defined template for cluster-local #4615 (thanks @duglin)
  • Fix error message when multiple containers are specified #4709 (thanks @nak3)
  • Update observedGeneration even when Route fails #4594 (thanks @taragu)

Tests:

Docs:

  • Remove misuse of RFC2119 keywords #4550 (thanks @duglin)
  • Add links to conformance tests from Runtime Contract #4428 (thanks @dgerd)
  • New API Specification document docs#1642 (thanks @dgerd)

Networking

Honest Route/Service Readiness (#1582, #3312) (thanks @JRBANCEL)

Route now only reports Ready if it is accessible from the Istio Ingress. This allows users to start using a Service or Route the moment it reports Ready.

Remove cluster scoping of ClusterIngress (#4028) (thanks @wtam)

networking.internal.knative.dev/ClusterIngress is now replaced by networking.internal.knative.dev/Ingress, which is a cluster-scoped resource. The ClusterIngress resource will be removed in 0.9.

Enable visibility settings for sub-Route (#3419) (thanks @andrew-su)

Each sub Route (tags) can have their own visibility setting by labelling the corresponding placeholder K8s Service.

Correct split percentage for inactive Revisions (#882, #4755) (thanks @tcnghia)

We no longer just route to the biggest inactive split, when there are more than one inactive traffic splits. To support this fix we now officially remove support for Istio 1.0 (which was announced to be EOL).

Integration with Gloo Ingress (thanks @scottweiss and Solo.io team)

Knative-on-Gloo now has its own continuous build to ensure good integration.
Gloo now officially supports networking.internal.knative.dev/Ingress (see #4028).

Ambassador officially announces Knative support (thanks @richarddli and Ambassador team)

blog post

Fixes

  • Fix activator crash due to trailing dot in resolv.conf (#4407) (thanks @tcnghia)
  • Activator to wait for active requests to drain before terminating (#4654) (thanks @vagababov)
  • Fix cluster-local Service URL (#4204) (thanks @duglin)
  • Remove cert-manager controller from default serving.yaml install (#4120) (thanks @ZhiminXiang)

Monitoring

Automate cold-start timing collection #2495 (thanks @greghaynes)

Record the time spent broken down into components during cold-start including “how much time is spent before we ask our deployment to scale up” and “how much time is spent before our user application begins executing”.

Dash in controller name cause metrics to be dropped #4716 (thanks @JRBANCEL)

Fixed an issue where some controller metrics were not getting into Prometheus due to invalid characters in their component names,

Assets 17

@knative-prow-releaser-robot knative-prow-releaser-robot released this Aug 6, 2019 · 3 commits to release-0.8 since this release

Meta

This release is our first “release candidate” for Serving v1

We are burning down remaining issues here, but barring major issues we will declare 0.9 the “v1” release of knative/serving.

Istio minimum version is now 1.1.x

In order to support #4755 we have to officially remove support for Istio 1.0.x (which is end-of-life).

Route/Service Ready actually means Ready!

Route now only reports Ready if it is accessible from the Istio Ingress. This allows users to start using a Service/Route the moment it reports Ready.

Target Burst Capacity (TBC) support

The activator can now be used to shield user services at smaller scales (not just zero!), where it will buffer requests until adequate capacity is available. This is configurable on cluster and revision level; it is currently off by default.

Migrate to knative.dev/serving import path

We have migrated github.com/knative/serving import paths to use knative.dev/serving.

Autoscaling

Target Burst Capacity (TBC) support #4443, #4516, #4580, #4758 (thanks @vagababov)

The activator can now be used to shield user services at smaller scales (not just zero!), where it will buffer requests until adequate capacity is available. This is configurable on cluster and revision level; it is currently off by default.

Activator HPA and performance improvements #4886, #4772 (thanks @yanweiguo)

With the activator on the dataplane more often (for TBC), several performance and scale problems popped up. We now horizontally scale the activator on CPU, and have made several latency improvements to its request handling.

Faster Scale Down to 0 #4883, #4949, #4938, etc (thanks @vagababov)

We will now elide the scale-to-zero “grace period” when the activator was already in the request path (this is now possible through the use of “target burst capacity”).
The scale-to-zero “grace period” is now computed from the time the activator was confirmed on the data path vs. a fixed duration.

Metrics Resource #4753, #4894, #4895, #4913, #4924 (thanks @markusthoemmes)

Autoscaling metrics are now full-fledged resources in Knative, this enables new autoscalers to plug in from out-of-process.

HPA is a separate controller now #4990 (thanks @markusthoemmes)

This proves that the metrics resource model enables a fully capable autoscaler outside of the main autoscaling controller.

Stability and performance (thanks to many):

  • Improvements to test flakiness
  • Better validation of annotation and config maps is performed
  • Autoscaler will wait for a reasonable population of metrics to be collected before scaling user pods down after it has been restarted.

Core API

Readiness probe cold-start improvements #4148, #4649, #4667, #4668, #4731 (thanks @joshrider, @shashwathi)

The queue-proxy sidecar will now evaluate both user specified readiness probes and the (default) TCP probe. This enables us to much more aggressively probe the user-provided container for readiness (vs. K8s default second granularity).
The default periodSeconds for the readinessProbe is now 0 which enables a system defined sub-second readiness check.
This contains a breaking change for users relying on the default periodSeconds while specifying either timeoutSeconds or failureThreshold. Services using these values should remove them to enable the benefits of faster probing, or they should specify a periodSeconds greater than 0 to restore previous behavior.

Enable specifying protocol without port number #4515 (thanks @tanzeeb)

Container ports can now be specified without a port number. This allows for specifying just a name (i.e. "http1", "h2c") to select the protocol.

Tag-to-digest resolution now works with AWS ECR #4084 (thanks @jonjonshonjr)

Knative has been updated to use the new AWS credential provider to enable pulling images from AWS ECR.

Revisions annotated with serving.knative.dev/creator #4526 (thanks @nak3)

Annotation Validations #4560, #4656, #4669, #4888, #4879, #4763 (thanks @vagababov, @markusthoemmes, @savitaashture , @shashwathi)

System annotations (autoscaling.knative.dev/* and serving.knative.dev/*) are now validated by the webhook for correctness and immutability (where applicable). This improves visibility to errors in annotations, and ensures annotations on Knative objects are accurate and valid.

ServiceAccountName Validation #4733, #4919 (thanks @shashwathi)

Service account names are now validated to be a valid kubernetes identifier to improve the time to error and reduce potential impact of an incorrect identifier.

Fixes

  • Tag resolution for schema 1 images #4432 (thanks @jonjohnsonjr )
  • Don't display user-defined template for cluster-local #4615 (thanks @duglin)
  • Fix error message when multiple containers are specified #4709 (thanks @nak3)
  • Update observedGeneration even when Route fails #4594 (thanks @taragu)

Tests:

Docs:

  • Remove misuse of RFC2119 keywords #4550 (thanks @duglin)
  • Add links to conformance tests from Runtime Contract #4428 (thanks @dgerd)
  • New API Specification document docs#1642 (thanks @dgerd)

Networking

Honest Route/Service Readiness (#1582, #3312) (thanks @JRBANCEL)

Route now only reports Ready if it is accessible from the Istio Ingress. This allows users to start using a Service or Route the moment it reports Ready.

Remove cluster scoping of ClusterIngress (#4028) (thanks @wtam)

networking.internal.knative.dev/ClusterIngress is now replaced by networking.internal.knative.dev/Ingress, which is a cluster-scoped resource. The ClusterIngress resource will be removed in 0.9.

Enable visibility settings for sub-Route (#3419) (thanks @andrew-su)

Each sub Route (tags) can have their own visibility setting by labelling the corresponding placeholder K8s Service.

Correct split percentage for inactive Revisions (#882, #4755) (thanks @tcnghia)

We no longer just route to the biggest inactive split, when there are more than one inactive traffic splits. To support this fix we now officially remove support for Istio 1.0 (which was announced to be EOL).

Integration with Gloo Ingress (thanks @scottweiss and Solo.io team)

Knative-on-Gloo now has its own continuous build to ensure good integration.
Gloo now officially supports networking.internal.knative.dev/Ingress (see #4028).

Ambassador officially announces Knative support (thanks @richarddli and Ambassador team)

blog post

Fixes

  • Fix activator crash due to trailing dot in resolv.conf (#4407) (thanks @tcnghia)
  • Activator to wait for active requests to drain before terminating (#4654) (thanks @vagababov)
  • Fix cluster-local Service URL (#4204) (thanks @duglin)
  • Remove cert-manager controller from default serving.yaml install (#4120) (thanks @ZhiminXiang)

Monitoring

Automate cold-start timing collection #2495 (thanks @greghaynes)

Record the time spent broken down into components during cold-start including “how much time is spent before we ask our deployment to scale up” and “how much time is spent before our user application begins executing”.

Dash in controller name cause metrics to be dropped #4716 (thanks @JRBANCEL)

Fixed an issue where some controller metrics were not getting into Prometheus due to invalid characters in their component names,

Assets 17

Meta

serving.knative.dev/v1beta1 (requires K8s 1.14+ due to #4533)

  • In 0.6 we expanded our v1alpha1 API to include our v1beta1 fields. In this release, we are contracting the set of fields we store for v1alpha1 to that subset (and disallowing those that don’t fit). With this, we can leverage the “same schema” CRD-conversion supported by Kubernetes 1.11+ to ship v1beta1.

HPA-based scaling on concurrent requests

  • We previously supported using the HPA “class” autoscaler to enable Knative services to be scaled on CPU and Memory. In this release, we are adding support for using the HPA to scale them on the same “concurrent requests” metrics used by our default autoscaler.
  • HPA still does not yet support scaling to zero, and more work is needed to expose these metrics to arbitrary autoscaler plugins, but this is exciting progress!

Non-root containers

  • This release, all of the containers we ship run as a “nonroot” user. This includes the queue-proxy sidecar injected into the user pod. This enables the use of stricter “Pod Security Policies” with knative/serving.

Breaking Changes

  • Previously deprecated status fields are no longer populated.
  • Build and Manual (deprecated in 0.6) are now unsupported
  • The URLs generated for Route tags by default have changed, see the tagTemplate section below for how to avoid this break.

Autoscaling

Support concurrency-based scaling on the HPA (thanks @markusthoemmes).

Metric-scraping and decision-making has been separated out of the Knative internal autoscaler (KPA). The metrics are now also available to the HPA.

Dynamically change autoscaling metrics sample size based on pod population (thanks @yanweiguo).

Depending on how many pods the specific revision has, the autoscaler now scrapes a computed number of pods to gain more confidence in the reported metrics while maintaining scalability.

Fixes:

  • Added readiness probes to the autoscaler #4456 (thanks @vagababov)
  • Adjust activator’s throttling behavior based on activator scale (thanks @shashwathi and @andrew-su).
  • Revisions wait until they have reached “minScale” before they are reported “Ready” (thanks @joshrider).

Core API

Expose v1beta1 API #4199 (thanks @mattmoor)

This release exposes resources under serving.knative.dev/v1beta1.

Non-root containers #3237 (thanks @bradhoekstra and @dprotaso)

This release, all of the containers we ship run as a “nonroot” user. This includes the queue-proxy sidecar injected into the user pod. This enables the use of stricter “Pod Security Policies” with knative/serving.

Allow users to specify their container name #4289 (thanks @mattmoor)

This will default to user-container, which is what we use today, and that default may be changed for config-defaults to a Go template with access to the parent resource’s (e.g. Service, Configuration) ObjectMeta fields.

Projected volume support #4079 (thanks @mattmoor)

Based on community feedback, we have added support for mounting ConfigMaps and Secrets via the projected volume type.

Drop legacy status fields #4197 (thanks @mattmoor)

A variety of legacy fields from our v1alpha1 have been dropped in preparation to serve these same objects over v1beta1.

Build is unsupported #4099 (thanks @mattmoor)

As mentioned in the 0.6 release notes, support for just-in-time builds has been removed, and requests containing a build will now be rejected.

Manual is unsupported #4188 (thanks @mattmoor)

As mentioned in the 0.6 release notes, support for manual mode has been removed, and requests containing it will now be rejected.

V1beta1 clients and conformance testing #4369 (thanks @mattmoor)

We have generated client libraries for v1beta1 and have a v1beta1 version of the API conformance test suite under ./test/conformance/api/v1beta1.

Defaulting based conversion #4080 (thanks @mattmoor)

Objects submitted with the old v1alpha1 schema will be upgraded via our “defaulting” logic in a mutating admission webhook.

New annotations for queue-proxy resource limits #4151 (thanks @raushan2016)

The queue.sidecar.serving.knative.dev/resourcePercentage annotation now allows setting the percetnage of user container resources to be used for the queue-proxy.

Annotation propagation #4363, #4367 (thanks @vagababov)

Annotations now propagate from the Knative Service object to Route and Configuration.

Fixes:

Test:

Networking

Reconcile annotations from Route to ClusterIngress #4087 (thanks @vagababov)

This allows ClusterIngress class annotation to be specified per-Route instead of cluster wide through a config-network setting.

Introduce tagTemplate configuration #4292 (thanks @mattmoor)

This allows operators to configure the names that are given to the services created for tags in Route.
This also changes the default to transpose the tag and route name, which is a breaking change to the URLs these received in 0.6. To avoid this break, you can set tagTemplate: {{.Name}}-{{.Tag}} in config-network.

Enable use of annotations in domainTemplate #4210 (thanks @raushan2016)

User can now provide custom subdomain via label serving.knative.dev/subDomain.

Allow customizing max allowed request timeout #4172 (thanks @mdemirhan)

This introduces a new config entry max-revision-timeout-seconds in config-defaults to set the max allowed request timeout.

Set Forwarded header on request #4376 (thanks @tanzeeb)

The Forwarded header is constructed and appended to the headers by the queue-proxy if only legacy x-forwarded-* headers are set.

Fixes:

  • Enable short names for cluster-local Service without relying on sidecars #3824 (thanks @tcnghia)
  • Better surfacing of ClusterIngress Status #4288 #4144 (thanks @tanzeeb, @nak3)
  • SKS private service uses random names to avoid length limitation #4250 (thanks @vagababov)

Monitoring

Set memory request for zipkin pods #4353 (thanks @sebgoa)

This lowers the memory necessary to schedule the zipkin pod.

Collect /var/log without fluentd sidecar #4156 (thanks @JRBANCEL)

This allows /var/log collection without the need to load fluentd sidecar, which is large and significantly increases pod startup time.

Enable queue-proxy metrics scraping by Prometheus. #4111 (thanks @mdemirhan)

The new metrics exposed by queue proxy are now exposed as part of the pod spec and Prometheus can now scrape these metrics.

Fixes:

  • Fix 'Revision CPU and Memory Usage' Grafana dashboard #4106 (thanks @JRBANCEL)
  • Fix 'Scaling Debugging' Grafana dashboard. #4096 (thanks @JRBANCEL)
  • Remove embedded jaeger-operator and include as dependency instead #3938 (thanks @objectiser)
  • Fix HTTP request dashboards #4418 (thanks @mdemirhan)
Assets 14

@knative-prow-releaser-robot knative-prow-releaser-robot released this Jun 25, 2019 · 1 commit to release-0.7 since this release

Meta

serving.knative.dev/v1beta1 (requires K8s 1.14+ due to #4533)

  • In 0.6 we expanded our v1alpha1 API to include our v1beta1 fields. In this release, we are contracting the set of fields we store for v1alpha1 to that subset (and disallowing those that don’t fit). With this, we can leverage the “same schema” CRD-conversion supported by Kubernetes 1.11+ to ship v1beta1.

HPA-based scaling on concurrent requests

  • We previously supported using the HPA “class” autoscaler to enable Knative services to be scaled on CPU and Memory. In this release, we are adding support for using the HPA to scale them on the same “concurrent requests” metrics used by our default autoscaler.
  • HPA still does not yet support scaling to zero, and more work is needed to expose these metrics to arbitrary autoscaler plugins, but this is exciting progress!

Non-root containers

  • This release, all of the containers we ship run as a “nonroot” user. This includes the queue-proxy sidecar injected into the user pod. This enables the use of stricter “Pod Security Policies” with knative/serving.

Breaking Changes

  • Previously deprecated status fields are no longer populated.
  • Build and Manual (deprecated in 0.6) are now unsupported
  • The URLs generated for Route tags by default have changed, see the tagTemplate section below for how to avoid this break.

Autoscaling

Support concurrency-based scaling on the HPA (thanks @markusthoemmes).

Metric-scraping and decision-making has been separated out of the Knative internal autoscaler (KPA). The metrics are now also available to the HPA.

Dynamically change autoscaling metrics sample size based on pod population (thanks @yanweiguo).

Depending on how many pods the specific revision has, the autoscaler now scrapes a computed number of pods to gain more confidence in the reported metrics while maintaining scalability.

Fixes:

  • Added readiness probes to the autoscaler #4456 (thanks @vagababov)
  • Adjust activator’s throttling behavior based on activator scale (thanks @shashwathi and @andrew-su).
  • Revisions wait until they have reached “minScale” before they are reported “Ready” (thanks @joshrider).

Core API

Expose v1beta1 API #4199 (thanks @mattmoor)

This release exposes resources under serving.knative.dev/v1beta1.

Non-root containers #3237 (thanks @bradhoekstra and @dprotaso)

This release, all of the containers we ship run as a “nonroot” user. This includes the queue-proxy sidecar injected into the user pod. This enables the use of stricter “Pod Security Policies” with knative/serving.

Allow users to specify their container name #4289 (thanks @mattmoor)

This will default to user-container, which is what we use today, and that default may be changed for config-defaults to a Go template with access to the parent resource’s (e.g. Service, Configuration) ObjectMeta fields.

Projected volume support #4079 (thanks @mattmoor)

Based on community feedback, we have added support for mounting ConfigMaps and Secrets via the projected volume type.

Drop legacy status fields #4197 (thanks @mattmoor)

A variety of legacy fields from our v1alpha1 have been dropped in preparation to serve these same objects over v1beta1.

Build is unsupported #4099 (thanks @mattmoor)

As mentioned in the 0.6 release notes, support for just-in-time builds has been removed, and requests containing a build will now be rejected.

Manual is unsupported #4188 (thanks @mattmoor)

As mentioned in the 0.6 release notes, support for manual mode has been removed, and requests containing it will now be rejected.

V1beta1 clients and conformance testing #4369 (thanks @mattmoor)

We have generated client libraries for v1beta1 and have a v1beta1 version of the API conformance test suite under ./test/conformance/api/v1beta1.

Defaulting based conversion #4080 (thanks @mattmoor)

Objects submitted with the old v1alpha1 schema will be upgraded via our “defaulting” logic in a mutating admission webhook.

New annotations for queue-proxy resource limits #4151 (thanks @raushan2016)

The queue.sidecar.serving.knative.dev/resourcePercentage annotation now allows setting the percetnage of user container resources to be used for the queue-proxy.

Annotation propagation #4363, #4367 (thanks @vagababov)

Annotations now propagate from the Knative Service object to Route and Configuration.

Fixes:

Test:

Networking

Reconcile annotations from Route to ClusterIngress #4087 (thanks @vagababov)

This allows ClusterIngress class annotation to be specified per-Route instead of cluster wide through a config-network setting.

Introduce tagTemplate configuration #4292 (thanks @mattmoor)

This allows operators to configure the names that are given to the services created for tags in Route.
This also changes the default to transpose the tag and route name, which is a breaking change to the URLs these received in 0.6. To avoid this break, you can set tagTemplate: {{.Name}}-{{.Tag}} in config-network.

Enable use of annotations in domainTemplate #4210 (thanks @raushan2016)

User can now provide custom subdomain via label serving.knative.dev/subDomain.

Allow customizing max allowed request timeout #4172 (thanks @mdemirhan)

This introduces a new config entry max-revision-timeout-seconds in config-defaults to set the max allowed request timeout.

Set Forwarded header on request #4376 (thanks @tanzeeb)

The Forwarded header is constructed and appended to the headers by the queue-proxy if only legacy x-forwarded-* headers are set.

Fixes:

  • Enable short names for cluster-local Service without relying on sidecars #3824 (thanks @tcnghia)
  • Better surfacing of ClusterIngress Status #4288 #4144 (thanks @tanzeeb, @nak3)
  • SKS private service uses random names to avoid length limitation #4250 (thanks @vagababov)

Monitoring

Set memory request for zipkin pods #4353 (thanks @sebgoa)

This lowers the memory necessary to schedule the zipkin pod.

Collect /var/log without fluentd sidecar #4156 (thanks @JRBANCEL)

This allows /var/log collection without the need to load fluentd sidecar, which is large and significantly increases pod startup time.

Enable queue-proxy metrics scraping by Prometheus. #4111 (thanks @mdemirhan)

The new metrics exposed by queue proxy are now exposed as part of the pod spec and Prometheus can now scrape these metrics.

Fixes:

  • Fix 'Revision CPU and Memory Usage' Grafana dashboard #4106 (thanks @JRBANCEL)
  • Fix 'Scaling Debugging' Grafana dashboard. #4096 (thanks @JRBANCEL)
  • Remove embedded jaeger-operator and include as dependency instead #3938 (thanks @objectiser)
  • Fix HTTP request dashboards #4418 (thanks @mdemirhan)
Assets 10

Meta

New API Shape

We have approved a proposal for the “v1beta1” API shape for knative/serving. These changes will make the Serving resources much more familiar for experienced Kubernetes users, unlock the power of Route to users of Service, and enable GitOps scenarios with features like “bring-your-own-Revision-name”. We will be working towards this over the next few releases.

In this release we have backported the new API surface to the v1alpha1 API as the first part of the transition to v1beta1 (aka “lemonade”). The changes that will become breaking in 0.7+ are:

  • Service and Configuration will no longer support “just-in-time” Builds.
  • Service will no longer support “manual” mode.

You can see the new API surface in use throughout our samples in knative/docs, but we will continue to support the majority of the legacy surface via v1alpha1 until we turn it down.

Overhauled Scale-to-Zero

We have radically changed the mechanism by which we scale to zero. The new architecture creates a better separation of concerns throughout the Serving resource model with fewer moving parts, and enables us to address a number of long-standing issues (some in this release, some to come). See below for more details.

Auto-TLS (alpha, opt-in)

We have added support for auto-TLS integration! The default implementation builds on cert-manager to provision certificates (e.g. via Let’s Encrypt), but similar to how we have made Istio pluggable, you can swap out cert-manager for other certificate provisioning systems. Currently certificates are provisioned per-Route, but stay tuned for wildcard support in a future release. This feature requires Istio 1.1, and must be explicitly enabled.

Moar Controller Decoupling

We have started to split the “pluggable” controllers in Knative into their own controller processes so that folks looking to replace Knative sub-systems can more readily remove the bundled default implementation. For example, to install Knative Serving without the Istio layer run:

kubectl apply -f serving.yaml \
  -l networking.knative.dev/ingress-provider!=istio

Note that we may see some error due to kubectl not understanding the yaml for Istio objects (even if they are filtered out by the label selector). It is safe to ignore the errors no matches for kind "Gateway" in version "networking.istio.io/v1alpha3".

You can also use this to omit the optional Auto-TLS controller based on cert-manager with:

kubectl apply -f serving.yaml \
  -l networking.knative.dev/certificate-provider!=cert-manager

Autoscaling

Move the Knative PodAutoscaler (aka “KPA”) from the /scale sub-resource for scaling to a PodScalable “duck type”. This enables us to leverage informer caching, and the expanded contract will enable the ServerlessService (aka “SKS”) to leverage the PodSpec to do neat optimizations in future releases. (Thanks @mattmoor)

We now ensure that our “activator” component has been successfully wired in before scaling a Revision down to zero (aka “positive hand-off”, #2949). This work was enabled by the Revision-managed activation work below. (Thanks @vagababov)

New annotations autoscaling.knative.dev/window, autoscaling.knative.dev/panicWindowPercentage, and autoscaling.knative.dev/panicThresholdPercentage allow customizing the sensitivity of KPA-class PodAutoscalers (#3103). (Thanks @josephburnett)

Added tracing to activator to get more detailed and persistently measured performance data (#2726). This fixes #1276 and will enable us to troubleshoot performance issues, such as cold start. (Thanks @greghaynes).

Fixed a Scale to Zero issue with Istio 1.1 lean installation (#3987) by reducing the idle timeouts in default transports (#3996) (Thanks @vagababov) which solves the k8's service not being terminated when the endpoint changes.

Resolved an issue which prevented disabling Scale to Zero (#3629) with fix (#3688) (Thanks @yanweiguo) which takes enable-scale-to-zero from configmap into account in KPA reconciler when doing scale. If minScale annotation is not set or set to 0 and enable-scale-to-zero is set to false, keep 1 pod as minimum.

Fix the autoscaler bug that make rash decision when the autoscaler restarts (#3771). This fixes issues #2705 and #2859. (Thanks @hohaichi)

Core API

We have an approved v1beta1 API shape! As above, we have started down the path to v1beta1 over the next several milestones. This milestone landed the v1beta1 API surface as a supported subset of v1alpha1. See above for more details. (Thanks to the v1beta1 task force for many hours of hard work on this).

We changed the way we perform validation to be based on a “fieldmask” of supported fields. We will now create a copy of each Kubernetes object limited to the fields we support, and then compare it against the original object; this ensures we are deliberate with which resource fields we want to leverage as the Kubernetes API evolves. (#3424, #3779) (Thanks @dgerd). This was extended to cleanup our internal API validations (#3789, #3911) (Thanks @mattmoor).

status.domain has been deprecated in favor of status.url. (#3970) (Thanks @mattmoor) which uses the apis.URL for our URL status fields, resolving the issue "Unable to get the service URL" (#1590)

Added the ability to specify default values for the matrix of {cpu, mem} x {request, limit} via our configmap for defaults. This also removes the previous CPU limit default so that we fallback on the configured Kubernetes defaults unless this is specifically specified by the operator. (#3550, #3912) (Thanks @mattmoor)

Dropped the use of the configurationMetadataGeneration label (#4012) (thanks @dprotaso), and wrapped up the last of the changes transitioning us to CRD sub-resources (#643).

Networking

Overhauled the way we scale-to-zero! (Thanks @vagababov) This enables us to have Revisions managing their own activation semantics, implement positive hand-off when scaling to zero, and increase the autoscaling controller’s resync period to be consistent with our other controllers.

Added support for automatically configuring TLS certificates! (Thanks @ZhiminXiang) See above for more details.

We have stopped releasing Istio yamls. It was never our intention for knative/serving to redistribute Istio, and prior releases exposed our “dev”-optimized Istio yamls. Users should consult either the Istio or vendor-specific documentation for how to get a “supported” Istio distribution. (Thanks @mattmoor)

We have started to adopt a flat naming scheme for the named sub-routes within a Service or Route. The old URLs will still work for now, but the new URLs will appear in the status.traffic[*].url fields. (Thanks @andrew-su)

Support the installation of Istio 1.1 (#3515, #3353) (Thanks @tcnghia)

Fixed readiness probes with Istio mTLS enabled (#4017) (Thanks @mattmoor)

Monitoring

Activator now reports request logs (#3781) with check-in (#3927) (Thanks @mdemirhan)

Test and Release

Assorted Fixes

  • label serving.knative.dev/release: devel should have the release name/number instead of devel (#3626) fixed with Export TAG to fix our annotation manipulation. (#3995) (Thanks @mattmoor)

  • Always install istio from HEAD for upgrade tests (#3522) (Thanks @jonjohnsonjr) fixing errors with upgrade / downgrade testing of knative (#3506)

  • Additional runtime conformance test coverage (9 new tests), improvements to existing conformance tests, and v1beta1 coverage. (Thanks @andrew-su, @dgerd, @yt3liu, @mattmoor, @tzununbekov)

Assets 10

@knative-prow-releaser-robot knative-prow-releaser-robot released this May 14, 2019 · 1 commit to release-0.6 since this release

Meta

New API Shape

We have approved a proposal for the “v1beta1” API shape for knative/serving. These changes will make the Serving resources much more familiar for experienced Kubernetes users, unlock the power of Route to users of Service, and enable GitOps scenarios with features like “bring-your-own-Revision-name”. We will be working towards this over the next few releases.

In this release we have backported the new API surface to the v1alpha1 API as the first part of the transition to v1beta1 (aka “lemonade”). The changes that will become breaking in 0.7+ are:

  • Service and Configuration will no longer support “just-in-time” Builds.
  • Service will no longer support “manual” mode.

You can see the new API surface in use throughout our samples in knative/docs, but we will continue to support the majority of the legacy surface via v1alpha1 until we turn it down.

Overhauled Scale-to-Zero

We have radically changed the mechanism by which we scale to zero. The new architecture creates a better separation of concerns throughout the Serving resource model with fewer moving parts, and enables us to address a number of long-standing issues (some in this release, some to come). See below for more details.

Auto-TLS (alpha, opt-in)

We have added support for auto-TLS integration! The default implementation builds on cert-manager to provision certificates (e.g. via Let’s Encrypt), but similar to how we have made Istio pluggable, you can swap out cert-manager for other certificate provisioning systems. Currently certificates are provisioned per-Route, but stay tuned for wildcard support in a future release. This feature requires Istio 1.1, and must be explicitly enabled.

Moar Controller Decoupling

We have started to split the “pluggable” controllers in Knative into their own controller processes so that folks looking to replace Knative sub-systems can more readily remove the bundled default implementation. For example, to install Knative Serving without the Istio layer run:

kubectl apply -f serving.yaml \
  -l networking.knative.dev/ingress-provider!=istio

Note that we may see some error due to kubectl not understanding the yaml for Istio objects (even if they are filtered out by the label selector). It is safe to ignore the errors no matches for kind "Gateway" in version "networking.istio.io/v1alpha3".

You can also use this to omit the optional Auto-TLS controller based on cert-manager with:

kubectl apply -f serving.yaml \
  -l networking.knative.dev/certificate-provider!=cert-manager

Autoscaling

Move the Knative PodAutoscaler (aka “KPA”) from the /scale sub-resource for scaling to a PodScalable “duck type”. This enables us to leverage informer caching, and the expanded contract will enable the ServerlessService (aka “SKS”) to leverage the PodSpec to do neat optimizations in future releases. (Thanks @mattmoor)

We now ensure that our “activator” component has been successfully wired in before scaling a Revision down to zero (aka “positive hand-off”, #2949). This work was enabled by the Revision-managed activation work below. (Thanks @vagababov)

New annotations autoscaling.knative.dev/window, autoscaling.knative.dev/panicWindowPercentage, and autoscaling.knative.dev/panicThresholdPercentage allow customizing the sensitivity of KPA-class PodAutoscalers (#3103). (Thanks @josephburnett)

Added tracing to activator to get more detailed and persistently measured performance data (#2726). This fixes #1276 and will enable us to troubleshoot performance issues, such as cold start. (Thanks @greghaynes).

Fixed a Scale to Zero issue with Istio 1.1 lean installation (#3987) by reducing the idle timeouts in default transports (#3996) (Thanks @vagababov) which solves the k8's service not being terminated when the endpoint changes.

Resolved an issue which prevented disabling Scale to Zero (#3629) with fix (#3688) (Thanks @yanweiguo) which takes enable-scale-to-zero from configmap into account in KPA reconciler when doing scale. If minScale annotation is not set or set to 0 and enable-scale-to-zero is set to false, keep 1 pod as minimum.

Fix the autoscaler bug that make rash decision when the autoscaler restarts (#3771). This fixes issues #2705 and #2859. (Thanks @hohaichi)

Core API

We have an approved v1beta1 API shape! As above, we have started down the path to v1beta1 over the next several milestones. This milestone landed the v1beta1 API surface as a supported subset of v1alpha1. See above for more details. (Thanks to the v1beta1 task force for many hours of hard work on this).

We changed the way we perform validation to be based on a “fieldmask” of supported fields. We will now create a copy of each Kubernetes object limited to the fields we support, and then compare it against the original object; this ensures we are deliberate with which resource fields we want to leverage as the Kubernetes API evolves. (#3424, #3779) (Thanks @dgerd). This was extended to cleanup our internal API validations (#3789, #3911) (Thanks @mattmoor).

status.domain has been deprecated in favor of status.url. (#3970) (Thanks @mattmoor) which uses the apis.URL for our URL status fields, resolving the issue "Unable to get the service URL" (#1590)

Added the ability to specify default values for the matrix of {cpu, mem} x {request, limit} via our configmap for defaults. This also removes the previous CPU limit default so that we fallback on the configured Kubernetes defaults unless this is specifically specified by the operator. (#3550, #3912) (Thanks @mattmoor)

Dropped the use of the configurationMetadataGeneration label (#4012) (thanks @dprotaso), and wrapped up the last of the changes transitioning us to CRD sub-resources (#643).

Networking

Overhauled the way we scale-to-zero! (Thanks @vagababov) This enables us to have Revisions managing their own activation semantics, implement positive hand-off when scaling to zero, and increase the autoscaling controller’s resync period to be consistent with our other controllers.

Added support for automatically configuring TLS certificates! (Thanks @ZhiminXiang) See above for more details.

We have stopped releasing Istio yamls. It was never our intention for knative/serving to redistribute Istio, and prior releases exposed our “dev”-optimized Istio yamls. Users should consult either the Istio or vendor-specific documentation for how to get a “supported” Istio distribution. (Thanks @mattmoor)

We have started to adopt a flat naming scheme for the named sub-routes within a Service or Route. The old URLs will still work for now, but the new URLs will appear in the status.traffic[*].url fields. (Thanks @andrew-su)

Support the installation of Istio 1.1 (#3515, #3353) (Thanks @tcnghia)

Fixed readiness probes with Istio mTLS enabled (#4017) (Thanks @mattmoor)

Monitoring

Activator now reports request logs (#3781) with check-in (#3927) (Thanks @mdemirhan)

Test and Release

Assorted Fixes

  • label serving.knative.dev/release: devel should have the release name/number instead of devel (#3626) fixed with Export TAG to fix our annotation manipulation. (#3995) (Thanks @mattmoor)

  • Always install istio from HEAD for upgrade tests (#3522) (Thanks @jonjohnsonjr) fixing errors with upgrade / downgrade testing of knative (#3506)

  • Additional runtime conformance test coverage (9 new tests), improvements to existing conformance tests, and v1beta1 coverage. (Thanks @andrew-su, @dgerd, @yt3liu, @mattmoor, @tzununbekov)

Assets 10
You can’t perform that action at this time.