Scalability [Validations] - Caching validations #4514

xeviknal · 2021-11-19T16:08:11Z

xeviknal
Nov 19, 2021

Kiali should have a robust model supporting the validation feature for big big clusters. As of today, the validation endpoints have this poor performance:

On a cluster with 1600 services and ~2000pods the tls and validations api endpoints take a long time to respond in large environment. For example tls takes 1.4 minutes to return.

Valiation endpoints result: {"errors":1,"objectCount":536,"warnings":73}. Load time is 30 seconds .

from #4224

Validations consume a huge amount of different kinds of resources: from services, pods, workloads to most of the istio native ones.
Some of the istio resources have mesh-wide visibility, meaning that they have influence in namespaces beyond the one where they are defined. At the beginning of the validations, we decided to avoid cross-namespace scenarios. However, this was quite a big limitation, specially after the addition of the exportTo field in some istio resources.

The first mechanism that comes to my mind is pre-processing or caching the validation status for each object in the mesh. The expiration date for each object should be when there is one new istio object related to the validation affecting to that object. Let's see one example:

Gateway validations only need the other gateways and workloads:

type GatewayChecker struct {
	GatewaysPerNamespace  [][]kubernetes.IstioObject
	Namespace             string
	WorkloadsPerNamespace map[string]models.WorkloadList
}

Therefore the cache for all the gateway objects should expire when there is either one new workload or gateway. Otherwise, they are still valid.

For example, sidecars validations:

type SidecarChecker struct {
	Sidecars       []kubernetes.IstioObject
	ServiceEntries []kubernetes.IstioObject
	Services       []core_v1.Service
	Namespaces     models.Namespaces
	WorkloadList   models.WorkloadList
}

The cache/pre-process can be expired or recalculated when there is a new sidecar, services entry, service, namespace or workload. If you add a DestinationRule, for instance, its validity remains the same.

Original comment here: #4080 (comment)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scalability [Validations] - Caching validations #4514

{{title}}

Replies: 0 comments

Select a reply

Scalability [Validations] - Caching validations #4514

xeviknal Nov 19, 2021

Replies: 0 comments

xeviknal
Nov 19, 2021