Skip to content
This repository has been archived by the owner on Oct 24, 2023. It is now read-only.

test: remove windows-specific tests, use windows node pools #2349

Merged
merged 2 commits into from Nov 22, 2019

Conversation

jackfrancis
Copy link
Member

Reason for Change:

Rather than test Windows by creating entirely new clusters, let's just add a Windows node pool to every cluster configuration we test across Kubernetes versions.

This PR adds a Windows node pool to the "standard" cluster configuration we use for our current "Linux" Kubernetes version tests, effectively making them OS-agnostic (or more appropriately: OS-generic, as they will cover both OS flavors).

This PR also removes 1.11 and 1.12 from our PR test matrix to reflect that 1.13-->1.17 are the versions of Kubernetes that are actively under development upstream.

Fewer tests for the same amount of coverage!

Issue Fixed:

Requirements:

Notes:

@codecov
Copy link

codecov bot commented Nov 22, 2019

Codecov Report

Merging #2349 into master will not change coverage.
The diff coverage is n/a.

@@          Coverage Diff           @@
##           master   #2349   +/-   ##
======================================
  Coverage    71.9%   71.9%           
======================================
  Files         130     130           
  Lines       23395   23395           
======================================
  Hits        16823   16823           
  Misses       5546    5546           
  Partials     1026    1026

@mboersma
Copy link
Member

This PR also removes 1.11 and 1.12 from our PR test matrix

Could we merge #2303 instead? It's just better hygiene to have 1 PR (commit) == 1 concern, plus it was here first.

- template: e2e-job-template.yaml
parameters:
name: 'k8s_1_13_release_e2e'
k8sRelease: '1.13'
apimodel: 'examples/e2e-tests/kubernetes/release/default/definition.json'
createVNET: true
stabilityIterations: '3'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 is the default, I assume?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we've been using 3 for the "linux" tests, which are occasionally flaky :(

We were skipping for "windows" tests.

The most cautious way to converge the two was to stop running these "stability" tests on PR tests for now.

(That's my thinking, at least.)

@@ -1891,11 +1891,11 @@ var _ = Describe("Azure Container Cluster using the Kubernetes Orchestrator", fu
if clusterAutoscalerEngaged {
cpuTarget = 50
for _, profile := range eng.ExpandedDefinition.Properties.AgentPoolProfiles {
maxPods, _ := strconv.Atoi(profile.KubernetesConfig.KubeletConfig["--max-pods"])
var n []node.Node
n, err = node.GetByRegexWithRetry(fmt.Sprintf("^k8s-%s", profile.Name), 3*time.Minute, cfg.Timeout)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So now we're just using profile.Count instead of actually fetching nodes to find the maxTotalPods. Is this because of Windows, and was there value in making the API call to get the nodes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, it's because the prior regex would fail (Windows nodes have a different name).

The purpose of this enumeration is to set an appropriate number of pod replicas that exceeds the pods-per nodes setting across all nodes (so that we induce quick scale up by creating pods that are unable to be scheduled). Doing the node count by calling the k8s api at runtime is more strictly accurate, but with Windows nodes in the mix, got more complicated: basically correlating the nodes in the cluster with a particular pool is trickier. All that being said, the theoretical benefits of doing this over simply taking the Count value in the api model seemed to be not worth wrangling this complexity.

@jackfrancis
Copy link
Member Author

This PR also removes 1.11 and 1.12 from our PR test matrix

Could we merge #2303 instead? It's just better hygiene to have 1 PR (commit) == 1 concern, plus it was here first.

Yep

@marosset
Copy link
Contributor

/lgtm

@acs-bot
Copy link

acs-bot commented Nov 22, 2019

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jackfrancis, marosset

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [jackfrancis,marosset]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jackfrancis jackfrancis merged commit aab72ee into Azure:master Nov 22, 2019
@jackfrancis jackfrancis deleted the e2e-merge-tests branch November 22, 2019 20:34
@jackfrancis jackfrancis added this to the v0.45.0 milestone Dec 16, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants