Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 11 additions & 7 deletions .pipelines/npm/npm-cni-integration-test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ jobs:
displayName: "NPM k8s E2E"
dependsOn: ${{ parameters.dependsOn }}
condition: and( and( not(canceled()), not(failed()) ), ${{ or(contains(parameters.os_version, '2022'), eq(parameters.os, 'linux') ) }} , or( contains(variables.CONTROL_SCENARIO, 'npm') , contains(variables.CONTROL_SCENARIO, 'all') ) )
timeoutInMinutes: 180 # This is for testing windows, due to the 3m between the 14 tests -> results in 42m of wasted time
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noticing that that change to the k8s fork was in July fyi

If desired, we could also create another branch with a smaller timeout. I don't know why we changed to two minutes, but I feel like it's probably unnecessary at this point (Perhaps there was a latency issue in HNS that's been fixed)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested with a shorter timeout from your fork and got similar results. (Roughly 20 minutes faster). I would rather keep this in and have consistent results and avoid flakes due to HNS variability.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it actually take 180 mins to complete ? Can we pull npm test in parallel to all other tests (if not already)

Copy link
Contributor Author

@jpayne3506 jpayne3506 Jan 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It takes roughly 75 - 105 minutes to complete... Main problem is that it takes 140-160 minutes to fail. If a job timeout happens before failure then we lose a considerable amount of results and debugging logs.

pull npm test in parallel

Do you mean do we create the e2e binary so we can pull the same binary in parallel? If so, yes. See artifact: npm*

pool:
name: $(BUILD_POOL_NAME_DEFAULT)
demands:
Expand Down Expand Up @@ -120,15 +121,17 @@ jobs:
--ginkgo.focus="$focus" \
--ginkgo.skip="NetworkPolicyLegacy|SCTP" \
--kubeconfig=$HOME/.kube/config

# Untaint Linux (system) nodes once testing is complete
if ${{ lower(eq(parameters.os, 'windows')) }}
then
kubectl taint nodes -l kubernetes.azure.com/mode=system node-role.kubernetes.io/control-plane:NoSchedule-
fi
displayName: "Run Kubernetes e2e.test"
continueOnError: ${{ parameters.continueOnError }}

- ${{ if eq(parameters.os, 'windows') }}:
- bash: |
# Untaint Linux (system) nodes once testing is complete
kubectl taint nodes -l kubernetes.azure.com/mode=system node-role.kubernetes.io/control-plane:NoSchedule-

displayName: Untaint Linux Nodes
condition: always()

- bash: |
npmLogs=$(System.DefaultWorkingDirectory)/${{ parameters.clusterName }}_npmLogs_Attempt_#$(System.StageAttempt)
mkdir -p $npmLogs
Expand All @@ -137,9 +140,10 @@ jobs:
npmPodList=`kubectl get pods -n kube-system | grep npm | awk '{print $1}'`
# capture all logs
for npmPod in $npmPodList; do
kubectl logs -n kube-system $npmPod > $npmLogs/$npmPod-logs.txt
kubectl logs -n kube-system $npmPod > $npmLogs/$npmPod-logs.txt
done
displayName: Generate NPM pod logs
retryCountOnTaskFailure: 3
condition: always()

- publish: $(System.DefaultWorkingDirectory)/${{ parameters.clusterName }}_npmLogs_Attempt_#$(System.StageAttempt)
Expand Down