Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move to using Istio 1.1.1. #3353

Merged
merged 2 commits into from Apr 2, 2019
Merged

Conversation

tcnghia
Copy link
Contributor

@tcnghia tcnghia commented Mar 4, 2019

Proposed Changes

  • Move to using Isio 1.1.1.

Release Note

Move to using Isio 1.1.1.

@knative-prow-robot knative-prow-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Mar 4, 2019
@tcnghia
Copy link
Contributor Author

tcnghia commented Mar 7, 2019

/retest

@tcnghia
Copy link
Contributor Author

tcnghia commented Mar 7, 2019

@adrcunha that is correct.

Istio 1.1 split the installation into two steps: the first installer runs to install all the CRDs. So it reduces the existing race condition by allowing us to check for the installer to complete first.

@knative-prow-robot knative-prow-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 7, 2019
@knative-prow-robot knative-prow-robot removed the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 7, 2019
@tanzeeb
Copy link
Contributor

tanzeeb commented Mar 12, 2019

I think the 503s in these conformance tests are because of segfaults in the istio-ingressgateway. Logs from my cluster:

[bazel-out/k8-opt/bin/external/envoy/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:81] Caught Segmentation fault, suspect faulting address 0x0          
[bazel-out/k8-opt/bin/external/envoy/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:67] Backtrace (use tools/stack_decode.py to get line numbers):       
[bazel-out/k8-opt/bin/external/envoy/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:73] #0: __restore_rt [0x7f71d9903390]                                
[bazel-out/k8-opt/bin/external/envoy/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:73] #1: std::_Function_handler<>::_M_invoke() [0x8ae399]             
[bazel-out/k8-opt/bin/external/envoy/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:73] #2: Envoy::Event::DispatcherImpl::runPostCallbacks() [0x896cfd]  
[bazel-out/k8-opt/bin/external/envoy/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:73] #3: event_process_active_single_queue [0xc2fa44]                 
[bazel-out/k8-opt/bin/external/envoy/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:73] #4: event_base_loop [0xc2e62c]                                   
[bazel-out/k8-opt/bin/external/envoy/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:73] #5: Envoy::Event::DispatcherImpl::run() [0x896bc3]               
[bazel-out/k8-opt/bin/external/envoy/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:73] #6: Envoy::Server::WorkerImpl::threadRoutine() [0x8922c2]        
[bazel-out/k8-opt/bin/external/envoy/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:73] #7: Envoy::Thread::ThreadImplPosix::ThreadImplPosix()::$_0::__inv
[bazel-out/k8-opt/bin/external/envoy/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:73] #8: start_thread [0x7f71d98f96ba]                                
 warn    Epoch 1 terminated with an error: signal: segmentation fault (core dumped)                                                                                        
 warn    Aborted all epochs                                                                                                                                                
 info    Epoch 1: set retry delay to 200ms, budget to 9                                                                                                                    
 info    Reconciling retry (budget 9)                                                                                                                                      
 info    Epoch 0 starting                                                                                                                                                  
 info    Envoy command: [-c /etc/istio/proxy/envoy-rev0.json --restart-epoch 0 --drain-time-s 45 --parent-shutdown-time-s 60 --service-cluster istio-ingressgateway --servi

The 503s are non-deterministic otherwise, happens at different times, on different tests, and never happens when you run a specific test.

Update:

I haven't seen a segfault all afternoon, but I still see the 503s.

In addition, I had an EnvoyFilter running the whole time. I still see 503s after removing it.

@tcnghia
Copy link
Contributor Author

tcnghia commented Mar 13, 2019

/retest

@tcnghia tcnghia changed the title [WIP] Move to using Istio 1.1.0-rc.2 to test our compatibility. [WIP] Move to using Istio 1.1.0-rc.4 to test our compatibility. Mar 13, 2019
@tcnghia
Copy link
Contributor Author

tcnghia commented Mar 14, 2019

/retest

1 similar comment
@chaodaiG
Copy link
Contributor

/retest

@chaodaiG
Copy link
Contributor

/test pull-knative-serving-upgrade-tests

> ../istio-crds.yaml

# Create a custom cluster local gateway, based on the Istio custom-gateway template.
helm template --namespace=istio-system \

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need key gateways.istio-ingressgateway.sds.enabled=true in order to support #1767 and #3052.

@knative-prow-robot knative-prow-robot added area/API API objects and controllers area/networking area/test-and-release It flags unit/e2e/conformance/perf test issues for product features labels Mar 25, 2019
@tcnghia tcnghia changed the title [WIP] Move to using Istio 1.1.0-rc.4 to test our compatibility. [WIP] Move to using Istio 1.1.0 to test our compatibility. Mar 25, 2019
@tcnghia
Copy link
Contributor Author

tcnghia commented Mar 26, 2019

Conformance tests and scale tests are passing consistently now.

Tests that fail consistently:

  • TestDestroyPodInFlight: we may have to add back our graceful shutdown patch.
  • TestServiceToServiceCall/*
  • TestAutoscaleUpDownUp
  • TestActivatorOverload
  • TestGrpc*FromZero

@tcnghia
Copy link
Contributor Author

tcnghia commented Mar 26, 2019

/test pull-knative-serving-integration-tests

@tcnghia
Copy link
Contributor Author

tcnghia commented Mar 26, 2019

Only TestDestroyPodInFlight and TestServiceToServiceCall/* are failing now.

/test pull-knative-serving-integration-tests

@tcnghia tcnghia changed the title [WIP] Move to using Istio 1.1.0 to test our compatibility. [WIP] Move to using Istio 1.1.1 to test our compatibility. Mar 26, 2019
@tcnghia tcnghia force-pushed the istio-1.1 branch 2 times, most recently from 7b32c38 to c8d68fc Compare March 26, 2019 17:46
@tcnghia
Copy link
Contributor Author

tcnghia commented Mar 26, 2019

/test pull-knative-serving-integration-tests

@tcnghia
Copy link
Contributor Author

tcnghia commented Mar 26, 2019

/test pull-knative-serving-unit-tests

@tcnghia
Copy link
Contributor Author

tcnghia commented Mar 26, 2019

looks like all tests passing now

@tcnghia
Copy link
Contributor Author

tcnghia commented Mar 26, 2019

Summary:

Other than that our tests seem to be happy with Istio 1.1.1

@tcnghia
Copy link
Contributor Author

tcnghia commented Mar 29, 2019

/retest

@googlebot googlebot added the cla: yes Indicates the PR's author has signed the CLA. label Mar 29, 2019
@knative-prow-robot knative-prow-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 30, 2019
@tanzeeb
Copy link
Contributor

tanzeeb commented Apr 1, 2019

/test pull-knative-serving-upgrade-tests
/test pull-knative-serving-integration-tests

@knative-prow-robot knative-prow-robot removed the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 2, 2019
@tcnghia tcnghia changed the title [WIP] Move to using Istio 1.1.1 to test our compatibility. Move to using Istio 1.1.1. Apr 2, 2019
@knative-prow-robot knative-prow-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 2, 2019
Copy link
Member

@mattmoor mattmoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@knative-prow-robot knative-prow-robot added the lgtm Indicates that a PR is ready to be merged. label Apr 2, 2019
@knative-prow-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mattmoor, tcnghia

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow-robot knative-prow-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 2, 2019
@knative-prow-robot knative-prow-robot merged commit 8a6c8f9 into knative:master Apr 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/API API objects and controllers area/networking area/test-and-release It flags unit/e2e/conformance/perf test issues for product features cla: yes Indicates the PR's author has signed the CLA. lgtm Indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants