Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] The GlobalTrafficPolicy doesn't failover when weights declared #134

Open
wenxian opened this issue Aug 8, 2020 · 4 comments
Open

Comments

@wenxian
Copy link

wenxian commented Aug 8, 2020

Describe the bug
If the weight is declared, the 10 times of consecutive5xxErrors won't failover to the other region

Steps To Reproduce

apiVersion: admiral.io/v1alpha1
kind: GlobalTrafficPolicy
metadata:
  name: gtp-admiral-sample
  namespace: sample-admiral
  labels:
    env: default
    identity: webapp-sample-admiral
spec:
  policy:
  - dns: default.webapp-sample-admiral.global
    lbType: 1 #0 represents TOPOLOGY, 1 represents FAILOVER
    target:
    - region: us-west-2
      weight: 10
    - region: us-east-1
      weight: 90

Expected behavior
If a service returns 10 times 500, it won't get kicked off when GTP Weight(90 / 10 ) applied.

Without GTP, the failover will work with 10 consecutive 500 errors

@wenxian wenxian added the bug Something isn't working label Aug 8, 2020
@aattuluri
Copy link
Contributor

@wenxian This could be an istio issue, but I remember this was tested at least in istio version 1.5.x
i) What istio version are you using?
ii) Can you paste the destination rule generated after applying this GTP?

@wenxian
Copy link
Author

wenxian commented Aug 11, 2020

we are using istio 1.6

Namespace:    admiral-sync
Labels:       <none>
Annotations:  <none>
API Version:  networking.istio.io/v1beta1
Kind:         DestinationRule
Metadata:
  Creation Timestamp:  2020-08-10T21:22:01Z
  Generation:          10
  Resource Version:    170254904
  Self Link:           /apis/networking.istio.io/v1beta1/namespaces/admiral-sync/destinationrules/default.greeting-sample-showgtp.global-default-dr
  UID:                 83651bea-145a-4fdc-8efb-92601c695c76
Spec:
  Host:  default.greeting-sample-showgtp.global
  Traffic Policy:
    Load Balancer:
      Locality Lb Setting:
        Distribute:
          From:  us-east-1/*
          To:
            us-east-1:  99 #50
            us-west-2:  1 #50
      Simple:           ROUND_ROBIN
    Outlier Detection:
      Base Ejection Time:    120s
      consecutive5xxErrors:  10
      Interval:              5s
    Tls:
      Mode:  ISTIO_MUTUAL
Events:      <none>

I am calling from the us-east-1, actually, i found as long as the local (us-east-1) >= 50, the call always in local (us-east-1) which means 1. the weight doesn't got applied. (10 of 10 in east) 2. it won't fail over to remote (west).

@aattuluri
Copy link
Contributor

@wenxian I see the destination rule has been generated with the correct weights as per the spec apparently the distribute sets weights. Outlier detection might not be used here.

Probably looking at the envoy clusters night help, can you share the output for the following command:
istioctl proxy-config clusters <pod_name_of_source_workload> -o json

@wenxian
Copy link
Author

wenxian commented Aug 12, 2020

us-east-1

  "name": "outbound|80||default.greeting-sample-showgtp.global",
        "type": "STRICT_DNS",
        "connectTimeout": "10s",
        "loadAssignment": {
            "clusterName": "outbound|80||default.greeting-sample-showgtp.global",
            "endpoints": [
                {
                    "locality": {
                        "region": "us-east-1"
                    },
                    "lbEndpoints": [
                        {
                            "endpoint": {
                                "address": {
                                    "socketAddress": {
                                        "address": "greeting.sample-showgtp.svc.cluster.local",
                                        "portValue": 80
                                    }
                                }
                            },
                            "loadBalancingWeight": 1
                        }
                    ],
                    "loadBalancingWeight": 50
                },
                {
                    "locality": {
                        "region": "us-west-2"
                    },
                    "lbEndpoints": [
                        {
                            "endpoint": {
                                "address": {
                                    "socketAddress": {
                                        "address": "a5020c7e4380642f09c42334f5d06314-b30f0b24ce995299.elb.us-west-2.amazonaws.com",
                                        "portValue": 15443
                                    }
                                }
                            },
                            "loadBalancingWeight": 1
                        }
                    ],
                    "loadBalancingWeight": 50
                }
            ]
        },
        "circuitBreakers": {
            "thresholds": [
                {
                    "maxConnections": 4294967295,
                    "maxPendingRequests": 4294967295,
                    "maxRequests": 4294967295,
                    "maxRetries": 4294967295
                }
            ]
        },
        "dnsRefreshRate": "5s",
        "respectDnsTtl": true,
        "dnsLookupFamily": "V4_ONLY",
        "outlierDetection": {
            "consecutive5xx": 10,
            "interval": "5s",
            "baseEjectionTime": "120s",
            "enforcingConsecutive5xx": 100
        },
        "commonLbConfig": {
            "healthyPanicThreshold": {},
            "localityWeightedLbConfig": {}
        },
        "transportSocket": {
            "name": "envoy.transport_sockets.tls",
            "typedConfig": {
                "@type": "type.googleapis.com/envoy.api.v2.auth.UpstreamTlsContext",
                "commonTlsContext": {
                    "tlsCertificateSdsSecretConfigs": [
                        {
                            "name": "default",
                            "sdsConfig": {
                                "apiConfigSource": {
                                    "apiType": "GRPC",
                                    "grpcServices": [
                                        {
                                            "envoyGrpc": {
                                                "clusterName": "sds-grpc"
                                            }
                                        }
                                    ]
                                }
                            }
                        }
                    ],
                    "combinedValidationContext": {
                        "defaultValidationContext": {},
                        "validationContextSdsSecretConfig": {
                            "name": "ROOTCA",
                            "sdsConfig": {
                                "apiConfigSource": {
                                    "apiType": "GRPC",
                                    "grpcServices": [
                                        {
                                            "envoyGrpc": {
                                                "clusterName": "sds-grpc"
                                            }
                                        }
                                    ]
--
                "sni": "outbound_.80_._.default.greeting-sample-showgtp.global"
            }
        },
        "metadata": {
            "filterMetadata": {
                "istio": {
                    "config": "/apis/networking.istio.io/v1alpha3/namespaces/admiral-sync/destination-rule/default.greeting-sample-showgtp.global-default-dr"
                }
            }
        },
        "filters": [
            {
                "name": "istio.metadata_exchange",
                "typedConfig": {
                    "@type": "type.googleapis.com/udpa.type.v1.TypedStruct",
                    "typeUrl": "type.googleapis.com/envoy.tcp.metadataexchange.config.MetadataExchange",
                    "value": {
                        "protocol": "istio-peer-exchange"
                    }
                }
            }
        ]
    },

I have a set up us-east-1 (admiral server and admiral remote) us-west-2 (admiral remote), actually i see the 50/50 distribute works in west but not in the east.

The east cluster goes to west (the LB), but looks like the west LB still returns the east response. So finally it looks like always in the east

us-west-2

 "name": "outbound|80||default.greeting-sample-showgtp.global",
        "type": "STRICT_DNS",
        "connectTimeout": "10s",
        "loadAssignment": {
            "clusterName": "outbound|80||default.greeting-sample-showgtp.global",
            "endpoints": [
                {
                    "locality": {
                        "region": "us-east-1"
                    },
                    "lbEndpoints": [
                        {
                            "endpoint": {
                                "address": {
                                    "socketAddress": {
                                        "address": "a4e692a23991b478ca62ea84881d79da-53c356a7441bc499.elb.us-east-1.amazonaws.com",
                                        "portValue": 15443
                                    }
                                }
                            },
                            "loadBalancingWeight": 1
                        }
                    ],
                    "loadBalancingWeight": 50
                },
                {
                    "locality": {
                        "region": "us-west-2"
                    },
                    "lbEndpoints": [
                        {
                            "endpoint": {
                                "address": {
                                    "socketAddress": {
                                        "address": "greeting.sample-showgtp.svc.cluster.local",
                                        "portValue": 80
                                    }
                                }
                            },
                            "loadBalancingWeight": 1
                        }
                    ],
                    "loadBalancingWeight": 50
                }
            ]
        },
        "circuitBreakers": {
            "thresholds": [
                {
                    "maxConnections": 4294967295,
                    "maxPendingRequests": 4294967295,
                    "maxRequests": 4294967295,
                    "maxRetries": 4294967295
                }
            ]
        },
        "dnsRefreshRate": "5s",
        "respectDnsTtl": true,
        "dnsLookupFamily": "V4_ONLY",
        "outlierDetection": {
            "consecutive5xx": 10,
            "interval": "5s",
            "baseEjectionTime": "120s",
            "enforcingConsecutive5xx": 100
        },
        "commonLbConfig": {
            "healthyPanicThreshold": {},
            "localityWeightedLbConfig": {}
        },
        "transportSocket": {
            "name": "envoy.transport_sockets.tls",
            "typedConfig": {
                "@type": "type.googleapis.com/envoy.api.v2.auth.UpstreamTlsContext",
                "commonTlsContext": {
                    "tlsCertificateSdsSecretConfigs": [
                        {
                            "name": "default",
                            "sdsConfig": {
                                "apiConfigSource": {
                                    "apiType": "GRPC",
                                    "grpcServices": [
                                        {
                                            "envoyGrpc": {
                                                "clusterName": "sds-grpc"
                                            }
                                        }
                                    ]
                                }
                            }
                        }
                    ],
                    "combinedValidationContext": {
                        "defaultValidationContext": {},
                        "validationContextSdsSecretConfig": {
                            "name": "ROOTCA",
                            "sdsConfig": {
                                "apiConfigSource": {
                                    "apiType": "GRPC",
                                    "grpcServices": [
                                        {
                                            "envoyGrpc": {
                                                "clusterName": "sds-grpc"
                                            }
                                        }
                                    ]
--
                "sni": "outbound_.80_._.default.greeting-sample-showgtp.global"
            }
        },
        "metadata": {
            "filterMetadata": {
                "istio": {
                    "config": "/apis/networking.istio.io/v1alpha3/namespaces/admiral-sync/destination-rule/default.greeting-sample-showgtp.global-default-dr"
                }
            }
        },
        "filters": [
            {
                "name": "istio.metadata_exchange",
                "typedConfig": {
                    "@type": "type.googleapis.com/udpa.type.v1.TypedStruct",
                    "typeUrl": "type.googleapis.com/envoy.tcp.metadataexchange.config.MetadataExchange",
                    "value": {
                        "protocol": "istio-peer-exchange"
                    }
                }
            }
        ]
    },

curl -HHost:default.greeting-sample-showgtp.global a5020c7e4380642f09c42334f5d06314-b30f0b24ce995299.elb.us-west-2.amazonaws.com always returns the east answer

--- UPDATE ---
Found that if the west cluster has more weights, then the request from west will always be in west cluster.

(US-EAST-1 >= 50, US-WEST-2) -> Request from East will always return East, West is good
(US-WEST-2> 50, US-EAST-1) -> Request from West will always return West, East is good

This means if the cluster (locality) has more weight, it could result in the requests from its own cluster fall in its cluster always. (Because the LB always resolves to its own cluster)

@aattuluri aattuluri added not a bug and removed bug Something isn't working labels Sep 2, 2020
itsLucario pushed a commit to itsLucario/admiral that referenced this issue Aug 9, 2022
…ESH-1743-DR-bk

MESH-1743 Fix missed vs delete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants