Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamically create cluster while the router decoding Headers #1998

Closed
linchunquan opened this issue Nov 3, 2017 · 5 comments
Closed

Dynamically create cluster while the router decoding Headers #1998

linchunquan opened this issue Nov 3, 2017 · 5 comments
Labels
question Questions that are neither investigations, bugs, nor enhancements

Comments

@linchunquan
Copy link

Issue Template

Title: Dynamically create cluster while the router decoding http headers

Description:

Hi, I want to build a dynamical service-discovery mechanism base on Sidecar-Pattern for gRPC services on k8s, and envoy would take an important role as the sidecar which will proxy all grpc request/response traffic. In my case, the clusters are unknown to the service consumer until gRPC request coming. For this reason I will leave the cluster name empty in the route config section, such as the following .

  "routes": [{
      "timeout_ms": 0,
      "prefix": "/",
      "headers": [
         {"name": "content-type", "value": "application/grpc"}
      ],
      /**no cluster name defined here*/
  }]

Each gRPC request sent by the consumer would hold a target service id that corresponding to k8s service name in the header( e.g. X-target-service: sample-service, and 'sample-service' can be found out through k8s API).

While the envoy router decoding http headers and meeting a header named ‘X-target-service’, I suppose it could first get the cluster object from a hash map by key of the ‘X-target-service’ header value. If the cluster object is null, then invoke k8s API, fetching endpoints of the target service , and finally create a new cluster with the target service name and endpoints.

the whole process can be briefly summarized as follows:

  1. The consumer send a gRPC request to envoy with additional header ‘X-target-service’ which contains the target service Id.
  2. Envoy try to find out if there is cluster named the same as the target service id.
  3. If the cluster with named of the target service id not existed, then get service endpoints through external APIs, and create a new cluster finally.

Would it possible to implement this mechanism?

Thanks very much

@mattklein123 mattklein123 added the question Questions that are neither investigations, bugs, nor enhancements label Nov 3, 2017
@mattklein123
Copy link
Member

I don't think this is something we are likely to support in Envoy any time soon. It's not that it's impossible, it's more that it's a big departure from the separation of concerns between data plane and control plane that we have in Envoy today. We already support header routing, but assume that an external CDS server is providing the clusters that the Envoy needs.

I would recommend taking a look at Istio for this case.

cc @rshriram

@rshriram
Copy link
Member

rshriram commented Nov 5, 2017

This is basically what we do in istio. A core tenet of Envoy’s design is to eliminate expensive operations from the data plane, shove all the complexity into the control plane.

In istio, we watch k8s api server, and push new clusters per service and update routes accordingly. You then write route rules that match based on http header and pick the target cluster. All this gets materialized in Envoy via CDs and RDS

@linchunquan
Copy link
Author

linchunquan commented Nov 6, 2017

Hi Rshriram, Istio is really cool. However, I don't want to watch all services' changes in k8s, because we have had a k8s cluster which contain more than 2000 services. In my case, I just need to check the http header that points to the target service ID. And, only when the cluster not existing in envoy’s clusters pool, then check k8s APIs and create a new one.

@linchunquan
Copy link
Author

linchunquan commented Nov 6, 2017

I have tried to add codes of dynamically creating cluster in the function ConnectionManagerImpl::ActiveStream::decodeHeaders(HeaderMapPtr&& headers, bool end_stream) of
/source/common/http/conn_manager_impl.cc

void ConnectionManagerImpl::ActiveStream::decodeHeaders(HeaderMapPtr&& headers, bool end_stream) {
... ...
const std::string service_id = (*request_headers_).get(Http::LowerCaseString("target-service"))->value().c_str();
  ENVOY_STREAM_LOG(debug, "header of target service '{}'", *this, service_id);
  //mock a response of fetching cluster config dynamically
  const std::string response_body = "{\"clusters\": [{\"name\": \"target-service\",\"connect_timeout_ms\": 250,\"type\": \"static\",\"lb_type\": \"round_robin\",\"features\": \"http2\",\"hosts\":[{\"url\":\"tcp://127.0.0.1:50051\"}] }]}";
  Json::ObjectSharedPtr response_json = Json::Factory::loadFromString(response_body);
  response_json->validateSchema(Json::Schema::CDS_SCHEMA);
  ENVOY_STREAM_LOG(debug, "success validate json '{}'", *this, response_body);
  std::vector<Json::ObjectSharedPtr> clusters = response_json->getObjectArray("clusters");
  Protobuf::RepeatedPtrField<envoy::api::v2::Cluster> resources;
  for (const Json::ObjectSharedPtr& cluster : clusters) {
    ENVOY_STREAM_LOG(debug, "looping clusters '{}'", *this, 1);
    envoy::api::v2::ConfigSource eds_config;
    Envoy::Config::CdsJson::translateCluster(*cluster, eds_config, *resources.Add());
  }
  Upstream::ThreadLocalCluster* cluster = connection_manager_.cluster_manager_.get(service_id);
  if (!cluster) {
    ENVOY_STREAM_LOG(debug, "create new cluster '{}'", *this, service_id);
    for (auto& c : resources) {
      const std::string cluster_name = c.name();
      if (connection_manager_.cluster_manager_.addOrUpdatePrimaryCluster(c)) {
        ENVOY_LOG(info, "cds: add/update cluster '{}'", cluster_name);
      }
    }
  }
... ...
}

Then, fetch the fresh-create cluster in the function Http::FilterHeadersStatus Filter::decodeHeaders(Http::HeaderMap& headers, bool end_stream) of /source/common/router/router.cc

Http::FilterHeadersStatus Filter::decodeHeaders(Http::HeaderMap& headers, bool end_stream) {
... ...
  route_entry_ = route_->routeEntry();
  Upstream::ThreadLocalCluster* cluster = config_.cm_.get(route_entry_->clusterName());
  if (!cluster) {
    const std::string service_id = headers.get(Http::LowerCaseString("target-service"))->value().c_str();
    ENVOY_STREAM_LOG(debug, "try to find cluster of header-target: '{}'", *callbacks_, service_id);
    Upstream::ThreadLocalCluster* cluster = config_.cm_.get(service_id);
    if (!cluster) {
      config_.stats_.no_cluster_.inc();
      ENVOY_STREAM_LOG(debug, "unknown cluster '{}'", *callbacks_, route_entry_->clusterName());
      callbacks_->requestInfo().setResponseFlag(Http::AccessLog::ResponseFlag::NoRouteFound);
      Http::HeaderMapPtr response_headers{new Http::HeaderMapImpl{
          {Http::Headers::get().Status, std::to_string(enumToInt(Http::Code::NotFound))}}};
      callbacks_->encodeHeaders(std::move(response_headers), true);
      return Http::FilterHeadersStatus::StopIteration;
    }
  }else{
    ENVOY_STREAM_LOG(debug, "found cluster '{}'", *callbacks_, route_entry_->clusterName());
  }
... ...
}

I found severe errors would occur for flowing code:

Upstream::ThreadLocalCluster* cluster = config_.cm_.get(service_id);

and error logs were:

[2017-11-06 17:17:03.308][6302][info][upstream] source/common/upstream/cluster_manager_impl.cc:286] add/update cluster target-service
[2017-11-06 17:17:03.308][6303][debug][upstream] source/common/upstream/cluster_manager_impl.cc:294] adding TLS cluster target-service
[2017-11-06 17:17:03.308][6304][debug][upstream] source/common/upstream/cluster_manager_impl.cc:294] adding TLS cluster target-service
[2017-11-06 17:17:03.308][6305][debug][upstream] source/common/upstream/cluster_manager_impl.cc:294] adding TLS cluster target-service
[2017-11-06 17:17:03.308][6302][debug][upstream] source/common/upstream/cluster_manager_impl.cc:294] adding TLS cluster target-service
[2017-11-06 17:17:03.308][6302][info][http] source/common/http/conn_manager_impl.cc:515] cds: add/update cluster 'target-service'
[2017-11-06 17:17:03.308][6302][debug][router] source/common/router/router.cc:286] [C1][S4096203911101847691] try to find cluster of header-target: 'target-service'
[2017-11-06 17:17:03.308][6302][critical][backtrace] bazel-out/local-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:101] Caught Segmentation fault, suspect faulting address 0x0
[2017-11-06 17:17:03.308][6302][critical][backtrace] bazel-out/local-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:85] Backtrace obj<./envoy> thr<6302> (use tools/stack_decode.py):
[2017-11-06 17:17:03.308][6302][critical][backtrace] bazel-out/local-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<6302> #0 0x60d07e
[2017-11-06 17:17:03.308][6302][critical][backtrace] bazel-out/local-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<6302> #1 0x563bd3
[2017-11-06 17:17:03.308][6302][critical][backtrace] bazel-out/local-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<6302> #2 0x565f98
[2017-11-06 17:17:03.308][6302][critical][backtrace] bazel-out/local-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<6302> #3 0x57cbe7
[2017-11-06 17:17:03.308][6302][critical][backtrace] bazel-out/local-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<6302> #4 0x584cd0
[2017-11-06 17:17:03.308][6302][critical][backtrace] bazel-out/local-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<6302> #5 0x57b9e6
[2017-11-06 17:17:03.308][6302][critical][backtrace] bazel-out/local-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<6302> #6 0x561aa7
[2017-11-06 17:17:03.308][6302][critical][backtrace] bazel-out/local-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<6302> #7 0x53ccf8
[2017-11-06 17:17:03.308][6302][critical][backtrace] bazel-out/local-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<6302> #8 0x53bcf8
[2017-11-06 17:17:03.308][6302][critical][backtrace] bazel-out/local-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<6302> #9 0x53c354
[2017-11-06 17:17:03.308][6302][critical][backtrace] bazel-out/local-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<6302> #10 0x4c8517
[2017-11-06 17:17:03.308][6302][critical][backtrace] bazel-out/local-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<6302> #11 0x7cdc21
[2017-11-06 17:17:03.309][6302][critical][backtrace] bazel-out/local-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<6302> #12 0x7ce37e
[2017-11-06 17:17:03.309][6302][critical][backtrace] bazel-out/local-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<6302> #13 0x4c4708
[2017-11-06 17:17:03.309][6302][critical][backtrace] bazel-out/local-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<6302> #14 0x7d7acd
[2017-11-06 17:17:03.309][6302][critical][backtrace] bazel-out/local-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:93] thr<6302> obj</lib/x86_64-linux-gnu/libpthread.so.0>
[2017-11-06 17:17:03.309][6302][critical][backtrace] bazel-out/local-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<6302> #15 0x7f28f39ac181
[2017-11-06 17:17:03.309][6302][critical][backtrace] bazel-out/local-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:93] thr<6302> obj</lib/x86_64-linux-gnu/libc.so.6>
[2017-11-06 17:17:03.309][6302][critical][backtrace] bazel-out/local-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<6302> #16 0x7f28f33d347c
[2017-11-06 17:17:03.309][6302][critical][backtrace] bazel-out/local-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:97] end backtrace thread 6302
Segmentation fault

It seems that it's not allow to consume a newly-created cluster during one request session which agains envoy's thread model. Is that correct?

@rshriram
Copy link
Member

rshriram commented Nov 6, 2017 via email

rshriram pushed a commit to rshriram/envoy that referenced this issue Oct 30, 2018
Signed-off-by: Wayne Zhang <qiwzhang@google.com>
jpsim added a commit that referenced this issue Nov 28, 2022
This reverts commit aeab7fe443e41a06687137a52002223f084f9db8.

Unfortunately still not working on M1.

E.g.

```
$ ./bazelw build //library/common:envoy_main_interface_lib
ERROR: While resolving toolchains for target @com_envoyproxy_protoc_gen_validate//:protoc-gen-validate: no matching toolchains found for types @io_bazel_rules_go//go:toolchain
ERROR: Analysis of target '//library/common:envoy_main_interface_lib' failed; build aborted: no matching toolchains found for types @io_bazel_rules_go//go:toolchain
```

Signed-off-by: JP Simard <jp@jpsim.com>
jpsim added a commit that referenced this issue Nov 29, 2022
This reverts commit aeab7fe443e41a06687137a52002223f084f9db8.

Unfortunately still not working on M1.

E.g.

```
$ ./bazelw build //library/common:envoy_main_interface_lib
ERROR: While resolving toolchains for target @com_envoyproxy_protoc_gen_validate//:protoc-gen-validate: no matching toolchains found for types @io_bazel_rules_go//go:toolchain
ERROR: Analysis of target '//library/common:envoy_main_interface_lib' failed; build aborted: no matching toolchains found for types @io_bazel_rules_go//go:toolchain
```

Signed-off-by: JP Simard <jp@jpsim.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Questions that are neither investigations, bugs, nor enhancements
Projects
None yet
Development

No branches or pull requests

3 participants