Feature request: Option to automatically create DNS record for virtual service #330

rizblie · 2021-03-12T16:07:05Z

I've just been hit by the issue documented at: #71, where my application front-end could not resolve my back-end virtual service name because there was no DNS entry for it.

While the docs have been updated to address this issue, I don't think the recommended approach makes sense in situations where a virtual service is associated with a virtual router (or wherever there is a possibility of multiple real service implementations).

The virtual service docs say:

You can choose any name, but the service discovery name of the real service that you're targeting, such as my-service.default.svc.cluster.local, is recommended to make it easier to correlate your virtual services to real services and so that you don't need to change your code to reference a different name than your code currently references.

One of the purposes served by a virtual service is that you can route traffic to one of many underlying real service implementations, each of which have their own distinct names e.g. colorteller-red, colorteller-blue etc. In such cases, the name of the virtual service should be neutral, and should not be tied to any one specific underlying service implementation i.e. in this case the best name would simply be colorteller.

Another use of virtual services is to support blue/green deployments when moving to a new version e.g. serviceA-v1 moving to serviceA-v2, and for this reason it would not make sense for the virtual service name to include the version number of the first (or any) real version - it should just be serviceA.

For cases like the above, where the virtual service name needs to be distinct from all the underlying real service names, it would be great if App Mesh could offer the option to automatically create a DNS record for a virtual service when it is created.

The text was updated successfully, but these errors were encountered:

bigdefect · 2021-03-13T00:08:27Z

Hey Mike. I can see why that paragraph is interpreted that way. I think the intent was to work forward from an existing application which is already referencing some service by a name. So the virtual service could be serviceA.local to match the application, and the initial virtual node would probably also have serviceA.local as service discovery (since a service is already being served by that name), until new revisions come up which could arbitrarily change to serviceA-v2.local and so on just for the service discovery component of the virtual nodes.

So, to be clear, your last paragraph is the intent for virtual services; stable name while the services behind it change.

The problem arises because today the proxy isn't intercepting DNS, so the application still needs to resolve the name before envoy takes over for the real service discovery dns. A bunch of our examples use a cloudmap private DNS namespace and have the virtual service match the service discovery, so the records are created. Other workarounds include creating the dummy record in route53 or in the local hosts file; the application just needs to send the request out to the proxy.

I'm not sure if App Mesh creating DNS records is the right mechanism for this; removing the need to have resolvable DNS for the virtual service name may be a better solution, to have the proxy take over that initial DNS response for the application. I thought we already had an open issue I could point you to (to see if it matches up with your current understanding), I'll check with the team.

Did I miss anything?

rajal-amzn · 2021-03-15T23:58:55Z

I agree with @efe-selcuk. We have an open issue to track naming Virtual Services by any name instead of FQDN.
@rizblie Would you mind if I close this in favor of this issue

rizblie · 2021-03-16T14:23:24Z

Hi @efe-selcuk - I understand your explanation, and it makes sense where there is an existing application that has been "meshified". But where a new application is being deployed from scratch, it feels wrong to create the expectation that version 1 should assume the "versionless" name, and that only subsequent versions should have different names. What is so special about version1 that it should follow a different naming convention than subsequent versions?

Right now I am building a pipeline that will deploy a new ECS Service which includes the CommitID in the service name, so that each version has a different ECS service name. The intention is that the mesh virtual service+router can then be used to shift traffic to the new version. If I follow the AWS docs recommendation, then I cannot deploy version 1 using my pipeline because the name would need to be "versionless".

In such a situation I think that the "versionless" name belongs to the virtual service, and not to any specific version of the service.

I understand why the problem arises. I just question whether the recommended solution is always the right solution, and that is why I thought having the option of adding the DNS name as part of the virtual service creation could work. But I agree that removing the need to have resolvable DNS for the virtual service name would be a better solution.

rizblie · 2021-03-16T14:41:43Z

Hi @rajal-amzn - issue #65 seems to be about using simple hostnames to call dependencies. If the solution for this means that I can also use a FQDN which is not resolvable by DNS, then that works for me. But this is not clear from the description on the ticket - it refers to ticket #71 , but it does not explicitly state that the unresolvable FQDN issue will be addressed by the same solution as for simple hostnames.

saleem-mirza · 2021-03-19T01:00:36Z

The documentation suggests that If application needs to access external services, I need to name my virtual services same as of external service.

e.g. if application is accessing google.com, I need to create a virtual service with name google.com too. I am wondering what was reason behind this rigid design decision.

If my application needs to point a different resource, I need to update my application, virtual node, virtual services. This is alot to change and kills purpose of using Service Mesh.

What I'm proposing that application can choose any generic name for service and Virtualnode should be responsible for routing traffic to desired service based upon dns hostname. This could simplify it greatly.

rizblie · 2021-03-19T18:27:05Z

Seems to me that there are at least 2 considerations which give rise to various use cases. Consider a situation where a downstream service A is calling an upstream service B which is configured as a virtual service in the mesh.

1. Downstream service reference to upstream service name
a. The downstream service code may already call an upstream service using a FQDN DNS name which (for whatever reason) cannot be changed easily e.g. serviceB.demo.local. In this case we want the virtual service to respond to the same FQDN name.
b. If the downstream service is new, or can easily be modified to change the target upstream service name, then a simple name e.g. serviceB will suffice i.e. does not need to be a FQDN.

2. Name of the upstream service implementing the virtual service
a. An existing upstream service is being meshified, and the virtual service name can reflect the current name of the upstream service - whether it is 1a or 1b above. Future versions of the service may have different names, but the virtual service name remains unchanged and is set up to route to new versions of the service as they are deployed.
b. A virtual service is being created for a new upstream service, where the service name always contains a version number (or Commit ID) as might be deployed by a pipeline. In this case the virtual service name is not the same as the first version of the service. Again this can be combined with both 1a and 1b.

In all combinations of these cases, the best solution would be for the Envoy proxy for the downstream service to map the referenced upstream name to the correct service (based on the virtual service/router config) WITHOUT any need for DNS resolution.

If the downstream also wishes to call upstream services outside the mesh then IMHO it is useful to model these external upstream services within the mesh, using a virtual node and virtual router as described here. That way it is possible to change the target upstream service through the virtual service e.g. to move to a new version of an external API, without changing the downstream service.

In this case though the Envoy proxy of the downstream service would need to perform a DNS lookup, as there is no Envoy proxy associated with the external service to do this. Maybe we should have Envoy-only UpstreamProxies for external services, just as we have Envoy-only virtual gateways for ingress?

rizblie mentioned this issue Mar 29, 2021

Intercept and respond to DNS queries for Virtual Services using Envoy's DNS filter #65

Open

Y0Username added the feature label Mar 31, 2021

herrhound self-assigned this Apr 29, 2021

herrhound added the Phase: To Be Prioritized label Apr 29, 2021

shsahu removed the Phase: To Be Prioritized label Feb 5, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: Option to automatically create DNS record for virtual service #330

Feature request: Option to automatically create DNS record for virtual service #330

rizblie commented Mar 12, 2021

bigdefect commented Mar 13, 2021

rajal-amzn commented Mar 15, 2021

rizblie commented Mar 16, 2021

rizblie commented Mar 16, 2021

saleem-mirza commented Mar 19, 2021

rizblie commented Mar 19, 2021

Feature request: Option to automatically create DNS record for virtual service #330

Feature request: Option to automatically create DNS record for virtual service #330

Comments

rizblie commented Mar 12, 2021

bigdefect commented Mar 13, 2021

rajal-amzn commented Mar 15, 2021

rizblie commented Mar 16, 2021

rizblie commented Mar 16, 2021

saleem-mirza commented Mar 19, 2021

rizblie commented Mar 19, 2021