Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#849 #1496 Extend the route key format used for load balancing making it unique #1944

Merged
merged 26 commits into from
Feb 29, 2024

Conversation

sliekens
Copy link
Contributor

@sliekens sliekens commented Jan 27, 2024

Fixes #849 #1496

Proposed Changes

  • Make the RouteKeyCreator aware of UpstreamHost routing
  • Make the RouteKeyCreator aware of ServiceDiscovery with ServiceName and ServiceNamespace

Multiple routes with identical UpstreamMethod and UpstreamPath, but different UpstreamHost and ServiceName or DownstreamHostAndPort will now be load-balanced correctly.

@sliekens sliekens changed the base branch from main to develop January 27, 2024 05:28
@raman-m raman-m changed the title Fix the route key format used for load balancing #1496 Extend the route key format used for load balancing making it unique Jan 27, 2024
@raman-m raman-m added bug Identified as a potential bug Service Discovery Ocelot feature: Service Discovery Load Balancer Ocelot feature: Load Balancer Jan'24 January 2024 release labels Jan 27, 2024
@raman-m raman-m added this to the January'24 milestone Jan 27, 2024
Copy link
Member

@raman-m raman-m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Steven,
Your proposal for the key format is

UpstreamHttpMethod,+|UpstreamHost|UpstreamPathTemplate|ServiceNamespace|ServiceName
OR
UpstreamHttpMethod,+|UpstreamHost|UpstreamPathTemplate|Host:Port,+

Maybe to include Type of Load Balancer for the second variant instead of Host:Port?
Because UpstreamPathTemplate cannot vary, so it is unique, seems Host:Port is redundant... and type of load balancer cannot vary too...

src/Ocelot/Configuration/Creator/RouteKeyCreator.cs Outdated Show resolved Hide resolved
src/Ocelot/Configuration/Creator/RouteKeyCreator.cs Outdated Show resolved Hide resolved
@raman-m
Copy link
Member

raman-m commented Jan 27, 2024

@ggnaegi Welcome to code review!

@ggnaegi
Copy link
Member

ggnaegi commented Jan 27, 2024

@ggnaegi Welcome to code review!

@raman-m yes but a I said i will have some time on Wednesday night.

@sliekens
Copy link
Contributor Author

Steven, Your proposal for the key format is

UpstreamHttpMethod,+|UpstreamHost|UpstreamPathTemplate|ServiceNamespace|ServiceName
OR
UpstreamHttpMethod,+|UpstreamHost|UpstreamPathTemplate|Host:Port,+

Maybe to include Type of Load Balancer for the second variant instead of Host:Port? Because UpstreamPathTemplate cannot vary, so it is unique, seems Host:Port is redundant... and type of load balancer cannot vary too...


I thought about it, what I want to do is make the route key format the same for all types of configs, because the differences in the format are causing me confusion.

Proposed format, which now includes the load balancer options instead of downstream host/port:

# sticky sessions
CookieStickySessions:LoadBalancerOptions.Key

# everything else
UpstreamHttpMethod|UpstreamPathTemplate|UpstreamHost|ServiceNamespace|ServiceName|LoadBalancerOptions.Type|LoadBalancerOptions.Key

However, for this to work correctly, empty parts must be kept, because removing empty parts will again lead to ambiguous route keys.

For example:

GET|/categories|my-api|LeastConnection

It's not clear if this means my-api is the UpstreamHost or ServiceName. The empty parts must be kept to solve this ambiguity.

@sliekens
Copy link
Contributor Author

sliekens commented Jan 28, 2024

I had to make a few more changes based on new information and test results. The route key is now calculated uniformly for all kinds of load balancers (except sticky session load balancing which uses static keys.)

The format is now:

UpstreamHttpMethod|UpstreamPathTemplate|UpstreamHost|DownstreamHostAndPort|ServiceNamespace|ServiceName|LoadBalancerType|LoadBalancerKey

I had to keep the DownstreamHostAndPort because a test started failing without it. Route configuration can be changed during runtime when Ocelot.Administration is used, so downstream host and ports must be included in the route key. A better solution would be to clear the LoadBalancerHouse when the route configuration is changed, but I think that is out of scope for this PR.

I added LoadBalancerType and LoadBalancerKey as suggested.

For undefined route properties, the undefined parts are replaced by default values:

-GET,POST,PUT|/api/product||10.0.0.1:8080||||
+GET,POST,PUT|/api/product|no-host|10.0.0.1:8080|no-svc-ns|no-svc-name|no-lb-type|no-lb-key

The default values don't serve any technical purpose, but it came up in the review that empty parts look like a mistake when it's actually not.

@raman-m
Copy link
Member

raman-m commented Jan 31, 2024

@sliekens added 4 commits on Jan 28

Thanks for issues resolving and working on code review more!
I'm busy these days... and sorry, I'll review after the release we have this week...

@sliekens
Copy link
Contributor Author

Let me know if you have any more suggestions. Otherwise, I'm happy with the current state of this branch.

Copy link
Member

@ggnaegi ggnaegi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, thanks

src/Ocelot/Configuration/Creator/RouteKeyCreator.cs Outdated Show resolved Hide resolved
@sliekens
Copy link
Contributor Author

@raman-m do you have time this week to review?

Copy link
Member

@raman-m raman-m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@raman-m do you have time this week to review?

@sliekens Yes, I do..now... 😉
We have some unit tests. Nice!
It would be good if we add 1 acceptance test too.
I guess we have some acceptance tests already: so we could update old ones.
But better to add new test which covers UpstreamHost user scenario and other props of the route.

@raman-m raman-m changed the title #1496 Extend the route key format used for load balancing making it unique #849 #1496 Extend the route key format used for load balancing making it unique Feb 15, 2024
@sliekens
Copy link
Contributor Author

sliekens commented Feb 16, 2024

@raman-m I added a test with two services, each with a different UpstreamHost and ServiceName and response body.

I have verified that it fails as expected on the develop branch, before my changes:

(Ocelot reuses the route of the first service, load balancing is not working correctly)

image

When I run the test on this branch, it passes.

Additional info:

  • I changed GivenThereIsAFakeConsulServiceDiscoveryProvider because it did not already support multiple services
  • I changed GivenOcelotIsRunningWithConsul to accept a list of URLs to listen on, because it's needed for UpstreamHost routing

As before, let me know if you have any more questions or suggestions, otherwise, I'm happy with these changes as is.

@raman-m
Copy link
Member

raman-m commented Feb 27, 2024

Questionz

  • Have you checked different types of load balancers? See LoadBalancers folder or search for them
  • Which type were you focused on?

@raman-m raman-m self-assigned this Feb 27, 2024
@sliekens
Copy link
Contributor Author

I did all of my testing with the LeastConnection load balancer. However I want to emphasize that the bug was in the LoadBalancerHouse, so the issue should present itself with any load balancer, probably including the NoLoadBalancer implementation of ILoadBalancer.

@raman-m
Copy link
Member

raman-m commented Feb 28, 2024

so the issue should present itself with any load balancer

I've converted the acceptance fact to theory:
https://github.com/sliekens/Ocelot/blob/bugfix/route-keys/test/Ocelot.AcceptanceTests/ServiceDiscoveryTests.cs#L461-L468
And now it also passes. So, the theory is 🟢 🤔
I expected that at least one "CookieStickySessions" test fails, but it is green too.
Could you explain me why the theory works for all types of load balancers?

Also, in debug session I noticed that key is build always without reading from the cache...
Seems the acceptance test scenario should be longer! It must contain more steps!
Otherwise it may be useless.


I have verified that it fails as expected on the develop branch, before my changes:

(Ocelot reuses the route of the first service, load balancing is not working correctly)
image

I hope that is correct.

@raman-m
Copy link
Member

raman-m commented Feb 28, 2024

Based on docs in case of Upstream Host in a route
The quote:

This works by looking at the Host header the client has used and then using this as part of the information we use to identify a Route.

and

The Route above will only be matched when the Host header value is somedomain.com.

and

If you do not set UpstreamHost on a Route then any Host header will match it.

Question: How do you setup the header in acceptance tests?
Is it added implicitly?

@sliekens
Copy link
Contributor Author

Could you explain me why the theory works for all types of load balancers?

It's important to understand that the LoadBalancerHouse was causing the problem, the actual load balancers don't use the route key. So, the issue is the same for all types of load balancers, including sticky session.

You can copy the acceptance test to the develop branch and run it to see it fail. If you put a breakpoint in the LoadBalancerHouse, you should see the problem when calling the second service.

Also, in debug session I noticed that key is build always without reading from the cache...
Seems the acceptance test scenario should be longer! It must contain more steps!

That is intended. The cache should not be used when calling two different services exactly once. I could add a step to call the same service more than once, which then uses the cache instead of making another call to Consul. However, I felt that it is already sufficiently covered by other existing tests.

@sliekens
Copy link
Contributor Author

Question: How do you setup the header in acceptance tests?
Is it added implicitly?

Yep, it is added implicitly, when you do this:

_ocelotClient.GetAsync("http://us-shop/products");

It is translated to a request like:

GET /products HTTP/1.1
Host: us-shop

@raman-m
Copy link
Member

raman-m commented Feb 28, 2024

Thanks for the explanation! It's much clearer now.


_ocelotClient.GetAsync("http://us-shop/products");

It is translated to a request like:

GET /products HTTP/1.1
Host: us-shop

But you don't call http://us-shop/products in the test, but http://us-shop only.
Also in the lines
https://github.com/sliekens/Ocelot/blob/bugfix/route-keys/test/Ocelot.AcceptanceTests/ServiceDiscoveryTests.cs#L521-L531
ServiceName has value product-us, but not products.
My expectation that ServiceName should be the same as products or product, but they differ: product-us and product-eu...
Is my assumption wrong? I see that you use two Consul service entries... Or in your scenario must Consul have 2 entries?
So, I'm confused by different ServiceNames in both routes...

@raman-m
Copy link
Member

raman-m commented Feb 28, 2024

That is intended.

Intended not to load balancer? )) But this is the test about load balancing 🤣


The cache should not be used when calling two different services exactly once.

Clear!


I could add a step to call the same service more than once, which then uses the cache instead of making another call to Consul.

Yes, please make the test complete.


However, I felt that it is already sufficiently covered by other existing tests.

Other acceptance tests even use counter approach to check times of calling!
Moreover, even your test regarding UpstreamHost should check the number of calls via counter.
The Counter approach is provided by 2 helpers:

private void GivenProductServiceOneIsRunning(string url, int statusCode)
private void GivenProductServiceTwoIsRunning(string url, int statusCode)

Copy link
Member

@raman-m raman-m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found major issue with usage of GivenThereIsAServiceRunningOn which is based on the one private variable. But the status of both downstream services must use different private variables!

Comment on lines 546 to 547
this.Given(x => x.GivenThereIsAServiceRunningOn(downstreamServiceUrlUS, "/", 200, responseBodyUS))
.And(x => x.GivenThereIsAServiceRunningOn(downstreamServiceUrlEU, "/", 200, responseBodyEU))
Copy link
Member

@raman-m raman-m Feb 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is wrong!
Pay attention to other tests which use 2 services.
Valid setup should be:

Suggested change
this.Given(x => x.GivenThereIsAServiceRunningOn(downstreamServiceUrlUS, "/", 200, responseBodyUS))
.And(x => x.GivenThereIsAServiceRunningOn(downstreamServiceUrlEU, "/", 200, responseBodyEU))
this.Given(x => x.GivenProductServiceOneIsRunning(downstreamServiceUrlUS, "/", 200, responseBodyUS))
.And(x => x.GivenProductServiceTwoIsRunning(downstreamServiceUrlEU, "/", 200, responseBodyEU))

In current GivenThereIsAServiceRunningOn helper you use shared private variable _downstreamPath deciding on status OK. But I guess it is wrong!
You have to decide on url + path, but not only on path.

Optional

Also, _serviceHandler should be different! So, you have to start two instances of webservers of downstream services.

@sliekens
Copy link
Contributor Author

sliekens commented Feb 28, 2024

My expectation that ServiceName should be the same as products or product, but they differ: product-us and product-eu...
Is my assumption wrong? I see that you use two Consul service entries... Or in your scenario must Consul have 2 entries?
So, I'm confused by different ServiceNames in both routes...

The setup is: 2 different product services, with different names, and 2 corresponding Consul entries. Requests to Ocelot with Host: us-shop should re-route to product-us while Host: eu-shop should re-route to product-eu.

image

image

There must be no load balancing between product-us and product-eu because they return different results (e.g. phone chargers with different plugs for US and EU markets).

The bug in develop branch is that Host: us-shop and Host: eu-shop are both routed to product-us (or product-eu, depending which route was cached first) because the routes, although they are different, have the same route key.

PS: I originally said that calls to Consul are cached, but that is false. Only responses from ILoadBalancerFactory are cached. The created ILoadBalancer still calls Consul each time. I think this is the intended design and correct.

@sliekens
Copy link
Contributor Author

sliekens commented Feb 28, 2024

Intended not to load balancer? )) But this is the test about load balancing 🤣

@raman-m with the explanation above, do you still feel the test is incomplete? While it is true that the bug is in the LoadBalancerHouse, it actually has very little to do with load balancing, which is known and proven to work by other existing tests.

/edit: I added a Consul request counter and repetitive Ocelot calls, to verify load balancing and service discovery are working correctly. I hope these changes are in line with your expectations, otherwise don't hesitate to comment.

@raman-m
Copy link
Member

raman-m commented Feb 29, 2024

@sliekens commented on Feb 28

I love this explanation! ❤️ 😉 Thanks!


PS: I originally said that calls to Consul are cached, but that is false. Only responses from ILoadBalancerFactory are cached.

Correct!


The created ILoadBalancer still calls Consul each time. I think this is the intended design and correct.

Correct! Default implicit type of poller is Ocelot.Provider.Consul.Consul which behaves as transient service calling Consul each request. There is another PollConsul type which has caching mechanism.

@raman-m
Copy link
Member

raman-m commented Feb 29, 2024

@sliekens wrote

@raman-m with the explanation above, do you still feel the test is incomplete? While it is true that the bug is in the LoadBalancerHouse, it actually has very little to do with load balancing, which is known and proven to work by other existing tests.

After your adding last 4 commits the test is complete now! It looks great! 😍
Regarding LoadBalancerHouse class... OK, we 've fixed it. But yesterday I noticed one Clean Code issue aka code smell related to SOLID vs DRY principles. Going to fix it as the last commit of the PR...


/edit: I added a Consul request counter and repetitive Ocelot calls, to verify load balancing and service discovery are working correctly. I hope these changes are in line with your expectations, otherwise don't hesitate to comment.

Now the acceptance test is awesome thing! It shows most complete user scenario which can be described for UpstreamHost case.
Thanks for refactoring the test! 🤩

Finally, I'd say: we have Done development!
Thank you for your great contribution! 😋

Copy link
Member

@raman-m raman-m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ready for delivery! ✅

  • Code review ✔️
  • Team approvals ✔️
  • Unit tests ✔️
  • Acceptance test(s) ✔️

@raman-m raman-m merged commit 8845d1b into ThreeMammals:develop Feb 29, 2024
2 checks passed
@raman-m raman-m mentioned this pull request Mar 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Identified as a potential bug Jan'24 January 2024 release Load Balancer Ocelot feature: Load Balancer Service Discovery Ocelot feature: Service Discovery
Projects
None yet
4 participants