Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Downside of Round-Robin load balancing, and cache-awareness #10

Closed
metalalive opened this issue Mar 20, 2023 · 4 comments
Closed

Downside of Round-Robin load balancing, and cache-awareness #10

metalalive opened this issue Mar 20, 2023 · 4 comments

Comments

@metalalive
Copy link
Contributor

metalalive commented Mar 20, 2023

Hi, could you please explain how cache-awareness leads to the downside of Round-Robin load balancing ?

According to round-robin part of the article :

What's not good about it? It's not caching-aware, meaning multiple clients will face higher latencies because they're asking uncached servers.

Here is my understanding, when multiple clients ask for uncached content, the latency of the concurrent client requests comes from coalescing (the Nginx directive proxy_cache_lock) , only one request passed to upstream server, all other client requests simply wait until the fresh content is cached. It is probably not related to Round-Robin load balancing ?

thanks for the reply.

@leandromoreira
Copy link
Owner

leandromoreira commented Mar 21, 2023

Hi @metalalive =) sure, I'll try to provide my reasoning behind that sentence:

Here is my understanding, when multiple clients ask for uncached content, the latency of the concurrent client requests comes from coalescing (the Nginx directive proxy_cache_lock) , only one request passed to upstream server, all other client requests simply wait until the fresh content is cached. It

Your understanding seems right to me :)! The main idea is to serve fast, save bandwidth, save storage, avoid extra hops, etc.

Think about a CDN, with 100 nodes, where you have an unpopular content video.mp4 being accessed by 100 users. Since you're using RR, as the load balancing police, every SINGLE user will pay the latency because each server does not have the content cached in its area (that's the cache unawareness). You can add on top of that the disks/SSD storage will be filled with content (not so hot) that could be better distributed.

However, if you have hugely popular content with millions of users the RR might be good enough at a CDN level, assuming that the user was directed to the closest CDN PoPsee figure1 (like in their ISP, or city). There's even a better algorithm than RR alone with, I'd say, almost the same characteristics, the random of two

figure1

image

The graph bellow shows that Alice, Bob, and Carol all paid the extra latency cost, and the CDN itself paid the cost of leaving the DC to fetch the source object (assuming a pull only nature). One can argue that you could organize your nodes, within the CDN, to avoid some of these issues.

I believe that everything comes down to acting differently depending on the circumstance; trying to offer fast access to popular content for and unpopular content demands various tactics. Given that the resources (storage, RAM, network, etc.) are limited and that content is not consumed linearly, finding a single balancing policy solution that meets all requirements is challenging, if not impossible.
If so, does it make sense?

flowchart LR
    A[Alice] -->|Get /video.mp4| RR{RR}
    RR -->|Sends Alices to| S1{Server1}
    B[Bob] -->|Get /video.mp4| RR
    RR -->|Sends Bob to| S2{Server2}
    C[Carol] -->|Get /video.mp4| RR
    RR -->|Sends Carol to| S3{Server3}
    S1 -->|Fetches from origin| O{Origin}
    S2 -->|Fetches from origin| O{Origin}
    S3 -->|Fetches from origin| O{Origin}
Loading

vs

flowchart LR
    A[Alice] -->|Get /video.mp4| RR{Caching-aware}
    RR -->|Sends Alices to| S1{Server1}
    B[Bob] -->|Get /video.mp4| RR
    RR -->|Sends Bob to| S1
    C[Carol] -->|Get /video1.mp4| RR
    RR -->|Sends Carol to| S1
    D[Dave] -->|Get /video2.mp4| RR
    RR -->|Sends Dave to| S2{Server2}
    S1 -->|Fetches from origin| O{Origin}
    S2 -->|Fetches from origin| O{Origin}
    
Loading

@metalalive
Copy link
Contributor Author

metalalive commented Mar 21, 2023

Can I view the blocks RR, Caching-aware, server1 , server2 (in the diagrams above) as different nodes (like nginx servers) in a CDN ?
so the CDN PoP (with a set of nodes) will face the caching issue you mentioned, when :

  • multiple frontend clients send requests (for the same unpopular content) to the same node (say node1) running round-robin load balancing
  • since round-robin policy is not aware of cache, node1 passes the requests to all other different nodes (e.g. node 2, node 3... etc) under the CDN PoP
  • then all other different nodes connect to origin server (or upstream backend server), cache the unpopular content , have several copies in these nodes, that causes unnecessary waste of memory space in all the nodes of a CDN.

@leandromoreira
Copy link
Owner

leandromoreira commented Mar 21, 2023

All the drawing/graphics used were just to demonstrate abstract concepts. In reality a user request might follow:

  1. A user requests content XYZ to API/DNS routing/Anycast
    • this part might work as the PoP load balancing
      • picking up the closest PoP (set of nodes) and available (resourcefulness, etc),
    • and also the node balancing within the PoP
      • picking up the best node, caching/resource aware
  2. API/DNS routing/Anycast returns/resolve to the closest NODE
  3. The user watches the stream/downloads the game/assets, etc
  4. If the content is in the cache it just serves, if not it goes to the tiered network of PoPs/nodes (you can also think of it as a "tree")

image

@metalalive
Copy link
Contributor Author

metalalive commented Mar 21, 2023

Yes, in reality the PoP load balancing will do things more intelligently , thanks for explaining

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants