# Rate limiting in async code

This post describes three ways to rate limit async code in a way that optimizes for throughput and latency. The first two methods use native asyncio primitives, while the third uses aiochan, a third-party library that is meant to provide functionality similar to channels in Go.

## Definitions

A few definitions to set the context for this post:

- "rate limit" means to limit the number of operations per period
- "operations" is the number of times a function is called in a given period
- "period" is the time window in which the number of operations is limited

## Setup

Typically, rate limiting is used in the context of a web server. For this post, I'll use a function that returns a timestamp to mock the request to the server. I'll also have that function sleep for a random amount of time to simulate the work that the server is doing.

In [3]:
import httpx
import pendulum as pdl

In [None]:
async def server():
    return pdl.now().format("x")

## Method 1: simple semaphore context manager

This is the simplest way to rate limit async code. It uses a semaphore to limit the number of concurrent tasks, and a context manager to ensure that the semaphore is released when the task is done. At the end of each task, the context manager sleeps for the required period before releasing the semaphore.