[Feature Request]: Rate Limiting on LLM API calls #913

roablep · 2025-03-30T01:02:59Z

roablep
Mar 30, 2025

What needs to be done?

Right now I think the Crawl4AI rate limiting is for the web crawl and not the LLM call, right?
I'd like to implement rate limits per LLM vendor and LLM model so that I can avoid hitting LLM API rate caps.

What problem does this solve?

LLM errors

Target users/beneficiaries

No response

Current alternatives/workarounds

No response

Proposed approach

No response

aravindkarnam · 2025-03-31T11:57:24Z

aravindkarnam
Mar 31, 2025
Collaborator

@roablep We use LiteLLM under the hood. They provide a proxy implementation in which, you can do all rate limiting, budget planning etc. Then you pass this proxy url as the base_url to our LLMConfig.

Then LiteLLM (under Crawl4AI) will route your queries through the proxy which will throttle/manage your queries and put it back together. This way Crawl4AI is agnostic of your LLM provider choice and setup.

0 replies

roablep · 2025-03-31T13:13:19Z

roablep
Mar 31, 2025
Author

Great! Perhaps we add this to the documentation at https://docs.crawl4ai.com/extraction/llm-strategies/

Also - typo in the docs: s/LightLLM/LiteLLMs/g

Here's a proposed addition to the documentation -

Advanced: Rate Limiting & Budgeting with LiteLLM Proxy

If you're working with LLM APIs that have rate limits, quota caps, or cost controls, you can enable full control over throttling, retries, and budgets by using LiteLLM’s proxy.

Crawl4AI uses LiteLLM under the hood to make its LLM calls. By default, we talk directly to the model provider (e.g., OpenAI, Ollama, etc.)—but you can optionally route all LLM requests through your own LiteLLM proxy.

How to Use It in Crawl4AI

Set up the LiteLLM proxy (see LiteLLM Proxy Setup Guide)
Set up LiteLLM Rate Limits (see LiteLLM Rate Limiting)
Point your LLMConfig to that proxy with api_base

LLMExtractionStrategy(
    llm_config = LLMConfig(
        provider="openai/gpt-4",  # Still specify the model
        api_token="sk-...",       # Your API key (optional if proxy handles auth)
        api_base="http://localhost:4000"  # Your LiteLLM proxy URL
    ),
    # ...
)

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Feature Request]: Rate Limiting on LLM API calls #913

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

[Feature Request]: Rate Limiting on LLM API calls #913

Uh oh!

Uh oh!

roablep Mar 30, 2025

What needs to be done?

What problem does this solve?

Target users/beneficiaries

Current alternatives/workarounds

Proposed approach

Replies: 2 comments

Uh oh!

aravindkarnam Mar 31, 2025 Collaborator

Uh oh!

roablep Mar 31, 2025 Author

Advanced: Rate Limiting & Budgeting with LiteLLM Proxy

roablep
Mar 30, 2025

aravindkarnam
Mar 31, 2025
Collaborator

roablep
Mar 31, 2025
Author