I've been strugling with lots of 429 error codes and noticed that DeepCrawlStrategy, by ignoring my dispatcher settings is creating an internal MemoryAdaptiveDispatcher but this only controls the delays and spawns 20 task right from the bat by default.
I added this change on my side to control the concurrency and work good for me.
+++ b/.venv/lib/python3.13/site-packages/crawl4ai/async_webcrawler.py
@@
if dispatcher is None:
primary_cfg = config[0] if isinstance(config, list) else config
mean_delay = getattr(primary_cfg, "mean_delay", 0.1)
max_range = getattr(primary_cfg, "max_range", 0.3)
+ max_session_permit = max(1, int(getattr(primary_cfg, "semaphore_count", 20) or 20))
dispatcher = MemoryAdaptiveDispatcher(
+ max_session_permit=max_session_permit,
rate_limiter=RateLimiter(
base_delay=(mean_delay, mean_delay + max_range),
max_delay=60.0,
max_retries=3,
),
)
Hi unclecode,
I've been strugling with lots of 429 error codes and noticed that DeepCrawlStrategy, by ignoring my dispatcher settings is creating an internal MemoryAdaptiveDispatcher but this only controls the delays and spawns 20 task right from the bat by default.
crawl4ai/crawl4ai/async_webcrawler.py
Line 1032 in 1debe5f
I added this change on my side to control the concurrency and work good for me.