crawl4ai version
0.7.4
Expected Behavior
Concurrent url crawling
Current Behavior
I encountered a memory leak issue when running the crawler in Docker. Our production server configuration is 1 CPU and 8GB of memory, and even when crawling only 5 URLs concurrently, the memory leak still occurs. However, it works normally on Windows, and the memory_threshold_percent parameter in MemoryAdaptiveDispatcher has no effect at all.
default_memory_dispatcher = MemoryAdaptiveDispatcher(
memory_threshold_percent=85,
check_interval=1,
max_session_permit=5,
rate_limiter=RateLimiter(
base_delay=(3, 15),
max_delay=20.0,
max_retries=1
)
)
async with AsyncWebCrawler(config=self.general_browser_config) as crawler:
results = await crawler.arun_many(
urls=urls,
config=config,
dispatcher=default_memory_dispatcher
)
This is the memory usage statistics in production. The more complex the content of a web page is, the more memory it uses
This is the local windows memory usage statistics
Is this reproducible?
Yes
Inputs Causing the Bug
Steps to Reproduce
Code snippets
OS
Linux(docker)
Python version
3.12.0
Browser
No response
Browser version
No response
Error logs & Screenshots (if applicable)
No response
crawl4ai version
0.7.4
Expected Behavior
Concurrent url crawling
Current Behavior
I encountered a memory leak issue when running the crawler in Docker. Our production server configuration is 1 CPU and 8GB of memory, and even when crawling only 5 URLs concurrently, the memory leak still occurs. However, it works normally on Windows, and the memory_threshold_percent parameter in MemoryAdaptiveDispatcher has no effect at all.
This is the memory usage statistics in production. The more complex the content of a web page is, the more memory it uses
This is the local windows memory usage statistics
Is this reproducible?
Yes
Inputs Causing the Bug
Steps to Reproduce
Code snippets
OS
Linux(docker)
Python version
3.12.0
Browser
No response
Browser version
No response
Error logs & Screenshots (if applicable)
No response