A Python application for crawling and scraping documentation using Firecrawl API.
- Crawls websites and follows child links
- Converts scraped content to markdown format
- Saves documentation files with sanitized filenames
- Handles duplicate filenames automatically
- Python 3.x
- Firecrawl API key
pip install firecrawl-py- Set your Firecrawl API key (recommended: use environment variables)
- Update the
urlandmax_pagesvariables infirecrawlbasics.py - Run the script:
python firecrawlbasics.pyThe script can be configured by modifying variables in firecrawlbasics.py:
url: The starting URL to crawlmax_pages: Maximum number of pages to crawloutput_folder: Folder to save markdown filesinclude_paths: Path filters for crawlingexclude_paths: Paths to exclude from crawling
[Add your license here]