v0.2.0 - The Advanced Reconnaissance Update
This is a massive step forward for caniscrape, evolving it from a simple analysis tool into a powerful, flexible reconnaissance engine for serious web scraping tasks. This release introduces the two very powerful features—proxy rotation support and integrated CAPTCHA solving—and refactors the core to be faster and more stable.
This is a beta release. I am seeking feedback from the community to help stabilize these new features before the official v1.0.0 launch.
🚀 What's New
Feature: Integrated CAPTCHA Solving
You can now not only detect CAPTCHAs but also attempt to solve them using popular third-party services. This allows for a much deeper analysis of a site's defenses.
- Detect vs. Solve: By default,
caniscrapeonly detects. To solve, you must provide your API key. - Supported Services:
capsolver2captcha
- New Flags: Use
--captcha-serviceand--captcha-api-keyto enable solving.
Feature: Full Proxy Support & Rotation
Anonymize your analysis and bypass basic IP blocks with full proxy support.
- Rotation Pool: Provide the
--proxyflag multiple times to create a pool of proxies.caniscrapewill randomly rotate through them for each request. - Supported Protocols: Works with standard
httpandsocks5proxies. - Full Coverage: All network requests made by the tool, including WAF detection and headless browser sessions, are now routed through your proxies.
✨ Other Improvements
- Smarter URL Handling: The tool now automatically adds
http://to URLs that are missing a scheme, preventing common DNS errors. - Bug Fixes: Numerous small bugs and error-handling improvements have been made across all analyzers.
🛠️ How to Use the New Features
1. Using a Proxy Pool:
Simply add the --proxy flag for each proxy you want to use.
caniscrape https://example.com --proxy "http://user:pass@host1:port" --proxy "socks5://user:pass@host2:port"2. Solving a Detected CAPTCHA:
Provide the service name and your API key. The tool will only attempt to solve if a CAPTCHA is detected. Only Capsolver and 2Captcha are supported so far.
caniscrape https://some-captcha-site.com --captcha-service capsolver --captcha-api-key "YOUR_CAPSOLVER_API_KEY"Seeking Your Feedback
This is a big update, and I can't test every edge case, neither could I test whether the CAPTCHA solving actually works (I know it'll get to that point but I haven't tested whether the API call to the service works correctly). Please try the new features introduced in v2.0.0.
Test the proxy rotation.
Test the CAPTCHA solving.
The next update will likely focus on improving detection on tough sites like Amazon and YouTube.
Report any bugs, crashes, or unexpected results by opening an issue on GitHub.