Skip to content

Releases: Shardj/ccrawler

0.2.0

05 Jul 11:59
Compare
Choose a tag to compare
0.2.0 Pre-release
Pre-release
  • Added regex fitler on url.
  • Fixed issues with multiple h1 tags.
  • Added checks for fragments and query strings in urls so we don't get duplicate pages infinitely.
  • Now randomises user-agent on every request (why did I do this? Guess it'll become profiles with proxies later).
  • Better formatting and validation on strings.
  • Files returned are now raw text instead of holding tags.
  • Handles redirects
  • improved some attribute setting and error logging

0.1.1

03 Jul 09:48
Compare
Choose a tag to compare
0.1.1 Pre-release
Pre-release

The same as 0.1.0 except gitignored pyc files which shouldn't have been in the repo