Skip to content

v2.0.0-beta

Pre-release
Pre-release
Compare
Choose a tag to compare
@z7r1k3 z7r1k3 released this 12 May 01:38
a511523

Feature Add

  • Added unique logging, the new default logging option which abandons the tree format and simply logs a list of each discovered URL once, and only once. Standard and redundant logging are still available upon user input, which have the usual tree format.

Bugfix

  • Redundantly logged URLs are now fully accurate in both the logs and terminal output. If you see a URL in the log/output tree without any child URLs, it doesn't have any or wasn't crawled (unless the crawler hit the depth limit, of course).

Changes to Logging and Terminal Output

  • Standard logging now omits the parent URL if it is being skipped due to previously crawling it. This makes the log/output more streamlined and less confusing.
  • Standard output now prints all URLs, but does not print any Error/Info messages (they are still logged).

Improvements to User Input

  • All input options now individually check for valid input. This means if you mess up one option, you won't have to re-input all of them.
  • Added more defaults, allowing the user to just mash enter after inputting the URL(s) to crawl.

Decreased Timeout

  • The timeout for opening a URL has been decreased to 20 seconds. If the crawler is hanging on a specific URL, this forces it to move on sooner.

Added Caching

  • The crawler now caches prefixes. This results in a more streamlined debug log, rather than having it spam "No prefix detected".

Refactored Default Variables

  • All default options are now placed at the top of the file. This allows the user to change options that are not requested during runtime, such as the log file location or timeout.