Skip to content

Releases: z7r1k3/creeper

v2.0.0-beta

12 May 01:38
a511523
Compare
Choose a tag to compare
v2.0.0-beta Pre-release
Pre-release

Feature Add

  • Added unique logging, the new default logging option which abandons the tree format and simply logs a list of each discovered URL once, and only once. Standard and redundant logging are still available upon user input, which have the usual tree format.

Bugfix

  • Redundantly logged URLs are now fully accurate in both the logs and terminal output. If you see a URL in the log/output tree without any child URLs, it doesn't have any or wasn't crawled (unless the crawler hit the depth limit, of course).

Changes to Logging and Terminal Output

  • Standard logging now omits the parent URL if it is being skipped due to previously crawling it. This makes the log/output more streamlined and less confusing.
  • Standard output now prints all URLs, but does not print any Error/Info messages (they are still logged).

Improvements to User Input

  • All input options now individually check for valid input. This means if you mess up one option, you won't have to re-input all of them.
  • Added more defaults, allowing the user to just mash enter after inputting the URL(s) to crawl.

Decreased Timeout

  • The timeout for opening a URL has been decreased to 20 seconds. If the crawler is hanging on a specific URL, this forces it to move on sooner.

Added Caching

  • The crawler now caches prefixes. This results in a more streamlined debug log, rather than having it spam "No prefix detected".

Refactored Default Variables

  • All default options are now placed at the top of the file. This allows the user to change options that are not requested during runtime, such as the log file location or timeout.

v1.4.2-beta

05 May 01:59
b3a6818
Compare
Choose a tag to compare
v1.4.2-beta Pre-release
Pre-release

Improved Logging

This pre-release brings some small but significant improvements to logging. Debug logs contain more information, and URL logs have a more accurate structure.

v1.4.1-beta

21 Apr 01:04
1ce3a22
Compare
Choose a tag to compare
v1.4.1-beta Pre-release
Pre-release

Code Refactor

Code now meets typical python style standards.

v1.4.0-beta

20 Apr 00:42
f7a023b
Compare
Choose a tag to compare
v1.4.0-beta Pre-release
Pre-release

Debug Logging

The new debug log is much more robust, with critical non-error information that will help solve issues in the future.

Code Refactoring

The code as a whole has been refactored further, with a primary focus on making it easy to add and modify log entries.

Prompt Defaults

The program now features a default selection which, if user input is empty, will automatically be used.

Beta

This project is still in beta. Updating all tags to reflect that.

v1.3.0-beta

24 Mar 02:08
Compare
Choose a tag to compare
v1.3.0-beta Pre-release
Pre-release

Feature Implementations

Added new option that allows the user to disable redundant logging for URLs that were already crawled and logged. Doing so will speed up overall crawl, as writing to a .txt file and the console output takes time.

User input aside from the URL is now checked and, if incorrect type, prompts for the correct input as opposed to just throwing exceptions.

v1.2.0-beta

29 Sep 05:25
Compare
Choose a tag to compare
v1.2.0-beta Pre-release
Pre-release

Error Logging
This release introduces proper error logging. Now, whenever an error occurs it is logged to the .error folder, along with the full exception thrown if applicable. Errors shown in the program output have a unique code that can be used to lookup the error in the applicable log. Errors will always log in a file, regardless of user settings.

Code Refactoring
Further progress has been made to make the code more readable and efficient. While it most definitely isn't perfect, it seems satisfactory for the moment.

v1.1.0-beta

04 Sep 02:49
Compare
Choose a tag to compare
v1.1.0-beta Pre-release
Pre-release

Various Bugfixes
URLs are now handled much more accurately, meaning more discovered URLs will be eligible for crawling.
Other previously unknown bugs have been fixed as well.

Feature Improvements
Tag, attribute, and file ending lists have been expanded to account for more links that were previously being ignored.

Code Refactoring
Major improvements to the code structure. This is still a work in progress, but it is vastly better than before. If you value your eyes, I recommend avoiding the previous commits.

v1.0.6-beta

17 May 21:11
Compare
Choose a tag to compare
v1.0.6-beta Pre-release
Pre-release

Completely whitelists the original URL from any qualifying checks. This will be used as a failsafe if, for example, a link is incorrectly not crawled because it is detected as an unqualified file-type. The user can then take the URL of that and crawl it separately. Assuming it is a valid HTML or FTP page, it will crawl it properly.

Fully "production ready" release will include a detailed README.md, but the crawler itself is production ready.