Releases · z7r1k3/creeper

12 May 01:38

z7r1k3

v2.0.0-beta

a511523

v2.0.0-beta Pre-release

Pre-release

Feature Add

Added unique logging, the new default logging option which abandons the tree format and simply logs a list of each discovered URL once, and only once. Standard and redundant logging are still available upon user input, which have the usual tree format.

Bugfix

Redundantly logged URLs are now fully accurate in both the logs and terminal output. If you see a URL in the log/output tree without any child URLs, it doesn't have any or wasn't crawled (unless the crawler hit the depth limit, of course).

Changes to Logging and Terminal Output

Standard logging now omits the parent URL if it is being skipped due to previously crawling it. This makes the log/output more streamlined and less confusing.
Standard output now prints all URLs, but does not print any Error/Info messages (they are still logged).

Improvements to User Input

All input options now individually check for valid input. This means if you mess up one option, you won't have to re-input all of them.
Added more defaults, allowing the user to just mash enter after inputting the URL(s) to crawl.

Decreased Timeout

The timeout for opening a URL has been decreased to 20 seconds. If the crawler is hanging on a specific URL, this forces it to move on sooner.

Added Caching

The crawler now caches prefixes. This results in a more streamlined debug log, rather than having it spam "No prefix detected".

Refactored Default Variables

All default options are now placed at the top of the file. This allows the user to change options that are not requested during runtime, such as the log file location or timeout.

Assets 2

05 May 01:59

z7r1k3

v1.4.2-beta

b3a6818

v1.4.2-beta Pre-release

Pre-release

Improved Logging

This pre-release brings some small but significant improvements to logging. Debug logs contain more information, and URL logs have a more accurate structure.

Assets 2

21 Apr 01:04

z7r1k3

v1.4.1-beta

1ce3a22

v1.4.1-beta Pre-release

Pre-release

Code Refactor

Code now meets typical python style standards.

Assets 2

20 Apr 00:42

z7r1k3

v1.4.0-beta

f7a023b

v1.4.0-beta Pre-release

Pre-release

Debug Logging

The new debug log is much more robust, with critical non-error information that will help solve issues in the future.

Code Refactoring

The code as a whole has been refactored further, with a primary focus on making it easy to add and modify log entries.

Prompt Defaults

The program now features a default selection which, if user input is empty, will automatically be used.

Beta

This project is still in beta. Updating all tags to reflect that.

Assets 2

24 Mar 02:08

z7r1k3

v1.3.0-beta

e9f46f4

v1.3.0-beta Pre-release

Pre-release

Feature Implementations

Added new option that allows the user to disable redundant logging for URLs that were already crawled and logged. Doing so will speed up overall crawl, as writing to a .txt file and the console output takes time.

User input aside from the URL is now checked and, if incorrect type, prompts for the correct input as opposed to just throwing exceptions.

Assets 2

29 Sep 05:25

z7r1k3

v1.2.0-beta

b65efee

v1.2.0-beta Pre-release

Pre-release

Error Logging
This release introduces proper error logging. Now, whenever an error occurs it is logged to the .error folder, along with the full exception thrown if applicable. Errors shown in the program output have a unique code that can be used to lookup the error in the applicable log. Errors will always log in a file, regardless of user settings.

Code Refactoring
Further progress has been made to make the code more readable and efficient. While it most definitely isn't perfect, it seems satisfactory for the moment.

Assets 2

04 Sep 02:49

z7r1k3

v1.1.0-beta

9f6fe8a

v1.1.0-beta Pre-release

Pre-release

Various Bugfixes
URLs are now handled much more accurately, meaning more discovered URLs will be eligible for crawling.
Other previously unknown bugs have been fixed as well.

Feature Improvements
Tag, attribute, and file ending lists have been expanded to account for more links that were previously being ignored.

Code Refactoring
Major improvements to the code structure. This is still a work in progress, but it is vastly better than before. If you value your eyes, I recommend avoiding the previous commits.

Assets 2

17 May 21:11

z7r1k3

v1.0.6-beta

594ccc0

v1.0.6-beta Pre-release

Pre-release

Completely whitelists the original URL from any qualifying checks. This will be used as a failsafe if, for example, a link is incorrectly not crawled because it is detected as an unqualified file-type. The user can then take the URL of that and crawl it separately. Assuming it is a valid HTML or FTP page, it will crawl it properly.

Fully "production ready" release will include a detailed README.md, but the crawler itself is production ready.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Add

Bugfix

Changes to Logging and Terminal Output

Improvements to User Input

Decreased Timeout

Added Caching

Refactored Default Variables

Improved Logging

Code Refactor

Debug Logging

Code Refactoring

Prompt Defaults

Beta

Releases: z7r1k3/creeper

v2.0.0-beta

Feature Add

Bugfix

Changes to Logging and Terminal Output

Improvements to User Input

Decreased Timeout

Added Caching

Refactored Default Variables

v1.4.2-beta

Improved Logging

v1.4.1-beta

Code Refactor

v1.4.0-beta

Debug Logging

Code Refactoring

Prompt Defaults

Beta

v1.3.0-beta

v1.2.0-beta

v1.1.0-beta

v1.0.6-beta