You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A website has a badly coded relative URL link, we keeps appending the link's path to the end of the url. It would look something like this: http://www.example.com/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/contact-us/co
SiteMaster has logic to prevent such a URL from stalling the system. It does this by taking the md5 hash of the URL looking for another page in the scan that has the same hash.
This does not appear to be working correctly in some circumstances. I believe this is because of URI truncation in the database.
The script looks first for the full url, then gets inserted as the truncated url. because the truncated url has a different md5 hash than the full URL, it will result in duplicate pages being scanned.
The text was updated successfully, but these errors were encountered:
fixesUNLSiteMaster#115
URLs that were longer than the mysql uri field length were being truncated, resulting in the same URL being scanned over and over again while ignoring the distinct page limit. This fixes that issue by truncating the URL before it even gets to mysql.
Take this scenario
SiteMaster has logic to prevent such a URL from stalling the system. It does this by taking the md5 hash of the URL looking for another page in the scan that has the same hash.
This does not appear to be working correctly in some circumstances. I believe this is because of URI truncation in the database.
The script looks first for
the full url
, then gets inserted asthe truncated url
. because the truncated url has a different md5 hash than the full URL, it will result in duplicate pages being scanned.The text was updated successfully, but these errors were encountered: