Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Noisy alerts about 401s without auth challenge #158
A 401 response is supposed to include an auth challenge but in practice a lot of sites erroneously use 401 without it (they should really be using 403s).
When Heritrix encounters such a situation it logs the error in a such a manner that it is added to the alerts log. As this isn't an issue with the crawler, this isn't very useful and the spamming of such errors may hide other, more serious and actionable errors.
Example entry from the alerts log:
Suggest we modify how these errors are handled and log them in the