Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in non-fatal-error log #160

Open
kris-sigur opened this issue May 2, 2016 · 2 comments
Open

Bug in non-fatal-error log #160

kris-sigur opened this issue May 2, 2016 · 2 comments
Labels
bug

Comments

@kris-sigur
Copy link
Collaborator

@kris-sigur kris-sigur commented May 2, 2016

While resolving #158 I noticed that the resulting entry in the non-fatal-error log had redundant stacktraces. I.e. one exception triggered the following:

2016-05-02T12:58:56.239Z   401       4758 http://aktravel.is/en/fundir-og-radstefnur/framkvaemd-radstefnu - - text/html #001 20160502125856064+163 sha1:KKGWJBFE2H4XPVTWXRNIMTCEQTJ4N76N - -
 java.lang.IllegalStateException: Missing auth challenge headers for uri with response status 401: http://aktravel.is/en/fundir-og-radstefnur/framkvaemd-radstefnu
    at org.archive.modules.fetcher.FetchHTTP.extractChallenges(FetchHTTP.java:884)
    at org.archive.modules.fetcher.FetchHTTP.handle401(FetchHTTP.java:802)
    at org.archive.modules.fetcher.FetchHTTP.innerProcess(FetchHTTP.java:743)
    at org.archive.modules.Processor.innerProcessResult(Processor.java:175)
    at org.archive.modules.Processor.process(Processor.java:142)
    at org.archive.modules.ProcessorChain.process(ProcessorChain.java:131)
    at org.archive.crawler.framework.ToeThread.run(ToeThread.java:148)
 java.lang.IllegalStateException: Missing auth challenge headers for uri with response status 401: http://aktravel.is/en/fundir-og-radstefnur/framkvaemd-radstefnu
    at org.archive.modules.fetcher.FetchHTTP.extractChallenges(FetchHTTP.java:884)
    at org.archive.modules.fetcher.FetchHTTP.handle401(FetchHTTP.java:802)
    at org.archive.modules.fetcher.FetchHTTP.innerProcess(FetchHTTP.java:743)
    at org.archive.modules.Processor.innerProcessResult(Processor.java:175)
    at org.archive.modules.Processor.process(Processor.java:142)
    at org.archive.modules.ProcessorChain.process(ProcessorChain.java:131)
    at org.archive.crawler.framework.ToeThread.run(ToeThread.java:148)

Probably a bug somewhere in

org.archive.crawler.io.NonFatalErrorFormatter.format()

@kris-sigur
Copy link
Collaborator Author

@kris-sigur kris-sigur commented May 2, 2016

Hmm, looking into this a bit closer, this may actually be a bug in the webarchive-commons

org.archive.io.GenerationFileHandler.publish()

  ((Preformatter)f).preformat(record); 
  super.publish(record);

Seems that both lines ultimately invoke NonFatalErrorFormatter.format() but publish() should be using the string prepared by preformat().

@kris-sigur
Copy link
Collaborator Author

@kris-sigur kris-sigur commented May 2, 2016

Also, I've confirmed that this bug was introduced between 3.0.0 and 3.1.0-RC1. It first shows up in our 2011-02 crawl which is when we switched from 3.0.0 to 3.1.0-RC1.

@ato ato added the bug label Aug 2, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.