New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in non-fatal-error log #160

Open
kris-sigur opened this Issue May 2, 2016 · 2 comments

Comments

Projects
None yet
2 participants
@kris-sigur
Collaborator

kris-sigur commented May 2, 2016

While resolving #158 I noticed that the resulting entry in the non-fatal-error log had redundant stacktraces. I.e. one exception triggered the following:

2016-05-02T12:58:56.239Z   401       4758 http://aktravel.is/en/fundir-og-radstefnur/framkvaemd-radstefnu - - text/html #001 20160502125856064+163 sha1:KKGWJBFE2H4XPVTWXRNIMTCEQTJ4N76N - -
 java.lang.IllegalStateException: Missing auth challenge headers for uri with response status 401: http://aktravel.is/en/fundir-og-radstefnur/framkvaemd-radstefnu
    at org.archive.modules.fetcher.FetchHTTP.extractChallenges(FetchHTTP.java:884)
    at org.archive.modules.fetcher.FetchHTTP.handle401(FetchHTTP.java:802)
    at org.archive.modules.fetcher.FetchHTTP.innerProcess(FetchHTTP.java:743)
    at org.archive.modules.Processor.innerProcessResult(Processor.java:175)
    at org.archive.modules.Processor.process(Processor.java:142)
    at org.archive.modules.ProcessorChain.process(ProcessorChain.java:131)
    at org.archive.crawler.framework.ToeThread.run(ToeThread.java:148)
 java.lang.IllegalStateException: Missing auth challenge headers for uri with response status 401: http://aktravel.is/en/fundir-og-radstefnur/framkvaemd-radstefnu
    at org.archive.modules.fetcher.FetchHTTP.extractChallenges(FetchHTTP.java:884)
    at org.archive.modules.fetcher.FetchHTTP.handle401(FetchHTTP.java:802)
    at org.archive.modules.fetcher.FetchHTTP.innerProcess(FetchHTTP.java:743)
    at org.archive.modules.Processor.innerProcessResult(Processor.java:175)
    at org.archive.modules.Processor.process(Processor.java:142)
    at org.archive.modules.ProcessorChain.process(ProcessorChain.java:131)
    at org.archive.crawler.framework.ToeThread.run(ToeThread.java:148)

Probably a bug somewhere in

org.archive.crawler.io.NonFatalErrorFormatter.format()

@kris-sigur

This comment has been minimized.

Show comment
Hide comment
@kris-sigur

kris-sigur May 2, 2016

Collaborator

Hmm, looking into this a bit closer, this may actually be a bug in the webarchive-commons

org.archive.io.GenerationFileHandler.publish()

  ((Preformatter)f).preformat(record); 
  super.publish(record);

Seems that both lines ultimately invoke NonFatalErrorFormatter.format() but publish() should be using the string prepared by preformat().

Collaborator

kris-sigur commented May 2, 2016

Hmm, looking into this a bit closer, this may actually be a bug in the webarchive-commons

org.archive.io.GenerationFileHandler.publish()

  ((Preformatter)f).preformat(record); 
  super.publish(record);

Seems that both lines ultimately invoke NonFatalErrorFormatter.format() but publish() should be using the string prepared by preformat().

@kris-sigur

This comment has been minimized.

Show comment
Hide comment
@kris-sigur

kris-sigur May 2, 2016

Collaborator

Also, I've confirmed that this bug was introduced between 3.0.0 and 3.1.0-RC1. It first shows up in our 2011-02 crawl which is when we switched from 3.0.0 to 3.1.0-RC1.

Collaborator

kris-sigur commented May 2, 2016

Also, I've confirmed that this bug was introduced between 3.0.0 and 3.1.0-RC1. It first shows up in our 2011-02 crawl which is when we switched from 3.0.0 to 3.1.0-RC1.

@ato ato added the bug label Aug 2, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment