Skip to content

Cookbook: Dealing with misleading HTTP response codes

Mark Jordan edited this page Nov 8, 2016 · 4 revisions

Toolchains that get content from remote websites, such as the CONTENTdm and OAI-PMH toolchains, can be difficult to troubleshoot because the remote web server may not be providing useful or accurate HTTP response codes. (There's even an example of this in the MIK source code.*) For example, some repository platforms will return a 404 Not Found response when the item being requested is embargoed. In this case, you might expect a 401 Unauthorized response.

Some tips to help you figure out how to interpret HTTP response codes that are showing up in your mik.log, or to figure out why content files contain HTML and not image/video/etc content:

  • The most general situation is that the remote web server is returning a 200 OK response code and a human-readable HTML page with some information on it, like "The resource you have requested could not be found", instead of a 404 Not Found response.
  • In the case of embargoed content, the returned error may be 404 Not Found, which is written to your mik.log, but if you visit that the same URL in FireFox or Chrome, you get a page that says something like "This object is under embargo."

In general, if you see HTTP response codes in your mik.log, verify that they are accurate by using curl to access the suspicious URL to see if you can replicate the error (for example, curl -v http://example.com/path/to/something), or by using a human web browser to see if the remote server provides a "helpful" page instead of the content or response you expect.

* CONTENTdm doesn't return a 404 when it should.

Cookbook table of contents

Clone this wiki locally