Skip to content


Subversion checkout URL

You can clone with
Download ZIP


Incorrect content-type for raw pod #258

rwstauner opened this Issue · 7 comments

2 participants


Looking at a module on metacpan and clicking the raw source link takes you straight to the api. For example:

That doc looks terrible in a browser because it's not utf-8 encoded,
however our headers say it is:

HTTP/1.1 200 OK
Server: nginx/0.7.67
Date: Fri, 29 Mar 2013 05:13:53 GMT
Content-Type: text/plain; charset=UTF-8
Transfer-Encoding: chunked
Connection: keep-alive
Vary: Accept-Encoding
Content-Encoding: gzip

The current behavior is wrong because we say it's UTF-8 when it's not.

  • We could detect =encoding\s+(\S+) and alter the charset header. This would be inconsistent with other docs that come from the api, but user-agents are certainly capable of dealing with a per-response encoding.
  • We could convert it to utf-8 but then it wouldn't really be "raw".
  • We could set the content-type for raw files to application/octet-stream but then the browser wouldn't display it at all.

Any thoughts?

This issue could probably be applied to any raw file.


Refs CPAN-API/metacpan-web@502a86f

I don't feel like detecting the encoding of a file by looking at =encoding is correct. Imagine a module that has both use utf8; and =encoding jpn. The source code might contain utf8 while the pod is jpn encoded. No way to make the right decision.
It just feels too magical and I'd rather see a consistent (i.e. utf8) response than something unreliable. I think we have to ask, what the raw response is being used for. And I'd say people are interested in the byte sequence (create diffs, download, etc).

@monken monken referenced this issue from a commit in CPAN-API/metacpan-web
@rwstauner rwstauner Honor =encoding directive when decoding raw resonses 502a86f

Are you in favor of setting the content-type to octet-stream then?


what about not providing a charset at all?


I considered that too... I'd have to look up how that's supposed to be interpreted


I guess it's up to the browser then.
My point is that source code is supposed to be ascii, or utf8 if we talk about perl code. So the /source endpoint should naturally provide an encoding that allows to view the source, not the documentation. We have the /pod endpoint for displaying documentation in the correct encoding (if provided).


Yeah, that makes sense.
That's an even a better argument than "the file could be mixed" (which is sufficiently valid).
There is the encoding pragma for writing perl in other encodings but that's been deprecated.
It's a UTF-8 world now.


I guess we should just leave it the way it is (since it will be right for most cases).


HTTP 1.1 says that the default charset is ISO-8859-1. But there are too many unlabeled documents in other encodings, so browsers use the reader's preferred encoding when there is no explicit charset parameter.


   The "charset" parameter is used with some media types to define the
   character set (section 3.4) of the data. When no explicit charset
   parameter is provided by the sender, media subtypes of the "text"
   type are defined to have a default charset value of "ISO-8859-1" when
   received via HTTP. Data in character sets other than "ISO-8859-1" or
   its subsets MUST be labeled with an appropriate charset value.

   Some HTTP/1.0 software has interpreted a Content-Type header without
   charset parameter incorrectly to mean "recipient should guess."
   Senders wishing to defeat this behavior MAY include a charset
   parameter even when the charset is ISO-8859-1 and SHOULD do so when
   it is known that it will not confuse the recipient.

   Unfortunately, some older HTTP/1.0 clients did not deal properly with
   an explicit charset parameter. HTTP/1.1 recipients MUST respect the
   charset label provided by the sender; and those user agents that have
   a provision to "guess" a charset MUST use the charset from the
   content-type field if they support that charset, rather than the
   recipient's preference, when initially displaying a document.
@rwstauner rwstauner closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.