Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PDF not rendering and title displayed as \376\377\000A\000f\000s\000c\000h\000r\000i\000f\000t\000 \000b\000r\000i\000e\000f\000 \000a\000a\000n\000 \000k\000o\000n\000i\000n\000g\000 \000B\000e\000a\000t\000r\000i\000x\000 \000o\000v\000e\000r\000 \000o\ #1597

Closed
user1113 opened this issue Apr 23, 2012 · 7 comments · Fixed by #1734

Comments

@user1113
Copy link

Hi,

This pdf does not render in PDF.js, and the title is shown as \376\377\000A\000f\000s\000c\000h\000r\000i\000f\000t\000 \000b\000r\000i\000e\000f\000 \000a\000a\000n\000 \000k\000o\000n\000i\000n\000g\000 \000B\000e\000a\000t\000r\000i\000x\000 \000o\000v\000e\000r\000 \000o\000n\000t\000s\000l\000a\000g\000 \000k\000a\000b\000i\000n\000e\000t instead of the actual title Afschrift brief aan koning Beatrix over ontslag kabinet.

Adobe gives the following additional information about it:

  • Title
    • Afschrift brief aan koning Beatrix over ontslag kabinet
  • Author
    • rvd527
  • Application
    • PDFCreator Version 0.9.9
  • PDF-version
    • 1.4
  • Included fonts
    • Calibri
    • Verdana regular+bold+italic
    • (All are TrueType)
  • It meets the PDF/A spec

(Of course, you can also download the document itself at the link given.)

Screenshots
PDF.jsAdobe Reader

My information:
Firefox add-on version 0.2.537
User-Agent: Mozilla/5.0 (Windows NT 6.0; rv:11.0) Gecko/20100101 Firefox/11.0

@gigaherz
Copy link
Contributor

The title looks like it may be utf-16 encoded as ascii but with the non-ascii characters encoded in octal...
@saebekassebil may be interested in that since he wrote the metadata decoding stuff.

There doesn't seem to be any messages in the console so I don't know about the display issue.

@gigaherz
Copy link
Contributor

For reference: the string begins with 0xFE,0xFF, which is the Byte order Mark for UTF-16BE (character 0xFEFF), this makes it clear how to decode the string to unicode or whatever the internal representation of javascript is.

@saebekassebil
Copy link
Contributor

The title in the embedded Metadata XMP document is:
"\\376\\377\\000A\\000f\\000s\\000c\\000h\\000r\\000i\\000f\\000t\\000 \\000b\\000r\\000i\\000e\\000f\\000 \\000a\\000a\\000n\\000 \\000k\\000 o\\000n\\000i\\000n\\000g\\000 \\000B\\000e\\000a\\000t\\000r\\000i\\000x\\000 \\000o\\000v\\000e\\000r\\000 \\000o\\000n\\000t\\000s\\000l\\000a\\000g \\000 \\000k\\000a\\000b\\000i\\000n\\000e\\000t"

So actually the whole title is there, but padded with NUL characters. This is probably because your PDF Generator is using a wide-character set (UTF-16) set instead of only UTF-8. I think yury is fixing this with the PR right above.

@gigaherz
Copy link
Contributor

Yes he is, and yes it is exactly that, as I said, it's UTF-16BE, because the octals for \376\377 translate to 0xFE,0xFF, which is the big-endian ordering for 0xFEFF, the Byte-order-mark (a Zero-width-no-break-space code which apps loading UTF-16 use to deduce the endianess).

@yurydelendik
Copy link
Contributor

inside pdf format structures the title is set and is read by pdf.js properly. the xmp/xml data is set to exactly the same vales as the fields inside pdf format, which is wrong -- the xml encoding shall be used.

the information above is taken from adobe reader pdf metadata pdf; could you also check xmp tab if one is present?

@user1113
Copy link
Author

could you also check xmp tab if one is present?

I don't see any XMP tab.

@brendandahl
Copy link
Contributor

The issue should be resolved with the above pull requests. If I overlooked anything please leave a comment and we'll re-open the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants