Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

index.qhp malformed when using special characters in \addindex (Origin: bugzilla #686259) #4926

Closed
doxygen opened this Issue Jul 2, 2018 · 0 comments

Comments

Projects
None yet
1 participant
@doxygen
Copy link
Owner

doxygen commented Jul 2, 2018

status RESOLVED severity major in component general for ---
Reported in version 1.8.2-SVN on platform Other
Assigned to: Dimitri van Heesch

Original attachment names and IDs:

On 2012-10-17 02:14:47 +0000, Bastiaan wrote:

Created attachment 226600
patch

In an attempt to test the improvements in doxygenw20120930_1_8_2.zip I encountered the following:

\addindex coördinaten

produces this item in index.qhp:

As a consequence, qhelpgenerator fails with the message "Encountered incorrectly encoded content."

I am attaching a patch that %-encodes the anchor fragment and handles unicode correctly. It is modelled after QUrl of Qt 4.7. It does not touch the name and id arguments, but it solves the above mentioned problem.

On 2012-10-28 11:31:17 +0000, Dimitri van Heesch wrote:

Hi Bastiaan,

I'm not sure your solution would work for general UTF-8 characters (i.e. multi-byte characters).

Did you configure your input encoding correctly (see INPUT_ENCODING)? Then I would expect that the anchor would be UTF-8 encoded as well.

Can you attach a self-contained example (source+config file in a zip) that allows me to reproduce this issue?

On 2012-10-29 08:43:05 +0000, Bastiaan wrote:

Created attachment 227507
Example

Hi Dimitri,

Here is the example. Yes, INPUT_ENCODING is left at UTF-8. As far as I can see, ö is a multi-byte character, as it gives two characters in the name and id fields of the keyword (co�¶rdinaten) when index.qhp is viewed in ISO 8859-15 encoding (Kate). Still the anchor displays as "#coördinaten". So it looks like the name and id are encoded but not the anchor. Consequently, it cannot be opened in UTF-8.

Apart from that, the individual bytes of the multi-byte character need to be percent-encoded in the URI, looking like "co%C3%B6rdinaten". Many browsers will decode that URI and display the encoded characters normally, as "#coördinaten" in this case. The patch is a generic solution: also works for spaces, quotes etc.

With the patch index.qhp can be opened in UTF-8, then the name and id show as "coördinaten" and the anchor is percent-encoded; and qhelpgenerator is happy.

Hope this helps,
Bastiaan.

On 2013-05-11 19:49:26 +0000, Dimitri van Heesch wrote:

Thanks, I'll include the patch in the next subversion update.

On 2013-05-19 12:35:32 +0000, Dimitri van Heesch wrote:

This bug was previously marked ASSIGNED, which means it should be fixed in
doxygen version 1.8.4. Please verify if this is indeed the case. Reopen the
bug if you think it is not fixed and please include any additional information
that you think can be relevant.

@doxygen doxygen closed this Jul 2, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.