Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Xml module does not normalize the document, making it difficult to use addenda #76

Closed
mquinson opened this issue Feb 6, 2018 · 8 comments

Comments

@mquinson
Copy link
Owner

mquinson commented Feb 6, 2018

Initially reported on Alioth by me, back in 2005:

For the addenda to work, it is better to have short lines (so that the 
translator can choose between which ones to put there addenda). 
The Sgml module does so since the bug #263300, in Debian.
@osamuaoki
Copy link
Contributor

This bug is preventing to move debian-history to migrate to XML.

It's not the bug in the normalization of output to the file but the way output strings are stored on @{$self->{TT}{doc_out}}. Although code expects array item to contain one line, it actually contain new line character for XML case.

If you carefully check output under debug, this is obvious since it lists multiple lines as the matched string. (Yes, it's too much to realize. I actually needed to change source to clearly mark matched line ... but that is another story) This reviled that array item contained multiple lines.

So what is needed seems to be something like:
@{$self->{TT}{doc_out}} = map {"$\n"} map {split '\n', $ } @{$self->{TT}{doc_out}};

This is a simple code to split multi-line item into multiple items of array. This sort-of works right if this is inserted after encoding setting before checking match with $position around line 651.

Now the question is the detail around "\n" handling. Avoiding perl odd rule using -1 for split seemed to be good thing to consider in order not to drop tailing "\n" for split. I thought doing the following seems to be better if each array item itself

@{$self->{TT}{doc_out}} = map {"$\n"} map {split '\n', $, -1 } @{$self->{TT}{doc_out}};

But this seems to produce too many "\n"s.

The initial map {"$_\n"} looks stupid time consuming code. So Setting one of:

  • IO::Handle->output_record_separator( EXPR )
  • $OUTPUT_RECORD_SEPARATOR
  • $ORS
  • $
    seems to be good before "print" if you are perl speaker. But I only see "write" in code which doesn't do this trick. I am not good enough to judge code style.

So I propose to add following as patch to address this bug. This works OK to add translator names into legalnotice within bookinfo tag.

0001-Ensure-one-line-per-item-of-array.patch.txt

@osamuaoki
Copy link
Contributor

If you check my patch doesn't have any unwanted side effects, please upload new package with this fix. Than many XML doc get benefit of addendum for buster.

Please.

@mquinson
Copy link
Owner Author

mquinson commented Sep 3, 2018

I will test your patch ASAP, but that's the beginning of the new term at university here. My free time is really scarce, sorry... Do not hesitate to ping me regularly if you don't see anything coming.

@osamuaoki
Copy link
Contributor

I am not familiar with test suite. Now that I have some idea how to change its behavior of parser for docbook without patching the source, test was done via "-o".

I tested on 4 simple XML book and article files based on examples in https://tdg.docbook.org/tdg/4.5/ with minor edits.

I think applying OPTION1 gives best result. po4a-normalize doesn't work, so there is some space for improvement. But this should give good reference test for book and article.

OPTIONS1 :=
-o nodefault=" "
-o break=" "
-o untranslated=" "
-o translated=" "

So default needs to be updated. I will propose patch later.

Here is the test run on 0.54
po4a-docbook-test.tar.gz

@osamuaoki
Copy link
Contributor

Apparently, pasting XML look-like doesn't work. See it in tarball. Sorry.

@mquinson
Copy link
Owner Author

mquinson commented Jul 1, 2020

Hello again @osamuaoki

I have the feeling that this issue can be closed, since #153 was merged to master. Could you please confirm?

Thanks in advance,

@osamuaoki
Copy link
Contributor

Yah... this is non-issue now, I think

@mquinson
Copy link
Owner Author

mquinson commented Jul 1, 2020

Thanks for confirming. Thus closing.

@mquinson mquinson closed this as completed Jul 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants