Why numbers getting reversed when formatted in RTL Arabic - Rails application? #700

Closed
myselfrajeshco opened this Issue Mar 28, 2014 · 19 comments

Projects

None yet

4 participants

@myselfrajeshco

I'm using Prawn gem in my Rails app to generate PDF reports.

I read the documentation for putting the text in Arabic with text_direction RTL in arabic. But, issue is that numbers are getting reversed here.

I wanted semester 1234 as الفصل الدراسي 1234,

but in my app the output is الفصل الدراسي 4321.

My two lines of code is here:
pdftable = Prawn::Document.new
pdftable.text(t('org.semester') + " " + @semester)

@semester = '1234' (The reason would be that it is being treated as a text/string, thus changes to RTL (reversed))

Hope this is an issue to be resolved in prawn.

@practicingruby
prawnpdf member

I don't think our RTL text support is content aware, so it probably just renders the string in reverse order. This is almost certainly not what people want, so what I'd like is a bunch of test-cases that reflect how we want this feature to work, ideally ones that reflect how RTL text works elsewhere.

@practicingruby
prawnpdf member

Another thing I would like is a good online resource that describes how RTL text should work. Does anyone have a reference they can share?

@hectorvs-gxg

I found this: http://unicode.org/reports/tr9/
it is the unicode bidirectional algorithm

it may help with the BiDi buisiness

Also this http://sourceforge.net/projects/rubybidi/

it's an old project, but it may work?

@practicingruby
prawnpdf member

Thanks @hectorvs-gxg! I don't know when I'll have time to look at this myself, but it does look like it will be helpful, and anyone interested in investing further is welcome do so.

We do have a policy of not relying on third-party dependencies, but looking at another library for inspiration (or to vendor a bit of code) is welcome!

@hectorvs-gxg

I played a bit with the rubybidi library, it seems to do a good job in detecting the text itself and printing out a visual representation of what the text should look like.

I tried massaging the text with the library before sending it over to prawn, but there needs to be some changes for this to work.

Basically, the library removes the need to reverse the text itself when printing in RTL, prawn does this by default when in RTL mode, so first step is to prevent this reversing.

After this, the words of the sentence still need to be printed from right to left in order for the word wrapping and alignment to be correct (this means that the last word of an RTL sentence has to be printed first, but the word itself should be printed "as is" in LTR mode).

This may be the wrong approach, but I will try to hack prawn a bit in my own local fork to see if this works. I know this is not a good enough solution, but all means (still needs to handle things like formatting tags and such) but it's a good proof of concept.

@practicingruby
prawnpdf member

@hectorvs-gx I ❤️ proof-of-concept work. Even if it's a hack that won't get merged, please open a pull request when you get something working so we can discuss.

@hectorvs-gxg

Ok, so I did some serious code spelunking today trying to get prawn to display the text generated by the rubyBidi library correctly and got mostly there.

to render in prawn I used a simple text method with no formatting, and set text_direction = :rtl in the prawn document

rubyBidi basically renders a visually correct version of the text in question with the following snippet

    @b = Bidi.new
    processed_text = @b.to_visual some_text, "R"

    text processed_text

I'm attaching a picture of the results:

screen_shot_2014-06-11_at_5_20_20_pm

All I did was comment out the lines of code that reverse the text in prawn, specifically in

https://github.com/prawnpdf/prawn/blob/1.0.0/lib/prawn/text/formatted/fragment.rb#L215-220

and

https://github.com/prawnpdf/prawn/blob/1.0.0/lib/prawn/text/formatted/wrap.rb#L90

As you can see from the picture, the text is visually correct on screen, but it's being printed in reverse order.

Could you offer some guidance @sandal? I kinda got the gist of how the line wrapping works, but I couldn't make it so the lines get read in the correct direction.

Here's the html file I used as well

<!DOCTYPE html>
<head>
    <meta charset="unicode">
</head>
<body>
  <div dir="RTL" align="right">במעבדתנו בוצעה בדיקת צ'יפ גנטי המאתרת שינויים כמותיים בכרומוזומים (חסרים או תוספות במספר עותקי הדנ"א). hello tgere בבדיקה נבחנו 716,503 אתרים על פני כל הגנום, כפי שהם מיוצגים על גבי צ'יפ מסוג SNP של חברת Illumina.</div>
</body>
</html>

so you can copy / paste the text if needed.

Thanks!

@practicingruby
prawnpdf member

@hectorvs-gxg: Thanks for working on this. I'm not super familiar with our RTL text support, so I wouldn't be able to comment further without doing some research, but hopefully someone who does know a bit about this area of Prawn will comment.

If not, this at least gives me a starting point for whenever I get around to reviewing this ticket.

@elad

Just found this. Please see #514 and elad/ruby-bidi.

I'd close this except that @hectorvs-gxg pointed out Prawn reverses the string, which shouldn't happen if there's proper bidi support. I agree because I specifically added a render_visual method to reverse the string in ruby-bidi.

@sandal, I would suggest testing the following:

  • Remove string reversal in Prawn
  • Change recently added documentation about right-to-left text to refer to to_visual
  • Remove render_visual from ruby-bidi and update README

Less code, less cycles, same results. :)

@practicingruby
prawnpdf member

@elad: My extremely limited understanding of the existing behavior is that we reverse the strings to support pure RTL text without any pre-processing. So removing reversal would break that case, or force users to install ruby-bidi.

Please correct me if I'm wrong, this is an area of Prawn that was worked on while I wasn't involved in the project.

@elad

@sandal I understand where you're coming from, but I somewhat disagree with the assertion that simply reversing the strings supports "pure RTL text."

My reasoning is this: Reversing works as long as your text contains just characters and nothing else. A PDF is a document though, and as soon as you try using richer paragraphs - with punctuation, numbers, mixed languages (especially relevant for technical documents) - things break. At the very least this is broken behavior, and in the worst case scenario this might come as an inconvenient surprise while working in production.

I'd argue that when RTL support is mentioned, the necessity of using a real bidi implementation be underscored so that people who really need RTL text (rather than right-aligned reversed strings ;) know how to make it work from the very beginning.

That said, you're absolutely right in that someone might be relying on that behavior, which is where your judgment comes into the picture. If you believe that's the case, then I agree it would be surprising to "unbreak" it, especially without a major version bump. Personally, as long as there's documentation indicating how to get a solution that works, I'm happy. It just might be the case that if people start relying on render_visual then we're only digging a deeper hole. :)

@practicingruby
prawnpdf member

@elad: That's all fair. What I meant by "pure RTL text" is that we don't consider any edge cases at all, which probably does mean we've got a broken implementation.

So I think this is all the more reason for us to figure out how to extract RTL support into a prawn-bidi extension that marries Prawn and RubyBIDI. We'd replace the existing functionality in Prawn with low level extension points, and then tell those who need support to use the gem.

If I lay the groundwork for that gem, can you help maintain it? (In particular, be able to answer questions about how it should work, because I don't have any experience here).

@practicingruby
prawnpdf member

@elad: I'm not worried about API compatibility right now, we're going to probably have three major releases in the next 12 months, so these changes change happen in any one of those.

@elad

@sandal - sure, I'd be happy to help maintain it. Just keep in mind I'm not fluent in Ruby. :)

@hectorvs-gxg
@elad

@hectorvs-gxg, if it's not too late, I suggest taking another look at Prawn in combination with ruby-bidi. It very neatly solves exactly the problems you pointed out.

@sandal, are you okay with closing this issue, potentially in favor of opening another one about creating a prawn-bidi gem with proper integration?

@practicingruby
prawnpdf member

@elad, yeah we can close this. For now the solution is "use ruby-bidi", and hopefully in the future a better solution will be "use prawn-bidi".

I'm not going to re-open a ticket for the latter just yet, because realistically I don't know when I'm going to find time to work on it. (especially because we have a workable alternative for the moment). But feel free to remind me if a long time passes and you haven't seen progress on this.

@elad

@sandal shouldn't the manual on prawnpdf.org be regenerated to reflect the resolution?

@practicingruby
prawnpdf member

@elad Yes, though the manual tracks released versions of Prawn, and your change to the manual is currently in master. It'll get included when the next release comes out, which will hopefully happen in the next few weeks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment