Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support more than ISO-8859-1 (Latin-1) #85

Open
oskusalerma opened this issue Jan 4, 2012 · 53 comments
Open

Support more than ISO-8859-1 (Latin-1) #85

oskusalerma opened this issue Jan 4, 2012 · 53 comments

Comments

@oskusalerma
Copy link
Owner

@oskusalerma oskusalerma commented Jan 4, 2012

Common request people make is for Trelby to support more than the ISO-8859-1 (also known as Latin-1) character set to write scripts in. We already support most Western European languages (see http://en.wikipedia.org/wiki/ISO/IEC_8859-1 for the full list), but there are plenty of characters used in other languages not found in Latin-1.

Things that would need to be implemented to support arbitrary Unicode characters in scripts, ranked in decreasing order of difficulty:

  • Unicode characters in PDF. Hard problem; PDF spec is oriented towards Latin-1 and basically says that to use anything else, you must embed your own font, subset the characters you use, and do complicated lookup tables to encode the text. That's how it was 5 years ago anyway when I last looked at it; maybe the spec has improved since then? See http://stackoverflow.com/questions/128162/unicode-in-pdf and http://blogs.adobe.com/insidepdf/2008/07/text_content_in_pdf_files.html for some discussions about this.
  • Unicode font metrics for PDF output (fontinfo.py). Standard PDF fonts only have metrics available for Latin-1 characters; how do we generate PDF without knowing the sizes for characters we are using?
  • Text input. Right now, OnKeyChar gets a key event and it contains an 8-bit character code. To support input of unicode characters, we need to do more. Maybe use the wxKeyEvent::GetUnicodeKey function? See http://docs.wxwidgets.org/2.8/wx_wxkeyevent.html. Note that Latin-1 characters are easy to input, they're a single keypress on most keyboards. Fancier unicode characters sometimes have very sophisticated input systems where people type in something, a popup shows up, they select the character they want, etc. I have no idea how those input systems integrate to wxWidgets, or if they do at all.
  • All the text processing in Trelby would need to be redone to do it in Unicode instead. util.py has a whole bunch of string handling functions, for example.
  • Probably more that I'm forgetting right now.
@gnaag

This comment has been minimized.

Copy link

@gnaag gnaag commented Jan 5, 2012

Is it easier to add support for ISO-8859-2? And eventualy other ISO based charsets? If so, then it is sufficient.

@oskusalerma

This comment has been minimized.

Copy link
Owner Author

@oskusalerma oskusalerma commented Jan 5, 2012

No, the PDF spec makes a special case for Latin-1. Supporting anything besides that is hard, it doesn't make much of a difference if you're supporting a small subset of unicode or a bigger one, the effort is about the same.

@ghost

This comment has been minimized.

Copy link

@ghost ghost commented May 25, 2012

Hi,
first of all I must say that Trelby is GREAT! But unfortunately I write all my screenplays in czech language and I would like to use Trelby for my professional work. I am not programmer but I would like to help with beta testing if you have plan for Unicode support (for Czech and Slovak language). I realy needed Trelby because this program is easy and I do not like robust programs like Celtx and other.

@anilgulecha

This comment has been minimized.

Copy link
Contributor

@anilgulecha anilgulecha commented May 26, 2012

Unfortunately no plans as of yet.

One of our limitations is the built in PDF module. We can always forgo it in favor of new python PDF libraries that have become available since trelby was first written; libraries like pyPDF, python imaging library, reportlab, etc. The below page has a list of all python pdf libraries.

http://vermeulen.ca/python-pdf.html

@ghost

This comment has been minimized.

Copy link

@ghost ghost commented Jun 1, 2012

Too sad... :-( Any chance to help?

@vinidiktov

This comment has been minimized.

Copy link

@vinidiktov vinidiktov commented Mar 16, 2013

I'd love to be able to use Trelby for writing scripts in Russian. Can support for Cyrillic charsets be added even if it is limited (no pdf export)?

@anilgulecha

This comment has been minimized.

Copy link
Contributor

@anilgulecha anilgulecha commented Mar 16, 2013

No.. supporting any other set would be an equal amount of effort.

Also PDF geenration is not really an issue.. we can switch to using one of
the multiple libraries that have come up since the years that Trelby was
originally written. It's more the matter of developers time and effort
needed to rewrite the core.

@oskusalerma

This comment has been minimized.

Copy link
Owner Author

@oskusalerma oskusalerma commented Mar 19, 2013

Switching PDF libraries would not help at all. It is not a matter of code, it is a matter of fonts. The only thing the PDF spec standardizes is ISO-8859-1 character mappings in fonts; to do anything else, you need to know the font you're using, where the characters in it are, etc. It is impossible to write generic Unicode PDF output without having extensive knowledge of the font you're using. And we don't control the fonts a user is using, so...

@oskusalerma

This comment has been minimized.

Copy link
Owner Author

@oskusalerma oskusalerma commented Mar 19, 2013

The PDF generation IS the issue, in other words. There's no need to rewrite anything in the core; all internal processing of text could be trivially switched to be unicode instead of latin-1; in fact, it would make much of the code simpler.

@HuBandiT

This comment has been minimized.

Copy link

@HuBandiT HuBandiT commented Oct 7, 2015

all internal processing of text could be trivially switched to be unicode instead of latin-1; in fact, it would make much of the code simpler.

Where would I start?

@HuBandiT

This comment has been minimized.

Copy link

@HuBandiT HuBandiT commented Oct 8, 2015

I now have Unicode input/editing, saving, loading, exporting to HTML in place; also backwards compatible PDF export (non-Latin-1 characters get replaced by their XML escaped sequences). Most other import/export functionality should also be mostly working, but untested, so could be buggy.

@HuBandiT

This comment has been minimized.

Copy link

@HuBandiT HuBandiT commented Oct 9, 2015

Switching PDF libraries would not help at all. It is not a matter of code, it is a matter of fonts. The only thing the PDF spec standardizes is ISO-8859-1 character mappings in fonts; to do anything else, you need to know the font you're using, where the characters in it are, etc. It is impossible to write generic Unicode PDF output without having extensive knowledge of the font you're using. And we don't control the fonts a user is using, so...

If you have the Unicode TTF in hand (which the assumption is we do, since there is parsing for TTF), you have all information that is needed to output Unicode text using that TTF into PDF. Whether we want to bother reimplementing the necessary non-trivial logic in Trelby, or we opt for using an existing implementation (PyPDF looks usable: https://code.google.com/p/pyfpdf/wiki/Unicode ) is a different question. PyPDF does not look hard to integrate. Bundle a few monospaced/typewriter TTF fonts (e.g. the ones PyPDF offers in their package, or https://en.wikipedia.org/wiki/GNU_FreeFont ) into the Trelby installer and we've covered at least 99% of the users, since very rarely do people want to print screenplays in something else than a monospaced typewriter font.

@gnaag

This comment has been minimized.

Copy link

@gnaag gnaag commented Oct 9, 2015

I now have Unicode input/editing, saving, loading, exporting to HTML in place; also backwards compatible PDF export (non-Latin-1 characters get replaced by their XML escaped sequences). Most other import/export functionality should also be mostly working, but untested, so could be buggy.

Great. What is the next step. Do you need us to test it?

@HuBandiT

This comment has been minimized.

Copy link

@HuBandiT HuBandiT commented Oct 9, 2015

Testing would be nice, yes, since I basically just played whack-a-mole: wherever a bug occurred, I squashed it quickly, without a real understanding of the complete system. Similarly some direction from someone who actually knows how the thing works would be nice.

Next visible steps will be changing the PDF output code to actually support Unicode characters.

@gnaag

This comment has been minimized.

Copy link

@gnaag gnaag commented Oct 9, 2015

Ok, as soon as you commit, let us know. Thanks for your effort.

@anilgulecha

This comment has been minimized.

Copy link
Contributor

@anilgulecha anilgulecha commented Oct 9, 2015

Hi HuBandit,

Which is your branch on github?

Thanks

On Fri, Oct 9, 2015 at 1:38 PM, HuBandiT notifications@github.com wrote:

Testing would be nice, yes, since I basically just played whack-a-mole:
wherever a bug occurred, I squashed it quickly, without a real
understanding of the complete system. Similarly some direction from someone
who actually knows how the thing works would be nice.

Next visible steps will be changing the PDF output code to actually
support Unicode characters.


Reply to this email directly or view it on GitHub
#85 (comment).

@anilgulecha

This comment has been minimized.

Copy link
Contributor

@anilgulecha anilgulecha commented Oct 9, 2015

There are tests BTW which test a lot of the code, so you can just launch
the test and see what breaks there.

On Fri, Oct 9, 2015 at 1:55 PM, Anil Gulecha anil.verve@gmail.com wrote:

Hi HuBandit,

Which is your branch on github?

Thanks

On Fri, Oct 9, 2015 at 1:38 PM, HuBandiT notifications@github.com wrote:

Testing would be nice, yes, since I basically just played whack-a-mole:
wherever a bug occurred, I squashed it quickly, without a real
understanding of the complete system. Similarly some direction from someone
who actually knows how the thing works would be nice.

Next visible steps will be changing the PDF output code to actually
support Unicode characters.


Reply to this email directly or view it on GitHub
#85 (comment).

@HuBandiT

This comment has been minimized.

@HuBandiT

This comment has been minimized.

Copy link

@HuBandiT HuBandiT commented Oct 9, 2015

@anilgulecha I tried to launch the tests, but on my Windows box in a git bash shell the system() calls of do_tests.py fail, and I would rather work on PDF export than making the tests run in this environment.

@oskusalerma

This comment has been minimized.

Copy link
Owner Author

@oskusalerma oskusalerma commented Oct 9, 2015

We are never going to accept patches which do not pass the testsuite, or
big changes like this that have not been tested on all supported platforms.
Sorry, but those are the facts. We don't have the time to do things for
other people they didn't do themselves for whatever reason.
On 9 Oct 2015 10:15 am, "HuBandiT" notifications@github.com wrote:

@anilgulecha https://github.com/anilgulecha I tried to launch the
tests, but on my Windows box in a git bash shell the system() calls of
do_tests.py fail, and I would rather work on PDF export than making the
tests run in this environment.


Reply to this email directly or view it on GitHub
#85 (comment).

@HuBandiT

This comment has been minimized.

Copy link

@HuBandiT HuBandiT commented Oct 9, 2015

That's fine; it works for me as it is, I'll probably also work on making the PDF export Unicode-aware just enough so that it works for me. Then whoever wants to take my improvements and polish them up for an official release is free to do so.

@HuBandiT

This comment has been minimized.

Copy link

@HuBandiT HuBandiT commented Oct 9, 2015

@gnaag you are welcome. I just stomped all over the code to come up with a proof of concept to get things going again. happy to cooperate further if others have good input.

@oskusalerma

This comment has been minimized.

Copy link
Owner Author

@oskusalerma oskusalerma commented Oct 9, 2015

Ah, the old "I'll do the fun bits and let others do the boring bits"
strategy. Good luck with that.
On 9 Oct 2015 11:03 am, "HuBandiT" notifications@github.com wrote:

That's fine; it works for me as it is, I'll probably also work on making
the PDF export Unicode-aware just enough so that it works for me. Then
whoever wants to take my improvements and polish them up for an official
release is free to do so.


Reply to this email directly or view it on GitHub
#85 (comment).

@HuBandiT

This comment has been minimized.

Copy link

@HuBandiT HuBandiT commented Oct 9, 2015

@oskusalerma I am getting mixed signals here. anilgulecha seems to be interested in having Unicode PDF support, then you trump his comment by saying you are not interested in that because it cannot be done properly. So nothing was done (not even an attempt at it). I am interested in having Unicode PDF support as well, and have actually put in work towards making that happen. Now you are telling me off too.

Why don't you come and join the party instead? Helping with your insight would be very valuable in saving lots time for others in rediscovering the hows and whys of the system that you already know, since you wrote it.

We might not get it right and perfect the very first time, but there are (or at least, were, in 2012) users willing to test even an imperfect version, and together we can get there and make it happen.

@HuBandiT

This comment has been minimized.

Copy link

@HuBandiT HuBandiT commented Oct 9, 2015

@oskusalerma Another question: what is our stance on moving to python 3? If we move to Unicode, it might make more sense to also move to python 3 where the str type is changed to be (Unicode) text; versus having to change every str to unicode in the source as would be required for correctness for python 2, only to have to change it all back to str again when eventually moving to python 3 down the road?

@oskusalerma

This comment has been minimized.

Copy link
Owner Author

@oskusalerma oskusalerma commented Oct 9, 2015

If reality collides with your worldview, it's your worldview that needs
adjusting, not reality.

You are proposing to merge in a change that you can't be bothered to even
run the tests for, let alone make sure it works on both Linux and windows.
And you think I'm in the wrong here for saying that's not going to happen?

Your thinking that random users "testing" stuff is somehow valuable, and
that that's the scarce resource missing, not developer time, is completely
wrong. Any developer capable of implementing Unicode PDF support is by
definition capable of testing it.
On 9 Oct 2015 11:38 am, "HuBandiT" notifications@github.com wrote:

@oskusalerma https://github.com/oskusalerma I am getting mixed signals
here. anilgulecha seems to be interested in having Unicode PDF support,
then you trump his comment by saying you are not interested in that because
it cannot be done properly. So nothing was done (not even an attempt at
it). I am interested in having Unicode PDF support as well, and have
actually put in work towards making that happen. Now you are telling me off
too.

Why don't you come and join the party instead? Helping with your insight
would be very valuable in saving lots time for others in rediscovering the
hows and whys of the system that you already know, since you wrote it.

We might not get it right and perfect the very first time, but there are
(or at least, were, in 2012) users willing to test even an imperfect
version, and together we can get there and make it happen.


Reply to this email directly or view it on GitHub
#85 (comment).

@HuBandiT

This comment has been minimized.

Copy link

@HuBandiT HuBandiT commented Oct 9, 2015

Oh, I see where the misunderstanding comes from: I did not propose to merge in these changes. There was no pull request sent, was there.

I merely indicated there is an effort going on to add Unicode support to Trelby. In the hope that there are other interested parties who would want join to work together on it.

@HuBandiT

This comment has been minimized.

Copy link

@HuBandiT HuBandiT commented Oct 9, 2015

@oskusalerma another question: what is our stance on using the printing subsystem of wx? Would it make sense to swtich to it instead of our native PDF? Add it as another option next to our native PDF? Don't even try it because some reason?

@oskusalerma

This comment has been minimized.

Copy link
Owner Author

@oskusalerma oskusalerma commented Oct 9, 2015

What does.printing have to do with PDF creation?

Anyway, wx code is too buggy and platform-dependent to ever use it for
anything not absolutely critical.
On 9 Oct 2015 12:47 pm, "HuBandiT" notifications@github.com wrote:

@oskusalerma https://github.com/oskusalerma another question: what is
our stance on using the printing subsystem of wx? Would it make sense to
swtich to it instead of our native PDF? Add it as another option next to
our native PDF? Don't even try it because some reason?


Reply to this email directly or view it on GitHub
#85 (comment).

@oskusalerma

This comment has been minimized.

Copy link
Owner Author

@oskusalerma oskusalerma commented Oct 9, 2015

No mainstream Linux distro ships python3 as default. If/when they do, and
all our deps work on python3, will we switch to python3, not a moment
sooner.
On 9 Oct 2015 11:50 am, "HuBandiT" notifications@github.com wrote:

@oskusalerma https://github.com/oskusalerma Another question: what is
our stance on moving to python 3? If we move to unicode, it might make more
sense to also move to python 3 where the str type is changed to be
(unicode) text; versus having to change every str to unicode in the
source as would be required for correctness for python 2, only to have to
change it all back to str again when eventually moving to python 3 down
the road?


Reply to this email directly or view it on GitHub
#85 (comment).

@gnaag

This comment has been minimized.

Copy link

@gnaag gnaag commented Oct 9, 2015

@oskusalerma I appreciate your work on Trelby, but never I could understand your attitude to unicode support. More than 70% of the world simply cannot use your great app because it is useless for them. (for me as well). Please stop acting like someone is stealing your child when actually trying to spread its impact. HuBandiT is trying to help. Accept his effort and please try to help us as well. The worst (for you) that can happen is that HuBandiT would fork the code.

Since 2012 I follow this topic and use other tools instead even though I know that Trelby sould be the most suitable for me. Are you willing to reject all the non-latin-1 users just because you don't like the idea of someone else working on your code?

I am sorry that this comment is not related to coding. However I am not a coder, therefore I cannot help in any other way than by testing the branches.

@oskusalerma

This comment has been minimized.

Copy link
Owner Author

@oskusalerma oskusalerma commented Oct 9, 2015

I am not rejecting anyone's work. If somebody submits a fully functioning,
fully tested, patch to trelby that fills all our requirements and
guidelines, and introduced no regressions and adds no new maintenance
burdens, I will be happy to merge it.

So far nobody has done so, and blaming me for other people's failures to do
so, or going further and blaming me because I personally have not invested
the considerable effort to implement such support, is completely uncalled
for.
On 9 Oct 2015 1:07 pm, "gnaag" notifications@github.com wrote:

@oskusalerma https://github.com/oskusalerma I appreciate your work on
Trelby, but never I could understand your attitude to unicode support. More
than 70% of the world simply cannot use your great app because it is
useless for them. (for me as well). Please stop acting like someone is
stealing your child when actually trying to spread its impact. HuBandiT is
trying to help. Accept his effort and please try to help us as well. The
worst (for you) that can happen is that HuBandiT would fork the code.

Since 2012 I follow this topic and use other tools instead even though I
know that Trelby sould be the most suitable for me. Are you willing to
reject all the non-latin-1 users just because you don't like the idea of
someone else working on your code?

I am sorry that this comment is not related to coding. However I am not a
coder, therefore I cannot help in any other way than by testing the
branches.


Reply to this email directly or view it on GitHub
#85 (comment).

@gnaag

This comment has been minimized.

Copy link

@gnaag gnaag commented Oct 9, 2015

@HuBandiT I get this when trying to run your version (I cannot file issue on your fork. I am not familiar with github. Is it normal?)

/opt/trelby/bin/trelby
15:28:15: Debug: Adding duplicate image handler for 'PNG file'
Traceback (most recent call last):
  File "/opt/trelby/bin/trelby", line 9, in 
    trelby.main()
  File "/opt/trelby//src/trelby.py", line 2688, in main
    myApp = MyApp(0)
  File "/usr/lib/python2.7/dist-packages/wx-3.0-gtk2/wx/_core.py", line 8628, in __init__
    self._BootstrapApp()
  File "/usr/lib/python2.7/dist-packages/wx-3.0-gtk2/wx/_core.py", line 8196, in _BootstrapApp
    return _core_.PyApp__BootstrapApp(*args, **kwargs)
  File "/opt/trelby//src/trelby.py", line 2661, in OnInit
    mainFrame = MyFrame(None, -1, "Trelby")
  File "/opt/trelby//src/trelby.py", line 1870, in __init__
    self.statusCtrl = misc.MyStatus(self, -1, getCfgGui)
  File "/opt/trelby//src/misc.py", line 175, in __init__
    TAB_BAR_HEIGHT // 2 + 6, wx.FONTFAMILY_DEFAULT, wx.NORMAL, wx.NORMAL)
  File "/opt/trelby//src/util.py", line 390, in createPixelFont
    h = getFontHeight(fn)
  File "/opt/trelby//src/util.py", line 367, in getFontHeight
    return permDc.GetTextExtent("_\xC5")[1]
  File "/usr/lib/python2.7/dist-packages/wx-3.0-gtk2/wx/_gdi.py", line 4127, in GetTextExtent
    return _gdi_.DC_GetTextExtent(*args, **kwargs)
  File "/usr/lib/python2.7/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xc5 in position 1: unexpected end of data
../src/gtk/dcclient.cpp(250): assert "Assert failure" failed in wxFreePoolGC(): Wrong GC
../src/gtk/dcclient.cpp(250): assert "Assert failure" failed in wxFreePoolGC(): Wrong GC
../src/gtk/dcclient.cpp(250): assert "Assert failure" failed in wxFreePoolGC(): Wrong GC
../src/gtk/dcclient.cpp(250): assert "Assert failure" failed in wxFreePoolGC(): Wrong GC
@HuBandiT

This comment has been minimized.

Copy link

@HuBandiT HuBandiT commented Oct 9, 2015

@oskusalerma thank you, clear on staying with python 2 then. I'll adjust my effort accordingly.

@HuBandiT

This comment has been minimized.

Copy link

@HuBandiT HuBandiT commented Oct 9, 2015

What does.printing have to do with PDF creation?

@oskusalerma the menu item is labeled "Print (via PDF)" and not "Export PDF", so I wanted to know whether the actual intention/requirement was indeed to print (for which using the printing subsystem of wx could potentially make sense as a substitute), or to create a PDF as a means for data interchange.

@HuBandiT

This comment has been minimized.

Copy link

@HuBandiT HuBandiT commented Oct 9, 2015

@gnaag thank you, the first of many bugs that we will chase out. I just pushed an update, please pull and try again now. you are using GTK and I'm under Windows, expect to run into a dozen such before Trelby even launches. If it still does not work after this fix, I'll try to go over the whole Trelby source and unconditionally convert each and every illegal string literal to unicode.

@oskusalerma

This comment has been minimized.

Copy link
Owner Author

@oskusalerma oskusalerma commented Oct 9, 2015

The labeling of the menu item is only for users, so they're not confused
about how to print (which we want them to do through their chosen PDF
reader).

Having written plenty of native printing code in my time, the decision not
to do that for Trelby was a deliberate choice that will not be revisited.
Trying to do native.printing would result in weird.problems that we
couldn't reproduce, thus couldn't fix, thus worsening the user experience.

Whereas writing PDF files using our own code is always going to be
reproducible and 100% the same on all platforms, thus something we can
actually support.
On 9 Oct 2015 2:46 pm, "HuBandiT" notifications@github.com wrote:

What does.printing have to do with PDF creation?

@oskusalerma https://github.com/oskusalerma the menu item is labeled
"Print (via PDF)" and not "Export PDF", so I wanted to know whether the
actual intention/requirement was indeed to print (for which using the
printing subsystem of wx could potentially make sense as a substitute), or
to create a PDF as a means for data interchange.


Reply to this email directly or view it on GitHub
#85 (comment).

@HuBandiT

This comment has been minimized.

Copy link

@HuBandiT HuBandiT commented Oct 9, 2015

Sure thing, all I needed to hear was that it was a deliberate choice.

@gnaag

This comment has been minimized.

Copy link

@gnaag gnaag commented Oct 9, 2015

Ok, now it starts well. However there are two issues.

First: Even though it works, in terminal it runs into the error below whatever I do. It does not influence the behaviour of the program (I wasn't doing any real writing yet).

Traceback (most recent call last):
  File "/opt/trelby//src/misc.py", line 504, in OnPaint
    dc.DrawText("�", xpos + tabW - self.paddingX * 2, self.textY)
  File "/usr/lib/python2.7/dist-packages/wx-3.0-gtk2/wx/_gdi.py", line 3734, in DrawText
    return _gdi_.DC_DrawText(*args, **kwargs)
  File "/usr/lib/python2.7/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xd7 in position 0: unexpected end of data

Second: I am not quite sure how far you got in pdf printing. Writing non-latin-1 characters is just good, but when I print them, it prints only unicode codes of the chars. (screenshot) Saving and loading of files works as expected.

trelby-export-pdf

@gnaag

This comment has been minimized.

Copy link

@gnaag gnaag commented Oct 9, 2015

Another error is on exiting the program:

** (trelby:28886): CRITICAL **: os_bar_hide: assertion 'OS_IS_BAR (bar)' failed
(trelby:28886): Gtk-CRITICAL **: IA__gtk_widget_hide: assertion 'GTK_IS_WIDGET (widget)' failed
** (trelby:28886): CRITICAL **: os_bar_set_parent: assertion 'OS_IS_BAR (bar)' failed
Segmentation fault.
@HuBandiT

This comment has been minimized.

Copy link

@HuBandiT HuBandiT commented Oct 9, 2015

@gnaag I pushed an update for this, please pull again. \xd7 is the Unicode Character 'MULTIPLICATION SIGN' (U+00D7) "×" and indeed misc.py:504 draws the closing "×" symbols for the document tabs, I see now. So this display was what you were probably missing (indeed missing on your screenshot).

Enhancing PDF generation with Unicode is still to be done. As I wrote before, I only quick-fixed it so it does not break on Unicode (non Latin-1) characters; it instead displays the character code in XML notation for such characters.

@gnaag

This comment has been minimized.

Copy link

@gnaag gnaag commented Oct 9, 2015

Great, the error is gone and segmentation fault as well. Moreover the x sign on tabs is back. Now on start there are just these debug notices:

/opt/trelby/bin/trelby
22:38:32: Debug: Adding duplicate image handler for 'PNG file'
22:38:32: Debug: Adding duplicate image handler for 'JPEG file'

And with these exit warnings. Even though some of them are probably more related to gtk, than to app itself.

** (trelby:20309): CRITICAL **: os_bar_hide: assertion 'OS_IS_BAR (bar)' failed
(trelby:20309): Gtk-CRITICAL **: IA__gtk_widget_hide: assertion 'GTK_IS_WIDGET (widget)' failed
** (trelby:20309): CRITICAL **: os_bar_set_parent: assertion 'OS_IS_BAR (bar)' failed
../src/gtk/dcclient.cpp(250): assert "Assert failure" failed in wxFreePoolGC(): Wrong GC
../src/gtk/dcclient.cpp(250): assert "Assert failure" failed in wxFreePoolGC(): Wrong GC
../src/gtk/dcclient.cpp(250): assert "Assert failure" failed in wxFreePoolGC(): Wrong GC
../src/gtk/dcclient.cpp(250): assert "Assert failure" failed in wxFreePoolGC(): Wrong GC

However it starts and exists without visible complains when not minding the terminal.

@HuBandiT

This comment has been minimized.

Copy link

@HuBandiT HuBandiT commented Oct 9, 2015

Sweet. You might want to test other functionality, while I focus my efforts on the PDF generation side. I think these are harmless enough to leave for later. Also, you might want to compare to the last original version to see whether these were indeed introduced by my changes, or were already present before.

@gnaag

This comment has been minimized.

Copy link

@gnaag gnaag commented Oct 10, 2015

Ok, the warnings and notices while starting and exiting are the same in latest original version

14:26:58: Debug: Adding duplicate image handler for 'PNG file'
14:26:58: Debug: Adding duplicate image handler for 'JPEG file'
** (trelby:20722): CRITICAL **: os_bar_hide: assertion 'OS_IS_BAR (bar)' failed
(trelby:20722): Gtk-CRITICAL **: IA__gtk_widget_hide: assertion 'GTK_IS_WIDGET (widget)' failed
** (trelby:20722): CRITICAL **: os_bar_set_parent: assertion 'OS_IS_BAR (bar)' failed
../src/gtk/dcclient.cpp(250): assert "Assert failure" failed in wxFreePoolGC(): Wrong GC
../src/gtk/dcclient.cpp(250): assert "Assert failure" failed in wxFreePoolGC(): Wrong GC
../src/gtk/dcclient.cpp(250): assert "Assert failure" failed in wxFreePoolGC(): Wrong GC
../src/gtk/dcclient.cpp(250): assert "Assert failure" failed in wxFreePoolGC(): Wrong GC
@HuBandiT

This comment has been minimized.

Copy link

@HuBandiT HuBandiT commented Oct 11, 2015

I have largely completed Unicode PDF functionality by transitioning to ReportLab.

Make sure you load a Unicode font into your document before generating the PDF (the default is still to use the default built-in limited fonts of PDF); otherwise you will see boxes for missing characters.

@gnaag Pease pull and try.

Any suggestions for (a) Unicode font(s) to include with Trelby as defaults?

@HuBandiT

This comment has been minimized.

Copy link

@HuBandiT HuBandiT commented Oct 11, 2015

Another update: on Windows the (Unicode) system fonts Courier New, Arial and Times New Roman are now automatically used for PDF generation as well (they were already used for on-screen display), unless the document specifies its own fonts.

This makes Trelby on Windows now mostly WYSIWYG out of the box in the sense of displaying the same font and characters on-screen as in the eventual PDF.

@HuBandiT

This comment has been minimized.

Copy link

@HuBandiT HuBandiT commented Oct 11, 2015

@oskusalerma now that Unicode characters are accepted into documents, the hardcoded character width tables in fontmetric.py for the 12 fonts are getting overindexed by Unicode characters (characters above code point 255). As a quick-fix I hacked around this by using the default character width for such characters. But this does not result in perfect string widths for title strings. The theoretically correct solution would be to eliminate the hardcoded tables and load the final TTF fonts already at this stage for the width calculation, but this means the FontMetric will get coupled to ReportLab. Alternatively, we could make calculations through wx based on the fonts used for the screen, but I dislike that because that calculates things with one font and applies the result to another, potentially causing issues down the road (in the '90s MS Word was infamous for doing this, and unable to be used as a professional layout solution for high-end pre-press work as a result - I don't want to go down that slope again).

What is your stance on this?

@oskusalerma

This comment has been minimized.

Copy link
Owner Author

@oskusalerma oskusalerma commented Oct 12, 2015

Using screen fonts for calculating anything for the final layout is a
no-no, as you yourself correctly observe.

I don't think I want a dependency on ReportLab in Trelby. Trelby is meant
to stand the test of time; I hope it's still around in 50 years' time. I
doubt ReportLab will be around in 50 years, and if it is, it will surely go
through many, many revisions of API changes that we'd need to devote time
(that doesn't exist) to dealing with.

I'm sorry you, and other people, have grander goals for Trelby than what I
see as maintainable forever with the extremely limited resources available
(basically, my non-existent free time). You're free to fork Trelby and have
a go at maintaining it yourself forever, if that's what you want, but I can
not in good conscience accept changes that I can not see myself having the
resources to maintain forever.

If somebody was to observe that that's not how most open source programs
work, they'd be correct. Trelby is a normal open source program however;
just look at the commit statistics to see that. There are only two people
who have committed a non-trivial amount of changes, me and Anil. Last
commit by Anil was three years ago. So anyone arguing that other people
will magically jump in and help me maintain the program are not arguing
from facts, they are arguing from hope.

On 11 October 2015 at 23:09, HuBandiT notifications@github.com wrote:

@oskusalerma https://github.com/oskusalerma now that Unicode characters
are accepted into documents, the hardcoded character width tables in
fontmetric.py for the 12 fonts would be getting overindexed by Unicode
characters (characters above code point 255). As a quick-fix I hacked
around this by using the default character width for such characters. But
this does not result in perfect string widths for title strings. The
theoretically correct solution would be to eliminate the hardcoded tables
and load the final TTF fonts already at this stage for the width
calculation, but this means the FontMetric will get coupled to ReportLab.
Alternatively, we could make calculations through wx based on the fonts
used for the screen, but I dislike that because that calculates things with
one font and applies the result to another, potentially causing issues down
the road (in the '90s MS Word was infamous for doing this, and unable to be
used as a professional layout solution for high-end pre-press work - I
don't want to go down that slope again).

What is your stance on this?


Reply to this email directly or view it on GitHub
#85 (comment).

Osku Salerma

@HuBandiT

This comment has been minimized.

Copy link

@HuBandiT HuBandiT commented Oct 12, 2015

Ok, I could argue with that, but let me ask instead: how about adding Unicode support natively (enhancing our own PDF code)?

@oskusalerma

This comment has been minimized.

Copy link
Owner Author

@oskusalerma oskusalerma commented Oct 12, 2015

That would be fine,as long as it doesn't introduce any regressions. By this
I mean for people writing scripts in Latin1 languages nothing should
change. We're not going to start using non-standard fonts on them by
default, etc.

I typoed.my previous mail, I meant to say "trelby is NOT a normal open
source program".
On 12 Oct 2015 11:49 am, "HuBandiT" notifications@github.com wrote:

Ok, I could argue with that, but let me ask instead: how about adding
Unicode support natively (enhancing our own PDF code)?


Reply to this email directly or view it on GitHub
#85 (comment).

@HuBandiT

This comment has been minimized.

Copy link

@HuBandiT HuBandiT commented Oct 12, 2015

(Yes, I realized a not was missing.)

Ok, so how about starting advertising Trelby as having Unicode support to see whether it gathers momentum. If it does, we can reevaluate whether it could be worth building native Unicode support.

@oskusalerma

This comment has been minimized.

Copy link
Owner Author

@oskusalerma oskusalerma commented Oct 12, 2015

I'm not sure exactly what you're proposing.
On 12 Oct 2015 12:17 pm, "HuBandiT" notifications@github.com wrote:

(Yes, I realized a not was missing.)

Ok, so how about starting advertising Trelby as having Unicode support to
see whether it gathers momentum. If it does, we can reevaluate whether it
could be worth building native Unicode support.


Reply to this email directly or view it on GitHub
#85 (comment).

@gnaag

This comment has been minimized.

Copy link

@gnaag gnaag commented Oct 15, 2015

@HuBandiT This time I have obviously had to install python-reportlab to install and run the code. Without the change of fonts there is a new warning when printing to pdf and of course black boxes instead of unicode characters.

** (pdf:14251): WARNING **: Couldn't register with accessibility bus: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.

After changing the font to ubuntu mono (I have tried some non-mono fonts as well, all of them worked fine) everything work as expected. However the warning is still rendering.

snimka obrazovky z 2015-10-15 14-53-06

Now I am not quite sure whether you are going to fork the code, or you are going to add native support for unicode. However your proof-of-concept with reportlab is working well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants
You can’t perform that action at this time.