Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Punctuation wrongly affects character count for hyphenation #109

Smylers opened this issue Jul 10, 2013 · 2 comments

Punctuation wrongly affects character count for hyphenation #109

Smylers opened this issue Jul 10, 2013 · 2 comments


Copy link

@Smylers Smylers commented Jul 10, 2013

Normally short words aren't ever split across lines and hyphenated, but if they have adjacent punctuation then WeasyPrint (version 0.19.2) wrongly treats them as though they were a longer word.

We've found “(LST)” being split as “(L-” at the end of one line and “ST)” at the start of the next. Evidence that the parens are being counted as word characters: setting this property avoids “(LST)” being split:

-weasy-hyphenate-limit-chars: 6 3;

(But obviously that could still hyphenate a 4-letter word with 2 adjacent punctuation marks. And in the general case requires setting hyphenate-limit-chars higher than you wish, thereby also disallowing hyphenating some words without punctuation which you'd wish to allow.)

CSS says to strip punctuation characters between words for counting their characters:

Pyphen say that punctuation-stripping should be done outside of Pyphen:

Let me know if you'd like a sample document showing this happening.

Copy link

@liZe liZe commented Jul 10, 2013

Yes, that's a problem. We must rewrite the way WP handles text, that's on our TODO-list, and there are some annoying bugs related to that limitation (#74, #100, #106).

Of course, it may also possible to write a quick fix for this bug, but rewriting the whole module will be necessary for the other bugs.

Copy link

@liZe liZe commented Jan 2, 2019

Now that #74 and #100 are fixed, it was easier to handle this case.

netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this issue Feb 21, 2019
Version 45

Released on 2019-02-20.

WeasyPrint now has a `code of conduct

A new website has been launched, with beautiful and useful graphs about speed
and memory use across versions: check `WeasyPerf


* Python 3.5+ is now needed, Python 3.4 is not supported anymore

Bug fixes:

* `798 <>`_:
  Prevent endless loop and index out of range in pagination
* `767 <>`_:
  Add a ``--quiet`` CLI parameter
* `784 <>`_:
  Fix library loading on Alpine
* `791 <>`_:
  Use path2url in tests for Windows
* `789 <>`_:
  Add LICENSE file to distributed sources
* `788 <>`_:
  Fix pending references
* `780 <>`_:
  Don't draw patterns for empty page backgrounds
* `774 <>`_:
  Don't crash when links include quotes
* `637 <>`_:
  Fix a problem with justified text
* `763 <>`_:
  Launch tests with Python 3.7
* `704 <>`_:
  Fix a corner case with tables
* `804 <>`_:
  Don't logger handlers defined before importing WeasyPrint
* `109 <>`_,
  `748 <>`_:
  Don't include punctuation for hyphenation
* `770 <>`_:
  Don't crash when people use uppercase words from old-fashioned Microsoft
  fonts in tables, especially when there's an 5th column
* Use a `separate logger
  <>`_ to
  report the rendering process
* Add a ``--debug`` CLI parameter and set debug level for unknown prefixed CSS
* Define minimal versions of Python and setuptools in setup.cfg


* `796 <>`_:
  Fix a small typo in the tutorial
* `792 <>`_:
  Document no alignement character support
* `773 <>`_:
  Fix phrasing in Hacking section
* `402 <>`_:
  Add a paragraph about fontconfig error
* `764 <>`_:
  Fix list of dependencies for Alpine
* Fix API documentation of HTML and CSS classes
@liZe liZe added the bug label Dec 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants