Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Honor language-specific rules for text-transform #1749

Open
DidierLoiseau opened this issue Oct 20, 2022 · 2 comments
Open

Honor language-specific rules for text-transform #1749

DidierLoiseau opened this issue Oct 20, 2022 · 2 comments
Labels
feature New feature that should be supported good first issue Issues that can be quite easily solved by Python developers with a good CSS background

Comments

@DidierLoiseau
Copy link

text-transform: uppercase defines some language-specific rules such as i/İ for Turkic languages and άι/ΑΪ for Greek.

As shown with the following html, WeasyPrint does not seem to respect those rules:

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="tr">
<body>
	<table>
		<tr>
			<th>Turkish:</th>
			<td style="text-transform: uppercase">a b c ç d e f g ğ h ı i j k l m n o ö p r s ş t u ü v y z</td>
		</tr>
		<tr>
			<th>Expected:</th>
			<td>A B C Ç D E F G Ğ H I İ J K L M N O Ö P R S Ş T U Ü V Y Z</td>
		</tr>
		<tr>
			<th>Greek:</th>
			<td lang="el" style="text-transform: uppercase">ά ή άι</td>
		</tr>
		<tr>
			<th>Expected:</th>
			<td>Α Ή ΑΪ</td>
		</tr>
	</table>
</body>
</html>

Renders as:
image

Firefox and Chrome handle it properly:
image

I’m also attaching the generated pdf.pdf with WeasyPrint 57.0.

@liZe liZe added the feature New feature that should be supported label Oct 20, 2022
@liZe liZe changed the title test-transform: uppercase does not honnor language-specific rules? Honor language-specific rules for test-transform Oct 20, 2022
@liZe
Copy link
Member

liZe commented Oct 20, 2022

We could rely on pyICU, but it should be possible to handle all the exceptions manually instead of depending on an external library. According to ICU’s repository, it looks like we have only 3 exceptions:

  • Turkish/Azeri
  • Greek
  • Lithuanian

The goal is to change these functions to take style['lang'] as a parameter and to handle the language-specific differences using naive Python code (there’s no need to optimize this code that shouldn’t be called often).

The "hard" part is to understand what’s exactly defined in ICU’s file 😁️.

@liZe liZe added the good first issue Issues that can be quite easily solved by Python developers with a good CSS background label Oct 20, 2022
@liZe
Copy link
Member

liZe commented Oct 23, 2022

That’s a good issue for a first-time contributor, if anyone’s interested in this feature we’ll be happy to help you add some code for it!

@liZe liZe changed the title Honor language-specific rules for test-transform Honor language-specific rules for text-transform Mar 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature that should be supported good first issue Issues that can be quite easily solved by Python developers with a good CSS background
Projects
None yet
Development

No branches or pull requests

2 participants