Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Text correction breaks HTML-colors #224

Closed
Nadyita opened this issue May 29, 2024 · 1 comment
Closed

Text correction breaks HTML-colors #224

Nadyita opened this issue May 29, 2024 · 1 comment

Comments

@Nadyita
Copy link

Nadyita commented May 29, 2024

Recently, a lot of subtitles, especially for reality shows, come with different colors for the speakers, because they speak over each other. When I run text corrections on these, Gaupol tries to replace every </font> <font color="XXX"> with just a space, even if the <font color="XXX"> actually switches to a new color. Example:

<font color="#00ff00">Yes, exactly.</font> <font color="#00ffff">Oh, wow!</font> <font color="#ffff00">Nice.</font>

would become

<font color="#00ff00">Yes, exactly. Oh, wow! Nice.</font>

It would be nice to have a switch to turn this off, because it makes it a lot harder for a deaf person to understand who said what.

@otsaloma
Copy link
Owner

Thanks for noticing! This is not one of the correction patterns that can be turned off, because it's supposed to be a harmless clean-up thing (often sensible after corrections). But, yes, this is wrong, I'll fix it. I'm guessing why this hasn't come up before is that usually these cases have dialogue with line-breaks. With line-breaks between the bug is not triggered.

<font color="#00ff00">- Yes, exactly.</font>
<font color="#00ffff">- Oh, wow!</font>
<font color="#ffff00">- Nice.</font>

bmwiedemann pushed a commit to bmwiedemann/openSUSE that referenced this issue Jun 24, 2024
https://build.opensuse.org/request/show/1182929
by user 1Antoine1 + anag+factory
- Update to 1.15:
  * Don't merge font tags with different values
    (gh#otsaloma/gaupol#224).
  * Drop dependency on chardet.
  * Add dependency on charset-normalizer.
  * Raise Python dependency to >= 3.5.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants