Skip to content

[BUG] ISDB subtitles issues with encoding of special characters #999

@jakubvojacek

Description

@jakubvojacek

CCExtractor version (using the --version parameter preferably) : e9d2a89768f10e6d269dcd0b9245895f3899a72d

In raising this issue, I confirm the following (please check boxes, eg [X] - and delete unchecked ones):

  • I have read and understood the contributors guide.
  • I have checked that the bug-fix I am reporting can be replicated, or that the feature I am suggesting isn't already present.
  • I have checked that the issue I'm posting isn't already reported.
  • I have checked that the issue I'm porting isn't already solved and no duplicates exist in closed issues and in opened issues
  • I have checked the pull requests tab for existing solutions/implementations to my issue/suggestion.
  • I have used the latest available version of CCExtractor to verify this issue exists.

My familiarity with the project is as follows (check one, eg [X] - and delete unchecked ones):

  • I absolutely love CCExtractor, but have not contributed previously.

Necessary information

  • Is this a regression (did it work before)? [x] NO | [ ] YES - please specify the last known working version
  • What platform did you use? [ ] Windows - [x] Linux - [ ] Mac
  • What were the used arguments? ccextractor -datapid 0x116 -o test.vtt nsc.mp4

Video links (replace text below with your links)

nsc.mp4 - https://goo.gl/iiKTAQ

Additional information
Hello,

the issue is with portugesse accent characters, such as á, ã, ê, .... Instead of these characters, the cccextractor shows ?. Probably some issue with encoding? Am I doing something wrong or is this an issue with ccextractor? Please find bellow samples from generated test.vtt file and manually fixed comparison

4
00:00:14,421 --> 00:00:15,925
Ah, vamos l?!                   


5
00:00:15,926 --> 00:00:17,430
Que horas s?o agora?       

vs

4
00:00:14,421 --> 00:00:15,925
Ah, vamos lá!                   


5
00:00:15,926 --> 00:00:17,430
Que horas são agora?    

Thank you
Jakub

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions