New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Subtitles webvtt fixed encoding strings #366
Conversation
wait to merge, i want do some test for the RTL problem |
Related to xbmc/xbmc#17076 ? |
@enen92 |
You might add too
and
|
an HTML escape converter already exists in Kodi, but i don't know which module is used for the subtitles coming from inputStream I'll have to try to add a few log points on the whole code to figure out what module is used for now it's just theory |
a bit of clarity because from what I understand you're confusing things a bit i took the first one you mentioned Vikings as an example
so this is the raw netflix subtitle: there is not LRM, but RLM after a bit you said:
this is not the same thing! I need you to specify some movies with a real RLM subtitles from netflix platform already visible from browser the problem that you mention RLM from HDD is a problem of another nature in how Kodi handle subtitle from hdd |
Vikings is
Try the Irishman, it has The.Irishman.WEBRip.Netflix.ar.zip
|
only at 64 bit: http://www.mediafire.com/file/1094j8d1gab0uh8/KodiBuildW64.zip/file |
Thank you so much, my PC is a very low-end one, I can only install windows 10 32bit. |
give air to the money😄 |
No, the full stop is to the right it should be to the left, the easiest way to know if the punctuation marks are in the correct position is to compare them to the browser ones, the punctuation marks in the browser are the correct ones, the full stop is just the most noticeable thing in the Arabic language punctuation marks. I know it might be for you a simple thing, but sometimes the lines are hard to be read when the lines have many punctuation marks and most of them are at the wrong position. It should be in this way. |
You could install this extension and you will get many more subtitles languages with Netflix, especially the Netflix originals. https://chrome.google.com/webstore/detail/language-learning-with-ne/hoombieeljmmljlkjmnheibnpciblicm This extension shows the Arabic subtitles in the webvtt format in the correct position on the browser when you use it to learn languages, you could investigate how it works by looking inside the extension file "pageScript_netflix.min.js" and search for rlm, the unpacked extension files are located at C:\Users\essam\AppData\Local\Google\Chrome\User Data\Default\Extensions In the extension interface turn it off if you only need to get more subtitles languages on the browser. |
I tested your Kodi Leia 64bit on windows 7 64bit with the original inputstream adaptive 2.4.2 and it converts But this could only work if Kodi merged your PR with Leia or merged it in the master branch and Kodi 19 got fixed to run the Netflix plugin. |
how did you come to these conclusions? working is different instead with the my PR in Kodi, Kodi replace: using a hex editor with raw netflix subtitle file, it shows that there is no other hidden control for the direction of the text, only the I've done hundreds of tests I'm starting to think there's a problem in the fribidi library used in kodi Can you provide me a working RTL arab and/or hebrew .STR file, for: Shtisel S1E1? |
I used to download the Netflix subtitles and adjust the punctuation marks direction using Subtitle Edit https://github.com/SubtitleEdit/subtitleedit/releases by removing the tags and replacing https://mega.nz/#!h0Mg3IDT!x5koqUfhWGLrmabnSNy2CHNLB_aaMex62V2vROryhl0 Shtisel.S01E01.NF.WEBRip.VTT Shtisel.S01E01.NF.WEBRip.VTT.zip Shtisel.S01E01.NF.WEBRip.SRT (after removing the tags and fixing the text direction) Shtisel.S01E01.NF.WEBRip.SRT.zip In front of http://unicode.scarfboy.com/?s=%E2%80%AB and this is a similar way for how Subtitle Edit fixes the text direction in the RTL languages by opening an Arabic subtitle and from Edit menu choose "Select All" then from Edit menu choose "Fix RTL via Unicode control characters", Subtitle Edit is adding the hidden control character Right to Left Embedding (\0xE2\0x80\0xAB) for all the lines. You can use this Arabic subtitle downloaded from primevideo.com for testing how Subtitle Edit fixs the text direction for the RTL languages. The.Other.Boleyn.Girl.WEBRip.Amazon.ar-ar.zip Sorry if my not so good English language confuses you, sometimes I use the same word for two different meanings like saying "the punctuation marks are in the 'right' position" instead of saying "the punctuation marks are in the 'correct' position", some may think that I mean here by the word 'right' the opposite of "left" when I mean "correct". The Arabic subtitles with The Arabic subtitles with |
I think i found the error and now seems to work |
I can not install it, could you upload version for Kodi 18.5 Leia 64bit? |
is for 18.5... open zip and overwrite the file to the isa folder |
Shit, i don't understand why when compile only works on my computer... |
a9af569
to
d4fe62e
Compare
Basically, i never even noticed, but the escape char |
src/parser/WebVTT.cpp
Outdated
strText.replace(0, 5, "\0xE2\0x80\0xAB"); | ||
else if (strText.find("‎", 0, 5) == 0) | ||
strText.replace(0, 5, "\0xE2\0x80\0xAA"); | ||
replace_string(strText, "‏", "\xE2\x80\xAB"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wouldn't it be enough to only remove the else in line 116?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no because the replacement position is fixed from 0 to 5, instead there is the need to perform a replace at any point of the string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
webvtt not always have ‎
at start of string, another example is: <c.arabic>‎.لست جباناً ؟من يقول إنّي جبان</c.arabic>
so we need to find inside the string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, fine, the few cpu cycles should not hurt,
Does this fix finally your issues? I don't see any reason to not let it in then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but: do we have multiple lrm in one line? Or even lrm and rml in one single line?
If not, would removing the 5 lead to the same solution?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we solve problems with all right to left languages,
i'm waiting a confermation by @Essam311
so if you can review also xbmc/xbmc#17085
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but: do we have multiple lrm in one line? Or even lrm and rml in one single line?
If not, would removing the 5 lead to the same solution?
i have not tried, now i see
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change done, practically doesn't change much from the original code, but it can be also reused, unfortunately "replace" does not support directly find replace with two strings parameters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wait until tomorrow before merge i want do another test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no nothing, it's okay
Still waiting for the files to get compiled. |
you need to use the builded isa artifacts not the master branch |
No artifacts for windows yet. |
Thanks for this fix, much appreciated. Will it be available for Leia too ? |
@CastagnaIT Try this modified version of your add-on and test it if you will get the Romanian subtitles in text format in your country or not, you will need to enable the subtitles in the Kodi player. |
tried but no do not work as expected, the problem is not in manifest parameters there's a problem with the requests msl side, they all point to the main profile, instead of being directed to the right profile. |
that modification you made, it has no effect on my system the only way to show the romanian language on Netflix addon, is set romanian to the main profile, and due to the problem specified above, will also apply to all other profiles due to the problem with msl requests |
I have one profile only, may be this why it works for me. |
@Essam311 My email is masca6689@gmail.com if you can send the tool used for extracting NF VTT archive. Would really help me to organize my NF library when downloading a whole show for watching where I have no internet connection. Work travels are a nuisance, also night shifts. |
@masca90021 If you have add-ons that do not work with the latest nightlies of Kodi 19 for windows 32bit (add-ons that do not work with python 3), install this older nightly of Kodi 19 for windows 32bit, then install this modified inputstream adaptive for windows 32bit with the latest fix for lrm (matrix), and the add-ons should work for you (all add-ons must installed manually from zip files). Before installing this older version of Kodi 19 you will need first to delete Kodi folder in C:\Users\username\AppData\Roaming This version of Kodi 19 is the same as Kodi 18.5 which means any add-on needs Kodi to support python 3 will not work on it, it will work only for the add-ons that supports python 2 or the add-ons that supports both of python 2 and 3, the only important different between this older version of Kodi 19 and Kodi 18.5 that it works with inputstream adaptive from the matrix branch (after minor modification in addon.xml in inputstream adaptive). |
Can't install any addon for this version. It gives an error about the python 3 dependency. The only addon that interests me on an x86 version at this time is Jellyfin or Emby. Both do not work atm. Thanks for the x86 ISA. |
Can you add all your add-ons in one zip file and upload them here? |
I think i understood you wrong, I thought you needed inputstream adaptive for Kodi 18 win32bit because your add-ons do not work with the latest Kodi 19 which runs add-ons that must support python3. When you need the version of inputstream adaptive for Kodi 19 win32bit, download the one with the name "inputstream.adaptive+windows-i686" from here. |
This is the only addon that does not work. It installs but fails to load. ISA 2.5.4 that you uploaded works great for Kodi 19 x86, the build from 6th of February. I needed an ISA for Kodi 18.5 x86 so I don't have to use jellyfin on Kodi Leia and Netflix on Kodi 19.
Basically this is what the log says. |
If any add-ons do not work with Kodi 19 32bit and Kodi 18.5 32bit you should ask the developers of the add-ons to fix them. The Netflix add-on should work without any problems on Kodi 18.5 Leia. |
The problem with Kodi 18.5 is that I do not have a working inputstream.adaptive addon for x86. It only works for Kodi 18.5 ( only x64 ) or Kodi 19 ( both x86 and x64 ) so I am forced to use Kodi 19 at work for Netflix and 18.5 for jellyfin ( they started already working on updating for Matrix but it will take some time I suppose ). Netflix isn't the issue for Kodi 18, the lack of an updated inputstream.adaptive x86 for Kodi 18 is. |
If you could find someone to compile to 32bit this file of inputstream adaptive repository of the Leia branch with the latest fixes for LRM it might work with Kodi 18.5 32bit. I just ported the fixes for LRM to the Leia branch but I'm not sure if what I did will work or not (this file can not be installed as it is). |
I shall try it and let you know. Also an updated x64 version is needed for Leia, to incorporate the new changes made to LRM. |
It could be compiled for win32 and win64, I think. |
Now we need someone who can compile this. Thanks for all your hard work, guys. Myself, as an user and not a developer, I trully appreciate the time and effort put into this. |
@masca90021 I updated this comment, read it again, and try this version of Kodi 19 only with the Netflix add-on and jellyfin, I tested the Netflix add-on but did not test jellyfin because I do not know how it works yet, the installation process of jellyfin works without any problems. |
@Essam311 Yes, the problem is when it starts. It gives an error, same error as Plex and Emby. I will give it a try in a couple of hours when I get home.
Getting this while trying to install any addon. Even Inputstream. Can't install anything from the official repo. |
Do not install anything from any official repo install from the zip files only. |
Copy the files here to addons folder in Kodi in C:\Users\username\AppData\Roaming\Kodi\addons and install this inputstream version manually as a zip file (do not install inputstream from within Kodi) https://github.com/peak3d/inputstream.adaptive/files/4177537/inputstream.adaptive-2.5.4.zip https://github.com/CastagnaIT/plugin.video.netflix/archive/v0.16.4.zip After a new installation of Kodi you should wait a minute until it updates itself. |
Nope, same dependency error. Tried it both ways earlier too. Doesn't want to allow it, just spits that dependency error. That's it, I give up, I will use Matrix for Netflix and Leia for Jellyfin until all addons migrate to the new Matrix, LE: Managed to get an updated Jellyfin from the developer until Matrix is out. Everything is working now ( both x86 and x64 Kodi Matrix ). |
@CastagnaIT You can check this plugin out: https://github.com/arvvoid/plugin.video.hbogoeu It's a plugin for HBO Go and can mark episodes as seen with the HBO Go platform. Maybe you get an idea for the NF plugin. |
at first glance i found nothing useful |
Sorry, I had in mind to post to the NF plugin, but I messed up the tabs. |
@masca90021 In case you needed to use add-ons that supports python 2 only, here are inputstream.adaptive-2.4.3.win32.Kodi.18.5.Leia and inputstream.adaptive-2.4.3.win64.Kodi.18.5.Leia for Kodi 18.5 Leia with the latest fix for LRM, I tested them with the latest nightlies for Kodi 18.5 Leia and they worked great. inputstream.adaptive-2.4.3.win32.Kodi.18.5.Leia.zip inputstream.adaptive-2.4.3.win64.Kodi.18.5.Leia.zip The new version of inputstream.adaptive-2.4.3 downloaded from this branch. https://github.com/peak3d/inputstream.adaptive/tree/Leia_backport |
I had to change the identification of encoding strings, unfortunately here you have set the search from the beginning of the string, and this does not work in all cases
this is an example of a netflix arab subtitle string:
<c.arabic>‎.لست جباناً ؟من يقول إنّي جبان</c.arabic>
as you see in this case there are also tags
so you need to search not by position
I've also added recognition of the missing escapesoften displayed in netflix subtitles in any languageI followed the example of another Webtt parserthat performs the replacement more or less as you implementedhttps://github.com/caitp/webvttFor this wait approve of : xbmc/xbmc#17085
it would be very much appreciated if you would also apply this fix also on Leia branch
because a lot of users complain about this inconvenience
refs:
https://forum.kodi.tv/showthread.php?tid=329767&pid=2910574#pid2910574
CastagnaIT/plugin.video.netflix#120
Tested on Kodi 18.5 with windows build
let me know @peak3d and merry Christmas!