Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding scrape for bg #1163

Closed
wants to merge 5 commits into from
Closed

Adding scrape for bg #1163

wants to merge 5 commits into from

Conversation

zinonino
Copy link
Contributor

Adding scrape for Episode, Season for Bulgarian lang.

Adding scrape for Episode, Season for Bulgarian lang.
@azlm8t
Copy link
Contributor

azlm8t commented Sep 25, 2018

Thanks, looks good.

It fails the one test (in support/testdata/eitscrape/bg) "'кулинарно предаване, 3 епизода'", but I think this is unavoidable? The scrape gives episode 3, but when I asked last year some people said it should give "no episode", but I think sometimes it means episode 3 and sometimes it doesn't? I couldn't quite understand.

https://tvheadend.org/issues/4509

@zinonino
Copy link
Contributor Author

Yes, you right "'кулинарно предаване, 3 епизода'" must not show, becasue "3 епизода'" mean will show 3 episodes, one after follow by second and follow by third, 3 together, not mean 3rd episode. I dont have idea how to miss only "епизода", but to stay "епизод", idk why catch "а" when, not put in file. If you have any idea is welcome.
When try "([0-9]+) епизод ", not catch both, when try "([0-9]+) епизод" , catch both "епизода" and "епизод.

@azlm8t
Copy link
Contributor

azlm8t commented Sep 26, 2018

Ah, I understand now. So "3 епизода" could be for cartoons where it means 3 episodes of "Scooby Doo" are shown together in one broadcast 9.30am--11am. Thanks. I'll take another look in a couple of hours.

@azlm8t
Copy link
Contributor

azlm8t commented Sep 26, 2018

Try changing the last two lines in episode_num (and then re-running an OTA scan).
(I can't tell in my browser if it has pasted correct, so that is "cyrillic a").

`
"episode_num": [
"([0-9]+) серия",
"еп. ([0-9]+)",
"[, ] ([0-9]+) еп[.]",
"[, ] еп.([0-9]+)",
"([0-9]+) еп.[,]",
"сериал[, ] еп.([0-9]+)",
"епизод ([0-9]+)",
"Епизод ([0-9]+)",
"[, ] ([0-9]+) епизод([^а]|$)",
"([0-9]+) епизод([^а]|$)"

],

`

@zinonino
Copy link
Contributor Author

Thank you for suggestion, now all work perfect, added also some new type of scrape, final must look this.

@zinonino zinonino closed this Sep 27, 2018
@zinonino zinonino reopened this Sep 27, 2018
@zinonino
Copy link
Contributor Author

bnt
episodes

Now waiting @perexg to merge to master.

@azlm8t
Copy link
Contributor

azlm8t commented Sep 27, 2018

Glad it works.

But, I don't think the "2018/19" lines are correct. (Though it will look ok in the EPG).

echo "сезон 1, епизод 13, драма, романтичен, САЩ, 2014" | egrep "сезон ([0-9]+)([^2018/19]|$)"   

==> "1," (one with a comma) but should be "1" (without comma).

Do you have an example of what you are trying to match / not match with that line?

There examples I have are in tvheadend/support/testdata/eitscrape/bg.

@zinonino
Copy link
Contributor Author

zinonino commented Sep 27, 2018

I try not match, because catch football season 2018/19 , and write like series, Look a pic.

fut season

@azlm8t
Copy link
Contributor

azlm8t commented Sep 27, 2018

OK, try this:

   "season_num": [
        "сезон ([0-9]{1,3})[^0-9]",
        "[, ] сезон ([0-9]{1,3})[^0-9]",

This means "3 digits only" (does not match 4 digits) so will work for 2019/2020 too.

@zinonino
Copy link
Contributor Author

Now not catch all season, some catch, some missing.
missing
missing2

@azlm8t
Copy link
Contributor

azlm8t commented Sep 28, 2018

One final try:

"season_num": [
        "сезон ([0-9]{1,3})([^0-9]|$)",
        "[, ] сезон ([0-9]{1,3})([^0-9]|$)",

@azlm8t suggestion "3 digits only" (does not match 4 digits) 
Thanks again
@perexg
Copy link
Contributor

perexg commented Oct 2, 2018

Merged.

@perexg perexg closed this Oct 2, 2018
@perexg
Copy link
Contributor

perexg commented Oct 3, 2018

e6a0731

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants