Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix SiCKRAGETV/sickrage-issues/issues/3347 #2905

Merged
merged 1 commit into from Oct 18, 2015
Merged

Fix SiCKRAGETV/sickrage-issues/issues/3347 #2905

merged 1 commit into from Oct 18, 2015

Conversation

ncksol
Copy link
Contributor

@ncksol ncksol commented Oct 17, 2015

HD-Torrents has some invalid html on the page with search results. Using
the default html parser wasn't returning the correct data. Substituted
it with html5 parser to fix the problem.

P.S. I've also created another PR with a different fix to that issue. Not sure which one would you prefere.

@fernandog
Copy link
Contributor

@ncksol
this does not work for me:

That's why I added that log line about "Could not find table of torrents mainblockcontenttt".

2015-10-17 17:54:46 ERROR    SEARCHQUEUE-MANUAL-95011 :: [HDTorrents] :: Could not find table of torrents mainblockcontenttt
2015-10-17 17:54:42 DEBUG    SEARCHQUEUE-MANUAL-95011 :: [HDTorrents] :: Search string: Modern Family S07E04
2015-10-17 17:54:42 DEBUG    SEARCHQUEUE-MANUAL-95011 :: [HDTorrents] :: Search URL: https://hd-torrents.org/torrents.php?search=Modern+Family+S07E04&active=1&options=0&category[]=59&category[]=60&category[]=30&category[]=38
2015-10-17 17:54:42 DEBUG    SEARCHQUEUE-MANUAL-95011 :: [HDTorrents] :: Search Mode: Episode
2015-10-17 17:54:42 INFO     SEARCHQUEUE-MANUAL-95011 :: [HDTorrents] :: Performing episode search for Modern Family
AANone
2015-10-17 17:54:42 ERROR    SEARCHQUEUE-MANUAL-95011 :: [HDTorrents] :: Could not find table of torrents mainblockcontenttt
2015-10-17 17:54:32 DEBUG    SEARCHQUEUE-MANUAL-95011 :: [HDTorrents] :: Search URL: https://hd-torrents.org/torrents.php?search=&active=1&options=0&category[]=59&category[]=60&category[]=30&category[]=38

@ncksol
Copy link
Contributor Author

ncksol commented Oct 17, 2015

Turns out simply upgrading soup to v4 fixed the problem. No need for html5 parser. I've just tested this pr on my server it has found the episodes that were missing.

@fernandog
Copy link
Contributor

@ncksol

still does not work to me:

2015-10-17 18:52:44 INFO     SEARCHQUEUE-MANUAL-95011 :: [HDTorrents] :: Unable to find a download for: [Modern Family - 7x04 - She Crazy]
2015-10-17 18:52:44 ERROR    SEARCHQUEUE-MANUAL-95011 :: [HDTorrents] :: Could not find table of torrents mainblockcontenttt
2015-10-17 18:52:40 INFO     SEARCHQUEUE-MANUAL-95011 :: [HDTorrents] :: Performing episode search for Modern Family
AANone
2015-10-17 18:52:40 ERROR    SEARCHQUEUE-MANUAL-95011 :: [HDTorrents] :: Could not find table of torrents mainblockcontenttt

@fernandog
Copy link
Contributor

@ncksol

here is my html:

                data = self.getURL(searchURL)
                g = open('hdt.html', 'w')
                g.write(str(data))
                g.close()

https://gist.github.com/fernandog/e11a882d060ebcbc237e

@ncksol
Copy link
Contributor Author

ncksol commented Oct 17, 2015

Yeah, I can recreate the issue from your html. Somehow it's different from the one I am getting served. I am completely buffled atm as to why the parser cannot find the data. This applies to both of my approaches. None of them work now.
I will need to try this again with a fresh head tomorrow.

@fernandog
Copy link
Contributor

NP! This issue is bugging me for quite a while!
Thanks!

@miigotu
Copy link
Contributor

miigotu commented Oct 18, 2015

@ncksol Yeah I had the same problem. It works fine for me but not fernandog and duramato -.-

I know sort of what is wrong, its the broken html in each item for the date ie <bWed, 22, Whatever>

@ncksol
Copy link
Contributor Author

ncksol commented Oct 18, 2015

Ok. So I've started to play with html that fernandog gets served and found out that if I just take the portion that containts the search result data then parser is fine. I decided to cut out anything before the table and it seem to work now. Both on html fernandog gave and the one I am getting served.

@fernandog
Copy link
Contributor

@ncksol Awesome! It's working!!

[HDTorrents] :: Discarding torrent because it doesn't meet the minimum seeders or leechers: Modern.Family.S07E04.720p.HDTV.x264-FLEET (S:0 L:0)
[HDTorrents] :: Discarding torrent because it doesn't meet the minimum seeders or leechers: Modern.Family.S07E04.INTERNAL.720p.HDTV.x264-BATV (S:0 L:0)

But seeds are wrong.Do you know how to fix?:
image

@fernandog
Copy link
Contributor

Size is working: [HDTorrents] :: Size found is: 699148533.76 but it's not detecting the KB,MB, etc

                                # Need size for failed downloads handling
                                if re.match(r'[0-9]+,?\.?[0-9]* [KkMmGg]+[Bb]+', cells[7].text):
                                    size = self._convertSize(cells[7].text)
                                    if not size:
                                        size = -1

@fernandog
Copy link
Contributor

@ncksol can you check what's wrong this this?

https://gist.github.com/fernandog/9f4c3c2417bbc776bd69

No results in SR, but 5minutes ago I got results

@fernandog
Copy link
Contributor

I found out that there's no cells in the page

                        if not cells:
                            logger.log(u"No cells in page", logger.ERROR)
                            continue

2015-10-18 10:15:38 ERROR SEARCHQUEUE-MANUAL-95011 :: [HDTorrents] :: No cells in page

@ncksol
Copy link
Contributor Author

ncksol commented Oct 18, 2015

@fernandog got the fix for seeders.
I believe the size was actually reported like that originally. I've not changed any of that code. It specifically takes the data from size cell, then determines whether it's kb/mb/gb/tb and converts it all to bytes.

@ncksol
Copy link
Contributor Author

ncksol commented Oct 18, 2015

As for the cells not found, I can't seem to be able to recreate this even with your html. If I feed it to my code it just comes back with valid:
Found result: Modern.Family.S07E04.INTERNAL.720p.HDTV.x264-BATV
Found result: Modern.Family.S07E04.720p.HDTV.x264-FLEET

@fernandog
Copy link
Contributor

Ah! that's right. Sorry!

@miigotu any comments on the PR?

@fernandog
Copy link
Contributor

@ncksol can you squash into one commit please?

@ncksol
Copy link
Contributor Author

ncksol commented Oct 18, 2015

@fernandog unfortunately I have no idea how to do that =D
literally never used git before.

@fernandog
Copy link
Contributor

You are using only git web?

@ncksol
Copy link
Contributor Author

ncksol commented Oct 18, 2015

No. I have github for windows. But it's pretty basic functionality. Only allows you to push, pull stuff.

@duramato
Copy link
Contributor

@ncksol it should have installed git with that
Try opening a terminal on the project folder and do
git remote -v it should show 2 origins and 2 upstreams
if it doesnt show the upstream do : git remote add upstream https://github.com/SiCKRAGETV/SickRage.git

after that do

git fetch upstream
git rebase -i upstream/develop

and
"leave your first commit as "pick" and the rest set to "fixup" or "squash", do not change anyone elses commits. Save and exit"
Finally do
git push origin yourbranchname -f

HD-Torrents has some invalid html on the page with search results. Using
the default html parser wasn't returning the correct data. Substituted
it with html5 parser to fix the problem.

Update soup to v4

Cutting out invalid portions of html before feeding it to parser.

Added error handling and case insensitive match

Fixed detection of seeders/leechers and improved size detection
@ncksol
Copy link
Contributor Author

ncksol commented Oct 18, 2015

@duramato thanks! looks like i've managed to do it.

@fernandog
Copy link
Contributor

@miigotu any commens before merge?

@miigotu
Copy link
Contributor

miigotu commented Oct 18, 2015

I must test this before merge =P

@fernandog
Copy link
Contributor

I tested and its working fine so far.
Dont know if @duramato tested

@fernandog
Copy link
Contributor

@miigotu So did tested?

fernandog added a commit that referenced this pull request Oct 18, 2015
Fix SiCKRAGETV/sickrage-issues/issues/3347
@fernandog fernandog merged commit 47f834e into SiCKRAGE:develop Oct 18, 2015
@fernandog
Copy link
Contributor

@ncksol

Don't know if HDT is passing the correct char to SR or it's only the log:

From website: Carnivàle S02 1080p WEB-DL DD5.1 H.264-BS

2015-10-18 20:01:26 DEBUG    SEARCHQUEUE-DAILY-SEARCH :: [HDTorrents] :: Attempting to add item to cache: Carniv%C3%A0le.S01.1080p.WEB-DL.DD5.1.H.264-BS
2015-10-18 20:01:26 DEBUG    SEARCHQUEUE-DAILY-SEARCH :: [HDTorrents] :: Unable to parse the filename Carniv%C3%A0le.S02.1080p.WEB-DL.DD5.1.H.264-BS into a valid sho

@ncksol ncksol deleted the branch-fix_hdtorrents branch October 23, 2015 23:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants