Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hdtorrentsit error parsing seeders/leechers #4839

Closed
Jorman opened this issue Mar 8, 2019 · 9 comments
Closed

Hdtorrentsit error parsing seeders/leechers #4839

Jorman opened this issue Mar 8, 2019 · 9 comments

Comments

@Jorman
Copy link
Contributor

Jorman commented Mar 8, 2019

Hi

I've a strange error with hdtorrent:

CardigannIndexer (hdtorrentsit): Error while parsing row '
<tr>
	<td align="center" style="padding: 0px">
		<a href="browse.php?cat=3">
			<img border="0" src="./pic//cats/HDDVDBD1080Rip.gif" alt="BDRip 1080p">
		</a>
	</td>
	<td align="left" onmousemove="dialog_tor('img_tor',Xmouse,Ymouse,'https://img.ibs.it/images/5050582922622_0_0_300_75.jpg')">
		<a href="details.php?id=14921">
			<b>S. Darko (2009)</b>
		</a>
		<a href="download.php?id=14921&amp;name=Dark.S.Blu.Ray.Rip.DTS%20ITA%20ENG..mkv.torrent">
			<img src="pic/download.gif" border="0" alt="Scaricare" title="Scaricare">
		</a>
		<br>
		<i>2019-02-25 08:58:11</i>
	</td>
	<td align="center">19.45<br>GB</td>
</tr>':

System.Exception: Error while parsing field=seeders, selector=td:nth-child(4), value=<null>: Selector "td:nth-child(4)" didn't match 
<tr>
	<td align="center" style="padding: 0px">
		<a href="browse.php?cat=3">
			<img border="0" src="./pic//cats/HDDVDBD1080Rip.gif" alt="BDRip 1080p">
		</a>
	</td>
	<td align="left" onmousemove="dialog_tor('img_tor',Xmouse,Ymouse,'https://img.ibs.it/images/5050582922622_0_0_300_75.jpg')">
		<a href="details.php?id=14921">
			<b>S. Darko (2009)</b>
		</a>
		<a href="download.php?id=14921&amp;name=Dark.S.Blu.Ray.Rip.DTS%20ITA%20ENG..mkv.torrent">
			<img src="pic/download.gif" border="0" alt="Scaricare" title="Scaricare">
		</a>
		<br>
		<i>2019-02-25 08:58:11</i>
	</td>
	<td align="center">19.45<br>GB</td>
</tr>
   at Jackett.Common.Indexers.CardigannIndexer.PerformQuery(TorznabQuery query)

Now code seems pretty correct, to me

      seeders:
        selector: td:nth-child(4)
        filters:
          - name: re_replace
            args: ["(\\d*) \\(\\+\\d*\\)\n? \\| (\\d*) \\(\\+\\d*\\)", "$1"]
      leechers:
        selector: td:nth-child(4)
        filters:
          - name: re_replace
            args: ["(\\d*) \\(\\+\\d*\\)\n? \\| (\\d*) \\(\\+\\d*\\)", "$2"]

qqqqqqqqq
Do you know if there's a better way to extract the seeders and leechers?
Originally I thought to extract the text beside the link but I never got it working

@garfield69
Copy link
Contributor

the issue here appears to be that the seeders and leechers do not always appear in the row block, as shown in the html you dumped from the log.
so add a default seeders selector above the existing entry and add an optional statement to the original, so that when the selector is not found it does not cause an error.

      seeders:
        text: 1
      seeders:
        selector: td:nth-child(4)
        optional: true

repeat for leechers.

@Jorman
Copy link
Contributor Author

Jorman commented Mar 8, 2019

Can be a workaround!
I always see one number inside seeders and leechers.

Do you think this's the best way to "fix" this?
Can depend on the selector of the row?

    rows:
      selector: tbody#highlighted tr

Because if I apply the workaround I got this
1111111111111
and is not true
22222222222

What do you think?

@garfield69
Copy link
Contributor

try:

      seeders:
        text: 1
      seeders:
        selector: td:nth-child(4) b a font
        optional: true
      leechers:
        text: 1
      leechers:
        selector: td:nth-child(4) b:nth-child(2) a 
        optional: true

@Jorman
Copy link
Contributor Author

Jorman commented Mar 9, 2019

Hi, I just tried but same results, always 1 seeder and 1 leecher

This's very strange to me :D
How can be possible?

@Jorman Jorman changed the title Hdtorrentsit Hdtorrentsit error parsing seeders/leechers Mar 9, 2019
@garfield69
Copy link
Contributor

I would need to see all of the sites' html to properly examine this.
If you send an invite to garfieldsixtynine @ gmail.com I can take a closer look.

Alternatively, if you provide the full enhanced log file I might be able to see whats going on, but without being able to test my changes there is a high risk of error.

  1. scroll down to the bottom of the Jackett Dashboard and tick the enhanced Logging checkbox
  2. scroll up a bit and click on the apply server settings button
  3. repeat the test
  4. find the log.txt file (linux ~/.config/Jackett/, windows %ProgramData%\Jackett)
  5. edit it with a plain text editor, redacting any personal details, usernames, passwords, passkeys, hashes etc.
  6. save, then drag-drop it here for us to take a look at.
    Thanks.

@Jorman
Copy link
Contributor Author

Jorman commented Mar 9, 2019

Hi, sorry I don't have any invite available but I can put here the log:
https://pastebin.com/XP0EbKhz
let me know if you need some more log or other

Jo

@garfield69
Copy link
Contributor

the log you provided shows that the seeders and leechers are not part of the body of the html presented by the http get/post.
This means that the seeders and leechers are being added to the displayed page dynamically, most likely using ajax.
It's not possible to process this dynamically generated data using the cardigann (the engine behind the yaml indexers) processor.

@Jorman
Copy link
Contributor Author

Jorman commented Mar 9, 2019

If I understood well the seeders/leechers data was added after the html generation.
In this case only the last column, so for this reason title, size and other info are good.

So, this mean that must be write in c?
If yes I have to create a request?

@garfield69
Copy link
Contributor

So, this mean that must be write in c? If yes I have to create a request?

yes to both.

Jorman added a commit to Jorman/Jackett that referenced this issue Mar 10, 2019
Workaround due to Jackett#4839
Waiting for a c version of this tracker
Jorman added a commit to Jorman/Jackett that referenced this issue Mar 10, 2019
Workaround due to Jackett#4839
Waiting for a c version of this tracker
garfield69 pushed a commit that referenced this issue Mar 10, 2019
Workaround due to #4839
Waiting for a c# version of this tracker
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants