Broken Scrapers #123

bnkai · 2020-08-09T15:06:23Z

Any issues with scrapers not working should be mentioned here
The name of the scraper, the xpath or part not working would be appretiated.

Known Issues

IAFD may need a couple of tries to scrape (CF detection issues)
nhentai scraper is broken ( blocked/detected by site / CF ?)

updated 2022-09-25

Belleyy · 2020-08-21T12:22:01Z

Look like the JavLibrary scraper can be broken sometimes.
You get the DDOS Protection Cloudflare that block it (You normally need to wait 5sec to be redirected to the site.)
I try with useCDP don't fix it.

Idea:
Javlibrary have mirror/clone, maybe it would be good to have a option if it's fail, it change the url and try with these site.
Exemple all are the same:

https://www.javlibrary.com/en/?v=javlilbj7e
https://www.m45e.com/en/?v=javlilbj7e
https://www.u44r.com/en/?v=javlilbj7e
https://www.g46e.com/en/?v=javlilbj7e

But i don't think it would be useful for other scraper.

bnkai · 2020-09-02T21:05:11Z

@brumouta thanks for the feedback welivetogether,babes now are moved to a separate one
edit added momsbang,momslickteens and propertysex also

budislov · 2020-09-06T16:56:33Z

RealityKings has some more broken domains:
bellesafilms.com, danejones.com, lesbea.com and sexyhub.com only parse the image. Will work fine if they are moved to RealityKingsOL

bnkai · 2020-09-06T18:48:34Z

Thanks for the feedback @budislov
The relevant scrapers have been updated

budislov · 2020-09-11T02:09:06Z

Looks like RealityKingsOL is broken. Tried to scrap from both babes.com and bellesafilms.com and only the tags came through. It appears that the div classes used in the scrapper have changed. Will investigate further.

bnkai · 2020-09-11T20:25:34Z

~~Pending PR is available for RealityKingsOL and Brazzers~~
relevant PRs merged

Ziatexataor · 2020-10-06T08:21:05Z

iafd.com performer scraper not working

bnkai · 2020-10-06T19:39:58Z

IAFD fixed , thanks for the report @Ziatexataor and for the fix @Belleyy

malibustacynewhat · 2020-10-09T20:04:09Z

TransSensual.yml seems to be broken. Tested with new and older scenes and can't pull the data

bnkai · 2020-10-11T11:40:12Z

@malibustacynewhat thanks for the report
The relevant PR by @Belleyy fixes the issue

mmenanno · 2020-10-11T21:36:59Z

JAVLibrary is broken https://github.com/stashapp/CommunityScrapers/blob/master/scrapers/javlibrary.yml

Looks to be a Cloudflare error but using the CDP driver didn't resolve it for me when testing:

<!DOCTYPE html><html lang="en-US"><head>
  <meta charset="UTF-8"/>
  <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
  <meta http-equiv="X-UA-Compatible" content="IE=Edge,chrome=1"/>
  <meta name="robots" content="noindex, nofollow"/>
  <meta name="viewport" content="width=device-width,initial-scale=1"/>
  <title>Just a moment...</title>
  <style type="text/css">
    html, body {width: 100%; height: 100%; margin: 0; padding: 0;}
    body {background-color: #ffffff; color: #000000; font-family:-apple-system, system-ui, BlinkMacSystemFont, "Segoe UI", Roboto, Oxygen, Ubuntu, "Helvetica Neue",Arial, sans-serif; font-size: 16px; line-height: 1.7em;-webkit-font-smoothing: antialiased;}
    h1 { text-align: center; font-weight:700; margin: 16px 0; font-size: 32px; color:#000000; line-height: 1.25;}
    p {font-size: 20px; font-weight: 400; margin: 8px 0;}
    p, .attribution, {text-align: center;}
    #spinner {margin: 0 auto 30px auto; display: block;}
    .attribution {margin-top: 32px;}
    @keyframes fader     { 0% {opacity: 0.2;} 50% {opacity: 1.0;} 100% {opacity: 0.2;} }
    @-webkit-keyframes fader { 0% {opacity: 0.2;} 50% {opacity: 1.0;} 100% {opacity: 0.2;} }
    #cf-bubbles > .bubbles { animation: fader 1.6s infinite;}
    #cf-bubbles > .bubbles:nth-child(2) { animation-delay: .2s;}
    #cf-bubbles > .bubbles:nth-child(3) { animation-delay: .4s;}
    .bubbles { background-color: #f58220; width:20px; height: 20px; margin:2px; border-radius:100%; display:inline-block; }
    a { color: #2c7cb0; text-decoration: none; -moz-transition: color 0.15s ease; -o-transition: color 0.15s ease; -webkit-transition: color 0.15s ease; transition: color 0.15s ease; }
    a:hover{color: #f4a15d}
    .attribution{font-size: 16px; line-height: 1.5;}
    .ray_id{display: block; margin-top: 8px;}
    #cf-wrapper #challenge-form { padding-top:25px; padding-bottom:25px; }
    #cf-hcaptcha-container { text-align:center;}
    #cf-hcaptcha-container iframe { display: inline-block;}
  </style>

    <meta http-equiv="refresh" content="12"/>
<script type="text/javascript">
  //<![CDATA[
  (function(){
    
    window._cf_chl_opt={
      cvId: "1",
      cType: "non-interactive",
      cNounce: "90957",
      cRay: "5e0bb321ef7bca98",
      cHash: "da202b537a470c2",
      cFPWv: "g",
      cRq: {
        ru: "aHR0cDovL3d3dy5qYXZsaWJyYXJ5LmNvbS9lbi8/dj1qYXZtZXpiZTNh",
        ra: "TW96aWxsYS81LjAgKE1hY2ludG9zaDsgSW50ZWwgTWFjIE9TIFggMTBfMTVfNSkgQXBwbGVXZWJLaXQvNTM3LjM2IChLSFRNTCwgbGlrZSBHZWNrbykgQ2hyb21lLzgzLjAuNDEwMy4xMDYgU2FmYXJpLzUzNy4zNg==",
        rm: "R0VU",
        d: "q4jiR7WSBtf4fLzLz9igfZOdIxwSKG18lkM8oKJ2oB8n30GM2iyW8aiQ9atzUZsOBOiOCY1F45Ok0xoQE9LhBiZfXlfVJaHdOBUlqNu1cCbboEIdvJX1FuypXHYYwXjfaKTC2p4xeTL5nAkfqvaQqkt1H/1p0rqFLGuv5JXJ3gBxB6Y/uALdxdsFi+lSlCG6Qe3X2Lj+WYyKl3todU7QjK8vUNythAJOrMTlR1fGrfbfXESvY4tSMJo7OEhwZymfB+AKhpzlHeTcuo+T40qfUHcXUDFRZCqSIvBynJ532Jn2bbqiZ1XffuBhRCVhBxK+kkJ9NurfuchvBr0bA3lk+Dnyykdr0hUr5lE34hioN0t6bDwXnGSBMCsX40Hx6TDDQa+utstnZqYk3G1jtYupvATJXzjvxhaNDHgOwHJomiUip/glK6aw52FuNwxXEj7ZJmdJPg4omti3B/1l7wy5+Z1rERc/nHgZE2JBxsOMDFpFXx6oNX/ZCk1//+mIVxGVFfNCBIGI1eyIKCP6LkCcsw1+aeO2YHmOzBkz9Ebx3drg5ouDQU0bmnNNsuh6vtMZ2eydA3b8y1H2mfO+UoUwB7Ej5u0cR1gJGbuSHpK+imsOFpqmwJdDPhqXYl5xcy6nVCnU2xeyqXJP/HMHGjU4h3Op/vlZKIuhtqFPC6Guk0FIUbFTI4JGMG7u3UwcuuYUrnmYXFX1vupeVrqsjsRFJXnqRhnWc+EJ62b3QYIqf/pFpb/eKU8DpE4wKEmd05vkzLCS1DZQ29AxACho6Zf0brScVV2/qvY5qVsNlk9QCSJdmmR7eyfAPju4BoRmFWdRVEymwQHM7raS1XGdZvcFDw==",
        t: "MTYwMjQ1MjAwOS4yNzIwMDA=",
        m: "jAJ8FygcOeJXMeDg2+r+pIbPCZvv7uD3AA/cCQ2MIkQ=",
        i1: "2tfaQpq68/qtCUW9AL9YZA==",
        i2: "DT4KsCiUsfsu8FZXKRmHjg==",
        uh: "TprDV0CpLyfpdzs+8x+WX/Btsv1e+OQLx8NzEGjSfMY=",
        hh: "3htzUBXaqug0moZaVaRPWNYG1rRQQxdDndKhxQafs0M=",
      }
    }
    window._cf_chl_enter = function(){window._cf_chl_opt.p=1};
    
    var a = function() {try{return !!window.addEventListener} catch(e) {return !1} },
    b = function(b, c) {a() ? document.addEventListener("DOMContentLoaded", b, c) : document.attachEvent("onreadystatechange", b)};
    b(function(){
      var cookiesEnabled=(navigator.cookieEnabled)? true : false;
      var cookieSupportInfix=cookiesEnabled?'/nocookie':'/cookie';
      var a = document.getElementById('cf-content');a.style.display = 'block';
      var isIE = /(MSIE|Trident\/|Edge\/)/i.test(window.navigator.userAgent);
      var trkjs = isIE ? new Image() : document.createElement('img');
      trkjs.setAttribute("src", "/cdn-cgi/images/trace/jschal/js"+cookieSupportInfix+"/transparent.gif?ray=5e0bb321ef7bca98");
      trkjs.id = "trk_jschal_js";
      trkjs.setAttribute("alt", "");
      document.body.appendChild(trkjs);
      
      var cpo = document.createElement('script');
      cpo.type = 'text/javascript';
      cpo.src = "/cdn-cgi/challenge-platform/h/g/orchestrate/jsch/v1";
      var done = false;
      cpo.onload = cpo.onreadystatechange = function() {
        if (!done && (!this.readyState || this.readyState === "loaded" || this.readyState === "complete")) {
          done = true;
          cpo.onload = cpo.onreadystatechange = null;
          window._cf_chl_enter()
        }
      };
      document.getElementsByTagName('head')[0].appendChild(cpo);
    
    }, false);
  })();
  //]]>
</script>


</head>
<body>
  <div style="display: none;"><a href="http://bt50.org/nonalignedfrequent.php?pl=0">table</a></div><table width="100%" height="100%" cellpadding="20">
    <tbody><tr>
      <td align="center" valign="middle">
          <div class="cf-browser-verification cf-im-under-attack">
  <noscript>
    <h1 data-translate="turn_on_js" style="color:#bd2426;">Please turn JavaScript on and reload the page.</h1>
  </noscript>
  <div id="cf-content" style="display:none">
    
    <div id="cf-bubbles">
      <div class="bubbles"></div>
      <div class="bubbles"></div>
      <div class="bubbles"></div>
    </div>
    <h1><span data-translate="checking_browser">Checking your browser before accessing</span> javlibrary.com.</h1>
    
    <div id="no-cookie-warning" data-translate="turn_on_cookies" style="display:none">
      <p data-translate="turn_on_cookies" style="color:#bd2426;">Please enable Cookies and reload the page.</p>
    </div>
    <p data-translate="process_is_automatic">This process is automatic. Your browser will redirect to your requested content shortly.</p>
    <p data-translate="allow_5_secs">Please allow up to 5 seconds…</p>
  </div>
   
  <form class="challenge-form" id="challenge-form" action="/en/?v=javmezbe3a&amp;__cf_chl_jschl_tk__=c44b146f044ddd9d0b23bf4928759e99e7ddef0e-1602452009-0-Ab8hTl3noYmOwwAWI1D0d_6zhaYO-4vHBJD8JW4VCFmZKjqal-xVCdpCdbztfKStCEp8QJa2ganoOGB_Jnq-Qwtu6BnG7zySJxaY_Oc54OgSHPG3Mt1wJ-nYfmFjU8ShDtM6t2VT15V5I0rsRAGRc5RZPs1OE8Vi3aozMxTjxatgWYLmnk0ozVyDVudpWURh7xhqtqs9M9vv_jAfqIUgHIwFe1MVURVaxrV4jOsccyGYHvJ8ZLFmpzrqf8LPPa2N3M1SG-T4vUDhsLgjgeIkfOC6_U3zZBVNKUY8HU47JaiTLjHHnOMHfzeA4iz76Sb2MQ" method="POST" enctype="application/x-www-form-urlencoded">
    <input type="hidden" name="r" value="c77fde06d76dccbdf1aa275a6824657ec7878994-1602452009-0-AefHkw7YBHV4yapfdyGgNFofr2bk+ZNLsmu1vxzyTAyFPQickf2DVbsdFnOKYI9Zs5D6PO21kZcj5siVtnYOhmEJ7HOBLBCp4lS+GBW8iyR62pXG9ezmP6Fu4qRomUkK8uCSsqveohhquzDEYroSgMpZT0eIJXFIprAfC6uIux7NSx6mo8wGMKFoW3TJJFmAN4FKgZdHpkLShowC8AaRocTx86yZzOOrEywJ5CGsOzw5vNg4GvS4gK6MB+pR3iKfGRnXamisWHrWYZWDyfiGHOfcD8LmcCWzeIEMfD+nADV4477P2jWOHIDvEqtS7Yi0G3qKvH16LmR28qALhOLv8PAhv2GBzp8EOUcdXkJfFN1Jloqm5JU2eoCn/5uBxE0xl80s8Xfaa9vhkhqRicv3XnmHpJRhXgNvauGiYLcmaJ0189RtB6eEhZ6j1N9o9pfstDcSa00ur7vPLgDCd2AqiVrVz8SG8zb+8L+wlfrTaBCIlAiecjoTFLHTPEZW2V4eaVYzY9ECAb69YOhnGBhUXDiDk8wjSLZv8uZYMIxwW+jEsdzAtJ9TkMq5VXrE/sORd24lamS6K3Lr8g9BasZTjJdR3Omni9UmlQVaVDXUIPQBAb6x1nhf57/47lvWjDgrjuEw47NDosN3IHSDoyKYUMg="/>
    <input type="hidden" value="3715604b2b146b25182bb17d479ebda2" id="jschl-vc" name="jschl_vc"/>
    <!-- <input type="hidden" value="" id="jschl-vc" name="jschl_vc"/> -->
    <input type="hidden" name="pass" value="1602452013.272-qzCPIXiuVG"/>
    <input type="hidden" id="jschl-answer" name="jschl_answer"/>
  </form>
  
  <div id="trk_jschal_nojs" style="background-image:url(&#39;/cdn-cgi/images/trace/jschal/nojs/transparent.gif?ray=5e0bb321ef7bca98&#39;)"> </div>
</div>

          
          <div class="attribution">
            DDoS protection by <a href="https://www.cloudflare.com/5xx-error-landing/" target="_blank">Cloudflare</a>
            <br/>
            <span class="ray_id">Ray ID: <code>5e0bb321ef7bca98</code></span>
          </div>
      </td>
     
    </tr>
  </tbody></table>


</body></html>

Ziatexataor · 2020-11-06T02:19:03Z

teamskeet.com
not working

bnkai · 2020-11-06T17:47:40Z

Teamskeet only works for a single query and then cloudflare blocks the ip i think.
Not much can be done

for javlibrary with the last update you can change the url to one of the mirrors and it should work

SpedNSFW · 2020-11-08T02:34:27Z

Vixen Network sites now require you to login when opening a scene page, thus the scraper no longer works.

Belleyy · 2020-11-08T11:28:44Z

Vixen Network sites now require you to login when opening a scene page, thus the scraper no longer works.

Already solved in discord, but for other people:
If you are in performer page, the link to the scene will have members. in the ULR (https://members.tushy.com/inauguration)
Just remove the members. to get to the scene. 😃

Threak · 2020-11-17T00:23:45Z

teenfidelity.com doesn't work (part of /scrapers/KellyMadisonMedia.yml)
the comment states the first scraping attempt should set a cookie, the second attempt should work, but it doesn't

bnkai · 2020-11-17T08:52:27Z

@Threak are you sure you setup cdp correctly? Just tried and it seems to work. The first request has something to do with their site protection not necessary a cookie.
You can append this at the end of the scraper file , refresh the scrapers

debug:
  printHTML: true

and have a look at the log so that you can see what the site returns to stash.

Belleyy · 2020-11-17T10:10:21Z

@bnkai I think there is a difference between headless chromium and using normal chrome.
@Threak What CDP do you use, Headless chrome or a chromium executable ?

I use a chromium executable and this scraper don't work for me like Teamskeet scraper, so i think there is a difference between headless and classic.

bnkai · 2020-11-18T09:22:59Z

@Belleyy you might be right
I am using a headless chrome docker container so that might be it.
Teamskeet works only for 1-2 queries max but teendfidelity works ok after first query

bnkai · 2020-11-18T11:55:37Z

@Belleyy upon futher investigation it seems that the docker container method maintains some cookies which i assume the executable one doesn't.
@Belleyy @Threak can you try the the stash version from this PR stashapp/stash#934 (download links below) with this scraper file https://pastebin.com/UBuHFkfm? (make sure to removethe old scraper file )

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current

                                 Dload  Upload   Total   Spent    Left  Speed

100    46  100    46    0     0     35      0  0:00:01  0:00:01 --:--:--    35

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current

                                 Dload  Upload   Total   Spent    Left  Speed

100 43.7M  100   157  100 43.7M      6  1890k  0:00:26  0:00:23  0:00:03  206k

stash-osx uploaded to url: "https://gofile.io/d/lq6J3w"

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current

                                 Dload  Upload   Total   Spent    Left  Speed

100    46  100    46    0     0     35      0  0:00:01  0:00:01 --:--:--    35

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current

                                 Dload  Upload   Total   Spent    Left  Speed

100 40.5M  100   161  100 40.5M     39   9.9M  0:00:04  0:00:04 --:--:--  9.9M

stash-win.exe uploaded to url: "https://gofile.io/d/DozNwz"

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current

                                 Dload  Upload   Total   Spent    Left  Speed

100    46  100    46    0     0     29      0  0:00:01  0:00:01 --:--:--    29

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current

                                 Dload  Upload   Total   Spent    Left  Speed

100 41.2M  100   159  100 41.2M      5  1376k  0:00:31  0:00:30  0:00:01     0

stash-linux uploaded to url: "https://gofile.io/d/5Tl0hi"

First make sure to set the log level to debug. Then do a scrape. After the scrape get the nats values that are printed in the log and replace in the yml file the Value: "" entries. Do a refresh scrapers from stash and the scraper should work for pornfidelity. As a bonus you can set CDP to false as it no longer seems to be needed ( Use it first though to verify that all works ok with the plain chrome executable ) .

Belleyy · 2020-11-18T15:26:44Z

@bnkai Just tested with few scene and it work 👍 (With & Without CDP)

~~Edit: I just found that the chromium process was still in background, will try it more later to know if i was doing something wrong or it's a issue to your PR.~~ Can't reproduce it 🤷‍♂️ .

bnkai · 2020-11-18T22:17:42Z

@Belleyy this seems to verify what i thought , we'll probably have to update the scraper to mention that a CDP remote instance is required (plain executable is not enough) till the cookies PR is merged.

JDRanpariya · 2020-12-28T15:23:11Z

I have following 4 errors regrading feild cookies.
time="2020-12-28T20:50:20+05:30" level=error msg="Error loading scraper C:\\Users\\...\\.stash\\scrapers\\Colette.yml: yaml: unmarshal errors:\n line 63: field cookies not found in type scraper.scraperDriverOptions"

time="2020-12-28T20:50:20+05:30" level=error msg="Error loading scraper C:\\Users\\...\\.stash\\scrapers\\KellyMadisonMedia.yml: yaml: unmarshal errors:\n line 42: field cookies not found in type scraper.scraperDriverOptions"

time="2020-12-28T20:50:20+05:30" level=error msg="Error loading scraper C:\\Users\\...\\.stash\\scrapers\\javdb.yml: yaml: unmarshal errors:\n line 89: field cookies not found in type scraper.scraperDriverOptions"

time="2020-12-28T20:50:20+05:30" level=error msg="Error loading scraper C:\\Users\\...\\.stash\\scrapers\\mgstage.yml: yaml: unmarshal errors:\n line 29: field cookies not found in type scraper.scraperDriverOptions"

Belleyy · 2020-12-28T15:31:15Z

@JDRanpariya Are you using the dev build ? This scraper need a version of stash >= v0.4.0-14.

JDRanpariya · 2020-12-28T15:38:54Z

I'm using following build
https://github.com/stashapp/stash/releases/tag/v0.4.0

The 24 Nov one

bnkai · 2020-12-29T15:59:23Z

@JDRanpariya you need to switch to a recent dev version as stated in the scrapers list v0.4.0-14 at least for cookie support. The one you have doesnt support that as its v0.4.0 ( 14 commits older that what you need)

bkbd3177 · 2024-03-22T14:44:45Z

@Maista6969 Thank you! I'm not sure how to reopen an issue, so I left a comment on the one you referenced.

LeGrosFromage · 2024-03-22T16:08:56Z

DesperateAmateurs:

I have the scraper installed (Stash has been restarted a few times since it was installed) but on a scene, clicking "Scrape with..." DA will not be in the list, pasting in a DA URL and clicking the white download/scrape button returns nothing at all (no dialog box, no message, no data) and clicking "Scrape with URL" from the "Scrape with..." list displays the message "No scenes found" - so either I'm doing something completely wrong or the scraper is 100% borked... could be either. Or both. The scraper is listed as the current/latest version.

smcallah · 2024-03-22T16:24:19Z

I just installed the scraper through the community scrapers installer in v0.25.0 and it is able get data when entering a DA URL and clicking the scrape button.

Make sure you are clicking the reload scraper button, as well as refreshing the browser window where you are attempting to scrape the URL on a scene. I had to refresh my scene page or else the scrape button was greyed out.

DesperateAmateurs:

I have the scraper installed (Stash has been restarted a few times since it was installed) but on a scene, clicking "Scrape with..." DA will not be in the list, pasting in a DA URL and clicking the white download/scrape button returns nothing at all (no dialog box, no message, no data) and clicking "Scrape with URL" from the "Scrape with..." list displays the message "No scenes found" - so either I'm doing something completely wrong or the scraper is 100% borked... could be either. Or both. The scraper is listed as the current/latest version.

LeGrosFromage · 2024-03-22T19:56:16Z

I just installed the scraper through the community scrapers installer in v0.25.0 and it is able get data when entering a DA URL and clicking the scrape button.

Make sure you are clicking the reload scraper button, as well as refreshing the browser window where you are attempting to scrape the URL on a scene. I had to refresh my scene page or else the scrape button was greyed out.

DesperateAmateurs:
I have the scraper installed (Stash has been restarted a few times since it was installed) but on a scene, clicking "Scrape with..." DA will not be in the list, pasting in a DA URL and clicking the white download/scrape button returns nothing at all (no dialog box, no message, no data) and clicking "Scrape with URL" from the "Scrape with..." list displays the message "No scenes found" - so either I'm doing something completely wrong or the scraper is 100% borked... could be either. Or both. The scraper is listed as the current/latest version.

I've:
Reloaded scrapers and restarted Stash.
Quit/restarted browser.
The button is available, but no data is returned. When you click it you get the "busy circle" for 0.5 seconds then it disappears.

Maista6969 · 2024-03-22T23:38:42Z

I just installed the scraper through the community scrapers installer in v0.25.0 and it is able get data when entering a DA URL and clicking the scrape button.
Make sure you are clicking the reload scraper button, as well as refreshing the browser window where you are attempting to scrape the URL on a scene. I had to refresh my scene page or else the scrape button was greyed out.

DesperateAmateurs:
I have the scraper installed (Stash has been restarted a few times since it was installed) but on a scene, clicking "Scrape with..." DA will not be in the list, pasting in a DA URL and clicking the white download/scrape button returns nothing at all (no dialog box, no message, no data) and clicking "Scrape with URL" from the "Scrape with..." list displays the message "No scenes found" - so either I'm doing something completely wrong or the scraper is 100% borked... could be either. Or both. The scraper is listed as the current/latest version.

I've: Reloaded scrapers and restarted Stash. Quit/restarted browser. The button is available, but no data is returned. When you click it you get the "busy circle" for 0.5 seconds then it disappears.

What URL are you scraping? This could be a networking issue, but we can rule out the scraper itself being broken: I have tested it with several scenes now (like this one) and it works as expected. The only problem I can see is that the URL pattern in this scraper is too liberal: it will accept any URL that contains desperateamateurs.com/, but that covers a lot of URLs that aren't scrapable scenes 😅 I've pushed a fix for that in 75d5337

Can you use other scrapers without any issues or is this the first/only one you've tried?

Eviepayne · 2024-03-23T01:05:06Z

Having an issue with dc-onlyfans scraper

2024-03-22   20:47:59 Info     Retrieved latest version: v0.25.1 (bf7cb78d)
2024-03-22   20:47:50 Info     Retrieved latest version: v0.25.1 (bf7cb78d)
2024-03-22   20:29:15 Error    scrapeSingleScene: input: scrapeSingleScene scraper dc-onlyfans: could not unmarshal json from script output: EOF
2024-03-22   20:29:15 Error    could not unmarshal json from script output: EOF
2024-03-22   20:29:15 Error    [Scrape / DC Onlyfans] FileNotFoundError: [Errno 2] No such file or directory: 'data/vaultshare/OF/defiantpanda/Posts/Free'
2024-03-22   20:29:15 Error    [Scrape / DC Onlyfans]                 ^^^^^^^^^^^^^^^^
2024-03-22   20:29:15 Error    [Scrape / DC Onlyfans]     for name in os.listdir(self):
2024-03-22   20:29:15 Error    [Scrape / DC Onlyfans]   File "/usr/lib/python3.11/pathlib.py", line 932, in iterdir
2024-03-22   20:29:15 Error    [Scrape / DC Onlyfans]     for child in p.iterdir():
2024-03-22   20:29:15 Error    [Scrape / DC Onlyfans]   File "/opt/of/scrapers/community/dc-onlyfans/dc-onlyfans.py", line 159, in <module>
2024-03-22   20:29:15 Error    [Scrape / DC Onlyfans] Traceback (most recent call last):
2024-03-22   20:25:31 Error    scrapeSingleScene: input: scrapeSingleScene scraper dc_onlyfans_fansdb: scraper script error: exit status 1
2024-03-22   20:25:31 Error    [Scrape / DC OnlyFans (FansDB)] Could not find username or network in path: data/vaultshare/OF/defiantpanda/Posts/Free/Videos/0h1c8tuxrptucffl1cmvx_source.mp4
2024-03-22   20:25:20 Info     Version v0.25.1 (bf7cb78d) is already the latest released
2024-03-22   20:25:19 Info     stash is running at http://localhost:999/
2024-03-22   20:25:19 Info     stash is listening on 0.0.0.0:999
2024-03-22   20:25:19 Info     stash version: v0.25.1 (bf7cb78d) - Official Build - 2024-03-13 03:32:11
2024-03-22   20:25:19 Info     [InitHWSupport] Supported HW codecs:
2024-03-22   20:25:19 Info     using config file: /opt/of/config.yml

Maista6969 · 2024-03-23T01:17:00Z

Having an issue with dc-onlyfans scraper

2024-03-22   20:47:59 Info     Retrieved latest version: v0.25.1 (bf7cb78d)
2024-03-22   20:47:50 Info     Retrieved latest version: v0.25.1 (bf7cb78d)
2024-03-22   20:29:15 Error    scrapeSingleScene: input: scrapeSingleScene scraper dc-onlyfans: could not unmarshal json from script output: EOF
2024-03-22   20:29:15 Error    could not unmarshal json from script output: EOF
2024-03-22   20:29:15 Error    [Scrape / DC Onlyfans] FileNotFoundError: [Errno 2] No such file or directory: 'data/vaultshare/OF/defiantpanda/Posts/Free'
2024-03-22   20:29:15 Error    [Scrape / DC Onlyfans]                 ^^^^^^^^^^^^^^^^
2024-03-22   20:29:15 Error    [Scrape / DC Onlyfans]     for name in os.listdir(self):
2024-03-22   20:29:15 Error    [Scrape / DC Onlyfans]   File "/usr/lib/python3.11/pathlib.py", line 932, in iterdir
2024-03-22   20:29:15 Error    [Scrape / DC Onlyfans]     for child in p.iterdir():
2024-03-22   20:29:15 Error    [Scrape / DC Onlyfans]   File "/opt/of/scrapers/community/dc-onlyfans/dc-onlyfans.py", line 159, in <module>
2024-03-22   20:29:15 Error    [Scrape / DC Onlyfans] Traceback (most recent call last):
2024-03-22   20:25:31 Error    scrapeSingleScene: input: scrapeSingleScene scraper dc_onlyfans_fansdb: scraper script error: exit status 1
2024-03-22   20:25:31 Error    [Scrape / DC OnlyFans (FansDB)] Could not find username or network in path: data/vaultshare/OF/defiantpanda/Posts/Free/Videos/0h1c8tuxrptucffl1cmvx_source.mp4
2024-03-22   20:25:20 Info     Version v0.25.1 (bf7cb78d) is already the latest released
2024-03-22   20:25:19 Info     stash is running at http://localhost:999/
2024-03-22   20:25:19 Info     stash is listening on 0.0.0.0:999
2024-03-22   20:25:19 Info     stash version: v0.25.1 (bf7cb78d) - Official Build - 2024-03-13 03:32:11
2024-03-22   20:25:19 Info     [InitHWSupport] Supported HW codecs:
2024-03-22   20:25:19 Info     using config file: /opt/of/config.yml

This scraper is very particular about file structures: your folder is named OF but the scraper expects a folder named exactly OnlyFans

Eviepayne · 2024-03-23T02:55:25Z

I still can't seem to get it working.
Any ideas? I wish the errors were more verbose and useful

2024-03-22 22:36:05 Error   scrapeSingleScene: input: scrapeSingleScene scraper dc-onlyfans: could not unmarshal json from script output: EOF
2024-03-22 22:36:05 Error   could not unmarshal json from script output: EOF
2024-03-22 22:36:05 Error   [Scrape / DC Onlyfans]                 ^^^^^^^^^^^^^^^^
2024-03-22 22:36:05 Error   [Scrape / DC Onlyfans]     for name in os.listdir(self):
2024-03-22 22:36:05 Error   [Scrape / DC Onlyfans] FileNotFoun Error: [Errno 2] No such file or directory: 'data/vaultshare/OnlyFans/defiantpanda/Posts/Free'
2024-03-22 22:36:05 Error   [Scrape / DC Onlyfans]     for child in p.iterdir():
2024-03-22 22:36:05 Error   [Scrape / DC Onlyfans]   File "/opt/of/scrapers/community/dc-onlyfans/dc-onlyfans.py", line 159, in <module>
2024-03-22 22:36:05 Error   [Scrape / DC Onlyfans] Traceback (most recent call last):
2024-03-22 22:36:05 Error   [Scrape / DC Onlyfans]   File "/usr/lib/python3.11/pathlib.py", line 932, in iterdir

Eviepayne · 2024-03-23T18:03:21Z

Got it working.
Thanks to the help of Maista on the discord they directed me to Fanscrape

LeGrosFromage · 2024-03-29T15:09:37Z

I Want Clips:
Trying to scrape either a scene or a performer times out after about 30 seconds with:
Response: Not successful Returned status code:504

Maista6969 · 2024-03-29T15:16:28Z

I Want Clips: Trying to scrape either a scene or a performer times out after about 30 seconds with: Response: Not successful Returned status code:504

I am unable to reproduce this so the scraper isn't broken. Status code 504 is gateway timeout so it's definitely a networking issue, but not necessarily something you can do something about. It could just be a transient problem that will pass on its own 🙂

LeGrosFromage · 2024-03-29T15:21:52Z

I Want Clips: Trying to scrape either a scene or a performer times out after about 30 seconds with: Response: Not successful Returned status code:504

I am unable to reproduce this so the scraper isn't broken. Status code 504 is gateway timeout so it's definitely a networking issue, but not necessarily something you can do something about. It could just be a transient problem that will pass on its own 🙂

That's fair. Thanks for looking at it.

Tany9696 · 2024-04-15T21:01:52Z

scraper Brazzers: error running scraper script
plz help! i cant Scrap brazzers scenes

Maista6969 · 2024-04-16T01:08:15Z

scraper Brazzers: error running scraper script plz help! i cant Scrap brazzers scenes

Need more info to be able to help with this, but first: please look at the README for some manual steps required to use Python scrapers at this time

Tany9696 · 2024-04-16T11:49:27Z

Thanks for reply I installed python but brazzers scrap not work .in scene i click on edit and clicking on scrape with and brazzers and it showed error running scraper script i tried another site but same things happend Maista ***@***.***> schrieb am Di., 16. Apr. 2024, 04:38:

…

scraper Brazzers: error running scraper script plz help! i cant Scrap brazzers scenes Need more info to be able to help with this, but first: please look at the README <https://github.com/stashapp/CommunityScrapers?tab=readme-ov-file#python-scrapers> for some manual steps required to use Python scrapers at this time — Reply to this email directly, view it on GitHub <#123 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/BHTQHE4ORTSNOUPLVBFVEODY5R2ZZAVCNFSM4PZHJ2KKU5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TEMBVHAYDKOBZGQ3Q> . You are receiving this because you commented.Message ID: ***@***.***>

Maista6969 · 2024-04-17T03:29:06Z

Thanks for reply I installed python but brazzers scrap not work .in scene i click on edit and clicking on scrape with and brazzers and it showed error running scraper script i tried another site but same things happend

Can you check the logs at Debug level to see what's going wrong? I can't see your screen from where I'm sitting

MyDirtyAccount · 2024-04-24T14:57:58Z

The X-Art scraper was recently updated to support galleries by @Ksrx01 in #1698 (thanks!). It returns blank details on some galleries, due to inconsistent HTML structures by the studio.

In Bohemian Rhapsody and First Loves, the description is on the paragraph inside the one with ID desc:

<p id="desc"><p>It's a "Bohemian [...] Colette</p></p>

<p id="desc"><p>Chelsea is [...] so cute!</p></p>

The XPath expression is on line 39:

    gallery:
      Title: //div[@class="small-12 medium-12 large-6 columns info"]/h1[@class="show-for-large-up"]
      Details: //div[@class="small-12 medium-12 large-6 columns info"]/p[@id="desc"]
      Date:
        selector: //div[@class="small-12 medium-12 large-6 columns info"]/h2[1]/text()

Ksrx01 · 2024-04-24T20:01:05Z

The X-Art scraper was recently updated to support galleries by @Ksrx01 in #1698 (thanks!). It returns blank details on some galleries, due to inconsistent HTML structures by the studio.

Noticed that issue too, shortly after updating it.
Unfortunately I didn't have the time to take a proper look. I had a few instances where it wasn't simply a nested P, some had DIV too.

Maista6969 · 2024-04-25T02:39:23Z

In Bohemian Rhapsody and First Loves, the description is on the paragraph inside the one with ID desc:

The descriptions are actually adjacent to the paragraph with the ID desc! I couldn't find any galleries that had div elements like @Ksrx01 described but I'd love it if I had some examples.

In the meantime I've pushed a fix that ensures that we can scrape the full description for galleries on X-Art 🙂

Ksrx01 · 2024-04-25T04:02:01Z

In Bohemian Rhapsody and First Loves, the description is on the paragraph inside the one with ID desc:

The descriptions are actually adjacent to the paragraph with the ID desc! I couldn't find any galleries that had div elements like @Ksrx01 described but I'd love it if I had some examples.

In the meantime I've pushed a fix that ensures that we can scrape the full description for galleries on X-Art 🙂

Thanks! Unfortunately I can't remember which galleries I had issued with.

MyDirtyAccount · 2024-04-25T20:59:41Z

In the meantime I've pushed a fix that ensures that we can scrape the full description for galleries on X-Art 🙂

Confirmed fixed! Thanks!

thickconfusion · 2024-04-26T19:55:22Z

I'm having an issue with Redgifs

ERRO[2024-04-26 19:54:50] [Scrape / Redgifs] HTTP Error: 404
ERRO[2024-04-26 19:54:50] [Scrape / Redgifs] Traceback (most recent call last):
ERRO[2024-04-26 19:54:50] [Scrape / Redgifs] File "/root/.stash/scrapers/community/Redgifs/Redgifs.py", line 183, in
ERRO[2024-04-26 19:54:50] [Scrape / Redgifs] result = json.dumps([scraper.getParseId(id)])
ERRO[2024-04-26 19:54:50] [Scrape / Redgifs] ^^^^^^^^^^^^^^^^^^^^^^
ERRO[2024-04-26 19:54:50] [Scrape / Redgifs] File "/root/.stash/scrapers/community/Redgifs/Redgifs.py", line 124, in getParseId
ERRO[2024-04-26 19:54:50] [Scrape / Redgifs] gif = req.get("gif")
ERRO[2024-04-26 19:54:50] [Scrape / Redgifs] ^^^^^^^
ERRO[2024-04-26 19:54:50] [Scrape / Redgifs] AttributeError: 'NoneType' object has no attribute 'get'
ERRO[2024-04-26 19:54:50] could not unmarshal json from script output: EOF
ERRO[2024-04-26 19:54:50] scrapeSingleScene: input: scrapeSingleScene error while name scraping with scraper Redgifs: could not unmarshal json from script output: EOF

andbigdata · 2024-04-29T14:28:51Z

IAFD updated their URL for performers. It used to be "https://www.iafd.com/person.rme/perfid=" and it is now "https://www.iafd.com/person.rme/id=" I didn't check all parts of the scraper and all of the data returned from the scraper looks correct.

Maista6969 · 2024-04-29T14:38:28Z

IAFD updated their URL for performers. It used to be "https://www.iafd.com/person.rme/perfid=" and it is now "https://www.iafd.com/person.rme/id=" I didn't check all parts of the scraper and all of the data returned from the scraper looks correct.

Thank you for bringing this up, I've pushed a new version of the scraper YAML that will let it trigger on the new patterns as well as the old since those will redirect to the new and still scrape fine 🙂

Maista6969 · 2024-05-02T06:47:11Z

Closing this in favor of creating individual issues for broken scrapers: if you've come here to report a broken scraper, please open a new issue

bnkai pinned this issue Aug 9, 2020

Belleyy mentioned this issue Oct 6, 2020

Fixing IAFD #202

Merged

Belleyy mentioned this issue Oct 10, 2020

Update Mindgeek (TransSensual & Movie) #214

Merged

mmenanno mentioned this issue Nov 9, 2020

Remove Cloudflare errors from JAVLibrary #267

Merged

bkbd3177 mentioned this issue Mar 22, 2024

[Bug Report] XPath Scraper shouldn't remove newlines for Detail fields stashapp/stash#591

Open

DogmaDragon mentioned this issue Mar 25, 2024

Fix nhentai scrapper #1707

Closed

DogmaDragon mentioned this issue Apr 3, 2024

Create scraper for itspov.com and its sub sites #1737

Closed

Maista6969 closed this as completed May 2, 2024

Maista6969 unpinned this issue May 2, 2024

Broken Scrapers #123

Broken Scrapers #123

Comments

bnkai commented Aug 9, 2020 • edited Loading

Belleyy commented Aug 21, 2020 • edited Loading

bnkai commented Sep 2, 2020 • edited Loading

budislov commented Sep 6, 2020 • edited Loading

bnkai commented Sep 6, 2020

budislov commented Sep 11, 2020

bnkai commented Sep 11, 2020 • edited Loading

Ziatexataor commented Oct 6, 2020

bnkai commented Oct 6, 2020

malibustacynewhat commented Oct 9, 2020

bnkai commented Oct 11, 2020

mmenanno commented Oct 11, 2020

Ziatexataor commented Nov 6, 2020

bnkai commented Nov 6, 2020

SpedNSFW commented Nov 8, 2020

Belleyy commented Nov 8, 2020

Threak commented Nov 17, 2020

bnkai commented Nov 17, 2020 • edited Loading

Belleyy commented Nov 17, 2020

bnkai commented Nov 18, 2020

bnkai commented Nov 18, 2020 • edited Loading

Belleyy commented Nov 18, 2020 • edited Loading

bnkai commented Nov 18, 2020

JDRanpariya commented Dec 28, 2020 • edited Loading

Belleyy commented Dec 28, 2020 • edited Loading

JDRanpariya commented Dec 28, 2020 • edited Loading

bnkai commented Dec 29, 2020

bkbd3177 commented Mar 22, 2024

LeGrosFromage commented Mar 22, 2024

smcallah commented Mar 22, 2024

LeGrosFromage commented Mar 22, 2024

Maista6969 commented Mar 22, 2024

Eviepayne commented Mar 23, 2024

Maista6969 commented Mar 23, 2024

Eviepayne commented Mar 23, 2024 • edited Loading

Eviepayne commented Mar 23, 2024

LeGrosFromage commented Mar 29, 2024

Maista6969 commented Mar 29, 2024

LeGrosFromage commented Mar 29, 2024

Tany9696 commented Apr 15, 2024

Maista6969 commented Apr 16, 2024

Tany9696 commented Apr 16, 2024 via email

Maista6969 commented Apr 17, 2024

MyDirtyAccount commented Apr 24, 2024

Ksrx01 commented Apr 24, 2024 • edited Loading

Maista6969 commented Apr 25, 2024

Ksrx01 commented Apr 25, 2024

MyDirtyAccount commented Apr 25, 2024

thickconfusion commented Apr 26, 2024

andbigdata commented Apr 29, 2024

Maista6969 commented Apr 29, 2024

Maista6969 commented May 2, 2024

bnkai commented Aug 9, 2020 •

edited

Loading

Belleyy commented Aug 21, 2020 •

edited

Loading

bnkai commented Sep 2, 2020 •

edited

Loading

budislov commented Sep 6, 2020 •

edited

Loading

bnkai commented Sep 11, 2020 •

edited

Loading

bnkai commented Nov 17, 2020 •

edited

Loading

bnkai commented Nov 18, 2020 •

edited

Loading

Belleyy commented Nov 18, 2020 •

edited

Loading

JDRanpariya commented Dec 28, 2020 •

edited

Loading

Belleyy commented Dec 28, 2020 •

edited

Loading

JDRanpariya commented Dec 28, 2020 •

edited

Loading

Eviepayne commented Mar 23, 2024 •

edited

Loading

Ksrx01 commented Apr 24, 2024 •

edited

Loading