Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to get to the "bottom" of an infinite scroll page? #625

Closed
nick2012 opened this issue May 7, 2016 · 3 comments
Closed

How to get to the "bottom" of an infinite scroll page? #625

nick2012 opened this issue May 7, 2016 · 3 comments

Comments

@nick2012
Copy link

nick2012 commented May 7, 2016

Nightmare.action('scrollPage', function (done) {
  this.evaluate_now(function () {
    var hitRockBottom = false; 
    while (!hitRockBottom) {
        // console.log("1");
    // Scroll the page (not sure if this is the best way to do so...)
    this.scrollTo(100000, 0);

    // Check if we've hit the bottom
    hitRockBottom = this.evaluate(function() {
        // console.log("0");
        return this.exists('selector') === null;
    }); }
  }, done)
})
@Mr0grog
Copy link
Contributor

Mr0grog commented May 7, 2016

It looks like maybe you are trying to call nightmare functions from within evaluate_now, which won’t work. The code passed to evaluate_now runs inside the browser window, where this represents the same window object any JavaScript running inside the page would have access to.

Also, unless the page specifically does something special when the page scrolls, scrolling the page isn’t going to make the result of exists(selector) change.

@nick2012
Copy link
Author

nick2012 commented May 7, 2016

.goto("url")
.scrollPage()

The selector that I use in exists() is available until no further scrolling down is possible.

On a sidenote, I have a different ideea. There are pages where you can't figure out very easily how the infinite scroll is implemented, or which is the selector, at least if you are a beginner like me.

I want it to scroll down until the size of the returned document does not increase anymore.
How can I achieve this?

Javascript kills me, in Perl this would have been so easy, but Mechanize::Firefox is outdated and very slow.

@rosshinkley
Copy link
Contributor

Another nit: your initial example also will only scroll until the page is 100000 px long, which is big, but "infinite" scroll pages could be longer.

I want it to scroll down until the size of the returned document does not increase anymore.
How can I achieve this?

This is a very naive method to answer your question:

var Nightmare = require('nightmare');
var vo = require('vo');
var nightmare = Nightmare({
  show: true
});

var run = function * () {
  yield nightmare.goto('http://someInfiniteScrollPage.tld');

  var previousHeight, currentHeight=0;
  while(previousHeight !== currentHeight) {
    previousHeight = currentHeight;
    var currentHeight = yield nightmare.evaluate(function() {
      return document.body.scrollHeight;
    });
    yield nightmare.scrollTo(currentHeight, 0)
      .wait(3000);
  }
  yield nightmare.end();
};

vo(run)(function(err) {
  console.dir(err);
  console.log('done');
});

This approach has problems: when you're going against a page that actually is an infinite scroll, the above will never end. Also, the .wait() call could be replaced with waiting for the scroll element count to change to possibly reduce latency and increase robustness.

Javascript kills me, in Perl this would have been so easy, but Mechanize::Firefox is outdated and very slow.

If Perl is your language of choice, you might want to check out Selenium. I haven't personally tried it, but there is a Perl port of the remote driver.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants