-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can't scrape airplane prices #142
Comments
@aragar I have the same issue in #141 I am also attempting to scrape an airline page. I can see you are running into the issue as me. The site you are scraping has AJAX (xhr requests) in-between loading the actual ticket price results. There are some suggestions provided in #81 however none of those have worked with my case. I would say give those a shot, maybe you will have better luck. I think the key to solving this is utilizing https://github.com/rchipka/node-libxmljs-dom to simulate a browser, but I have not been able to implement that correctly. |
nightmare.js works with Ajax pages.
… On Jan 10, 2017, at 4:05 PM, Taylor McClure ***@***.***> wrote:
@aragar I have the same issue in #141
I am also attempting to scrape an airline page. I can see you are running into the issue as me. The site you are scraping has AJAX (xhr requests) in-between loading the actual ticket price results. There are some suggestions provided in #81 however none of those have worked with my case. I would say give those a shot, maybe you will have better luck.
I think the key to solving this is utilizing https://github.com/rchipka/node-libxmljs-dom to simulate a browser, but I have not been able to implement that correctly.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
@bchr02 That library looks rather spiffy. I think I will give it a shot! Thank you for the input. |
@taylorsmcclure thanks for the link. I've tried them already, but with no success. I will try the idea with libxmljs |
@aragar I played around with nightmare yesterday and it solved the issue I was having with AJAX. I think I will continue to use that library for my project. If you are running in a headless environment check out this segment-boneyard/nightmare#224 Here is my proof of concept using nightmare: https://gist.github.com/taylorsmcclure/76d1ecd7f999b009f6b4f8c03c600a97 It's a shame libxmljs-dom and osmosis doesn't seem to work for my use-case. Comparatively osmosis is much more lightweight. With nightmare you need to emulate a screen with |
Wow, thank you very much @taylorsmcclure. This is really helpful. I agree that osmosis looked better :\ I hope @rchipka could help us with this. |
I tried the advices in #81. |
I too have the same issue. I too Tried what @aragar mentioned in his comment, but I too had the issue persisting. Hope someone can help with this issue. I prefer to use osmosis compared to the much heavier nightmarejs (Although I got it to work with it). |
Hello, guys, I've restarted the project after some huge delay and I've realised after reading the Osmosis source code, that the result from the submit (which is a post request) is actually strip from the available data. You need to reconfigure the Osmosis to be able to see it. Here is the code I use for getting the result with WizzAir:
As a result, you have the response of the query, saved in the response field of the current context. |
I am trying to get airplane ticket prices between some two cities from https://www.air.bg/en
I started with the following code:
From then on, I can't get any more information. Every time I try
.find(SOMETHING)
I receiveno results for ...
Can somebody help me how to continue and get some information from the next page with the results ?
The text was updated successfully, but these errors were encountered: