You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi all,
I'm new with HUGINN and while I managed to scrape 3 websites so far, there are some others that it looks like there is no way of pulling any data out of them.
This is more of a question rather than a real issue I suppose, as I think I might be doing something wrong.
Here is my case.
I want to scrap this link, which seems rather ordinary to me. List of jobs
While the same approach worked on other 3 html links, it doesn't seem to work on this.
The DryRun returns and empty list (as well as the saved agent) and, while I have tried using both xpath and and css, the result doesn't change altogether (empty list of events).
It doesn't matter whether I use xpath or css, for HUGINN, it looks like after the id#productInfoPrice, there is nothing. It looks like <div id="productInfoPrice">....</div> is totally empty.
There is JS on the page that after the HTML is loaded in the browser, updates the HTML with some logic to update the price.
Without much looking, seems like the price you are looking for is sent in the same HTML document (not retrieved through a subsequent request to an API), so you should take a look at the raw HTML and look for var componentsData = {, which is a giant JS object with all of the product info.
Alternative would be to run through browserless or phantomjs to get the HTML from after the javascript has fired, and parse that with normal Website Agent.
Hi all,
I'm new with HUGINN and while I managed to scrape 3 websites so far, there are some others that it looks like there is no way of pulling any data out of them.
This is more of a question rather than a real issue I suppose, as I think I might be doing something wrong.
Here is my case.
I want to scrap this link, which seems rather ordinary to me.
List of jobs
And this is my Website Agent
While the same approach worked on other 3 html links, it doesn't seem to work on this.
The DryRun returns and empty list (as well as the saved agent) and, while I have tried using both xpath and and css, the result doesn't change altogether (empty list of events).
I encounter the very same issue when I try to scrap [the price of this library] on Trademax.(https://www.trademax.se/f%C3%B6rvaring/hyllor/bokhylla/skanelija-bokhylla-svart-p882980)
It doesn't matter whether I use xpath or css, for HUGINN, it looks like after the id
#productInfoPrice
, there is nothing. It looks like<div id="productInfoPrice">....</div>
is totally empty.this is my WebAgent for scraping the library.
DryRun shows how < div id="productInfoPrice" > looks completely empty
As I said, the same approach worked on 3 other websites but in this case, it simply doesn't and returns an empty list of event.
Do you have any suggestions? I'm really grasping at straws here :-(
The text was updated successfully, but these errors were encountered: