-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Doesn't seem to handle Amazon partner link redirects? #61
Comments
Hello! Thanks for opening this issue. In |
@jshemas thanks, I think that’s definitely a good addition. Have you checked it against my example though? I think something more complicated might be happening here - if I view source on the page the scraper sees it doesn’t have the open graph tags yet so I don’t know what the Facebook scraper is doing differently? |
Hello, i'm not able to use the Facebook debugger since I don't have a facebook account. Facebook is probably doing a lot more then just scraping open graph info. What other information would you like OGS to pull back? I can probably do a one off thing to get amazon reviews, if that is what you need. |
Hi @jshemas, again thanks for responding. It's not the reviews bit per se (that just seems to be something Amazon adds into its thumbnails) it's more just figuring out what's weird about how / when Amazon exposes og tags for its products. Seems weird what comes back from curl even. |
It might not even be exposing og tags! But I'm sure I saw them on at least one product page! |
No it does seem that Amazon doesn't expose opengraph tags after all, so Facebook must be making an exception to get this (maybe hitting an Amazon API). This all seems beyond the scope of |
In case anyone else needs this, solution for amazon found here: jhy/jsoup#976 just need to set User-Agent |
if you're a noob like me, here is the full answer further to @TylerAHolden 's guidance
|
When I scrape this Amazon url: https://amzn.to/2Is8sCR, I expect to see the same results as the Facebook debugger (here) which follows a redirect to here, but for some reason I always get this:
Here's what I see
Here's what I want
Thanks, hope somebody can help!
The text was updated successfully, but these errors were encountered: