Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doesn't seem to handle Amazon partner link redirects? #61

Closed
michaelforrest opened this issue Apr 12, 2018 · 8 comments
Closed

Doesn't seem to handle Amazon partner link redirects? #61

michaelforrest opened this issue Apr 12, 2018 · 8 comments

Comments

@michaelforrest
Copy link

When I scrape this Amazon url: https://amzn.to/2Is8sCR, I expect to see the same results as the Facebook debugger (here) which follows a redirect to here, but for some reason I always get this:

{
  ogDescription: 'Buy The Anatomy of Story: 22 Steps to Becoming a Master Storyteller Reprint by John Truby (ISBN: 8601200418156) from Amazon\'s Book Store. Everyday low prices and free delivery on eligible orders.',
  ogImage: {
    url: 'https://images-eu.ssl-images-amazon.com/images/G/02/gno/sprites/nav-sprite-global_bluebeacon-V3-1x_optimized._CB516557022_.png'
  },
  ogTitle: 'The Anatomy of Story: 22 Steps to Becoming a Master Storyteller: Amazon.co.uk: John Truby: 8601200418156: Books'

}

Here's what I see

seen

Here's what I want

wanted

Thanks, hope somebody can help!

@jshemas
Copy link
Owner

jshemas commented Apr 13, 2018

Hello! Thanks for opening this issue.

In open-graph-scraper@3.2.0 I updated the module to send back an array of all of the images on page if there isn't any open graph images.

@michaelforrest
Copy link
Author

@jshemas thanks, I think that’s definitely a good addition. Have you checked it against my example though? I think something more complicated might be happening here - if I view source on the page the scraper sees it doesn’t have the open graph tags yet so I don’t know what the Facebook scraper is doing differently?

@jshemas
Copy link
Owner

jshemas commented Apr 15, 2018

Hello, i'm not able to use the Facebook debugger since I don't have a facebook account. Facebook is probably doing a lot more then just scraping open graph info. What other information would you like OGS to pull back?

I can probably do a one off thing to get amazon reviews, if that is what you need.

@michaelforrest
Copy link
Author

Hi @jshemas, again thanks for responding. It's not the reviews bit per se (that just seems to be something Amazon adds into its thumbnails) it's more just figuring out what's weird about how / when Amazon exposes og tags for its products. Seems weird what comes back from curl even.

@michaelforrest
Copy link
Author

It might not even be exposing og tags! But I'm sure I saw them on at least one product page!

@michaelforrest
Copy link
Author

No it does seem that Amazon doesn't expose opengraph tags after all, so Facebook must be making an exception to get this (maybe hitting an Amazon API). This all seems beyond the scope of open-graph-scraper so I'm happy to close this issue.

@TylerAHolden
Copy link

In case anyone else needs this, solution for amazon found here: jhy/jsoup#976 just need to set User-Agent

@t-lochhead
Copy link

t-lochhead commented Jan 23, 2021

if you're a noob like me, here is the full answer further to @TylerAHolden 's guidance

const ogs = require("open-graph-scraper");
const options = {
  url: "https://amzn.to/2Is8sCR",
  headers: {
    "user-agent": "Googlebot/2.1 (+http://www.google.com/bot.html)",
  },
};
ogs(options, (error, results, response) => {
  console.log("error:", error); // This is returns true or false. True if there was a error. The error it self is inside the results object.
  console.log("results:", results); // This contains all of the Open Graph results
  console.log("response:", response); // This contains the HTML of page
});

@jshemas jshemas mentioned this issue Feb 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants