New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spiderable returns empty body #721

Closed
raix opened this Issue Feb 21, 2013 · 10 comments

Comments

Projects
None yet
3 participants
@raix
Contributor

raix commented Feb 21, 2013

I'v added the spiderable package, it kinda renders - used curl to check but body is empty?

curl meteorrain.meteor.com

Really just needed a img tag and some text for twitter/fb to use.

EDIT: At first I tested with fb - It didn't find any img tags or text. I've read the spiderable code - can see that only facebook bot is supported and there might be an issue with node?

Knowing the spiderable is a temperary approach - Should I try fixing or try getting templates rendered serverside? (guess it would require some input from core, eg. strategy for js/non js bots - use robots.txt or only support major bots)

@svasva

This comment has been minimized.

svasva commented Feb 22, 2013

Spiderable only works for requests that match AJAX Crawl Specification ( https://developers.google.com/webmasters/ajax-crawling/ ). Try doing curl http://meteorrain.meteor.com/?_escaped_fragment_=

@raix

This comment has been minimized.

Contributor

raix commented Feb 22, 2013

Thanks @erundook, I've read the google spec - So spiderable is only usefull for ajax enabled crawlers actually setting the _escaped_fragment_. Guess it doesn't solve facebook&linkedin etc.

Just for the fun of it I tried setting a fictive parametre in fb link eg. meteorrain.meteor.com?hash= and this works in fb, but bit ugly compared with a simple meteorrain.meteor.com - wich doesn't work i fb/linkedin.

Would it be possible to check the agents and eg. set the ?_escaped_fragment_= for these if they don't do it them selfs? -before passing on to phantom.js (I'll have a look at spiderable.js I think)

@svasva

This comment has been minimized.

svasva commented Feb 22, 2013

@raix you could do that (useragent detection) with some kind of reverse proxy (nginx for exapmle).

@raix

This comment has been minimized.

Contributor

raix commented Feb 22, 2013

@erundook : nope, dont have to use proxy, I just have to make the spiderable.js work... It has regular ex. that tests agent for 'facebookexternalhit', I'm gonna figure out why it seems not to work, and add some test for linkedin too.
but thanks for your suggestion.

raix added a commit to raix/meteor that referenced this issue Feb 22, 2013

fixes meteor#721
Facebook bot header gone CamelCase instead of lowercase.
* made user agent lowercase
* added linkedin bot in reg.ex. test
@raix

This comment has been minimized.

Contributor

raix commented Feb 22, 2013

@erundook: Guess facebook's user agent changed from lower case to CamelCase - the spiderable only tested lower case. Spiderable only supported the fb bot.

I've fixed the test and added the linkedin bot.

@raix raix closed this Feb 22, 2013

@raix

This comment has been minimized.

Contributor

raix commented Feb 22, 2013

@erundook: ... and thanks for your input :)

raix added a commit to raix/meteor that referenced this issue Feb 24, 2013

Fixed meteor#721
Added 'i' and linkedinbot to reg.ex
@glasser

This comment has been minimized.

Member

glasser commented Feb 28, 2013

Pushed @raix's patch.

glasser added a commit that referenced this issue Feb 28, 2013

Fixed #721
Testing for bots should be case insensitive - facebook bots are not all
lowercase - they adapted CamelCase on some servers. I've added the
linkedin bot just for the sake of it.
@svasva

This comment has been minimized.

svasva commented Mar 1, 2013

It would be awesome if this would be configurable on per-app basis.

@raix

This comment has been minimized.

Contributor

raix commented Mar 1, 2013

Just to clearify,
Like eg.: Meteor.spiderableAgents = [/^onlyBotAllowed/]?

@svasva

This comment has been minimized.

svasva commented Mar 1, 2013

Yes, just like that, but with a way to add agents instead of overriding the
whole array. Like Meteor.spiderableAgents.push('someBot')

On Fri, Mar 1, 2013 at 5:25 PM, Morten N.O. Nørgaard Henriksen <
notifications@github.com> wrote:

Just to clearify,
Like eg.: Meteor.spiderableAgents = [/^onlyBotAllowed/]?


Reply to this email directly or view it on GitHubhttps://github.com//issues/721#issuecomment-14289038
.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment