-
Notifications
You must be signed in to change notification settings - Fork 254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Turpentine esi request loop when using a google bot #599
Comments
For some reason when setting Googlebot as my user-agent I am seeing expires as 1970, without crawler-session:
|
From what I can tell, crawlers don't actually get a cookie set. I was just playing around with that a few days ago when updating our cache warming tool. |
@eth8505 The issue I have is if I use Google web master tools and do "fetch as google" I will quite often get a re-direct or a temporary unavailable message. If I set my user-agent to Googlebot in firefox, I just get a re-direct loop. If no cookie is being set, then that could cause the re-redirect as no cookie's exist? I am not sure if the same happens using Google web master tools or that is a seperate issue. generate_session_expires sets the ttl of the cookie and that only get's called if req.http.X-Varnish-Faked-Session, is that set during a crawler? I am not sure! Actually looking at the code, it should be. Also, I see in the VCL
I don't see X-Varnish-Set-Cookie anywhere in debug :( On the turpentine demo site, no cookie is set at all using googlebot, wonder why it is on mine! However browsing unofficial varnish sites, there is one set |
@eth8505 I've discovered that If I disable our CM_RedisSession module I no longer get the re-direct with google bot. Do you have any ideas? |
In vcl_recv() the request cookie "frontend" is set in req.http.Cookie. That's the one passed to magento to be used as internal session ID. However, for crawlers generate_session() is never called, hence not filling req.http.X-Varnish-Faked-Session. In vcl_fetch() the _response_cookie is read from beresp.http.Set-Cookie and stored in beresp.http.X-Varnish-Set-Cookie. In vcl_deliver(), generate_session_expires() is only called, if req.http.X-Varnish-Faked-Session is set, which is never called for crawlers (see vcl_recv()). And hence, no resp.http.Set-Cookie is returned to the client. |
@craigcarnell We use Cm_RedisSession as well. We never had any problems with redirect loops though. At least not afaik. |
@eth8505 Thanks for that explanation. I am already running the latest code from git however :( |
@eth8505 Do you out of interest use CM RedisSession to share session information across multiple hosts via load balancing? Have you tried google bot in that scenario? |
@craigcarnell we have two webservers sharing the session data behind a load balancer. |
@eth8505 Ensuring that bots now get a cookie has now resolved the issue for me with redis session. It's a workaround for now, but it's important to let it do it's thing |
If I set my user agent to Google bot I am seeing a loop on the turpentine ESI requests. It is also returning the whole page instead of just the header for example.
This might be the cause of our issues indexing the site. Firebug says 302 forced.302 redirect for the turpentine ESI request.
I also don't see cookie being set to crawler-session. In addition the home page keeps reloading under net in firebug?
VCL content:
The text was updated successfully, but these errors were encountered: