Skip to content
This repository has been archived by the owner on Apr 21, 2023. It is now read-only.

Stop rewriting 404s #170

Open
GoogleCodeExporter opened this issue Apr 6, 2015 · 11 comments
Open

Stop rewriting 404s #170

GoogleCodeExporter opened this issue Apr 6, 2015 · 11 comments

Comments

@GoogleCodeExporter
Copy link

What steps will reproduce the problem?
1. Go to http://www.modpagespeed.com/mod_pagespeed_beacon
2. Look at the source and notice that it was rewritten even though it's a 404 
page

My first thought was that a 404 page should not be rewritten. But maybe it 
doesn't hurt. In this case we are adding instrumentation to the 404 page, which 
seems like a bad idea because it seems like a 404 page shouldn't have resources.

What does everyone else think?

Original issue reported on code.google.com by sligocki@google.com on 30 Dec 2010 at 9:28

@jeffkaufman
Copy link
Contributor

It may make sense to revisit this. 404 pages are often relatively rich, with lots to optimize. Neither the introduction of PLT beacons nor the DOS attack reasoning makes much sense to me. Is there any other reason to keep this restriction?

@jeffkaufman jeffkaufman reopened this Jun 15, 2015
@jeffkaufman
Copy link
Contributor

@sligocki @jmaessen

@sligocki
Copy link
Contributor

My 4 year old memory is a bit fuzzy on this :) but the reasoning I remember is: A) We do not expect much content on 404s, B) We don't expect high percentage of traffic to be for 404, C) We really don't want to break 404 pages (ex: somehow obfuscate the error message).

But if we're not worried about (C), then maybe there's no harm. I certainly don't feel strongly and think it would be fine if we changed this back.

@jmaessen
Copy link
Contributor

On Mon, Jun 15, 2015 at 5:39 PM, Shawn Ligocki notifications@github.com
wrote:

My 4 year old memory is a bit fuzzy on this :) but the reasoning I
remember is: A) We do not expect much content on 404s, B) We don't expect
high percentage of traffic to be for 404, C) We really don't want to break
404 pages (ex: somehow obfuscate the error message).

But if we're not worried about (C), then maybe there's no harm. I
certainly don't feel strongly and think it would be fine if we changed this
back.

I seem to recall in particular that folks serving huge 404 pages and sites
with lots of missing links (or just a missing favicon.ico) ran into
problems whenever (say) googlebot came a-visiting.

Arguably we should actually be replacing some 404 pages with
content-length: 0 404s, as this has produced high returns for Chrome mobile
bandwidth reduction.


Reply to this email directly or view it on GitHub
#170 (comment)
.

@jeffkaufman
Copy link
Contributor

Arguably we should actually be replacing some 404 pages with
content-length: 0 404s, as this has produced high returns for Chrome mobile
bandwidth reduction.

I don't think this is something we could do by default. Many sites use their 404 pages to offer people suggestions about how to find what they're looking for.

I do think 0-length 404s would be good for favicon.ico, apple-touch-icon.png, and apple-touch-icon-precomposed.png. On mobile browsers that fetch these by default (not Chrome anymore) they were accounting for 3%-4% of bytes downloaded: https://bugs.webkit.org/show_bug.cgi?id=104585

@morlovich
Copy link
Contributor

Note that there are practical difficulties rewriting 404s on Apache 2.4
with our current filter and the way it's configured.
I don't think it's unsolvable, but config backwards compat may be tricky.

IIRC it implements AddOutputFilterByType in mod_filter (2.2 had a
different impl in core.c), which eventually dispatches to:

static apr_status_t filter_harness(ap_filter_t *f, apr_bucket_brigade *bb)
{
apr_status_t ret;
#ifndef NO_PROTOCOL
const char *cachecontrol;
#endif
harness_ctx *ctx = f->ctx;
ap_filter_rec_t *filter = f->frec;

if (f->r->status != 200
    && !apr_table_get(f->r->subprocess_env, "filter-errordocs")) {
    ap_remove_output_filter(f);
    return ap_pass_brigade(f->next, bb);
}

... I guess the filter-errordocs may be an option, too.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants