Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add body part for redirects #3

Closed
wants to merge 1 commit into from
Closed

Conversation

wreis
Copy link

@wreis wreis commented Feb 1, 2013

No description provided.

@ap
Copy link
Owner

ap commented Feb 2, 2013

Code review:

  1. Spaces for indentation is not what the rest of the code is doing.
  2. As far as I can tell, this will erroneously generate a body for body-less non-redirect responses like status 201. It should do a positive check for all statuses for which the Location header is defined, instead of skipping on 304.
  3. $location needs entity escaping before interpolation.
  4. A Content-Type header needs to be set.
  5. I’d do the tests by adding a _no_body debug flag, to avoid the need for regexing the body. Just have a separate test that the expected body gets generated without that flag.

@ap
Copy link
Owner

ap commented Feb 2, 2013

I’ll write a patch that accounts for all of these, no problem, but I do have one question first: why?

There isn’t strictly any need to have a body in redirect requests. As far as I know, it is merely a courtesy to old browsers – very old browsers, at this point. Am I missing something?

And that brings me to the next point – why XHTML? Without a Content-Type: application/xhtml+xml it’s not going to be parsed as XHTML anyway (in fact as it stands, my guess is it’d be considered text/plain – or even application/octet-stream, not sure). But is anything even going to parse this? The stone-age browsers for which this would be necessary certainly don’t understand XHTML. And indentation – what for? Is anyone really going to view-source this? I’ll rather use tag-minimised HTML5 served as text/html, i.e.:

my $body = qq[<!DOCTYPE html><title>Moved</title>This resource has moved to <a href="$href">a new address</a>.];

(Yes, that is valid. I hate links labelled “here”, btw.)

@wreis
Copy link
Author

wreis commented Feb 2, 2013

  1. Argh! You still use tabs for indentation? For God's sake!
  2. That is why it does -not- generate body for 201 responses (as test at line 51 proves that).
  3. Right, forgot that.
  4. IMO, adding logic into code in order to benefit tests isn't a good design. So, either regex body or parse it with HTML::TreeBuilder.

@wreis
Copy link
Author

wreis commented Feb 2, 2013

Because it's a recommendation from W3C to do that.

No special reason for being XHTML. I was just lazy and copied that from my own code and as I said, I forgot the content-type header. It's not only for old browsers...but yeah, with correct content-type header they do parse that.

The indentation was more for us maintaining code as any best-practice recommendation is to do so for long strings.

It's up to you to use HTML5. I hardly believe that robots will parse that tho.

As last thing, I suggest adding a flag (defaulting to false) used in assertion for adding the body, this way we keep backcompat for users who don't need/want/like having that.

@ap
Copy link
Owner

ap commented Feb 2, 2013

It’s a recommendation from W3C to do that.

Ah.

Argh! You still use tabs for indentation? For God’s sake!

Spaces for indentation? For God’s sake! 😆

IMO, adding logic into code in order to benefit tests isn’t a good design. So, either regex body or parse it with HTML::TreeBuilder.

TreeBuilder?! Even worse! I don’t want to go groping around in the content at all, let alone add dependencies to do that.

But you do have a point, and the solution is obvious if I’d stopped to think about it: set a flag variable within the wrapped dummy app. Then the test can check that the flag didn’t get set in case of a redirect, implying that the wrapped app wasn’t called.

I suggest adding a flag (defaulting to false) used in short-circuit assertion for adding the body, this way we keep backcompat for users who don't need/want/like having that.

I don’t think it’s necessary to make that configurable. If someone explicitly does return [ 301, [], [ '' ] ] then no body will be auto-generated for them anyway. So the answer is if you want that then spell it out.

@wreis
Copy link
Author

wreis commented Feb 2, 2013

I don’t think it’s necessary to make that configurable. If someone explicitly does return [ 301, [], [ '' ] ] then no body will be auto-generated for them anyway. So the answer is if you want that then spell it out.

-Backcompat-.

And that is not true as Plack::Util::content_length() will return 0 from [''] in body.

@ap
Copy link
Owner

ap commented Feb 3, 2013

OK, I’ve commited 78cdd62.

Can you pull and and see if my master works for you?

That is why it does -not- generate body for 201 responses (as test at line 51 proves that).

It didn’t btw. When I added an explicit test for it, the test failed – as expected by the fact that I couldn’t find any logic in your patch to exempt such statuses. Try running your code against the rewrite.t from my commit.

@wreis
Copy link
Author

wreis commented Feb 3, 2013

@wreis
Copy link
Author

wreis commented Feb 3, 2013

And RE 201 response, actually the better would be to use Plack::Util::status_with_no_entity_body() instead of hardcoding checking "== 304", but it's very clear as demonstrated above that for 201 it does -not- generate body.

@ap
Copy link
Owner

ap commented Feb 4, 2013

Gosh, what an inattentive idiot I am. I misread your patch, sorry. Excuse the bother, and sorry I’ve been such a pain on this issue.

@ap
Copy link
Owner

ap commented Feb 4, 2013

Actually, not entity-encoding, but URI escaping which should be ok for cases where we auto generate the Location header but still required for app-generated ones.

I don’t follow here.

  1. If they’re explicitly setting a Location header then I don’t want to touch it, and if they’re letting Rewrite generate the header, it’ll be URI-escaped automatically courtesy of Plack::Request::uri returning a URI object.
  2. Even if it’s URI-escaped it still needs to also be HTML-escaped before it’s interpolated into HTML (think of ampersands in query strings).

But I don’t even understand whether you’re arguing that something is wrong, or unnecessary, or are trying to say something else entirely. What was your aim here?

I’m feeling obtuse.

@ap
Copy link
Owner

ap commented Feb 4, 2013

I’ve pushed another commit, 58833d2 – it’s now, finally maybe, pretty much what you wrote, give or take a few of my idiosyncrasies. Please pull and take a look.

@wreis
Copy link
Author

wreis commented Feb 12, 2013

You're right, the URI should be HTML-econded there. Looks good to me.

@ap
Copy link
Owner

ap commented Feb 12, 2013

I just pushed 540ee4e and released it to CPAN as 1.007, so I am closing this ticket. Feel free to reopen if anything new crops up on this topic. HTH, thank you for the request.

@ap ap closed this Feb 12, 2013
@wreis
Copy link
Author

wreis commented Feb 13, 2013

Oh It seems that HTML content is missing the root tag after doctype declaration?!

@ap
Copy link
Owner

ap commented Feb 13, 2013

Nope. It’s valid.

@wreis
Copy link
Author

wreis commented Feb 13, 2013

I guess you're aware that not all UA (specially here bots) support HTML5 yet eh?!

@ap
Copy link
Owner

ap commented Feb 13, 2013

  1. HTML 5 is essentially tag soup with well-defined error recovery. For older user agents which “do not support HTML 5” it is simply tag soup period (unless you employ the newer tags, which I haven’t). It’s sent with text/html and there’s an <a> somewhere in there. That’s good enough for things that understand tag soup.

  2. Did you forget that bots are going to be looking at the status and Location header and not even bother to look at the body? Are you expecting visits from bots that were written before HTTP/1.0 (i.e. older than Perl 5!) which still operate with no updates to their code?

    Note that the majority of the contemporary web will be invisible to such ancient user agents anyway since they wouldn’t support the Host header either.

  3. Note that at that age, there was only tag soup – the W3C was just a year and a half old.

I really don’t understand what non-hypothetical use case is supposed to be unsupported by the code as it stands, and considering your own patch started off with XHTML without entity encoding and with no Content-Type, I’m not inclined to trust your understanding of markup flavours and their parsing and history over mine.

But maybe I am wrong. I am open to changing my mind if you can cite any real-world example at all of a contemporarily used user agent that would suffer – i.e. one that neither supports redirects nor succeeds in parsing as tag soup the body generated by Plack::Middleware::Rewrite.

(I can tell you what use case my use of elliptical markup is supposed to support: minimising the number of bytes wasted on of perfunctory boiler plate on otherwise very small HTTP responses that a high-traffic website may generate in large numbers.)

@wreis
Copy link
Author

wreis commented Feb 13, 2013

On Feb 13, 2013, at 2:51 PM, Aristotle Pagaltzis notifications@github.com wrote:

• HTML 5 is essentially tag soup with well-defined error recovery. For older user agents which “do not support HTML 5” it is simply tag soup period (unless you employ the newer tags, which I haven’t). It’s sent with text/html and there’s an somewhere in there. That’s good enough for things that understand tag soup.

Agreed.

• Did you forget that bots are going to be looking at the status and Location header and not even bother to look at the body? Are you expecting visits from bots that were written before HTTP/1.0 (i.e. older than Perl 5!) which still operate with no updates to their code?

Note that the majority of the contemporary web will be invisible to such ancient user agents anyway since they wouldn’t support the Host header either.

• Note that at that age, there was only tag soup – the W3C was just a year and a half old.

Well, possibly but not 100% sure they will. If I were the one writing a robot yes, but not always the case.

I really don’t understand what non-hypothetical use case is supposed to be unsupported by the code as it stands, and considering your own patch started off with XHTML without entity encoding and with no Content-Type, I’m not inclined to trust your understanding of markup flavours and their parsing and history over mine.
[snip]

Step back please. I told you ages ago that I fucking forgot to set the content-type and entity-encode the URI as you fucking did as well for a plenty of things that I submitted as a patch here.

I have been trying to have a productive discussion with you since when this PR was opened, but then you have chosen to act as an idiot instead. I am done.

@ap
Copy link
Owner

ap commented Feb 13, 2013

Yes, I am an idiot. (I read only the patches and then misunderstood what exactly you did because of the broken indentation; but never thought to actually read the final code from your commit.) I did step back – I said I may be wrong, and that I am open to changing my mind, and I meant those both. My question is serious: is there some issue you are actually seeing? It does sound like a wildly hypothetical eventuality to me, because as far as I can tell, for it to actually occur in practice would require a confluence of factors I find hard to believe exists. But maybe my imagination is limited, my knowledge, or both. Convince me!

@wreis
Copy link
Author

wreis commented Feb 13, 2013

I said -idiot- and not -ignorant- cause you were trying to judge my understanding or skills instead of focusing on real thing here. You didn't even mind to carefully read the patch on the beginning, but just claimed quickly it was wrong.
Anyway, yes, it was a wild guess about possibilities on HTML5 processing, but you raised concise arguments and made sense to me as I said "Agreed". It's still a new thing for me and just wanted to discuss with opportunity to learn. Finally, I am sorry about the broken indentation.

@wreis wreis deleted the body_for_redirects branch February 13, 2013 20:21
@ap
Copy link
Owner

ap commented Feb 13, 2013

I wasn’t dinging you for the broken indentation! Sorry if you got that impression. It’s an easy mistake to make, so no worries. I was just saying it’s what misled me, combined with my own carelessness.

As for judging your understanding – you seemed dismissively convinced, so I had to gauge your level of understanding to decide how much of a stickler to be. The cues were inconclusive, though tentatively seemed to point more in one direction, so I decided not to follow your lead – just yet…

I’m not good at putting these things in a way that goes down easy, sorry. I have strong opinions, but I don’t hold onto them very fast – unfortunately the former is a lot easier to see than the latter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants