Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#7879] Fix documentation for HTTP client to, among other things, mention that Agent does HTTPS verification. #646

Merged
merged 11 commits into from
Jan 14, 2017

Conversation

glyph
Copy link
Member

@glyph glyph commented Dec 24, 2016

No description provided.

adiroiban and others added 8 commits November 19, 2015 07:15
git-svn-id: svn://svn.twistedmatrix.com/svn/Twisted/branches/doc-htt-client-ssl-valid-7879@46256 bbbe8e31-12d6-0310-92fd-ac37d47ddeeb
git-svn-id: svn://svn.twistedmatrix.com/svn/Twisted/branches/doc-htt-client-ssl-valid-7879@46257 bbbe8e31-12d6-0310-92fd-ac37d47ddeeb
git-svn-id: svn://svn.twistedmatrix.com/svn/Twisted/branches/doc-htt-client-ssl-valid-7879@46258 bbbe8e31-12d6-0310-92fd-ac37d47ddeeb
git-svn-id: svn://svn.twistedmatrix.com/svn/Twisted/branches/doc-htt-client-ssl-valid-7879@46259 bbbe8e31-12d6-0310-92fd-ac37d47ddeeb
Copy link
Member

@exarkun exarkun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Generally quite good. A lot of nitpicking follows but it's not all grammar hijinx (not that I wish anyone to discount the significance of grammar).

Please address to your satisfaction and merge.


By default, since version 15.0.0 (if you're using an earlier version, please upgrade, as this is a critical security feature!), Twisted will automatically validate ``https`` URLs against your platform's trust store to the best of its ability.
Be sure to ``pip install twisted[tls]`` in order to get all the dependencies necessary to do TLS properly, but beyond that you should not need to do much.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit hand-wavey. "Should not"? When might this not hold? "do much"? Like what?

I think it would be better to just refer to installation docs here, if you think folks reading this document will need help getting a working installation of Twisted.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'll just delete the whole "but beyond that" bit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are about 45 different ways that getting system trust roots can fail, unfortunately, but the way to address that is not to document all the problems repeatedly, but just to fix them.


By default, since version 15.0.0 (if you're using an earlier version, please upgrade, as this is a critical security feature!), Twisted will automatically validate ``https`` URLs against your platform's trust store to the best of its ability.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the warning here will read better if it's pulled out of this parenthetical. We probably have some kind of .. warning thing? I'd try sticking it in such a thing at the top of this HTTPS section.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"By default" and "automatically" feel somewhat redundant. I'd say "Twisted will automatically validate ..." or "By default, Twisted will validate ..."

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree; there's no ..warning:: that I can see, so I think I'l use a ..note:: for this.


``Agent``'s constructor takes an optional second argument, which allows you to customize its behavior with respect to HTTPS.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"... second argument which allows ..." (no comma)

Copy link
Member Author

@glyph glyph Jan 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell, your comment here is just backwards. "which allows you to customize its behavior" is (I believe) a nonrestrictive phrase, the closest analog of which I can find in this list of examples https://www.grammarly.com/blog/comma-before-which/ is "Rob tripped over his shoes, which he had left lying in the middle of the floor" and "Everyone loved Robin’s video, which she had filmed in her garage", both of which are listed as requiring a comma (they don't even seem to think it's optional). So I'm going to ignore it.

In the future, if you're going to provide grammar rule feedback like this, can you provide some a citation the author can confirm their understanding of the rule you're trying to suggest adherence to? I have to admit I'm not confident that your understanding of the rules here is wrong and mine is right, but my searching isn't turning up counterexamples. If it's a stylebook thing, having a link also simplifies future feedback for other authors, for consistency.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(just to be clear, I originally typed "restrictive" but then corrected it to "nonrestrictive" above)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My reading of the statement is that it is a restrictive clause. The context of the sentence makes the HTTPS customization the critical information being conveyed here. Maybe that actually means the sentence should be something more like "You can customize the behavior of Agent with respect to HTTPS by using its optional second argument." and that's the edit I should have suggested rather than just the comma removal (since technically "Agent's constructor takes an optional second argument." makes perfect sense on its own).

wrt grammar citations, I've long thought we should officially adhere to a particular style guide. Then folks would have something to look at to figure this stuff out in advance and there'd be no confusion about where the rules come from (and we really don't want to allow just any random citation as justification since there are conflicting rules).

(no action required, in any case, just wanted to follow-up)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My reading of the statement is that it is a restrictive clause. The context of the sentence makes the HTTPS customization the critical information being conveyed here. Maybe that actually means the sentence should be something more like "You can customize the behavior of Agent with respect to HTTPS by using its optional second argument." and that's the edit I should have suggested rather than just the comma removal (since technically "Agent's constructor takes an optional second argument." makes perfect sense on its own).

My understanding of "restrictive" is that, while it is often colloquially described as "provides important information", what it actually means is that it provides essential information, i.e. information without which the meaning would be different. If you can provide the additional information in separate sentences, then the clause, while it might be important, is not restrictive.

To phrase this in terms of the things we're talking about here, in this case the clause would be restrictive:

Agent's constructor takes one argument which allows you to customize its behavior with respect to HTTPS.

but non-restrictive in this case:

Agent's constructor takes a second argument, which allows you to customize its behavior with respect to HTTPS.

since in the first case if you eliminate the "which" clause, the sentence could legitimately be interpreted as "Agent's constructor takes only one argument", whereas in the second case you could write it as:

Agent's constructor takes a second argument. It allows you to customize its behavior with respect to HTTPS.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrt grammar citations, I've long thought we should officially adhere to a particular style guide. Then folks would have something to look at to figure this stuff out in advance and there'd be no confusion about where the rules come from (and we really don't want to allow just any random citation as justification since there are conflicting rules).

I definitely agree. The key thing that's stopped me from doing exactly that is that the two style guides I might want to adopt, http://www.apstylebook.com or http://www.chicagomanualofstyle.org/home.html, both require a site-license.

Is there a good freely-available style guide we could adopt?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(no action required, in any case, just wanted to follow-up)

Yeah, definitely came across that way, but thanks for making this explicit. No action required for any of my responses either. :)


``Agent``'s constructor takes an optional second argument, which allows you to customize its behavior with respect to HTTPS.
The object passed here must implement the :api:`twisted.web.iweb.IPolicyForHTTPS` interface.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not "implement". "provide", yes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Good catch.




So, here's our example.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line doesn't add a lot (and the example doesn't start for two more sentences 😄 ).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Point taken, deleted.

self._normalPolicy = BrowserLikePolicyForHTTPS()
def creatorForNetloc(self, hostname, port):
if hostname == b"wrong.host.badssl.com":
return optionsForClientTLS(u"badssl.com")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would self._normalPolicy.creatorForNetloc(u"badssl.com", port) work here? Are you making a point by using optionsForClientTLS instead?

What about the reverse question about the other case?

ie, can both branches call the same function and just vary the arguments? That might be clearer and you can remove one concept from the example of you make these consistent.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I was trying to make a point (illustrating that you can construct these objects yourself if you want to, pointing out the API that you might want to use is optionsForClientTLS to try to steer people away from building their own context factory) but you're right, fewer concepts is better. The TLS doc already exists and those concepts are explained there.

else:
return self._normalPolicy.creatorForNetloc(hostname, port)

@react
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, don't think I didn't notice this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know you are not the biggest fan of decorator abuse :-) but in the case of twisted.internet.task.react, I actually think the argument-handling stuff is by far worse than useless (if you want to do any error-handling at all, you need to take *args and then pass it to a different function) and using it this way just encourages the user to think of it as "here's how you say where the main function is".

I'd actually like to do a react++ which is designed to be used this way, and only runs the function when it's defined in __main__, still allowing it to be imported, and maybe do something else with the command-line args. We should discuss that elsewhere though.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, would use "react++" ;)


So, here's our example.
Using the very helpful ``badssl.com`` web API, we will construct a requst that fails to validate, because the certificate has the wrong hostname in it.
The following Python program should produce a verification error, when run as ``python example.py https://wrong.host.badssl.com/``.

.. code-block:: python
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this make this a Python 2 code block or is it unspecified whether this is Python 2 or Python 3 code? Is it meant to be polyglot?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consulting the sphinx docs, apparently this specifically means "python 2": http://www.sphinx-doc.org/en/1.5.1/config.html#confval-highlight_language but my intention is that it be polyglot, because readers of the documentation might always be using either version.

@react
def main(reactor):
agent = Agent(reactor, OneHostnameWorkaroundPolicy())
requested = agent.request(b"GET", sys.argv[1])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked about Python 2 vs Python 3 because I wonder what type sys.argv[1] is.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wince.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Thanks for pointing this out. Lucky for us, implicitly, the type of IAgent.request's second argument is actually "ASCII-only bytes", because IRIs officially aren't 8-bit clean.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

… which further means that simply adding an .encode("ascii") to the end here manages to make it work with py2 & py3 simultaneously, thanks to str's confusion on py2.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(and testing the script manually on both just to make sure seems to work)

Now, invoking ``python example.py https://wrong.host.badssl.com/`` will happily give us a ``200`` status code; however, running it with ``https://expired.badssl.com/`` or ``https://self-signed.badssl.com/`` or any of the other error hostnames should still give an error.

Using this TLS policy mechanism, you can customize Agent to use any feature of TLS that Twisted has support for, including the examples given above; client certificates, alternate trust roots, and so on.
For a more detailed explanation of what options exist for client TLS configuration in Twisted, check out the documentation for the :api:`twisted.internet.ssl.optionsForClientTLS <optionsForClientTLS>` API and the :doc:`Using SSL in Twisted <../../core/howto/ssl>` chapter of this documentation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be nice to standardize on a "Further Reading" section in the docs which contains tips like this one (providing users with the benefit of navigational simplicity in accessing the information).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might want to make that as a separate change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specifically, in this case, I really want to mention optionsForClientTLS first, because there's a lot of ways to screw that up, and putting them in a sentence rather than a list makes the order more significant.

@glyph
Copy link
Member Author

glyph commented Jan 14, 2017

Hopefully all the buildbots are alive and happy now.

@codecov-io
Copy link

Current coverage is 91.16% (diff: 100%)

Merging #646 into trunk will decrease coverage by 0.05%

@@              trunk       #646   diff @@
==========================================
  Files           838        838          
  Lines        146956     146956          
  Methods           0          0          
  Messages          0          0          
  Branches      12984      12984          
==========================================
- Hits         134059     133979    -80   
- Misses        10665      10739    +74   
- Partials       2232       2238     +6   

Powered by Codecov. Last update 80ce0cc...b341886

@glyph glyph merged commit 87db6da into trunk Jan 14, 2017
@glyph glyph deleted the doc-htt-client-ssl-valid-7879 branch January 14, 2017 11:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants