New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
contrib.appengine.AppEngineManager #664
Conversation
5d24451
to
45394b5
Compare
|
||
try: | ||
from urllib.parse import urlencode | ||
except: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
except ImportError
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@Lukasa @sigmavirus24 Would love your thoughts on some of the open questions inline:
Hm, I've been thinking about this. I think it may make more sense to have the AppEngine URLFetch driver be at the Manager-level (see poolmanager.py) rather than ConnectionPool-level because it doesn't make sense for our PoolManager/ProxyManager/etc to manage pools of GAEConnectionPool, right? This should make things a fair bit more straight-forward. Any other opinions/ideas?
Ideally there shouldn't be too many duplicate tests. Any overlapping functionality should be inherited (and thus won't need to be exercised separately) whereas any new functionality should be tested explicitly, right?
This is a good question, and it'll depend if we go with the ConnectionPool-level or Manager-level. The other question is: How automatic/implicit vs explicit should this be?
Honestly I was hoping that's something you could inform us about. We're not too sure how URLFetch differs, as we don't use it ourselves. :/ Another thought: It may be better to start this off as a urllib3/contrib/ submodule. Our standards for those are somewhat lower, so it'll be easier to get this work merged and start to be used by the community while we figure out how we want to integrate it into the core API. |
@jonparrott Btw, thank you again for volunteering to tackle this. We very much appreciate your time and effort. :) |
The downside of this is that GAE no longer becomes transparent to users. The ideal case would be that libraries that use urllib3 just magically work on GAE, and I don't think that'd be true if the AppEngine driver was a replacement for PoolManager instead of a replacement for ConnectionPool. Though I'm a little concern that I missed what you were aiming for here.
My vote is for automatic and implicit. The reason to do explicit is to force users to know that something different happens on GAE, but what we're doing there is pushing responsibility for testing and debugging those differences onto our users, when we should be the ones bearing that burden. |
Cool, I'll inherit from the other test and null out the cases that aren't applicable.
I can help, absolutely, but as I mentioned before I'm not intimately familiar with how URLFetch works under the covers (though I guess I could look). Most importantly, I'd like to know what things you guys are already aware of.
Fine with me.
I'm cool with it either way, it seems the implementation is basically the same (urlopen). Let me know what you guys decide.
No worries, I'm here to help make GCP a great place for Python developers.
We should be aware of what we're doing to users if it's automatic and implicit, and the user should always be able to switch. Here's a comparison of App Engine's environments and the caveats I'm aware of:
So, we need to figure out what the best implicit behavior is for each environment (I have code that can reliably detect this) and how to give users the ability to change it. |
@Lukasa Why not? We could swap out PoolManager with GAEPoolManager at runtime same way we swap in VerifiedHTTPSConnectionPool. Or maybe we need to finally add another layer above PoolManager which would do this implicitly (we've been talking about adding a Session object? Or another name?). The silly thing is that ConnectionPool objects are there to manage pools of connections (via our queue collections), which URLFetch does not have (those are managed behind the scenes by URLFetch). In effect, URLFetch is an interface to a ConnectionPool-type object. Then the PoolManager would make an instance of these ConnectionPool managers per host, which would really add nothing in terms of functionality—runtime complexity and stack depth (and possibly GAE-incompatible bugs therein). Plus, it's not like any of our Manager-features would work with URLFetch without more interception too. For example, ProxyManager wouldn't work. |
@shazow I've moved the pool into the contrib directory. |
I suppose, but...ewww. It feels really weird to do that at runtime if we can possibly avoid it, though I suppose it's an acceptable approach. Basically, I don't like it, but I can't articulate a good reason. Consider me -0 on that idea. =P |
Actually, no, I think that might be fine. If it were me I'd probably take the stack depth hit, but if that's off the table, the 'patch at runtime' approach is fine. |
@Lukasa Please do me a favour: sleep on it and try to articulate why you feel negative about it, I'm curious. :) Either way it's going to be a patch at runtime, right? Whether we swap the ConnectionPool or the PoolManager. Only difference is that a ConnectionPool of ConnectionPools is an extra unnecessary layer of abstraction, as far as I can tell. There's no benefit, and some potential surprises (e.g. ProxyManager not working with it). I'd really like to get somekind of consensus on this as I think this decision will inform how we move forward in our future designs. I still suspect that adding another layer above PoolManager which automagically manages the correct things might be the best and least-surprising option. |
It's your library, but adding another layer of complexity just to support GAE's sandbox doesn't sound fun to me. If there's other use cases beyond just supporting GAE for this then I'm cool with it. |
Yea it's something we've talked about before. urllib3 in general is largely about being very manual/explicit, but more and more we move towards autodetecting/autoconfiguring which creates a schism. Putting all that magic in one place might help keep things more sane: Things like setting up good TLS defaults (use certifi if available), proxy settings (load from .netrc/env vars), platform-specific injections (like AppEngine, or PyOpenSSL), maybe someday cookie handling (ugh). |
Gotcha, sounds good to me. |
AFAIK URLfetch does not support chunked encoding, so when we use it implicitly streaming APIs will not work anymore. |
status=urlfetch_resp.status_code | ||
) | ||
|
||
def _absolute_url(self, path): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better move the definition from HTTPConnectionPool
to ConnectionPool
and use it from there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will hold off on this until the decision of pool vs manager is settled.
Are you serious? From RFC 7230 Section 4.1:
The only way URLFetch gets around that requirement is by advertising themselves as a HTTP/1.0 client. Do they do that? |
**response_kw): | ||
|
||
retries = self._get_retries(retries, redirect) | ||
log.info(retries) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's try to keep logging messages consistent with other urlopens?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
debugging statement I accidentally left in, fixed.
Looks sensible, just need to cleanup some debugging code and ideally match up logging messages with other pools. Could use another pair of eyes, though. (@Lukasa @t-8ch @sigmavirus24?) |
Removed the debugging stuff (that's what I get for trying to get this pushed before I left yesterday). |
@Lukasa Can you give this a quick look-over? I want to include this in the v1.11 early next week. @jonparrott Would you be up for adding a bit of docs for how to use this module? It can live here for now: https://github.com/shazow/urllib3/blob/master/docs/contrib.rst -> https://urllib3.readthedocs.org/en/latest/contrib.html Something along the lines of... from urllib3 import PoolManager
from urllib3.contrib.appengine import AppEngineManager, is_appengine
# This substitution will be done automagically once appengine code
# graduates from the contrib module.
if is_appengine():
# AppEngineManager uses AppEngine's URLFetch API behind the scenes
http = AppEngineManager()
else:
# PoolManager uses a socket-level API behind the scenes
http = PoolManager()
# The client API should be consistent across managers, though some features are not available
# in URLFetch and you'll get warnings when you try to use them (like granular timeouts).
r = http.request('GET', 'https://google.com/') Overall and example and little blurb for detecting which AppEngine environment you're in, a recipe for swapping out the respective manager, and that this will happen automagically as soon as AppEngineManager graduates from contrib-land. :) |
Connection manager for Google App Engine sandbox applications. | ||
|
||
This manager uses the URLFetch service directly instead of using the | ||
emulated httplib. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a list of features that AppEngineManager will throw PlatformError/Warnings for in the docstring?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
@shazow I'm certainly happy to have this in contrib land for the next release. |
7ce31ae
to
28adf54
Compare
contrib: App Engine Connection Pool
@jonparrott This is fantastic and invaluable. Thank you so much for helping us out here, I'm very excited about this and the urllib3/requests community really appreciates your contribution but even moreso: I'm sure all the AppEngine users will be just ecstatic! Make sure to link to this thread in your promo packet, you really deserve some recognition for your work. :) ❤️ 🍮 ✨ |
Happy to help. I'm sure this will have bugs and fun edge cases, so please mention me on any bugs that get filed on this. |
@jonparrott Thanks so much for responding when I sent up the Bat-Signal! |
This is currently very rough and early. I want to make sure I'm on the right track before continuing.
Outstanding questions:
[shazow edit: Fixes #664]