Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve client "hello world" in README #4272

Closed
ksamuel opened this issue Oct 26, 2019 · 8 comments · Fixed by #4329
Closed

Improve client "hello world" in README #4272

ksamuel opened this issue Oct 26, 2019 · 8 comments · Fixed by #4329

Comments

@ksamuel
Copy link
Contributor

ksamuel commented Oct 26, 2019

When looking at aiohttp README, the hello word looks like that:

import aiohttp
import asyncio

async def fetch(session, url):
    async with session.get(url) as response:
        return await response.text()

async def main():
    async with aiohttp.ClientSession() as session:
        html = await fetch(session, 'http://python.org')
        print(html)

if __name__ == '__main__':
    loop = asyncio.get_event_loop()
    loop.run_until_complete(main())

For requests, it looks like this:

>>> r = requests.get('https://api.github.com/user', auth=('user', 'pass'))
>>> r.status_code
200
>>> r.headers['content-type']
'application/json; charset=utf8'
>>> r.encoding
'utf-8'
>>> r.text
u'{"type":"User"...'
>>> r.json
{u'private_gists': 419, u'total_private_repos': 77, ...}

I can see three issues with the way aiohttp present itself:

  • The snippet seems to have unnecessary complexity
  • The snippet doesn't feature any quick scripting setup
  • There is no explanation for the necessary complexity

Expected behaviour

First, extra complexity should be removed. I understand the benefit of if __name__ == "__main__", but in a hello word type of code, it's noise that makes the code seems bigger than necessary. If you have the skill to learn async, you know when to add it later.

Also, can the code go from:

async def fetch(session, url):
    async with session.get(url) as response:
        return await response.text()

async def main():
    async with aiohttp.ClientSession() as session:
        html = await fetch(session, 'http://python.org')
        print(html)

To:

async def main():
    async with aiohttp.ClientSession() as session:
        async with session.get('http://python.org') as response:
            return await response.text()

?

I'm guessing the first form exhibit some kind of best practice I'm missing but:

  • since I'm missing it, I have no way to understand why it's best, and won't use it in the future
  • it makes the code feels heavy, and hide the expressiveness of aiohttop behind the verbosity of the coroutine system
  • the first time you grab for aiohttp, you probably just want to make a quick script out of it, not build a system with best practices

And lastly, even this is weird:

async def main():
    async with aiohttp.ClientSession() as session: # why session ?
        async with session.get('http://python.org') as response: # why can't I do that in one line ?
            return await response.text()

This is necessary complexity. It's done for a reason, but coming from requests, it's strange and seems uncalled for. The README should acknowledge that, then offer a link leading to and explanation of why this is necessary for an async lib, and the consequences of it. It there are shortcuts to that syntax for quick scripts with less performance/best practice needs, the README should mention THAT first, give a quick warning, then follow with the best practice.

aiohttp is the requests of async as for now. It's the first face a newcommer see when discovering the async world nowaday (well, after playing with asyncio.sleep in tutorials).

@asvetlov
Copy link
Member

Thanks for raising an interesting question.

Making the best 'hello world' example is very important, sure.

I don't know how to achieve it; please let me add some comments. Consider it as an invite to the discussion, not a strong objection for your proposal.

async def main():
    async with aiohttp.ClientSession() as session: # why session ?
        async with session.get('http://python.org') as response: # why can't I do that in one line ?
            return await response.text()

why session?

How do you want to name it? The client also sounds good. Why is session needed at all? To control resources' lifetime. In sync request's world, everything is released on __del__, mostly implicitly. In asyncio it should be async operation, e.g. await resource.close(). async with provides better and more idiomatic approach than try/finally with await close() in finally block.

The same for async with session.get() as resp: the response object should be closed as well.

Why requests don't use the session concept? It does (see requests.Session class). The problem with implicit sessions is that the connection is recreated every time; it leads to performance lose if people use just resp = requests.get(url) because the lack of HTTP keep-alive.
I saw it very many times.

Splitting the example into 2 functions emphasizes that the session can be created once and reused by many fetches. Unfortunately, it doesn't work well; as I see newbies recreate sessions very often in their code :(

We had top-level get() / post() etc functions like requests have but these functions was designed bad. A backward-compatible fix was impossible, that's why these functions were removed. In aiohttp 4.x we can resurrect these functions with the following usage:

async with aiohttp.get(url) as resp:
    print(await resp.text())

I still not sure if we should do this: any real code should use ClientSession for better controls of resources; shortcut is good for 'hello world' only.
But the shortcut notation is too tempting and short; I expect that people will start using it everywhere without careful thinking.

Should we introduce a way that saves one line with 100% performance lost? I'm not sure.

What do you think?

@ksamuel
Copy link
Contributor Author

ksamuel commented Oct 26, 2019

That's the most open minded answer I received in a while :)

For the README part, here what it looks like when you simplify the snippet and add a quick explanation:

Getting Started
===============

Client example::

    import aiohttp
    import asyncio

    async def main():
        async with aiohttp.ClientSession() as session:
            async with session.get(url) as response:
                print(await response.text())

    loop = asyncio.get_event_loop()
    loop.run_until_complete(main())

Coming from requests or urllib ? Check out why your need all this with our explanations on `steps of an asynchronous HTTP request<LINK HERE>`_.

This assumes there is a documentation page named "steps of an asynchronous HTTP request", of course.

Now, about the simple API, it's a more controversial issue. Please consider it independently of my README proposal. Maybe I should open a new ticket ?

My take on it is that Python does a lot of things this way: easy first, most performant possible later if you want.

So I would definitely add some "quick and dirty" shortcut, built on top of the clean aiohttp recommended way. Something like:

def get(*args, **kwargs):
    # PreloadedClientResponse.text() returns self.body.decode(), not an awaitable
    async with aiohttp.ClientSession(response_class=PreloadedClientResponse) as session:
        async with session.get(*args, **kwargs) as response:
            await response.read()
            return response

So that you can do:

response = await aiohttp.get(url) # call .read() internally
print(response.text())

Of course, you need a warning below that with a link to the best practices.

I know it seems dangerous, but let's consider that:

  • even inefficient aiohttp is faster than any sync alternative
  • in simple scripts this is enough
  • people needing the perf will read the doc eventually, and find the article on best practices. Migrating their code to use Session after the fact is not hard, thank to your very good design (love reading the aiohttp source code BTW, it's a pleasure).
  • bad practices didn't prevent requests to power most HTTP related Python during the last decades.
  • having to write this part of the doc would be nice anyway. When you look at the session object, you wonder when and where to create a session, for how long, when to destroy it, how to pass it around, how it plays out with contextvar, what are the consequences of having several sessions opened at the same time, should the same session be used for different domain names, etc. Having a definitive answer to this is great.

If you choose to go that road, then the README could look like this:

Getting Started
===============

 Simplest client example::

    import aiohttp
    import asyncio

    async def main():
        response = await aiohttp.get(url)  
        print(response.text())

    loop = asyncio.get_event_loop()
    loop.run_until_complete(main()) 

Please note **this syntax is intended for simple scripts only**. To get the best out of an aiohttp client, you should use `proper session management<LINK HERE>`_.

If this is something you are interested in, I'm ok with working on a PR under your guidance.

@asvetlov
Copy link
Member

Agree, but I prefer to make aiohttp.get() and family async context manager that closes all resources: both client session and response at the exit from async with block.

Another thing can be the replacement of get_event_loop()/run_until_complete() with asyncio.run(). Python 3.7+ is enough widespread nowdays I guess.

@ksamuel
Copy link
Contributor Author

ksamuel commented Oct 28, 2019 via email

@asvetlov
Copy link
Member

Let's start from chatting.
Does google talk or gitter work for you?

@ksamuel
Copy link
Contributor Author

ksamuel commented Oct 29, 2019

I just created a gitter account for that purpose and pinged you on it. I'm available this morning up to 11am. Same with tomorrow. Let me know when you have a moment.

@webknjaz
Copy link
Member

@ksamuel FWIW it's usually important to mention the timezone on the Internet :)

@ksamuel
Copy link
Contributor Author

ksamuel commented Oct 30, 2019

Indeed. I realized my mistake a few moments later and sent the timezone, no worry.

@helpr helpr bot added the pr-available label Nov 8, 2019
@helpr helpr bot added pr-merged and removed pr-available labels Nov 12, 2019
netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this issue Oct 24, 2020
This fixes py-yarl in pkgsrc being too new for py-aiohttp.


3.7.0 (2020-10-24)
==================

Features
--------

- Response headers are now prepared prior to running ``on_response_prepare`` hooks, directly before headers are sent to the client.
  `#1958 <https://github.com/aio-libs/aiohttp/issues/1958>`_
- Add a ``quote_cookie`` option to ``CookieJar``, a way to skip quotation wrapping of cookies containing special characters.
  `#2571 <https://github.com/aio-libs/aiohttp/issues/2571>`_
- Call ``AccessLogger.log`` with the current exception available from ``sys.exc_info()``.
  `#3557 <https://github.com/aio-libs/aiohttp/issues/3557>`_
- `web.UrlDispatcher.add_routes` and `web.Application.add_routes` return a list
  of registered `AbstractRoute` instances. `AbstractRouteDef.register` (and all
  subclasses) return a list of registered resources registered resource.
  `#3866 <https://github.com/aio-libs/aiohttp/issues/3866>`_
- Added properties of default ClientSession params to ClientSession class so it is available for introspection
  `#3882 <https://github.com/aio-libs/aiohttp/issues/3882>`_
- Don't cancel web handler on peer disconnection, raise `OSError` on reading/writing instead.
  `#4080 <https://github.com/aio-libs/aiohttp/issues/4080>`_
- Implement BaseRequest.get_extra_info() to access a protocol transports' extra info.
  `#4189 <https://github.com/aio-libs/aiohttp/issues/4189>`_
- Added `ClientSession.timeout` property.
  `#4191 <https://github.com/aio-libs/aiohttp/issues/4191>`_
- allow use of SameSite in cookies.
  `#4224 <https://github.com/aio-libs/aiohttp/issues/4224>`_
- Use ``loop.sendfile()`` instead of custom implementation if available.
  `#4269 <https://github.com/aio-libs/aiohttp/issues/4269>`_
- Apply SO_REUSEADDR to test server's socket.
  `#4393 <https://github.com/aio-libs/aiohttp/issues/4393>`_
- Use .raw_host instead of slower .host in client API
  `#4402 <https://github.com/aio-libs/aiohttp/issues/4402>`_
- Allow configuring the buffer size of input stream by passing ``read_bufsize`` argument.
  `#4453 <https://github.com/aio-libs/aiohttp/issues/4453>`_
- Pass tests on Python 3.8 for Windows.
  `#4513 <https://github.com/aio-libs/aiohttp/issues/4513>`_
- Add `method` and `url` attributes to `TraceRequestChunkSentParams` and `TraceResponseChunkReceivedParams`.
  `#4674 <https://github.com/aio-libs/aiohttp/issues/4674>`_
- Add ClientResponse.ok property for checking status code under 400.
  `#4711 <https://github.com/aio-libs/aiohttp/issues/4711>`_
- Don't ceil timeouts that are smaller than 5 seconds.
  `#4850 <https://github.com/aio-libs/aiohttp/issues/4850>`_
- TCPSite now listens by default on all interfaces instead of just IPv4 when `None` is passed in as the host.
  `#4894 <https://github.com/aio-libs/aiohttp/issues/4894>`_
- Bump ``http_parser`` to 2.9.4
  `#5070 <https://github.com/aio-libs/aiohttp/issues/5070>`_


Bugfixes
--------

- Fix keepalive connections not being closed in time
  `#3296 <https://github.com/aio-libs/aiohttp/issues/3296>`_
- Fix failed websocket handshake leaving connection hanging.
  `#3380 <https://github.com/aio-libs/aiohttp/issues/3380>`_
- Fix tasks cancellation order on exit. The run_app task needs to be cancelled first for cleanup hooks to run with all tasks intact.
  `#3805 <https://github.com/aio-libs/aiohttp/issues/3805>`_
- Don't start heartbeat until _writer is set
  `#4062 <https://github.com/aio-libs/aiohttp/issues/4062>`_
- Fix handling of multipart file uploads without a content type.
  `#4089 <https://github.com/aio-libs/aiohttp/issues/4089>`_
- Preserve view handler function attributes across middlewares
  `#4174 <https://github.com/aio-libs/aiohttp/issues/4174>`_
- Fix the string representation of ``ServerDisconnectedError``.
  `#4175 <https://github.com/aio-libs/aiohttp/issues/4175>`_
- Raising RuntimeError when trying to get encoding from not read body
  `#4214 <https://github.com/aio-libs/aiohttp/issues/4214>`_
- Remove warning messages from noop.
  `#4282 <https://github.com/aio-libs/aiohttp/issues/4282>`_
- Raise ClientPayloadError if FormData re-processed.
  `#4345 <https://github.com/aio-libs/aiohttp/issues/4345>`_
- Fix a warning about unfinished task in ``web_protocol.py``
  `#4408 <https://github.com/aio-libs/aiohttp/issues/4408>`_
- Fixed 'deflate' compression. According to RFC 2616 now.
  `#4506 <https://github.com/aio-libs/aiohttp/issues/4506>`_
- Fixed OverflowError on platforms with 32-bit time_t
  `#4515 <https://github.com/aio-libs/aiohttp/issues/4515>`_
- Fixed request.body_exists returns wrong value for methods without body.
  `#4528 <https://github.com/aio-libs/aiohttp/issues/4528>`_
- Fix connecting to link-local IPv6 addresses.
  `#4554 <https://github.com/aio-libs/aiohttp/issues/4554>`_
- Fix a problem with connection waiters that are never awaited.
  `#4562 <https://github.com/aio-libs/aiohttp/issues/4562>`_
- Always make sure transport is not closing before reuse a connection.

  Reuse a protocol based on keepalive in headers is unreliable.
  For example, uWSGI will not support keepalive even it serves a
  HTTP 1.1 request, except explicitly configure uWSGI with a
  ``--http-keepalive`` option.

  Servers designed like uWSGI could cause aiohttp intermittently
  raise a ConnectionResetException when the protocol poll runs
  out and some protocol is reused.
  `#4587 <https://github.com/aio-libs/aiohttp/issues/4587>`_
- Handle the last CRLF correctly even if it is received via separate TCP segment.
  `#4630 <https://github.com/aio-libs/aiohttp/issues/4630>`_
- Fix the register_resource function to validate route name before splitting it so that route name can include python keywords.
  `#4691 <https://github.com/aio-libs/aiohttp/issues/4691>`_
- Improve typing annotations for ``web.Request``, ``aiohttp.ClientResponse`` and
  ``multipart`` module.
  `#4736 <https://github.com/aio-libs/aiohttp/issues/4736>`_
- Fix resolver task is not awaited when connector is cancelled
  `#4795 <https://github.com/aio-libs/aiohttp/issues/4795>`_
- Fix a bug "Aiohttp doesn't return any error on invalid request methods"
  `#4798 <https://github.com/aio-libs/aiohttp/issues/4798>`_
- Fix HEAD requests for static content.
  `#4809 <https://github.com/aio-libs/aiohttp/issues/4809>`_
- Fix incorrect size calculation for memoryview
  `#4890 <https://github.com/aio-libs/aiohttp/issues/4890>`_
- Add HTTPMove to _all__.
  `#4897 <https://github.com/aio-libs/aiohttp/issues/4897>`_
- Fixed the type annotations in the ``tracing`` module.
  `#4912 <https://github.com/aio-libs/aiohttp/issues/4912>`_
- Fix typing for multipart ``__aiter__``.
  `#4931 <https://github.com/aio-libs/aiohttp/issues/4931>`_
- Fix for race condition on connections in BaseConnector that leads to exceeding the connection limit.
  `#4936 <https://github.com/aio-libs/aiohttp/issues/4936>`_
- Add forced UTF-8 encoding for ``application/rdap+json`` responses.
  `#4938 <https://github.com/aio-libs/aiohttp/issues/4938>`_
- Fix inconsistency between Python and C http request parsers in parsing pct-encoded URL.
  `#4972 <https://github.com/aio-libs/aiohttp/issues/4972>`_
- Fix connection closing issue in HEAD request.
  `#5012 <https://github.com/aio-libs/aiohttp/issues/5012>`_
- Fix type hint on BaseRunner.addresses (from ``List[str]`` to ``List[Any]``)
  `#5086 <https://github.com/aio-libs/aiohttp/issues/5086>`_
- Make `web.run_app()` more responsive to Ctrl+C on Windows for Python < 3.8. It slightly
  increases CPU load as a side effect.
  `#5098 <https://github.com/aio-libs/aiohttp/issues/5098>`_


Improved Documentation
----------------------

- Fix example code in client quick-start
  `#3376 <https://github.com/aio-libs/aiohttp/issues/3376>`_
- Updated the docs so there is no contradiction in ``ttl_dns_cache`` default value
  `#3512 <https://github.com/aio-libs/aiohttp/issues/3512>`_
- Add 'Deploy with SSL' to docs.
  `#4201 <https://github.com/aio-libs/aiohttp/issues/4201>`_
- Change typing of the secure argument on StreamResponse.set_cookie from ``Optional[str]`` to ``Optional[bool]``
  `#4204 <https://github.com/aio-libs/aiohttp/issues/4204>`_
- Changes ``ttl_dns_cache`` type from int to Optional[int].
  `#4270 <https://github.com/aio-libs/aiohttp/issues/4270>`_
- Simplify README hello word example and add a documentation page for people coming from requests.
  `#4272 <https://github.com/aio-libs/aiohttp/issues/4272>`_
- Improve some code examples in the documentation involving websockets and starting a simple HTTP site with an AppRunner.
  `#4285 <https://github.com/aio-libs/aiohttp/issues/4285>`_
- Fix typo in code example in Multipart docs
  `#4312 <https://github.com/aio-libs/aiohttp/issues/4312>`_
- Fix code example in Multipart section.
  `#4314 <https://github.com/aio-libs/aiohttp/issues/4314>`_
- Update contributing guide so new contributors read the most recent version of that guide. Update command used to create test coverage reporting.
  `#4810 <https://github.com/aio-libs/aiohttp/issues/4810>`_
- Spelling: Change "canonize" to "canonicalize".
  `#4986 <https://github.com/aio-libs/aiohttp/issues/4986>`_
- Add ``aiohttp-sse-client`` library to third party usage list.
  `#5084 <https://github.com/aio-libs/aiohttp/issues/5084>`_


Misc
----

- `#2856 <https://github.com/aio-libs/aiohttp/issues/2856>`_, `#4218 <https://github.com/aio-libs/aiohttp/issues/4218>`_, `#4250 <https://github.com/aio-libs/aiohttp/issues/4250>`_
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants