Add tracking signals for getting request/response bodies. #2767

kowalski · 2018-02-26T18:58:58Z

This is PR for issue: #2755

codecov-io · 2018-02-26T19:50:52Z

Codecov Report

Merging #2767 into master will increase coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #2767      +/-   ##
==========================================
+ Coverage   97.98%   97.99%   +<.01%     
==========================================
  Files          39       39              
  Lines        7347     7377      +30     
  Branches     1289     1296       +7     
==========================================
+ Hits         7199     7229      +30     
  Misses         47       47              
  Partials      101      101

Impacted Files	Coverage Δ
aiohttp/client_reqrep.py	`97.44% <100%> (+0.05%)`	⬆️
aiohttp/tracing.py	`100% <100%> (ø)`	⬆️
aiohttp/http_writer.py	`100% <100%> (ø)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5382822...d7c995a. Read the comment docs.

asvetlov

I like the idea but implementation need a polishing.

asvetlov · 2018-02-27T09:54:46Z

aiohttp/http_writer.py

        """Writes chunk of data to a stream.

        write_eof() indicates end of stream.
        writer can't be used after write_eof() method being called.
        write() return drain future.
        """
+        self.on_chunk_sent.freeze()


I pretty sure there is better place for signal freezing than first write attempt.

Actually I was quite surprised with the whole concept of freeze(). It's very tricky! Do you know why it's necessary at all ?
The only reason I can think of is that the author wanted to prevent modifications to list of handlers while processing send().

Still, it's possible that we will decide to remove this Signal altogether, but if we don't, what do you think is the correct place to call freeze() ?

I can't do this in __init__ because I want to add handlers later

I can't delegate this responsibility to the user of the class, because this introduces a braking change where a code which works now like:
writer = StreamWriter(...) await writer.write(...)
will start raising RuntimeError

asvetlov · 2018-02-27T09:56:04Z

aiohttp/http_writer.py

@@ -56,13 +59,18 @@ def _write(self, chunk):
            raise asyncio.CancelledError('Cannot write to closing transport')
        self._transport.write(chunk)

-    def write(self, chunk, *, drain=True, LIMIT=64*1024):
+    def write(self, chunk, *, drain=True, LIMIT=64 * 1024):


Don't touch if you don't change the logic.
If you want update code style -- do it in a separate PR.
For this particular line I pretty happy with status quo.

That's wasn't me, it's my emacs ;) I will revert this change

Also I fixed setup.cfg so that this line is not altered by anyone else who uses autopep8

asvetlov · 2018-02-27T09:57:18Z

aiohttp/http_writer.py

        """Writes chunk of data to a stream.

        write_eof() indicates end of stream.
        writer can't be used after write_eof() method being called.
        write() return drain future.
        """
+        self.on_chunk_sent.freeze()
+        self.loop.create_task(


Spawning a new task is not what we need.
Technically .write is a coroutine, we need convert it to genuine async function.
Maybe I'll do it very soon in separate PR.

Are you sure we don't want this in another task ? Whatever handler does asynchronously will slow down writing the response.

Also .write isn't actually coroutine, because it's not called like:

for chunk in self.body: writer.write(chunk)

(without await) The couroutine returned is never triggered.
I guess I would have to track all the places there StreamWriter.write is used and add await in there. That's not undoable, but it's rather big refactor.

Yes, I'm sure. Spawning a task without waiting for result has an ugly smell.
More important spawning a task may reorder received signals, e.g. user may get request chunk after request finished.

.write should be async method. I'm working on this refactoring (required for #2698 anyway).
I suggest waiting for my job done.

Got it, I will wait until the refactoring is merged.

asvetlov · 2018-02-27T10:01:20Z

aiohttp/client_reqrep.py

@@ -573,6 +587,9 @@ def __init__(self, method, url, *,
        self._auto_decompress = auto_decompress
        self._cache = {}  # reqired for @reify method decorator

+        # avoid circular reference so that __del__ works
+        self.on_chunk_received = Signal(owner=None)


Not sure if we need a signal here.
It looks like signal concept overusage: you are creating on_chunk_received signal with the only usage by aiohttp internals itself. It is not very effective, signal subscription is literally a waste of time.
I suggest passing traces directly to response object.

Here the approach with passing traces will actually work quite well. I've introduced the signal to make it symetrical to the situation in request. I will update this to pass traces and remove the signal

asvetlov · 2018-02-27T10:01:51Z

aiohttp/http_writer.py

@@ -31,6 +32,8 @@ def __init__(self, protocol, transport, loop):
        self._compress = None
        self._drain_waiter = None

+        self.on_chunk_sent = Signal(self)


Again pass traces instead of creating a new internal-only signal.

Well, please consider that StreamWriter is used both by the http client and server. If I pass the traces in __init__() I will have to make something like:

def write( ): for trace in self.traces: await trace.sent_request_chunk_sent(...) ...

Now if you consider that code in the context of usage by http server, it clearly stops making sense....
This is why I introduced a signal and names it on_chunk_sent. Without specifying if it's http request it's sending or http response (as a server).

Let's avoid signals for internal-only usage.
Here we have two options: pass a callback for sending signals or derive a ClientStreamWriter class. Personally I prefer the second approach.

I will remove the Signal in favor of accepting an optional callable to be used as a callback.
I prefer not to create a child class for this - too many layers of inheritance hurts code readability.

asvetlov · 2018-02-27T10:03:03Z

Documentation should be updated as well

asvetlov · 2018-02-27T12:40:28Z

@kowalski #2774 is done, write() is async now

Mark pep8 rules E225 and E226 as ignored, to prevent automatic changes in code formating.

kowalski · 2018-02-27T13:32:14Z

@asvetlov all fixes done, including adding documentation.
I'm not squashing commits so that you can see what changes. I can squash changes after review to keep history clean.

asvetlov

Looks good but please fix a couple notes.

asvetlov · 2018-02-27T13:41:23Z

aiohttp/client_reqrep.py

@@ -475,7 +477,14 @@ def keep_alive(self):
            if self.url.raw_query_string:
                path += '?' + self.url.raw_query_string

-        writer = StreamWriter(conn.protocol, conn.transport, self.loop)
+        async def on_chunk_sent(chunk):


Convert the function into ClientRequest's method. No need to create a nested function on every HTTP request.

asvetlov · 2018-02-27T13:43:42Z

aiohttp/http_writer.py

@@ -62,6 +70,9 @@ def _write(self, chunk):
        writer can't be used after write_eof() method being called.
        write() return drain future.
        """
+        if self._on_chunk_sent:


Please do check self._on_chunk_sent is not None. It is more idiomatic.

asvetlov · 2018-02-27T13:45:41Z

aiohttp/tracing.py

-    'TraceDnsResolveHostStartParams', 'TraceDnsResolveHostEndParams',
-    'TraceDnsCacheHitParams', 'TraceDnsCacheMissParams',
-    'TraceRequestRedirectParams'
+    'TraceConfig', 'TraceRequestStartParams',


Do you really need to touch all these lines?
Appending two classes to the end of sequence should be enough.

Ok, reverted and added new classes at the end

asvetlov · 2018-02-27T13:46:39Z

docs/tracing_reference.rst

@@ -147,6 +147,20 @@ TraceConfig

      ``params`` is :class:`aiohttp.TraceRequestStartParams` instance.

+   .. attribute:: on_request_chunk_sent


Please also upgrade flow diagrams at the beginning of file.

done. Here I added new signals to description. Please let me know if you meant something more elaborate.

asvetlov · 2018-02-27T13:48:37Z

Not squashing is fine, I'll do it on merging anyway. Well, github button will do actually.

kxepal

Few more notes.

kxepal · 2018-02-27T14:02:12Z

aiohttp/client_reqrep.py

@@ -168,7 +168,8 @@ def __init__(self, method, url, *,
                 proxy=None, proxy_auth=None,
                 timer=None, session=None, auto_decompress=True,
                 ssl=None,
-                 proxy_headers=None):
+                 proxy_headers=None,
+                 traces=[]):


Let's not use mutables as defaults. These may cause some nasty bugs we could easily avoid.

true, changed to None

kxepal · 2018-02-27T14:03:02Z

aiohttp/client_reqrep.py

@@ -572,6 +583,7 @@ def __init__(self, method, url, *,
        self._timer = timer if timer is not None else TimerNoop()
        self._auto_decompress = auto_decompress
        self._cache = {}  # reqired for @reify method decorator
+        self._traces = traces


Why in ClientRequest traces is public attribute while in ClientResponse it's private?

I can answer: ClientResponse is a public class visible by user.
The attr should be private.
About request I don't care.

Ok, fine for me.

Agreed. I switched now to _traces in both classes

kxepal · 2018-02-27T14:03:13Z

aiohttp/client_reqrep.py

@@ -555,7 +565,8 @@ class ClientResponse(HeadersMixin):

    def __init__(self, method, url, *,
                 writer=None, continue100=None, timer=None,
-                 request_info=None, auto_decompress=True):
+                 request_info=None, auto_decompress=True,
+                 traces=[]):


Same defaults story.

also changed to None

asvetlov · 2018-02-27T14:08:47Z

docs/tracing_reference.rst

@@ -34,8 +34,8 @@ Overview
     exception[shape=flowchart.terminator, description="on_request_exception"];

     acquire_connection[description="Connection acquiring"];
-     got_response;
-     send_request;
+     got_response[description="on_response_chunk_received"];


Please rename got_response and send_request nodes to be close to actual signal names. Maybe that long names could be shortened on diagram -- be creative.

Also I like to see arrows what explicitly points that chunk_sent and chunk_received events can come more than once.

asvetlov · 2018-02-27T14:11:37Z

Wait, send_request and got_response in my mind is mostly for headers. We need different nodes for body chunks.

asvetlov · 2018-02-27T14:12:01Z

Please add ..versionadded:: 3.1 for new signals and data classes

kxepal · 2018-02-27T14:16:15Z

setup.cfg

@@ -1,5 +1,6 @@
 [pep8]
 max-line-length=79
+ignore=E225,E226


I'm in doubt that we should ignore these. Any reasons why to disable them globally?

Well, I'm not in place to judge this. I did it to be able to save a file which has 64*1000 literal without spaces around *. Otherwise autopep just fixes it for me.

That autopep suggestion looks correct. You may also write 64000 instead without any loss in readability.

I'm with you. @asvetlov specifically asked me to keep it as was.

For me the strange thing is successful passing flake8 checks without the ignore setting.
I'm totally fine with 64000 or even better 0x10000. Pretty sure it should be 64KiB instead of 64kB

Please drop the change. Feel free to replace the limit with 0x10000 if needed -- I'm +-0 for it.

kowalski · 2018-02-27T14:26:17Z

@asvetlov
If I split nodes for headers and chunks this is what I get. What do you think ?

asvetlov · 2018-02-27T16:50:10Z

The diagram looks better but I don't see arrows from sent/recevied nodes to exception.
Maybe resulting picture will be unreadable, in this case I suggest extracting headers/body signals into separate diagram -- like I did for connection establishment and DNS lookup.
I don't know -- please do the best but if you'll fail I'll polish the diagram after merging.

Also the diagram shows that we need on_request_headers_sent and on_response_headers_received signals maybe :) -- but in separate PR obviously.

asvetlov · 2018-02-27T16:51:54Z

docs/tracing_reference.rst

@@ -147,6 +155,23 @@ TraceConfig

      ``params`` is :class:`aiohttp.TraceRequestStartParams` instance.

+   .. attribute:: on_request_chunk_sent
+
+      .. versionadded:: 3.1


Move the line after the signal description, below params is :class:aiohttp.TraceRequestChunkSentParams instance. line.

This is our style guide for aiohttp documentation

asvetlov · 2018-02-27T16:52:50Z

docs/tracing_reference.rst

+      when a chunk of response body is received.
+
+      ``params`` is :class:`aiohttp.TraceResponseChunkReceivedParams` instance.
+


Please add versionadded here as well

kxepal · 2018-02-28T19:16:30Z

@asvetlov
Ok, approving to not be a blocker. It seems you're on pulse of further changes here.

asvetlov · 2018-02-28T19:17:24Z

@kxepal thanks

pfreixes · 2018-02-28T20:09:43Z

Just missing the proper tests in test_http_write.py and test_client_response.py that must check the changes at the function level, for the default path - traces and on_chunk_sent as None - and
the none default path. We alredy have it for test_connection.py right now.

I know that globally new handlers are tested via test_client_session.py, but its more an integration test, local tests allow us to check al branches explicitly.

asvetlov · 2018-02-28T20:39:18Z

@pfreixes agree.
@kowalski please add required tests

…walski/aiohttp into feature/add-signals-for-reqres-chunks

kowalski · 2018-03-01T11:11:01Z

all done

pfreixes · 2018-03-01T11:36:18Z

Thanks !

asvetlov

Excellent work!

asvetlov

Oh wait.
One last point is missing.
Please add a ./CHANGES log record.

…walski/aiohttp into feature/add-signals-for-reqres-chunks

kowalski · 2018-03-01T15:51:23Z

@asvetlov done (added ./CHANGES/2767.feature)

asvetlov

Thanks

asvetlov · 2018-03-01T16:25:35Z

Great job, thanks to everyone.
Will merge when tests pass.

lock · 2019-10-28T09:03:34Z

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a [new issue] for related bugs.
If you feel like there's important points made in this discussion, please include those exceprts into that [new issue].
[new issue]: https://github.com/aio-libs/aiohttp/issues/new

asvetlov requested changes Feb 27, 2018

View reviewed changes

asvetlov mentioned this pull request Feb 27, 2018

Async writer #2774

Merged

kowalski added 3 commits February 27, 2018 14:11

Add tracking signals for getting request/response bodies.

158cdbe

Revert automatic pep8 fix.

329f89a

Mark pep8 rules E225 and E226 as ignored, to prevent automatic changes in code formating.

Remove internal usage of Signal in favor of simple callbacks.

24e1db9

kowalski force-pushed the feature/add-signals-for-reqres-chunks branch from e093c00 to 24e1db9 Compare February 27, 2018 13:24

Document new signals

1be8ecb

asvetlov reviewed Feb 27, 2018

View reviewed changes

kowalski added 4 commits February 27, 2018 15:01

Move callback to a private method.

f19e7c0

Make check more idiomatic

45d6332

Reorder classes in __all__

e1e82e5

Update request lifecycle diagram to include new signals

6e3819f

kowalski force-pushed the feature/add-signals-for-reqres-chunks branch from c6a1ebd to 6e3819f Compare February 27, 2018 14:02

kxepal requested changes Feb 27, 2018

View reviewed changes

Don't use mutable defaults for traces. Make it private in ClientRequest

89dcb0f

asvetlov reviewed Feb 27, 2018

View reviewed changes

kxepal reviewed Feb 27, 2018

View reviewed changes

Further updates to tracing documentation

8288c26

asvetlov reviewed Feb 27, 2018

View reviewed changes

asvetlov and others added 2 commits February 27, 2018 20:49

Merge branch 'master' into feature/add-signals-for-reqres-chunks

9265d0b

Polish docs

202cb86

kxepal approved these changes Feb 28, 2018

View reviewed changes

kowalski added 6 commits March 1, 2018 11:49

Revert ignoring pep8 rules

57e3060

Subtle optimisation - don't create list instance if not needed

f944a17

Remove assert statement

6a93b16

Add test case ensuring StreamWriter calls callback

9f8d389

Add test checking that response.read() trigger trace callback

4fbc080

Merge branch 'feature/add-signals-for-reqres-chunks' of github.com:ko…

1034104

…walski/aiohttp into feature/add-signals-for-reqres-chunks

pfreixes approved these changes Mar 1, 2018

View reviewed changes

asvetlov approved these changes Mar 1, 2018

View reviewed changes

asvetlov requested changes Mar 1, 2018

View reviewed changes

asvetlov and others added 3 commits March 1, 2018 17:11

Merge branch 'master' into feature/add-signals-for-reqres-chunks

dcd7366

Add CHANGES record

7badf72

Merge branch 'feature/add-signals-for-reqres-chunks' of github.com:ko…

d7c995a

…walski/aiohttp into feature/add-signals-for-reqres-chunks

asvetlov approved these changes Mar 1, 2018

View reviewed changes

asvetlov merged commit d95ff20 into aio-libs:master Mar 1, 2018

asvetlov mentioned this pull request Mar 1, 2018

Client request tracing. Allow to get the request body in the trace. #2755

Closed

This was referenced Mar 22, 2018

Bump aiohttp from 2.3.9 to 3.1.0 andriyko/aiociscospark#34

Closed

Bump aiohttp from 3.0.5 to 3.1.0 thalesmorandi/MeguminBot#11

Closed

lock bot added the outdated label Oct 28, 2019

lock bot locked as resolved and limited conversation to collaborators Oct 28, 2019

psf-chronographer bot added the bot:chronographer:provided There is a change note present in this PR label Oct 28, 2019

		@@ -147,6 +147,20 @@ TraceConfig

		``params`` is :class:`aiohttp.TraceRequestStartParams` instance.

		.. attribute:: on_request_chunk_sent

		when a chunk of response body is received.

		``params`` is :class:`aiohttp.TraceResponseChunkReceivedParams` instance.

Add tracking signals for getting request/response bodies. #2767

Add tracking signals for getting request/response bodies. #2767

Conversation

kowalski commented Feb 26, 2018

codecov-io commented Feb 26, 2018 • edited Loading

Codecov Report

asvetlov left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

asvetlov commented Feb 27, 2018

asvetlov commented Feb 27, 2018

kowalski commented Feb 27, 2018

asvetlov left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

asvetlov commented Feb 27, 2018

kxepal left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kowalski Feb 27, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

asvetlov commented Feb 27, 2018

asvetlov commented Feb 27, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kowalski commented Feb 27, 2018

asvetlov commented Feb 27, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kxepal commented Feb 28, 2018

asvetlov commented Feb 28, 2018

pfreixes commented Feb 28, 2018

asvetlov commented Feb 28, 2018

kowalski commented Mar 1, 2018

pfreixes commented Mar 1, 2018

asvetlov left a comment

Choose a reason for hiding this comment

asvetlov left a comment

Choose a reason for hiding this comment

kowalski commented Mar 1, 2018 • edited Loading

asvetlov left a comment

Choose a reason for hiding this comment

asvetlov commented Mar 1, 2018

codecov-io commented Feb 26, 2018 •

edited

Loading

kowalski Feb 27, 2018 •

edited

Loading

kowalski commented Mar 1, 2018 •

edited

Loading