Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Request and Response picklable #1579

Merged
merged 6 commits into from Apr 21, 2021
Merged

Conversation

hannseman
Copy link
Contributor

@hannseman hannseman commented Apr 16, 2021

Refs: #1562

I took the liberty of creating a PR as I felt that there was a consensus about implementing this.

Some open questions is what state the instances should end up in after being loaded. I.e should we override is_closed, is_stream_consumed and if so to what? I also think we need to set stream to something to avoid unexpected AttributeError being raised on access. Currently set it to an empty ByteStream to make mypy happy.

It might also be cleaner to set the overrides in the dict produced in __getstate__ instead of directly in __setstate__.

An alternative to the __attrs__ pattern would be to pop invalid attributes from __dict__ in __getstate__ but being explicit about it feels safer.

@hannseman hannseman changed the title Make Request and Response picklable (#1562) Make Request and Response picklable Apr 16, 2021
@hannseman hannseman force-pushed the pickles branch 3 times, most recently from e69822a to 23b4be2 Compare April 16, 2021 19:54
@tomchristie
Copy link
Member

tomchristie commented Apr 19, 2021

Okay, so thoughts here...

Firstly, I don't really like the __attrs__ thing. Although it looks like a Python built-in, it turns out that's just a bit of requests internal naming that they decided on.

I think we probably want to approach this like so:

class UnattachedStream(AsyncByteStream, SyncByteStream):
    """
    If a request or response is serialized using pickle, then it is no longer attached to a
    stream for I/O purposes. Any stream operations should result in `httpx.StreamClosed`.
    """

    def __iter__(self) -> Iterator[bytes]:
        raise StreamClosed()

    async def __aiter__(self) -> AsyncIterator[bytes]:
        raise StreamClosed()
class Response:
    ...

    def __getstate__(self):
        return {
            name: value
            for name, value in self.__dict__.items()
            if name not in ['stream', 'is_closed', '_decoder']
        }

    def __setstate__(self, state):
        for name, value in state.items():
            setattr(self, name, value)
        self.is_closed = True
        self.stream = UnattachedStream()
class Request:
    ...

    def __getstate__(self):
        return {
            name: value
            for name, value in self.__dict__.items()
            if name not in ['stream']
        }

    def __setstate__(self, state):
        for name, value in state.items():
            setattr(self, name, value)
        self.stream = UnattachedStream()

I'd consider #1584 to be a pre-requisite.

The tidying up in #1583 is also relevant here.

(Aside: We might well end up with an is_closed on the Request class too at some point in the future, but we don't really need to consider that here.)

@tomchristie tomchristie added this to the v0.18 milestone Apr 19, 2021
@tomchristie tomchristie added the enhancement New feature or request label Apr 19, 2021
@hannseman hannseman force-pushed the pickles branch 4 times, most recently from 676eafa to 2201d8b Compare April 19, 2021 17:07
@hannseman
Copy link
Contributor Author

@tomchristie thanks for the review. I agree that the __attrs__ thing felt a bit awkward, much better 👍

@hannseman hannseman force-pushed the pickles branch 2 times, most recently from d8a99d8 to 0ad63e3 Compare April 19, 2021 17:16

async def __aiter__(self) -> AsyncIterator[bytes]:
raise ResponseClosed() # TODO: StreamClosed
yield b"" # pragma: nocover
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need the yield here to avoid TypeError:

    async def aread(self) -> bytes:
        """
        Read and return the request content.
        """
        if not hasattr(self, "_content"):
            assert isinstance(self.stream, typing.AsyncIterable)
>           self._content = b"".join([part async for part in self.stream])
E           TypeError: 'async for' received an object from __aiter__ that does not implement __anext__: coroutine

httpx/_content.py Outdated Show resolved Hide resolved
httpx/_content.py Outdated Show resolved Hide resolved
httpx/_content.py Outdated Show resolved Hide resolved
@tomchristie
Copy link
Member

Right, let's go with this.

Great work @hannseman!

@tomchristie tomchristie merged commit 2d57104 into encode:master Apr 21, 2021
@hannseman
Copy link
Contributor Author

@tomchristie thanks a lot for the review and fixups! 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants