replace requests with httpx and factor out clients #1574

technillogue · 2024-03-12T23:40:00Z

this is a step towards merging #1530 into the async release channel. it's a repeat of #1508 without actually adding concurrency or the less runner stuff from #1499. hopefully the remainder of #1530 is easier to review once this is done.

as I recall, the main reason we couldn't release async runner was that cog.Path/File using requests would block the event loop and prevent work from advancing. because of that, these are probably the first changes that need to be released into mainline cog to proceed.

input downloads, output uploads, and webhooks are now handled by ClientManager, which persists for the lifetime of runner, allowing us to reuse connections, which may significantly help with large uploads.
although I was originally going to drop output_file_prefix, it's not actually hard to maintain. the behavior is changed now and objects are uploaded as soon as they're outputted rather than after the prediction is completed.
there's a kind of ugly hack with uploading an empty body to get the redirect instead of making api time out from trying to upload an 140GB file. that can be fixed by implemented an MPU endpoint and/or a "fetch upload url" endpoint.
the behavior of the non-indempotent endpoint is changed; the id is now randomly generated if it's not provided in the body. this isn't strictly required for this change alone, but is hard to carve out.
the behavior of Path is changed significantly. see https://www.notion.so/replicate/Cog-Setup-Path-Problem-2fc41d40bcaf47579ccd8b2f4c71ee24

.github/workflows/ci.yaml

mattt · 2024-03-13T21:45:17Z

python/cog/server/clients.py

+
+
+def httpx_webhook_client() -> httpx.AsyncClient:
+    return httpx.AsyncClient(headers=webhook_headers(), follow_redirects=True)


Looks like we need to opt-in to enable HTTP/2

Suggested change

return httpx.AsyncClient(headers=webhook_headers(), follow_redirects=True)

return httpx.AsyncClient(headers=webhook_headers(), follow_redirects=True, http2=True)

I will note there is likely no benefit to using http2 for director, it only really matters for file downloads and maybe uploads from the internet

python/cog/server/clients.py

mattt · 2024-03-13T21:46:09Z

python/cog/server/clients.py

+        self.webhook_client = httpx_webhook_client()
+        self.retry_webhook_client = httpx_retry_client()
+        self.file_client = httpx_file_client()
+        self.download_client = httpx.AsyncClient(follow_redirects=True)


Extract this into a helper method like the others?

Also, initialize with http2=True

🤷‍♀️ it's one line, all the other ones have more involved configuration

mattt · 2024-03-13T21:49:01Z

python/tests/server/test_webhook.py

 import requests
 import responses
 from cog.schema import WebhookEvent
-from cog.server.webhook import webhook_caller, webhook_caller_filtered
+
+#from cog.server.webhook import webhook_caller, webhook_caller_filtered


Suggested change

#from cog.server.webhook import webhook_caller, webhook_caller_filtered

mattt · 2024-03-13T21:58:23Z

python/cog/types.py


 class URLFile(io.IOBase):
    """
    URLFile is a proxy object for a :class:`urllib3.response.HTTPResponse`
    object that is created lazily. It's a file-like object constructed from a
    URL that can survive pickling/unpickling.
+
+    This is the only place Cog uses requests


What would it take to get rid of requests outright?

Would something like this work?

@property def __wrapped__(self) -> Any: try: return object.__getattribute__(self, "__target__") except AttributeError: url = object.__getattribute__(self, "__url__") with httpx.stream("GET", url) as resp: resp.raise_for_status() resp.raw.decode_content = True object.__setattr__(self, "__target__", resp.raw) return resp.raw

I think that would work, I just didn't bother yet because nobody anywhere uses File and this solution is still a little unsatisfying (it would block the event loop). I also suspect requests is always installed because pip depends on it.

requests is always installed because pip depends on it

🤨

Maybe that's no longer the case? According to this, pip has no external dependencies.

Not urgent or blocking, but it'd be nice to make a clean break.

oh yeah they vendored it in a while ago, I forgot. and I'm mistaken, python:3.11-slim doesn't have requests by default.

I'd still lean towards leaving it obviously broken rather then seeming okay but actually blocking the event loop. good ways to do this include subprocesses with a pipe, some kind of thread nonsense or a hypothetical pget-py binding

mattt · 2024-03-13T21:59:32Z

test-integration/test_integration/test_run.py

@@ -24,7 +24,7 @@ def test_run_with_secret(tmpdir_factory):
    with open(tmpdir / "cog.yaml", "w") as f:
        cog_yaml = """
 build:
-  python_version: "3.8"
+  python_version: "3.9"


Any reason why this is targeting 3.9 instead of 3.8?

Signed-off-by: technillogue <technillogue@gmail.com>

Co-authored-by: Mattt <mattt@replicate.com> Signed-off-by: technillogue <wisepoison@gmail.com> Signed-off-by: technillogue <technillogue@gmail.com>

Signed-off-by: technillogue <technillogue@gmail.com>

* input downloads, output uploads, and webhooks are now handled by ClientManager, which persists for the lifetime of runner, allowing us to reuse connections, which may significantly help with large uploads. * although I was originally going to drop output_file_prefix, it's not actually hard to maintain. the behavior is changed now and objects are uploaded as soon as they're outputted rather than after the prediction is completed. * there's an ugly hack with uploading an empty body to get the redirect instead of making api time out from trying to upload an 140GB file. that can be fixed by implemented an MPU endpoint and/or a "fetch upload url" endpoint. * the behavior of the non-indempotent endpoint is changed; the id is now randomly generated if it's not provided in the body. this isn't strictly required for this change alone, but is hard to carve out. * the behavior of Path is changed significantly. see https://www.notion.so/replicate/Cog-Setup-Path-Problem-2fc41d40bcaf47579ccd8b2f4c71ee24 Signed-off-by: technillogue <technillogue@gmail.com> Co-authored-by: Mattt <mattt@replicate.com> Signed-off-by: technillogue <technillogue@gmail.com>

* input downloads, output uploads, and webhooks are now handled by ClientManager, which persists for the lifetime of runner, allowing us to reuse connections, which may significantly help with large uploads. * although I was originally going to drop output_file_prefix, it's not actually hard to maintain. the behavior is changed now and objects are uploaded as soon as they're outputted rather than after the prediction is completed. * there's an ugly hack with uploading an empty body to get the redirect instead of making api time out from trying to upload an 140GB file. that can be fixed by implemented an MPU endpoint and/or a "fetch upload url" endpoint. * the behavior of the non-indempotent endpoint is changed; the id is now randomly generated if it's not provided in the body. this isn't strictly required for this change alone, but is hard to carve out. * the behavior of Path is changed significantly. see https://www.notion.so/replicate/Cog-Setup-Path-Problem-2fc41d40bcaf47579ccd8b2f4c71ee24 Co-authored-by: Mattt <mattt@replicate.com> * format * stick a %s on line 190 clients.py (#1707) * local upload server can be called cluster.local in addition to .internal (#1714) Signed-off-by: technillogue <technillogue@gmail.com>

technillogue force-pushed the syl/httpx-only branch 2 times, most recently from b4e958c to 224e593 Compare March 12, 2024 23:50

technillogue requested a review from a team March 13, 2024 20:02

mattt reviewed Mar 13, 2024

View reviewed changes

technillogue and others added 3 commits March 28, 2024 16:57

try to carve out httpx changes only

0983433

Signed-off-by: technillogue <technillogue@gmail.com>

Apply suggestions from code review

6505c10

Co-authored-by: Mattt <mattt@replicate.com> Signed-off-by: technillogue <wisepoison@gmail.com> Signed-off-by: technillogue <technillogue@gmail.com>

more suggestions from code review

1f43754

Signed-off-by: technillogue <technillogue@gmail.com>

technillogue force-pushed the syl/httpx-only branch from 237977a to 1f43754 Compare March 28, 2024 20:57

technillogue merged commit 7ee96ba into async Mar 29, 2024
10 checks passed

technillogue deleted the syl/httpx-only branch March 29, 2024 05:22

This was referenced May 17, 2024

fix flaky runner test #1669

Merged

[async] Include prediction id upload request #1680

Closed

technillogue mentioned this pull request Jun 4, 2024

fix upload redirect handling #1714

Merged

technillogue mentioned this pull request Jun 19, 2024

async but refactored #1752

Closed

technillogue mentioned this pull request Jul 23, 2024

syl/fix setup shutdown bug #1819

Merged

technillogue mentioned this pull request Jul 24, 2024

httpx from scratch #1823

Closed

aron mentioned this pull request Oct 17, 2024

[async] Support custom filename to be provided to URLFile #1997

Merged

aron mentioned this pull request Nov 4, 2024

[async] Fix cog predict running file outputs #2043

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

replace requests with httpx and factor out clients #1574

replace requests with httpx and factor out clients #1574

technillogue commented Mar 12, 2024 •

edited

Loading

mattt Mar 13, 2024

technillogue Mar 15, 2024

mattt Mar 13, 2024

technillogue Mar 15, 2024

mattt Mar 13, 2024

mattt Mar 13, 2024

technillogue Mar 13, 2024

mattt Mar 13, 2024

technillogue Mar 14, 2024

mattt Mar 13, 2024



		def httpx_webhook_client() -> httpx.AsyncClient:
		return httpx.AsyncClient(headers=webhook_headers(), follow_redirects=True)

replace requests with httpx and factor out clients #1574

replace requests with httpx and factor out clients #1574

Conversation

technillogue commented Mar 12, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

technillogue commented Mar 12, 2024 •

edited

Loading