New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
invoke_startup() is not run in some conditions, e.g. gunicorn/uvicorn workers, breaking lots of things #1955
Comments
This is definitely a regression: Datasette is meant to work in those environments, and I didn't think to test them when I added the Coincidentally I actually built a plugin for running Datasette with Gunicorn just a couple of months ago: https://datasette.io/plugins/datasette-gunicorn And I just tested and it has the same bug you describe here! Filed: |
Datasette 0.63 is the release that broke this, thanks to this issue: |
It's possible the fix for this might be for the first incoming HTTP request to trigger Lines 1728 to 1731 in e054704
This would be a much more elegant fix, I could remove those multiple |
Wow, just spotted this in the code - it turns out I solved this problem a different (and better) way long before i introduced Lines 1416 to 1440 in e054704
|
That may not be the best fix here. It turns out this pattern: async def get(self, path, **kwargs):
async with httpx.AsyncClient(app=self.app) as client:
return await client.get(self._fix(path), **kwargs) Doesn't trigger that I wrote about that previously in this TIL: https://til.simonwillison.net/asgi/lifespan-test-httpx |
So actually that I'm inclined to ditch |
It looks like that fix almost works... except it seems to push the tests into an infinite loop or similar? They're not finishing their runs from what I can see. |
Running:
On may laptop to see if I can replicate. |
I added this to @pytest.fixture(autouse=True)
def log_name_of_test_before_test(request):
# To help identify tests that are hanging
name = str(request.node)
with open("/tmp/test.log", "a") as f:
f.write(name + "\n")
yield This logs out the name of each test to |
This is surprising! The logs suggest that the test suite hung running this test here: Lines 55 to 58 in dc18f62
I find that very hard to believe. |
I added So this time it was hanging at So it's clearly not the individual tests themselves that are the problem - something about running the entire test suite in one go is incompatible with this change for some reason. |
When I hit
It looks to me like this relates to |
Possibly related issue: |
Just noticed this: https://github.com/simonw/datasette/actions/runs/3706504228/jobs/6281796135 This suggests that the regular tests passed in CI fine, but the non-serial ones failed. I'm going to try running everything using |
I'm trying this fix again, after a bunch of work on the test suite in: |
|
Hitting
|
I think that's this test: datasette/tests/test_cli_serve_server.py Lines 6 to 13 in 63fb750
Using this fixture: Lines 155 to 175 in 63fb750
|
This bit here looks like it could hang! # Loop until port 8041 serves traffic
while True:
try:
httpx.get("http://localhost:8041/")
break
except httpx.ConnectError:
time.sleep(0.1) |
Improved version of that fixture: diff --git a/tests/conftest.py b/tests/conftest.py
index 44c44f87..69dee68b 100644
--- a/tests/conftest.py
+++ b/tests/conftest.py
@@ -27,6 +27,17 @@ UNDOCUMENTED_PERMISSIONS = {
_ds_client = None
+def wait_until_responds(url, timeout=5.0, client=httpx, **kwargs):
+ start = time.time()
+ while time.time() - start < timeout:
+ try:
+ client.get(url, **kwargs)
+ return
+ except httpx.ConnectError:
+ time.sleep(0.1)
+ raise AssertionError("Timed out waiting for {} to respond".format(url))
+
+
@pytest_asyncio.fixture
async def ds_client():
from datasette.app import Datasette
@@ -161,13 +172,7 @@ def ds_localhost_http_server():
# Avoid FileNotFoundError: [Errno 2] No such file or directory:
cwd=tempfile.gettempdir(),
)
- # Loop until port 8041 serves traffic
- while True:
- try:
- httpx.get("http://localhost:8041/")
- break
- except httpx.ConnectError:
- time.sleep(0.1)
+ wait_until_responds("http://localhost:8041/")
# Check it started successfully
assert not ds_proc.poll(), ds_proc.stdout.read().decode("utf-8")
yield ds_proc
@@ -202,12 +207,7 @@ def ds_localhost_https_server(tmp_path_factory):
stderr=subprocess.STDOUT,
cwd=tempfile.gettempdir(),
)
- while True:
- try:
- httpx.get("https://localhost:8042/", verify=client_cert)
- break
- except httpx.ConnectError:
- time.sleep(0.1)
+ wait_until_responds("http://localhost:8042/", verify=client_cert)
# Check it started successfully
assert not ds_proc.poll(), ds_proc.stdout.read().decode("utf-8")
yield ds_proc, client_cert
@@ -231,12 +231,7 @@ def ds_unix_domain_socket_server(tmp_path_factory):
# Poll until available
transport = httpx.HTTPTransport(uds=uds)
client = httpx.Client(transport=transport)
- while True:
- try:
- client.get("http://localhost/_memory.json")
- break
- except httpx.ConnectError:
- time.sleep(0.1)
+ wait_until_responds("http://localhost/_memory.json", client=client)
# Check it started successfully
assert not ds_proc.poll(), ds_proc.stdout.read().decode("utf-8")
yield ds_proc, uds |
... and it turns out those tests saved me. Because I forgot to check if
|
Now the only failure is in the
That's this test: datasette/tests/test_cli_serve_server.py Lines 16 to 24 in 63fb750
And this fixture: Lines 178 to 215 in 63fb750
|
During the polling loop it constantly raises:
|
Maybe the reason the ASGI lifespan stuff broke was this line: Lines 630 to 632 in 8b73fc6
|
I added the TLS support here: |
I used the steps to test manually from this comment: #1221 (comment) In one terminal:
Then in another terminal:
This worked correctly, outputting the expected JSON. So the feature still works, it's just the test that is broken for some reason. |
This issue might be relevant, but I tried the suggested fix in there ( |
Rather than continue to bang my head against this, I'm tempted to rewrite this test to happen outside of Python world - in a bash script run by GitHub Actions, for example. |
Various attempts at a fix which didn't work: diff --git a/tests/conftest.py b/tests/conftest.py
index 69dee68b..899d36fd 100644
--- a/tests/conftest.py
+++ b/tests/conftest.py
@@ -1,4 +1,3 @@
-import asyncio
import httpx
import os
import pathlib
@@ -6,6 +5,7 @@ import pytest
import pytest_asyncio
import re
import subprocess
+import sys
import tempfile
import time
import trustme
@@ -27,13 +27,23 @@ UNDOCUMENTED_PERMISSIONS = {
_ds_client = None
-def wait_until_responds(url, timeout=5.0, client=httpx, **kwargs):
+def wait_until_responds(url, timeout=5.0, client=None, **kwargs):
+ client = client or httpx.Client(**kwargs)
start = time.time()
while time.time() - start < timeout:
try:
- client.get(url, **kwargs)
+ if "verify" in kwargs:
+ print(kwargs["verify"])
+ print(
+ "Contents of verify file: {}".format(
+ open(kwargs.get("verify")).read()
+ )
+ )
+ print("client = {}, kwargs = {}".format(client, kwargs))
+ client.get(url)
return
- except httpx.ConnectError:
+ except (httpx.ConnectError, httpx.RemoteProtocolError) as ex:
+ print(ex)
time.sleep(0.1)
raise AssertionError("Timed out waiting for {} to respond".format(url))
@@ -166,7 +176,7 @@ def check_permission_actions_are_documented():
@pytest.fixture(scope="session")
def ds_localhost_http_server():
ds_proc = subprocess.Popen(
- ["datasette", "--memory", "-p", "8041"],
+ [sys.executable, "-m", "datasette", "--memory", "-p", "8041"],
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
# Avoid FileNotFoundError: [Errno 2] No such file or directory:
@@ -180,7 +190,7 @@ def ds_localhost_http_server():
ds_proc.terminate()
-@pytest.fixture(scope="session")
+@pytest.fixture
def ds_localhost_https_server(tmp_path_factory):
cert_directory = tmp_path_factory.mktemp("certs")
ca = trustme.CA()
@@ -194,6 +204,8 @@ def ds_localhost_https_server(tmp_path_factory):
ca.cert_pem.write_to_path(path=client_cert)
ds_proc = subprocess.Popen(
[
+ sys.executable,
+ "-m",
"datasette",
"--memory",
"-p",
@@ -207,7 +219,11 @@ def ds_localhost_https_server(tmp_path_factory):
stderr=subprocess.STDOUT,
cwd=tempfile.gettempdir(),
)
- wait_until_responds("http://localhost:8042/", verify=client_cert)
+ wait_until_responds(
+ "http://localhost:8042/_memory.json",
+ verify=client_cert,
+ headers={"Connection": "close"},
+ )
# Check it started successfully
assert not ds_proc.poll(), ds_proc.stdout.read().decode("utf-8")
yield ds_proc, client_cert
diff --git a/tests/test_cli_serve_server.py b/tests/test_cli_serve_server.py
index 1c31e2a3..9320b623 100644
--- a/tests/test_cli_serve_server.py
+++ b/tests/test_cli_serve_server.py
@@ -16,7 +16,11 @@ def test_serve_localhost_http(ds_localhost_http_server):
@pytest.mark.serial
def test_serve_localhost_https(ds_localhost_https_server):
_, client_cert = ds_localhost_https_server
- response = httpx.get("https://localhost:8042/_memory.json", verify=client_cert)
+ response = httpx.get(
+ "https://localhost:8042/_memory.json",
+ verify=client_cert,
+ headers={"Connection": "close"},
+ )
assert {
"database": "_memory",
"path": "/_memory", |
Asked ChatGPT:
It gave me: #!/bin/bash
# Start the server in the background
datasette -p 8002 &
# Store the background process ID in a variable
server_pid=$!
# Make a test request using curl
curl http://localhost:8002
# Shut down the server
kill $server_pid |
This #!/bin/bash
# Generate certificates
python -m trustme
# This creates server.pem, server.key, client.pem
# Start the server in the background
datasette --memory \
--ssl-keyfile=server.key \
--ssl-certfile=server.pem \
-p 8152 &
# Store the background process ID in a variable
server_pid=$!
# Wait for the server to start
sleep 2
# Make a test request using curl
curl -f --cacert client.pem 'https://localhost:8152/_memory.json'
# Save curl's exit code (-f option causes it to return one on HTTP errors)
curl_exit_code=$?
# Shut down the server
kill $server_pid
sleep 1
# Clean up the certificates
rm server.pem server.key client.pem
echo $curl_exit_code
exit $curl_exit_code |
https://github.com/simonw/datasette/actions/runs/3722908296/jobs/6314093163 shows that new test passing in CI:
|
... and with this change, the following now works correctly:
So this issue is now fixed! |
You were super-close on the python version of the test here, changing diff --git a/tests/conftest.py b/tests/conftest.py
index 69dee68b4a3f..ba07a11d37f6 100644
--- a/tests/conftest.py
+++ b/tests/conftest.py
@@ -207,7 +207,7 @@ def ds_localhost_https_server(tmp_path_factory):
stderr=subprocess.STDOUT,
cwd=tempfile.gettempdir(),
)
- wait_until_responds("http://localhost:8042/", verify=client_cert)
+ wait_until_responds("https://localhost:8042/", verify=client_cert)
# Check it started successfully
assert not ds_proc.poll(), ds_proc.stdout.read().decode("utf-8")
yield ds_proc, client_cert My speculation about what was happening here: when A
|
In the past (pre-september 14, #1809) I had a running deployment of Datasette on Azure WebApps by emulating the call in cli.py to Gunicorn:
gunicorn -w 2 -k uvicorn.workers.UvicornWorker app:app
.My most recent deployment, however, fails loudly by shouting that
Datasette.invoke_startup()
was not called. It does not seem to be possible to callinvoke_startup
when running using a uvicorn command directly like this (I've reproduced this locally usinguvicorn
). Two candidates that I have tried:--factory
option, but the app factory has to be synchronous, so noawait invoke_startup
thereasyncio.get_event_loop().run_until_complete
is also not an option becauseuvicorn
already has the event loop running.One additional option is:
invoke_startup
. These are also synchronous, but I might be able to get ahead of the event loop starting here.In my current deployment setup, it does not appear to be possible to use
datasette serve
directly, so I'm stuck eitherQuestions for the maintainers:
Almost forgot, minimal reproducer:
Save as app.py in the same folder as global-power-plants.db, and then try running
uvicorn app:app
.Opening the resulting Datasette instance in the browser will show the error message.
The text was updated successfully, but these errors were encountered: