-
-
Notifications
You must be signed in to change notification settings - Fork 703
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
await datasette.client.get(path) mechanism for executing internal requests #943
Comments
Could be as simple as Bit weird calling it |
Should it default to treating things as if they had the I think I'm OK with people having to add |
Alternative name possibilities:
|
Actually no -
(I'll probably add |
Should internal requests executed in this way be handled by plugins that used the Hard to be sure one way or the other. I'm worried about logging middleware triggering twice - but actually anyone doing serious logging of their Datasette instance is probably doing it in a different layer (uvicorn logs or nginx proxy or whatever) so they wouldn't be affected. There aren't any ASGI logging middlewares out there that I've seen. Also: if you run into a situation where your stuff is breaking because So I think it DOES execute |
What about authentication checks etc? Won't they run twice? I think that's OK too, in fact it's desirable: think of the case of |
Right now calling I think a single |
One thing to consider here: Datasette's table and database name escaping rules can be a little bit convoluted. If a plugin wants to get back the first five rows of a table, it will need to construct a URL Here's how the datasette/datasette/templates/row.html Lines 19 to 23 in b21ed23
It would be an improvement to have this logic abstracted out somewhere and documented so plugins can use it. |
Maybe allow this:
This could cause problems if users ever need to pass literal
Not convinced this is useful - it's a bit unintuitive. |
I just realised that this mechanism is kind of like being able to use microservices - make API calls within your application - except that everything runs in the same process against SQLite databases so calls will be lightning fast. It also means that a plugin can add a new internal API to Datasette that's accessible to other plugins by registering a new route with |
Also fun: the inevitable plugin that exposes this to the template language - so Datasette templates can stitch together data from multiple other internal API calls. Fun way to take advantage of |
Need to decide what to do about JSON responses. When called from a template it's likely the intent will be to further loop through the JSON data returned. It would be annoying to have to run Maybe a |
I'm leaning towards defaulting to JSON as the requested format - you can pass But weird that it's different from the web UI. |
Maybe |
I'm not going to mess around with formats - you'll get back the exact response that a web client would receive. Question: what should the response object look like? e.g. if you do:
What should I could reuse the Datasette |
So what should I do about streaming responses? I could deliberately ignore them - through an exception if you attempt to run I could load the entire response into memory and return it as a wrapped object. I could support some kind of asynchronous iterator mechanism. This would be pretty elegant if I could decide the right syntax for it - it would allow plugins to take advantage of other internal URLs that return streaming content without needing to load that content entirely into memory in order to process it. |
Maybe these methods become the way most Datasette tests are written, replacing the existing |
I'm tempted to create a |
What if This would make It would also solve the return type problem: I would return whatever |
I could solve streaming using something like this: async with datasette.stream("GET", "/fixtures/compound_three_primary_keys.csv?_stream=on&_size=max") as response:
async for chunk in response.aiter_bytes():
print(chunk) Which would be a wrapper around |
I think I can use |
Maybe instead of implementing response = await datasette.client.get("/") Or perhaps this should be a method in case I ever need to be able to response = await (await datasette.client()).get("/") This is a bit cosmetically ugly though, I'd rather avoid that if possible. Maybe I could get this working by returning an object from response = await datasette.client().get("/") I don't think there's any benefit to that over |
Should I instantiate a single https://www.python-httpx.org/advanced/#why-use-a-client says that the main benefit of a Client instance is HTTP connection pooling - which isn't an issue for these internal requests since they won't be using the HTTP protocol at all, they'll be calling the ASGI application directly. So I'm leaning towards instantiating a fresh client for every internal request. I'll run a microbenchmark to check that this doesn't have any unpleasant performance implications. |
dogsheep-beta could do with this too. It currently makes a call to
|
I put together a minimal prototype of this and it feels pretty good: diff --git a/datasette/app.py b/datasette/app.py
index 20aae7d..fb3bdad 100644
--- a/datasette/app.py
+++ b/datasette/app.py
@@ -4,6 +4,7 @@ import collections
import datetime
import glob
import hashlib
+import httpx
import inspect
import itertools
from itsdangerous import BadSignature
@@ -312,6 +313,7 @@ class Datasette:
self._register_renderers()
self._permission_checks = collections.deque(maxlen=200)
self._root_token = secrets.token_hex(32)
+ self.client = DatasetteClient(self)
async def invoke_startup(self):
for hook in pm.hook.startup(datasette=self):
@@ -1209,3 +1211,25 @@ def route_pattern_from_filepath(filepath):
class NotFoundExplicit(NotFound):
pass
+
+
+class DatasetteClient:
+ def __init__(self, ds):
+ self.app = ds.app()
+
+ def _fix(self, path):
+ if path.startswith("/"):
+ path = "http://localhost{}".format(path)
+ return path
+
+ async def get(self, path, **kwargs):
+ async with httpx.AsyncClient(app=self.app) as client:
+ return await client.get(self._fix(path), **kwargs)
+
+ async def post(self, path, **kwargs):
+ async with httpx.AsyncClient(app=self.app) as client:
+ return await client.post(self._fix(path), **kwargs)
+
+ async def options(self, path, **kwargs):
+ async with httpx.AsyncClient(app=self.app) as client:
+ return await client.options(self._fix(path), **kwargs) Used like this in
|
This adds |
How important is it to use https://www.python-httpx.org/async/#opening-and-closing-clients says:
The The transport I am using is a class called The |
Even smaller class DatasetteClient:
def __init__(self, ds):
self._client = httpx.AsyncClient(app=ds.app())
def _fix(self, path):
if path.startswith("/"):
path = "http://localhost{}".format(path)
return path
async def get(self, path, **kwargs):
return await self._client.get(self._fix(path), **kwargs)
async def post(self, path, **kwargs):
return await self._client.post(self._fix(path), **kwargs)
async def options(self, path, **kwargs):
return await self._client.options(self._fix(path), **kwargs) |
I may as well implement all of the HTTP methods supported by the
|
class DatasetteClient:
def __init__(self, ds):
self._client = httpx.AsyncClient(app=ds.app())
def _fix(self, path):
if path.startswith("/"):
path = "http://localhost{}".format(path)
return path
async def get(self, path, **kwargs):
return await self._client.get(self._fix(path), **kwargs)
async def options(self, path, **kwargs):
return await self._client.options(self._fix(path), **kwargs)
async def head(self, path, **kwargs):
return await self._client.head(self._fix(path), **kwargs)
async def post(self, path, **kwargs):
return await self._client.post(self._fix(path), **kwargs)
async def put(self, path, **kwargs):
return await self._client.put(self._fix(path), **kwargs)
async def patch(self, path, **kwargs):
return await self._client.patch(self._fix(path), **kwargs)
async def delete(self, path, **kwargs):
return await self._client.delete(self._fix(path), **kwargs) |
Am I going to rewrite ALL of my tests to use this instead? It would clean up a lot of test code, at the cost of quite a bit of work. It would make for much neater plugin tests too, and neater testing documentation: https://docs.datasette.io/en/stable/testing_plugins.html |
I want this in Datasette 0.50, so I can use it in |
Documentation (from #1006): https://docs.datasette.io/en/latest/internals.html#client |
datasette-graphql
works by making internal requests to the TableView class (in order to take advantage of existing pagination logic, plus options like?_search=
and?_where=
) - see #915I want to support a
mod_rewrite
style mechanism for putting nicer URLs on top of Datasette pages - I botched that together for a project here using an internal ASGI proxying trick: natbat/rockybeaches@ec102c6If the
datasette
object provided a documented method for executing internal requests (in a way that makes sense with logging etc - i.e. doesn't get logged as a separate request) both of these use-cases would be much neater.The text was updated successfully, but these errors were encountered: