Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement ?_extra and new API design for TableView #1729

Open
Tracked by #1914
simonw opened this issue Apr 28, 2022 · 12 comments
Open
Tracked by #1914

Implement ?_extra and new API design for TableView #1729

simonw opened this issue Apr 28, 2022 · 12 comments

Comments

@simonw
Copy link
Owner

simonw commented Apr 28, 2022

Part of:

@simonw
Copy link
Owner Author

simonw commented Apr 28, 2022

I'm going to change the default API response to look like this:

{
  "rows": [
    {
      "pk": 1,
      "created": "2019-01-14 08:00:00",
      "planet_int": 1,
      "on_earth": 1,
      "state": "CA",
      "_city_id": 1,
      "_neighborhood": "Mission",
      "tags": "[\"tag1\", \"tag2\"]",
      "complex_array": "[{\"foo\": \"bar\"}]",
      "distinct_some_null": "one",
      "n": "n1"
    },
    {
      "pk": 2,
      "created": "2019-01-14 08:00:00",
      "planet_int": 1,
      "on_earth": 1,
      "state": "CA",
      "_city_id": 1,
      "_neighborhood": "Dogpatch",
      "tags": "[\"tag1\", \"tag3\"]",
      "complex_array": "[]",
      "distinct_some_null": "two",
      "n": "n2"
    }
  ],
  "next": null,
  "next_url": null
}

Basically https://latest.datasette.io/fixtures/facetable.json?_shape=objects but with just the rows, next and next_url fields returned by default.

@simonw
Copy link
Owner Author

simonw commented Apr 28, 2022

Then I'm going to implement the following ?_extra= options:

  • ?_extra=facet_results - to see facet results
  • ?_extra=suggested_facets - for suggested facets
  • ?_extra=count - for the count of total rows
  • ?_extra=columns - for a list of column names
  • ?_extra=primary_keys - for a list of primary keys
  • ?_extra=query - a {"sql" "select ...", "params": {}} object

I thought about having ?_extra=facet_results returned automatically if the user specifies at least one ?_facet - but that doesn't work for default facets configured in metadata.json - how can the user opt out of those being returned? So I'm going to say you don't see facets at all if you don't include ?_extra=facet_results.

I'm tempted to add ?_extra=_all to return everything, but I can decide if that's a good idea later.

@simonw
Copy link
Owner Author

simonw commented Apr 28, 2022

This means filtered_table_rows_count is going to become count. I had originally picked that terrible name to avoid confusion between the count of all rows in the table and the count of rows that were filtered.

I'll add ?_extra=table_count for getting back the full table count instead. I think count is clear enough!

@simonw
Copy link
Owner Author

simonw commented Apr 28, 2022

(I remain keen on the idea of shipping a plugin that restores the old default API shape to people who have written pre-Datasette-1.0 code against it, but I'll tackle that much later. I really like how jQuery has a culture of doing this.)

@simonw
Copy link
Owner Author

simonw commented Apr 28, 2022

I may be able to implement this mostly in the json_renderer() function:

def json_renderer(args, data, view_name):
"""Render a response as JSON"""
status_code = 200
# Handle the _json= parameter which may modify data["rows"]
json_cols = []

@simonw
Copy link
Owner Author

simonw commented Apr 28, 2022

I'm not sure what to do about the "truncated": true/false key.

It's not really relevant to table results, since they are paginated whether or not you ask for them to be.

It plays a role in query results, where you might run select * from table and get back 1000 results because Datasette truncates at that point rather than returning everything.

Adding it to every table result and always setting it to "truncated": false feels confusing.

I think I'm going to keep it exclusively in the default representation for the /db?sql=... query endpoint, and not return it at all for tables.

@simonw
Copy link
Owner Author

simonw commented Apr 28, 2022

OK, the prototype of this is looking really good - it's very pleasant to use.

http://127.0.0.1:8001/github_memory/issue_comments.json?_search=simon&_sort=id&_size=5&_extra=query_ms&_extra=count&_col=body returns this:

{
  "rows": [
    {
      "id": 338854988,
      "body": "    /database-name/table-name?name__contains=simon&sort=id+desc\r\n\r\nNote that if there's a column called \"sort\" you can still do sort__exact=blah\r\n\r\n"
    },
    {
      "id": 346427794,
      "body": "Thanks. There is a way to use pip to grab apsw, which also let's you configure it (flags to build extensions, use an internal sqlite, etc). Don't know how that works as a dependency for another package, though.\n\nOn November 22, 2017 11:38:06 AM EST, Simon Willison <notifications@github.com> wrote:\n>I have a solution for FTS already, but I'm interested in apsw as a\n>mechanism for allowing custom virtual tables to be written in Python\n>(pysqlite only lets you write custom functions)\n>\n>Not having PyPI support is pretty tough though. I'm planning a\n>plugin/extension system which would be ideal for things like an\n>optional apsw mode, but that's a lot harder if apsw isn't in PyPI.\n>\n>-- \n>You are receiving this because you authored the thread.\n>Reply to this email directly or view it on GitHub:\n>https://github.com/simonw/datasette/issues/144#issuecomment-346405660\n"
    },
    {
      "id": 348252037,
      "body": "WOW!\n\n--\nPaul Ford // (646) 369-7128 // @ftrain\n\nOn Thu, Nov 30, 2017 at 11:47 AM, Simon Willison <notifications@github.com>\nwrote:\n\n> Remaining work on this now lives in a milestone:\n> https://github.com/simonw/datasette/milestone/6\n>\n> —\n> You are receiving this because you were mentioned.\n> Reply to this email directly, view it on GitHub\n> <https://github.com/simonw/datasette/issues/153#issuecomment-348248406>,\n> or mute the thread\n> <https://github.com/notifications/unsubscribe-auth/AABPKHzaVPKwTOoHouK2aMUnM-mPnPk6ks5s7twzgaJpZM4Qq2zW>\n> .\n>\n"
    },
    {
      "id": 391141391,
      "body": "I'm going to clean this up for consistency tomorrow morning so hold off\nmerging until then please\n\nOn Tue, May 22, 2018 at 6:34 PM, Simon Willison <notifications@github.com>\nwrote:\n\n> Yeah let's try this without pysqlite3 and see if we still get the correct\n> version.\n>\n> —\n> You are receiving this because you authored the thread.\n> Reply to this email directly, view it on GitHub\n> <https://github.com/simonw/datasette/pull/280#issuecomment-391076458>, or mute\n> the thread\n> <https://github.com/notifications/unsubscribe-auth/AAihfMI-H6CBt-Py0xdBbH2xDK0KsjT2ks5t1EwYgaJpZM4UI_2m>\n> .\n>\n"
    },
    {
      "id": 391355030,
      "body": "No objections;\r\nIt's good to go @simonw\r\n\r\nOn Wed, 23 May 2018, 14:51 Simon Willison, <notifications@github.com> wrote:\r\n\r\n> @r4vi <https://github.com/r4vi> any objections to me merging this?\r\n>\r\n> —\r\n> You are receiving this because you were mentioned.\r\n> Reply to this email directly, view it on GitHub\r\n> <https://github.com/simonw/datasette/pull/280#issuecomment-391354237>, or mute\r\n> the thread\r\n> <https://github.com/notifications/unsubscribe-auth/AAihfM_2DN5WR2mkO-VK6ozDmkUQ4IMjks5t1WlcgaJpZM4UI_2m>\r\n> .\r\n>\r\n"
    }
  ],
  "next": "391355030,391355030",
  "next_url": "http://127.0.0.1:8001/github_memory/issue_comments.json?_search=simon&_size=5&_extra=query_ms&_extra=count&_col=body&_next=391355030%2C391355030&_sort=id",
  "count": 57,
  "query_ms": 21.780223003588617
}

@simonw
Copy link
Owner Author

simonw commented Apr 28, 2022

That prototype is a very small amount of code so far:

diff --git a/datasette/renderer.py b/datasette/renderer.py
index 4508949..b600e1b 100644
--- a/datasette/renderer.py
+++ b/datasette/renderer.py
@@ -28,6 +28,10 @@ def convert_specific_columns_to_json(rows, columns, json_cols):
 
 def json_renderer(args, data, view_name):
     """Render a response as JSON"""
+    from pprint import pprint
+
+    pprint(data)
+
     status_code = 200
 
     # Handle the _json= parameter which may modify data["rows"]
@@ -43,6 +47,41 @@ def json_renderer(args, data, view_name):
     if "rows" in data and not value_as_boolean(args.get("_json_infinity", "0")):
         data["rows"] = [remove_infinites(row) for row in data["rows"]]
 
+    # Start building the default JSON here
+    columns = data["columns"]
+    next_url = data.get("next_url")
+    output = {
+        "rows": [dict(zip(columns, row)) for row in data["rows"]],
+        "next": data["next"],
+        "next_url": next_url,
+    }
+
+    extras = set(args.getlist("_extra"))
+
+    extras_map = {
+        # _extra=   :    data[field]
+        "count": "filtered_table_rows_count",
+        "facet_results": "facet_results",
+        "suggested_facets": "suggested_facets",
+        "columns": "columns",
+        "primary_keys": "primary_keys",
+        "query_ms": "query_ms",
+        "query": "query",
+    }
+    for extra_key, data_key in extras_map.items():
+        if extra_key in extras:
+            output[extra_key] = data[data_key]
+
+    body = json.dumps(output, cls=CustomJSONEncoder)
+    content_type = "application/json; charset=utf-8"
+    headers = {}
+    if next_url:
+        headers["link"] = f'<{next_url}>; rel="next"'
+    return Response(
+        body, status=status_code, headers=headers, content_type=content_type
+    )
+
+
     # Deal with the _shape option
     shape = args.get("_shape", "arrays")
     # if there's an error, ignore the shape entirely

simonw added a commit that referenced this issue Apr 30, 2022
The tests don't pass yet but this will let me deploy a preview using
the mechanism from ##1442
@simonw
Copy link
Owner Author

simonw commented Apr 30, 2022

Deployed a preview of this here: https://latest-1-0-alpha.datasette.io/

Examples:

Second example produces:

{
  "rows": [],
  "next": null,
  "next_url": null,
  "count": 15,
  "facet_results": {
    "state": {
      "name": "state",
      "type": "column",
      "hideable": true,
      "toggle_url": "/fixtures/facetable.json?_size=0&_extra=facet_results&_extra=count",
      "results": [
        {
          "value": "CA",
          "label": "CA",
          "count": 10,
          "toggle_url": "https://latest-1-0-alpha.datasette.io/fixtures/facetable.json?_facet=state&_size=0&_extra=facet_results&_extra=count&state=CA",
          "selected": false
        },
        {
          "value": "MI",
          "label": "MI",
          "count": 4,
          "toggle_url": "https://latest-1-0-alpha.datasette.io/fixtures/facetable.json?_facet=state&_size=0&_extra=facet_results&_extra=count&state=MI",
          "selected": false
        },
        {
          "value": "MC",
          "label": "MC",
          "count": 1,
          "toggle_url": "https://latest-1-0-alpha.datasette.io/fixtures/facetable.json?_facet=state&_size=0&_extra=facet_results&_extra=count&state=MC",
          "selected": false
        }
      ],
      "truncated": false
    }
  }
}

@simonw
Copy link
Owner Author

simonw commented Apr 30, 2022

Related:

Which talks about how there was confusion in this example: https://latest.datasette.io/fixtures/facetable.json?_facet=created&_facet_date=created&_facet=tags&_facet_array=tags&_nosuggest=1&_size=0

Which I fixed in #625 by introducing tags and tags_2 keys, but actually the facet results would be better if they were a list rather than a dictionary.

@simonw
Copy link
Owner Author

simonw commented Apr 30, 2022

I had to look up what hideable means - it means that you can't hide the current facet because it was defined in metadata, not as a ?_facet= parameter:

"hideable": source != "metadata",

That's a bit of a weird thing to expose in the API. Maybe change that to source so it can be metadata or request? That's very slightly less coupled to how the UI works.

@simonw
Copy link
Owner Author

simonw commented Apr 30, 2022

but actually the facet results would be better if they were a list rather than a dictionary

I think facet_results in the JSON should match this (used by the HTML) instead:

"sorted_facet_results": sorted(
facet_results.values(),
key=lambda f: (len(f["results"]), f["name"]),
reverse=True,
),

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant