Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support column descriptions in metadata.json #942

Closed
simonw opened this issue Aug 18, 2020 · 18 comments
Closed

Support column descriptions in metadata.json #942

simonw opened this issue Aug 18, 2020 · 18 comments

Comments

@simonw
Copy link
Owner

simonw commented Aug 18, 2020

Could look something like this:

{
    "title": "Five Thirty Eight",
    "license": "CC Attribution 4.0 License",
    "license_url": "https://creativecommons.org/licenses/by/4.0/",
    "source": "fivethirtyeight/data on GitHub",
    "source_url": "https://github.com/fivethirtyeight/data",
    "databases": {
        "fivethirtyeight": {
            "tables": {
                "mueller-polls/mueller-approval-polls": {
                    "description_html": "<p>....</p>",
                    "columns": {
                        "name_of_column": "column_description goes here"
}
@simonw
Copy link
Owner Author

simonw commented Aug 18, 2020

Could display these as tooltips on icons something like this (from the experimental datasette-inspect-columns plugin):

fixtures__facetable__15_rows_and_NOAA_tides_second_attempt_-_Jupyter_Notebook

This would need to take accessibility into account, and would need a different display for the mobile web layout. Need to consider how it will interact with the column menu suggested in #690.

@simonw
Copy link
Owner Author

simonw commented Aug 18, 2020

Easiest solution: if you provide column metadata it gets displayed above the table, something like on https://fivethirtyeight.datasettes.com/fivethirtyeight/antiquities-act%2Factions_under_antiquities_act

fivethirtyeight__antiquities-act_actions_under_antiquities_act__344_rows

HTML title= tooltips are also added to the table headers, which won't be visible on touch devices but that's OK because the information is visible on the page already.

@simonw
Copy link
Owner Author

simonw commented Aug 18, 2020

Is columns the right key for this in the table metadata block? I might want to use that for initial values for ?_col= in #615.

Alternative names:

  • column_descriptions
  • column_info

@simonw simonw added this to the Datasette 0.52 milestone Nov 1, 2020
@simonw
Copy link
Owner Author

simonw commented Nov 15, 2020

This will also benefit from the metadata plugin hook: #860

@zaneselvans
Copy link

Are there common patterns for storing column-based metadata inside SQLite itself? I know Postgres allows "comment" fields, which this is kind of trying to replicate. Should the units and description and possibly other per-column metadata fields be combined into a single (tabular?) structure, that would be displayed above the data on the table / query results page?

@simonw
Copy link
Owner Author

simonw commented Dec 2, 2020

SQLite does let you add comments in your CREATE TABLE statements:

CREATE TABLE something (
    id integer primary key, -- integer primary key
    created text -- created date as ISO datetime
);

But the only mechanism for reading those back is to retrieve that CREATE TABLE block of SQL from the sqlite_master table and run a parser against it.

I've so far resisted adding a SQL syntax parser to Datasette for complexity reasons - though I'm increasingly thinking I'll need to do it at some point.

I think I'll leave this to plugins. I'm definitely going to build a plugin that lets you store metadata for tables and columns in a SQLite database table, which will then support interactively editing metadata through a UI.

A plugin which extracts column comments from the SQLite CREATE TABLE comments would be feasible too, if I design the plugin hooks well.

@zaneselvans
Copy link

Are you thinking that those metadata tables would be added to the SQLite DB by Datasette, when you tell it to wrap up the database, with the metadata coming from the metadata.json? Would it be easy to allow the prepopulation of those tables in the database itself? We've been struggling with the best way to make sure that the data is always accompanied by metadata, and baking it all into the database itself would be nice, since then we wouldn't need to worry about separately distributing different files in different contexts.

@simonw
Copy link
Owner Author

simonw commented Dec 2, 2020

My idea is that if you installed my proposed plugin you wouldn't need metadata.json at all - your metadata would instead live in a table in the connected SQLite database files - either one table per database (so the metadata can live in the same place as the data) or maybe also in a dedicated separate database file, for if you want to add metadata to an otherwise read-only database.

The plugin would then provide a UI for editing that metadata - maybe by configuring some writable canned queries or maybe something more custom than that. Or you could edit the metadata by manually editing the SQLite database file (or loading data into it using a tool like yaml-to-sqlite).

@mroswell
Copy link
Sponsor Contributor

I like this idea. Though it might be nice to have some kind of automated system from database to file, so that developers could easily track diffs.

@simonw
Copy link
Owner Author

simonw commented Aug 12, 2021

I'm going with "columns": {"name-of-column": "description-of-column"}.

If I decide to make "col" and "nocol" available in metadata I'll use those as the keys in the metadata, for consistency with the existing query string parameters.

I'm OK with having both "columns": ... and "col": ... keys in the metadata, even though they could be a tiny bit confusing without the documentation.

@simonw
Copy link
Owner Author

simonw commented Aug 12, 2021

Prototype:

fixtures__sortable__201_rows

diff --git a/datasette/static/app.css b/datasette/static/app.css
index c6be1e9..5ca64cb 100644
--- a/datasette/static/app.css
+++ b/datasette/static/app.css
@@ -784,9 +784,14 @@ svg.dropdown-menu-icon {
     font-size: 0.7em;
     color: #666;
     margin: 0;
-    padding: 0;
     padding: 4px 8px 4px 8px;
 }
+.dropdown-menu .dropdown-column-description {
+    margin: 0;
+    color: #666;
+    padding: 4px 8px 4px 8px;
+    max-width: 20em;
+}
 .dropdown-menu li {
     border-bottom: 1px solid #ccc;
 }
diff --git a/datasette/static/table.js b/datasette/static/table.js
index 991346d..a903112 100644
--- a/datasette/static/table.js
+++ b/datasette/static/table.js
@@ -9,6 +9,7 @@ var DROPDOWN_HTML = `<div class="dropdown-menu">
   <li><a class="dropdown-not-blank" href="#">Show not-blank rows</a></li>
 </ul>
 <p class="dropdown-column-type"></p>
+<p class="dropdown-column-description"></p>
 </div>`;
 
 var DROPDOWN_ICON_SVG = `<svg xmlns="http://www.w3.org/2000/svg" width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
@@ -166,6 +167,14 @@ var DROPDOWN_ICON_SVG = `<svg xmlns="http://www.w3.org/2000/svg" width="14" heig
     } else {
       columnTypeP.style.display = "none";
     }
+
+    var columnDescriptionP = menu.querySelector(".dropdown-column-description");
+    if (th.dataset.columnDescription) {
+      columnDescriptionP.innerText = th.dataset.columnDescription;
+      columnDescriptionP.style.display = 'block';
+    } else {
+      columnDescriptionP.style.display = 'none';
+    }
     menu.style.position = "absolute";
     menu.style.top = menuTop + 6 + "px";
     menu.style.left = menuLeft + "px";
diff --git a/datasette/templates/_table.html b/datasette/templates/_table.html
index d765937..649f517 100644
--- a/datasette/templates/_table.html
+++ b/datasette/templates/_table.html
@@ -4,7 +4,7 @@
         <thead>
             <tr>
                 {% for column in display_columns %}
-                    <th class="col-{{ column.name|to_css_class }}" scope="col" data-column="{{ column.name }}" data-column-type="{{ column.type }}" data-column-not-null="{{ column.notnull }}" data-is-pk="{% if column.is_pk %}1{% else %}0{% endif %}">
+                    <th {% if column.description %}data-column-description="{{ column.description }}" {% endif %}class="col-{{ column.name|to_css_class }}" scope="col" data-column="{{ column.name }}" data-column-type="{{ column.type }}" data-column-not-null="{{ column.notnull }}" data-is-pk="{% if column.is_pk %}1{% else %}0{% endif %}">
                         {% if not column.sortable %}
                             {{ column.name }}
                         {% else %}
diff --git a/datasette/views/table.py b/datasette/views/table.py
index 456d806..486a613 100644
--- a/datasette/views/table.py
+++ b/datasette/views/table.py
@@ -125,6 +125,7 @@ class RowTableShared(DataView):
         """Returns columns, rows for specified table - including fancy foreign key treatment"""
         db = self.ds.databases[database]
         table_metadata = self.ds.table_metadata(database, table)
+        column_descriptions = table_metadata.get("columns") or {}
         column_details = {col.name: col for col in await db.table_column_details(table)}
         sortable_columns = await self.sortable_columns_for_table(database, table, True)
         pks = await db.primary_keys(table)
@@ -147,6 +148,7 @@ class RowTableShared(DataView):
                     "is_pk": r[0] in pks_for_display,
                     "type": type_,
                     "notnull": notnull,
+                    "description": column_descriptions.get(r[0]),
                 }
             )

@simonw
Copy link
Owner Author

simonw commented Aug 12, 2021

I like this. Need to solve for mobile though where the cog menu isn't visible - I think I'll do that with a definition list at the top of the page.

@zaneselvans
Copy link

zaneselvans commented Aug 12, 2021 via email

@simonw
Copy link
Owner Author

simonw commented Aug 12, 2021

Prototype with a <dl>:

fixtures__sortable__201_rows

diff --git a/datasette/static/app.css b/datasette/static/app.css
index c6be1e9..bf068fd 100644
--- a/datasette/static/app.css
+++ b/datasette/static/app.css
@@ -836,6 +841,16 @@ svg.dropdown-menu-icon {
     background-repeat: no-repeat;
 }
 
+dl.column-descriptions dt {
+    font-weight: bold;
+}
+dl.column-descriptions dd {
+    padding-left: 1.5em;
+    white-space: pre-wrap;
+    line-height: 1.1em;
+    color: #666;
+}
+
 .anim-scale-in {
     animation-name: scale-in;
     animation-duration: 0.15s;
diff --git a/datasette/templates/table.html b/datasette/templates/table.html
index 211352b..466e8a4 100644
--- a/datasette/templates/table.html
+++ b/datasette/templates/table.html
@@ -51,6 +51,14 @@
 
 {% block description_source_license %}{% include "_description_source_license.html" %}{% endblock %}
 
+{% if metadata.columns %}
+<dl class="column-descriptions">
+    {% for column_name, column_description in metadata.columns.items() %}
+        <dt>{{ column_name }}</dt><dd>{{ column_description }}</dd>
+    {% endfor %}
+</dl>
+{% endif %}
+
 {% if filtered_table_rows_count or human_description_en %}
     <h3>{% if filtered_table_rows_count or filtered_table_rows_count == 0 %}{{ "{:,}".format(filtered_table_rows_count) }} row{% if filtered_table_rows_count == 1 %}{% else %}s{% endif %}{% endif %}
         {% if human_description_en %}{{ human_description_en }}{% endif %}

@simonw
Copy link
Owner Author

simonw commented Aug 12, 2021

I like this enough that I'm going to ship it as an alpha and try it out on a couple of live projects.

@simonw
Copy link
Owner Author

simonw commented Aug 12, 2021

@simonw
Copy link
Owner Author

simonw commented Aug 13, 2021

And on mobile:

5FAF8D73-7199-4BB7-A5B8-9E46DCB4A985

@kokes
Copy link

kokes commented Oct 19, 2021

@simonw I know this is closed, just found this via the annotated release notes, but I wanted to note this one thing:

Not sure how widely used this is, but I've seen CSVW a couple times in the wild. It is trying to address these metadata challenges in a standardised way.

See e.g.

I'm not suggesting you change the syntax you've implemented, just letting you know of this effort by W3C.

simonw added a commit that referenced this issue Oct 24, 2021
@simonw simonw removed this from the Datasette Next milestone Jan 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants