Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plugin hook for instance/database/table metadata #860

Closed
simonw opened this issue Jun 21, 2020 · 10 comments
Closed

Plugin hook for instance/database/table metadata #860

simonw opened this issue Jun 21, 2020 · 10 comments

Comments

@simonw
Copy link
Owner

simonw commented Jun 21, 2020

I'm not happy with how metadata.(json|yaml) keeps growing new features. Rather than having a single plugin hook for all of metadata.json I'm going to split out the feature that shows actual real metadata for tables and databases - source, license etc - into its own plugin-powered mechanism.

Originally posted by @simonw in #357 (comment)

@simonw
Copy link
Owner Author

simonw commented Jun 21, 2020

This is also relevant to #639, and may mean I can close that ticket in place of this one. I'm going to get this at least to a proof-of-concept stage first though.

@simonw
Copy link
Owner Author

simonw commented Nov 22, 2020

There are three layers of metadata: table, database and instance.

Currently the metadata fields are (ignoring not-quite-metadata like sort and sort_desc):

  • title
  • description (or description_html)
  • about / about_url
  • source / source_url
  • license / license_url

@simonw
Copy link
Owner Author

simonw commented Nov 22, 2020

Open question: how should cascading work? If a table is missing a field but the database or instance has it, should that value cascade down to the table?

It feels like license should definitely cascade: if an instance lists a certain license that should absolutely filter through to all databases and tables.

But... should the other fields cascade? Cascading description doesn't feel right at all, and neither does title.

What about about and about_url and source and source_url? I'm a bit torn on whether they should cascade or not. I'm leaning towards cascading them.

@simonw
Copy link
Owner Author

simonw commented Nov 22, 2020

Documented behaviour right now, for metadata set at the instance level, is: https://docs.datasette.io/en/stable/metadata.html

The above metadata will be displayed on the index page of your Datasette-powered site. The source and license information will also be included in the footer of every page served by Datasette.

...

Metadata at the top level of the JSON will be shown on the index page and in the footer on every page of the site. The license and source is expected to apply to all of your data.

@simonw
Copy link
Owner Author

simonw commented Nov 24, 2020

I see two ways this plugin hook could work. It could be asked about a specific instance, database or table and return the full metadata for that object. OR it could ask for a specific metadata field - e.g. source_url for table X, and return that.

The more finely grained one would allow plugins to implement their own cascading rules pretty easily. Is there a reason it would be better for the hook to return an entire block of JSON for a specific table or database?

I also need to decide if this hook is just going to be about source/license/about displayed metadata, or if it will include the functionality that has been sneaking into metadata.json over time - stuff like page size, default sort order or default facets.

Perhaps I should split those out into a "configuration" concept first, after renaming --config to --setting in #992.

@simonw
Copy link
Owner Author

simonw commented Nov 24, 2020

I'm going to go with a plugin hook (and Datasette method) that returns individual values - so you ask it for e.g. the license_url for a specific table and it returns a string or None.

The default plugin hook implementation that ships with Datasette will then implement cascading lookups against metadata.json - but other plugins will be able to provide their own implementations, which should make it easy to build a plugin that lets you keep metadata in a database file and edit it interactively.

@simonw
Copy link
Owner Author

simonw commented Nov 24, 2020

I'll also allow any key to be looked up - so if users want to invent their own metadata keys other than the default license_url etc they can do so.

@simonw
Copy link
Owner Author

simonw commented Nov 24, 2020

In #942 I want to add support for per-column metadata - which means this new lookup mechanism will need to be able to answer the question "what description is available for this column".

So what should the .metadata() method look like? A couple of options:

  • datasette.metadata("description", table=x, database=y) - can take optional column= too.
  • datasette.table_metadata("description", table=x, database=y) and datasette.database_metadata("description", database=y) and so on - multiple methods for the different types of metadata.

@simonw
Copy link
Owner Author

simonw commented Nov 24, 2020

Here's what I have today - it's an undocumented datasette.metadata() method that returns a full JSON dictionary of values OR a single value if the optional key= argument is provided:

datasette/datasette/app.py

Lines 357 to 388 in f2e2bfc

def metadata(self, key=None, database=None, table=None, fallback=True):
"""
Looks up metadata, cascading backwards from specified level.
Returns None if metadata value is not found.
"""
assert not (
database is None and table is not None
), "Cannot call metadata() with table= specified but not database="
databases = self._metadata.get("databases") or {}
search_list = []
if database is not None:
search_list.append(databases.get(database) or {})
if table is not None:
table_metadata = ((databases.get(database) or {}).get("tables") or {}).get(
table
) or {}
search_list.insert(0, table_metadata)
search_list.append(self._metadata)
if not fallback:
# No fallback allowed, so just use the first one in the list
search_list = search_list[:1]
if key is not None:
for item in search_list:
if key in item:
return item[key]
return None
else:
# Return the merged list
m = {}
for item in search_list:
m.update(item)
return m

@simonw
Copy link
Owner Author

simonw commented Jun 26, 2021

This work is continuing in #1384.

@simonw simonw closed this as completed Jun 26, 2021
@simonw simonw removed this from the Datasette Next milestone Jan 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant