Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

datasette publish needs support for the new config/metadata split #2195

Open
simonw opened this issue Sep 21, 2023 · 12 comments
Open

datasette publish needs support for the new config/metadata split #2195

simonw opened this issue Sep 21, 2023 · 12 comments

Comments

@simonw
Copy link
Owner

simonw commented Sep 21, 2023

... which raises the challenge that datasette publish doesn't yet know what to do with a config file!

Originally posted by @simonw in #2194 (comment)

@simonw
Copy link
Owner Author

simonw commented Sep 21, 2023

As soon as datasette publish cloudrun has this I can re-enable this bit of the demo deploy:

- name: Make some modifications to metadata.json
run: |
cat fixtures.json | \
jq '.databases |= . + {"ephemeral": {"allow": {"id": "*"}}}' | \
jq '.plugins |= . + {"datasette-ephemeral-tables": {"table_ttl": 900}}' \
> metadata.json
cat metadata.json

Which should fix this broken demo from https://simonwillison.net/2022/Dec/2/datasette-write-api/

https://todomvc.datasette.io/

@simonw
Copy link
Owner Author

simonw commented Sep 21, 2023

The @add_common_publish_arguments_and_options decorator described here is bad. If I update it to support a new config option all plugins that use it will break.

@hookimpl
def publish_subcommand(publish):
@publish.command()
@add_common_publish_arguments_and_options
@click.option(
"-k",
"--api_key",
help="API key for talking to my hosting provider",
)

I want to deprecate it and switch to a different, better design to address the same problem.

@simonw
Copy link
Owner Author

simonw commented Sep 21, 2023

I think the actual design of this is pretty simple. Current help starts like this:

Usage: datasette publish cloudrun [OPTIONS] [FILES]...

  Publish databases to Datasette running on Cloud Run

Options:
  -m, --metadata FILENAME         Path to JSON/YAML file containing metadata
                                  to publish
  --extra-options TEXT            Extra options to pass to datasette serve

The -s and -c short options are not being used.

So I think -c/--config can point to a JSON or YAML datasette.yaml file, and -s/--setting key value can mirror the new -s/--setting option in datasette serve itself (a shortcut for populating the config file directly from the CLI).

Here's the relevant help section from datasette serve:

  -m, --metadata FILENAME         Path to JSON/YAML file containing
                                  license/source metadata
  -c, --config FILENAME           Path to JSON/YAML Datasette configuration
                                  file
  -s, --setting SETTING...        nested.key, value setting to use in
                                  Datasette configuration

@simonw
Copy link
Owner Author

simonw commented Sep 21, 2023

Here's the full help for Cloud Run at the moment:

datasette publish cloudrun --help
Usage: datasette publish cloudrun [OPTIONS] [FILES]...

  Publish databases to Datasette running on Cloud Run

Options:
  -m, --metadata FILENAME         Path to JSON/YAML file containing metadata
                                  to publish
  --extra-options TEXT            Extra options to pass to datasette serve
  --branch TEXT                   Install datasette from a GitHub branch e.g.
                                  main
  --template-dir DIRECTORY        Path to directory containing custom
                                  templates
  --plugins-dir DIRECTORY         Path to directory containing custom plugins
  --static MOUNT:DIRECTORY        Serve static files from this directory at
                                  /MOUNT/...
  --install TEXT                  Additional packages (e.g. plugins) to
                                  install
  --plugin-secret <TEXT TEXT TEXT>...
                                  Secrets to pass to plugins, e.g. --plugin-
                                  secret datasette-auth-github client_id xxx
  --version-note TEXT             Additional note to show on /-/versions
  --secret TEXT                   Secret used for signing secure values, such
                                  as signed cookies
  --title TEXT                    Title for metadata
  --license TEXT                  License label for metadata
  --license_url TEXT              License URL for metadata
  --source TEXT                   Source label for metadata
  --source_url TEXT               Source URL for metadata
  --about TEXT                    About label for metadata
  --about_url TEXT                About URL for metadata
  -n, --name TEXT                 Application name to use when building
  --service TEXT                  Cloud Run service to deploy (or over-write)
  --spatialite                    Enable SpatialLite extension
  --show-files                    Output the generated Dockerfile and
                                  metadata.json
  --memory TEXT                   Memory to allocate in Cloud Run, e.g. 1Gi
  --cpu [1|2|4]                   Number of vCPUs to allocate in Cloud Run
  --timeout INTEGER               Build timeout in seconds
  --apt-get-install TEXT          Additional packages to apt-get install
  --max-instances INTEGER         Maximum Cloud Run instances
  --min-instances INTEGER         Minimum Cloud Run instances
  --help                          Show this message and exit.

@simonw
Copy link
Owner Author

simonw commented Sep 21, 2023

I'd really like to remove --extra-options. I think the new design makes that completely obsolete?

Maybe it doesn't. You still need --extra-options for the --crossdb option for example.

@simonw
Copy link
Owner Author

simonw commented Sep 21, 2023

https://github.com/search?q=datasette+publish+extra-options+language%3AShell&type=code&l=Shell shows 17 matches, I'll copy in illustrative examples here:

--extra-options="--setting sql_time_limit_ms 5000"
--extra-options="--config default_cache_ttl:3600 --config hash_urls:1"
--extra-options "--setting sql_time_limit_ms 3500 --setting default_page_size 20 --setting trace_debug 1"
--extra-options="--config default_page_size:50 --config sql_time_limit_ms:30000 --config facet_time_limit_ms:10000"
--extra-options="--setting sql_time_limit_ms 5000"
--extra-options "--setting suggest_facets off --setting allow_download on --setting truncate_cells_html 0 --setting max_csv_mb 0 --setting sql_time_limit_ms 2000"

@simonw
Copy link
Owner Author

simonw commented Sep 21, 2023

Found more when I searched for YAML.

Here's the most interesting: https://github.com/labordata/warehouse/blob/0029a72fc1ceae9091932da6566f891167179012/.github/workflows/build.yml#L59

--extra-options="--crossdb --setting sql_time_limit_ms 100000 --cors --setting facet_time_limit_ms 500 --setting allow_facet off --setting trace_debug 1"

Uses both --cors and --crossdb.

@simonw
Copy link
Owner Author

simonw commented Sep 21, 2023

Maybe I should add --cors and --crossdb to datasette publish cloudrun as well?

@simonw
Copy link
Owner Author

simonw commented Sep 21, 2023

Worth noting that it already sets --cors automatically without you needing to specify it:

cmd.extend(["--cors", "--inspect-file", "inspect-data.json"])

I wonder if that's actually surprising behaviour that we should change before 1.0.

@simonw
Copy link
Owner Author

simonw commented Feb 6, 2024

Once I have this working I should use it to ship a demo of a datasette.yml file somewhere, so I can then add a link to that demo to the new documentation for:

It should go in this section: https://docs.datasette.io/en/latest/introspection.html#config

@benhur07b
Copy link

Hi @simonw,

Awesome work with Datasette as always. We talked previously with a colleague about a project we were interested in hosting on Datasette Cloud. While waiting for DS Cloud to be ready, we decided to publish Datasette ourselves first. We were able to deploy the latest version (1.0a13) on Google Cloud Run but I did notice that datasette publish did not support an option for the datasette.yaml/json.

Because of this, our deployed instance lacked features that are available when ran locally. Specifically, the config/options (previously in metadata.yaml) that were moved to datasette.yaml that our instance is lacking are:

  • extra_css_urls (i.e. the custom styles aren't being rendered)
  • canned queries (i.e. the canned queries aren't being listed w/ the database)

As a test, we published the same database with Datasette stable (0.64.6) using the old format of metadata.yaml file (i.e. contains extra_css_urls and canned queries) and the custom styling and canned queries work as expected. From what I understand, we should be able to use the same metadata.yaml file with version 1.0a13. However, when we tested deploying Datasette latest using the same metadata.yaml file (using the old format), the custom CSS and canned queries did not work but everything else was working fine. We used the same parameters for datasette publish in both cases with the only difference being one used 0.64.6 while the other used 1.0a13.

I wanted to ask if there are any workarounds or solutions that would allow me to specify the aforementioned options (extra_css_urls, canned queries) during the deployment of 1.0a13 using datasette publish (similar to --extra-options perhaps?). Maybe I missed something in the documentation. Or do you think we'd need to deploy datasette manually for this to work?

Lastly, although using the stable version works as intended, we plan to use the 1.0 version because we want to include the datasette-write-ui plugin (w/c requires v 1.0a3) in the deployment. As such, we're also interested to know the status of datasette publish's support for using the datasette.yaml file.

Any help or pointers would be much appreciated.

Thanks!

@benhur07b
Copy link

Hi Simon,

Just an update: we managed to overcome the limitations mentioned above by deploying datasette 1.0a13 on a dedicated VPS (there were other requirements that Cloud Run wasn't able to address such as having mutable databases and using the upload-csvs plugin) but we're still very much interested in seeing and maybe even helping datasette publish eventually support settings in datasette.yaml

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants