Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace most of .inspect() (and datasette inspect) with table counting #462

Closed
simonw opened this issue May 11, 2019 · 4 comments
Closed
Labels
Milestone

Comments

@simonw
Copy link
Owner

simonw commented May 11, 2019

This is the last part of #419 - with the move to supporting mutable databases by default, the inspect-data mechanism currently in use no-longer makes much sense.

The one optimization I think it's worth keeping for databases opened in immutable mode is the cached table counts. I think datasette inspect should cut down to only counting the rows in the tables - the other things done by inspect (figuring out columns, foreign key relationships, FTS etc) should all be fast enough that they can be reliably performed at runtime even against large databases.

If performing them at run-time has performance issues, I would rather cache those results internally within Datasette after they are first calculated than continue to support them in the datasette inspect command - to keep things simpler.

@simonw simonw added the medium label May 11, 2019
@simonw simonw added this to the 0.28 milestone May 11, 2019
@simonw
Copy link
Owner Author

simonw commented May 11, 2019

test_inspect.py currently just contains two tests that exercise a small portion of what .inspect() does - I'm going to repurpose that module and have it only test the datasette inspect CLI command instead.

Here's the current contents of that file: https://github.com/simonw/datasette/blob/ce09e5d2d392634eced44c3c8d603d7c628e2822/tests/test_inspect.py

@simonw
Copy link
Owner Author

simonw commented May 11, 2019

So I think datasette inspect fixtures.db other.db should output something like this:

{
  "fixtures": {
    "hash": "894870db97229e9e18b40921dc32b581da813465d672445e96e040ab2adbd229",
    "file": "fixtures.db",
    "size": 225280,
    "tables": {
      "facetable": {
        "count": 34,
      }
   }
}

It currently writes it out to a file called inspect-data.json. Should I keep that as the default behaviour or switch it to outputting to stdout instead?

Here's the current datasette inspect --help:

Usage: datasette inspect [OPTIONS] [FILES]...

Options:
  --inspect-file TEXT
  --load-extension PATH  Path to a SQLite extension to load
  --help                 Show this message and exit.```

@simonw
Copy link
Owner Author

simonw commented May 11, 2019

I'm going to change it to output to stdout unless you pass it the --inspect-file argument.

simonw added a commit that referenced this issue May 11, 2019
Refs #462

* inspect command now just outputs table counts
* test_inspect.py is now only tests for that CLI command
* Updated some relevant documentation
* Removed docs for /-/inspect since that is about to change
@simonw
Copy link
Owner Author

simonw commented May 11, 2019

I now need to update datasette serve ... --inspect-data=X to understand and correctly handle the new format.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant