Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace most of `.inspect()` (and `datasette inspect`) with table counting #462

Closed
simonw opened this issue May 11, 2019 · 4 comments

Comments

Projects
None yet
1 participant
@simonw
Copy link
Owner

commented May 11, 2019

This is the last part of #419 - with the move to supporting mutable databases by default, the inspect-data mechanism currently in use no-longer makes much sense.

The one optimization I think it's worth keeping for databases opened in immutable mode is the cached table counts. I think datasette inspect should cut down to only counting the rows in the tables - the other things done by inspect (figuring out columns, foreign key relationships, FTS etc) should all be fast enough that they can be reliably performed at runtime even against large databases.

If performing them at run-time has performance issues, I would rather cache those results internally within Datasette after they are first calculated than continue to support them in the datasette inspect command - to keep things simpler.

@simonw simonw added the medium label May 11, 2019

@simonw simonw added this to the 0.28 milestone May 11, 2019

@simonw

This comment has been minimized.

Copy link
Owner Author

commented May 11, 2019

test_inspect.py currently just contains two tests that exercise a small portion of what .inspect() does - I'm going to repurpose that module and have it only test the datasette inspect CLI command instead.

Here's the current contents of that file: https://github.com/simonw/datasette/blob/ce09e5d2d392634eced44c3c8d603d7c628e2822/tests/test_inspect.py

@simonw

This comment has been minimized.

Copy link
Owner Author

commented May 11, 2019

So I think datasette inspect fixtures.db other.db should output something like this:

{
  "fixtures": {
    "hash": "894870db97229e9e18b40921dc32b581da813465d672445e96e040ab2adbd229",
    "file": "fixtures.db",
    "size": 225280,
    "tables": {
      "facetable": {
        "count": 34,
      }
   }
}

It currently writes it out to a file called inspect-data.json. Should I keep that as the default behaviour or switch it to outputting to stdout instead?

Here's the current datasette inspect --help:

Usage: datasette inspect [OPTIONS] [FILES]...

Options:
  --inspect-file TEXT
  --load-extension PATH  Path to a SQLite extension to load
  --help                 Show this message and exit.```
@simonw

This comment has been minimized.

Copy link
Owner Author

commented May 11, 2019

I'm going to change it to output to stdout unless you pass it the --inspect-file argument.

simonw added a commit that referenced this issue May 11, 2019

"datasette inspect foo.db" now just calculates table counts
Refs #462

* inspect command now just outputs table counts
* test_inspect.py is now only tests for that CLI command
* Updated some relevant documentation
* Removed docs for /-/inspect since that is about to change
@simonw

This comment has been minimized.

Copy link
Owner Author

commented May 11, 2019

I now need to update datasette serve ... --inspect-data=X to understand and correctly handle the new format.

simonw added a commit that referenced this issue May 16, 2019

Removed .inspect() and /-/inspect.json
Refs #462

/-/inspect.json may return in some shape in #465

@simonw simonw closed this in 21b57cd May 16, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.