Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mechanism to default a table to sorting by multiple columns #1653

Open
simonw opened this issue Mar 7, 2022 Discussed in #1652 · 2 comments
Open

Mechanism to default a table to sorting by multiple columns #1653

simonw opened this issue Mar 7, 2022 Discussed in #1652 · 2 comments

Comments

@simonw
Copy link
Owner

simonw commented Mar 7, 2022

Discussed in #1652

Originally posted by zaneselvans March 7, 2022
It's easy to tell datasette to sort tables using a single column, as described in the docs:

databases:
  ferc1:
    tables:
      f1_edcfu_epda:
        sort: created_time

But is there some way to tell it to sort using a composite key, like you would in an ORDER BY clause instead? For example, the way it's being done in this query on our Datasette?

SELECT
  respondent_id,
  report_year,
  spplmnt_num,
  row_number,
  row_seq,
  row_prvlg,
  acct_num,
  depr_plnt_base,
  est_avg_srvce_lf,
  net_salvage,
  apply_depr_rate,
  mrtlty_crv_typ,
  avg_remaining_lf,
  report_prd
FROM
  f1_edcfu_epda
WHERE
  respondent_id = 210
  AND report_year = 2020
ORDER BY
  report_year, report_prd, respondent_id, spplmnt_num, row_number
LIMIT
  1000

The problem here is that by default it's using rowid (the SQLite assigned autoincrementing integer key) to order the records, but the table should have a natural composite primary key, but the original database that this data is being migrated from doesn't enforce unique primary keys, so there are dupes, and we don't want to drop those rows, and the records are somehow getting jumbled in the database (the rowid ordering isn't lined up with the expected ordering based on the composite primary key, though it's close) and this jumbling is confusing to users that expect to see the data ordered based on the natural primary key.

I've tried setting the sort metadata parameter to a list of column names, a tuple of column names, a quoted string of comma-separated column names, a quoted string of a tuple of column names...

databases:
  ferc1:
    tables:
      f1_edcfu_epda:
        sort: "(report_year, report_prd, respondent_id, spplmnt_num, row_number)"

and they all give me server errors like:

Cannot sort table by (report_year, report_prd, respondent_id, spplmnt_num, row_number)
```</div>
@simonw
Copy link
Owner Author

simonw commented Mar 7, 2022

This is currently blocked on the fact that Datasette doesn't have a mechanism for sorting by more than one column:

@simonw
Copy link
Owner Author

simonw commented Mar 7, 2022

There may be a short-term fix for this: table view could start accepting a ?_sort_sql=SQLfragment parameter, similar to the ?_where= parameter described here: https://docs.datasette.io/en/stable/json_api.html#special-table-arguments

That fragment could then be pre-populated in metadata. Makes me think maybe that ?_where= should be optionally settable in metadata too?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant