Skip to content

sqlite3 one-liner(ish) for benchmarking #70

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
simonw opened this issue Jun 20, 2022 · 8 comments
Closed

sqlite3 one-liner(ish) for benchmarking #70

simonw opened this issue Jun 20, 2022 · 8 comments

Comments

@simonw
Copy link

simonw commented Jun 20, 2022

Follows #69 - the README at https://github.com/multiprocessio/dsq/tree/b7af2679038f8b09802c88f2f088bcc78fa8872c#benchmark says sqlite3 can't do one-liners, but it sort-of-can - this works for running the taxi query:

sqlite3 :memory: <<EOS
.mode csv
.import taxi.csv taxi
SELECT passenger_count, COUNT(*), AVG(total_amount) FROM taxi GROUP BY passenger_count
EOS
@simonw
Copy link
Author

simonw commented Jun 20, 2022

On my MacBook Pro (2.6Ghz 32MB 2019) it takes 12.4s using time, I've not tried with a proper benchmark.

@eatonphil
Copy link
Member

Yeah I just am not considering that a reasonable one-liner. Open to other wording but the popularity of all these tools kind of demonstrates how that is not what everyone is settling for.

@simonw
Copy link
Author

simonw commented Jun 20, 2022

This works too:

sqlite3 :memory: -cmd '.mode csv' -cmd '.import taxi.csv taxi' 'SELECT passenger_count, COUNT(*), AVG(total_amount) FROM taxi GROUP BY passenger_count'

@eatonphil
Copy link
Member

That's a good one! And wow it's 4s on my machine. Adding to the list.

@eatonphil
Copy link
Member

Closed in 8f68cde

@simonw
Copy link
Author

simonw commented Jun 20, 2022

Turned this into a TIL: https://til.simonwillison.net/sqlite/one-line-csv-operations - I just found out you can output in different formats too, e.g.:

sqlite3 :memory: -cmd '.mode csv' -cmd '.import taxi.csv taxi' -cmd '.mode markdown' \
    'SELECT passenger_count, COUNT(*), AVG(total_amount) FROM taxi GROUP BY passenger_count'

Which outputs markdown you can embed in an issue like this:

passenger_count COUNT(*) AVG(total_amount)
128020 32.2371511482553
0 42228 17.0214016766151
1 1533197 17.6418833067999
2 286461 18.0975870711456
3 72852 17.9153958710923
4 25510 18.452774990196
5 50291 17.2709248175672
6 32623 17.6002964166367
7 2 87.17
8 2 95.705
9 1 113.6

@jakiestfu
Copy link

Really slick, @simonw

@ryenus
Copy link

ryenus commented Jun 22, 2022

Meanwhile I have this cq bash function for interactive use:

# Query a csv file with sqlite
# Usage: cq <csv_file> [<table>]
cq() {
  local csv_file="${1:?Usage: cq <csv_file> [<table>]}" table="${2:-T}"
  sqlite3 -interactive -cmd '.headers on' -cmd '.prompt "\nsqlite> "' \
    -cmd '.mode csv' -cmd ".import '$csv_file' '$table'" -cmd ".mode column"
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants