Skip to content
/ dfq Public

A CLI for running SQLs over various data sources.

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE_Apache2.txt
MIT
LICENSE_MIT.txt
Notifications You must be signed in to change notification settings

zhxiaogg/dfq

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DFQ - DataFusion Query

Crates.io

A CLI tool for running SQLs over various data sources using Apache Arrow DataFusion SQL Query Engine.

Usage

$ dfq --help
A CLI for running SQLs over various data sources.

Usage: dfq [OPTIONS] [DATA_AND_SQL]...

Arguments:
  [DATA_AND_SQL]...  data sources and SQL, e.g. `sample.csv "select * from t0"`

Options:
  -d, --dialect <DIALECT>  
  -o, --output <OUTPUT>    [default: terminal] [possible values: json, csv, terminal]
  -h, --help               Print help
$ dfq samples/users.csv samples/orders.csv "select count(*) as num_orders, t0.name from t0 join t1 on t0.id = t1.user group by t0.name order by num_orders"
+------------+--------+
| num_orders | name   |
+------------+--------+
| 1          | Henry  |
| 2          | Taylor |
+------------+--------+
$ dfq samples/orders.csv "describe t0"
+-------------+-------------------------+-------------+
| column_name | data_type               | is_nullable |
+-------------+-------------------------+-------------+
| id          | Int64                   | YES         |
| user        | Int64                   | YES         |
| ts          | Timestamp(Second, None) | YES         |
| status      | Utf8                    | YES         |
+-------------+-------------------------+-------------+

Status

Supported Data Sources

  1. Local line delimeted JSON file, ends with .json or .json.gz
  2. (TODO) Local JSON array file
  3. Local CSV file, ends with .csv or .csv.gz
  4. Parquet file, ends with .parquet or .prq

Supported Output Formats

  1. Printed table format (default)
  2. JSON array format
  3. JSON line delimeted format
  4. CSV
  5. Parquet

All outputs are directed to stdout now, need the user to manually pipe them to a file if needed.

About

A CLI for running SQLs over various data sources.

Topics

Resources

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE_Apache2.txt
MIT
LICENSE_MIT.txt

Stars

Watchers

Forks

Packages

No packages published

Languages