Execute SQL on any json data.
echo '[{"id": "1", "content": {"price": 2.3}}, {"id": "2", "content": {"price": 2.3}}]' | \
jsql query 'SELECT id, data.content.price FROM main'
┌────┬───────┐
│ id │ price │
├────┼───────┤
│ 1 │ 2.3 │
├────┼───────┤
│ 2 │ 2.3 │
└────┴───────┘
JSql is powered by Apache Calcite and enables a lot of SQL functionalities including window functions. To know about the permitted SQL syntax and the available functions, visit the Apache Calcite SQL Reference page.
Requires a JDK 11 or higher.
brew tap wlezzar/jsql https://github.com/wlezzar/jsql
brew install wlezzar/jsql/jsql
# use jsql
jsql --help
docker run -i --rm wlezzar/jsql:latest --help
To have a native like CLI experience, you can create an alias in your ~/.bashrc
or ~/.zshrc
:
alias jsql='docker run -i --rm wlezzar/jsql:latest'
jsql --help
Requires a JDK 11 or higher.
# Get the latest release version
JSQL_VERSION=$(curl -s https://api.github.com/repos/wlezzar/jsql/releases/latest | jq -r .tag_name)
# Download the tar (or zip) package and unarchive it somewhere in your host (ex. /opt)
wget -O /tmp/jsql.zip https://github.com/wlezzar/jsql/releases/download/${JSQL_VERSION}/jsql.zip
unzip /tmp/jsql.zip -d /opt
# Add jsql in your path (add the following line to your ~/.bashrc or ~/.zshrc to make it permanent)
export PATH="${PATH}:/opt/jsql/bin"
# Check that this is working
jsql --help
jsql
can SQL query any json data piped into it. It handles nested data and both streaming and non streaming input.
Input data is available in jsql
under the main
table name. Thus, you can use: SELECT * FROM main
to access it.
To inspect how jsql is parsing the input data, use the describe
command to print the inferred schema of the data:
echo '[{"id": 1}, {"id": 2}]' | jsql describe
┌────────────────────────┐
│ fields │
├────────────────────────┤
│ Record │
│ - _ROW_UUID : VARCHAR │
│ - id : DOUBLE │
│ │
└────────────────────────┘
_ROW_UUID
is a generated field.
Then, to query the data:
echo '[{"id": 1}, {"id": 2}]' | jsql query 'SELECT id FROM main'
┌─────┐
│ id │
├─────┤
│ 1.0 │
├─────┤
│ 2.0 │
└─────┘
If you expect streaming data (one json per line), use the --streaming
option:
curl -s https://stream.wikimedia.org/v2/stream/recentchange \
| grep data \
| grep '/mediawiki/recentchange/1.0.0' \
| sed 's/^data: //g' \
| jsql --streaming --take 5 query "SELECT id, title, server_name FROM main WHERE server_name = 'www.wikidata.org'"
┌───────────────┬──────────────────────────────────────────────────┬───────────────────────┐
│ id │ title │ server_name │
├───────────────┼──────────────────────────────────────────────────┼───────────────────────┤
│ 4.1684978E8 │ Catégorie:Page dont la protection est à vérifier │ fr.wikipedia.org │
├───────────────┼──────────────────────────────────────────────────┼───────────────────────┤
│ 1.639128E7 │ Категория:Страницы с графиками │ ru.wikinews.org │
├───────────────┼──────────────────────────────────────────────────┼───────────────────────┤
│ 1.41757828E9 │ Category:Lassen Peak │ commons.wikimedia.org │
├───────────────┼──────────────────────────────────────────────────┼───────────────────────┤
│ 1.281655368E9 │ Gustave (crocodile) │ en.wikipedia.org │
├───────────────┼──────────────────────────────────────────────────┼───────────────────────┤
│ 4.16849781E8 │ Element (logiciel) │ fr.wikipedia.org │
└───────────────┴──────────────────────────────────────────────────┴───────────────────────┘
Contributors of the most recent commits in the kubernetes repository:
http https://api.github.com/repos/kubernetes/kubernetes/commits \
| jsql query 'SELECT main.author.login, COUNT(*) as total FROM main GROUP BY main.author.login'
┌───────────────┬───────┐
│ login │ total │
├───────────────┼───────┤
│ hh │ 1.0 │
├───────────────┼───────┤
│ MikeSpreitzer │ 1.0 │
├───────────────┼───────┤
│ giuseppe │ 1.0 │
├───────────────┼───────┤
│ k8s-ci-robot │ 25.0 │
├───────────────┼───────┤
│ rajansandeep │ 1.0 │
└───────────────┴───────┘
Use zoe to fetch the lags for a kafka consumer group and jsql
to compute the total lag per topic:
zoe --silent -o json groups offsets my_group \
| jsql query 'SELECT topic, SUM("lag") as total_lag FROM main GROUP BY topic'
┌─────────┬───────────┐
│ topic │ total_lag │
├─────────┼───────────┤
│ topic_1 │ 470.0 │
│ topic_2 │ 470.0 │
└─────────┴───────────┘
Apache Calcite parser is a bit peaky. When accessing nested data using the .
operator, you need to give the full path to your field including the table name. So:
- Instead of:
SELECT author.login FROM main
- Use:
SELECT main.author.login FROM main
This is an Apache Calcite parse error. First check you didn't do any syntax error on your query.
One common issue that causes this problem is when you have a field that has the same name as a SQL keyword. In this case, you need to quote your field name with "
. So:
- Instead of:
SELECT main.commit.author FROM main
- Use:
SELECT main."commit".author FROM main
Because commit
is a Calcite SQL keyword, and clashes with the field named commit
.
If you encounter a bug, or you have any question, do not hesitate to open an issue in the repository.