A lightweight query engine that treats a folder of CSV/JSON files as a relational database. Write queries in a compact GraphQL-like syntax and get back nested JSON — no SQL, no database server.
# Core engine only
pip install .
# With web UI (Gradio)
pip install ".[ui]"
# Development (editable)
pip install -e .After installation the nestql command is available globally:
nestql --help
nestql query.nestql
nestql query.nestql --data-folder ./my-data
nestql query.nestql --no-ast -vThe nestql-vscode/ folder contains a VS Code extension that provides:
- Syntax highlighting for
.nestqlfiles - File icons for
.nestqlfiles in the Explorer - Live linting (structural brace checks + Python-backed deep parsing)
Option A — install from folder (recommended for local use)
-
Open VS Code.
-
Open the Command Palette (
Ctrl+Shift+P/Cmd+Shift+P) and run Extensions: Install from VSIX… -
If you have already packaged the extension, select the
.vsixfile and click Install.Or, to install directly from the source folder without packaging:
- Open the Command Palette and run Developer: Install Extension from Location…
- Select the
nestql-vscode/folder.
Option B — package and install
# Install the VS Code Extension CLI (once)
npm install -g @vscode/vsce
# From inside the extension folder
cd nestql-vscode
vsce package # produces nestql-1.0.0.vsix
code --install-extension nestql-1.0.0.vsix| Setting | Default | Description |
|---|---|---|
nestql.pythonPath |
"python" |
Python executable used for deep linting. Set to a venv path if needed (e.g. "/home/user/.venv/bin/python"). |
The extension automatically lints on open and on every keystroke (debounced 500 ms). Structural errors (unmatched braces, bad alias syntax) are shown immediately; Python-backed errors appear within a second.
from nestql import FolderAdapter
results = FolderAdapter(folder_path="./data").execute("""
<directors> {
name
country : {"India", "USA"}
}
""")
# results → {"directors": [{"name": "Rajamouli", "country": "India"}, …]}$global { # optional macro block
var : value
}
$tables { # optional temporary intermediate tables (excluded from output)
alias := <source> { fields … }
}
<table_name> { # select a CSV/JSON file
field # include field
field : filter # filter field (equality, range, IN list, LIKE)
field := alias # rename field in output
~field # internal field (used in logic, hidden from output)
!<sub_table> { } # strict sub-table (hide parent row if sub is empty)
<sub_table> *{ } # single pluck — return list of value arrays
<sub_table> **{ } # double pluck — flatten all values into one array
func(<table> { }) # sub-table aggregate: count(), sum(), avg(), min(), max()
count() := alias # in-row aggregate (use with @group)
avg(field) := alias # in-row field aggregate (use with @group)
(expr) := alias # computed column — arithmetic on row fields
@sort : f, -f2 # ORDER BY (prefix - = descending)
@limit : N # LIMIT
@offset: N # OFFSET
@distinct # DISTINCT
@group : field # GROUP BY
@having: f op val # HAVING (requires @group)
} := result_alias
| Syntax | Meaning |
|---|---|
<name> |
Select table / file |
field |
Include field in output |
field : value |
Equality or range filter |
field : {"a","b"} |
IN filter (list) |
field : [min, max] |
Inclusive range filter |
field : (min, max) |
Exclusive range filter |
field := alias |
Rename field / table in output |
~field |
Internal — used in filters, excluded from output |
!<table> |
Strict — hide parent row when sub-table is empty |
*{ } |
Single pluck — return [[v1,v2],[v3,v4],…] |
**{ } |
Double pluck — flatten to [v1,v2,v3,v4,…] |
field = parent_var |
Join on parent row's context variable |
$var |
Reference a global macro |
count(<table>{}) |
Count rows in sub-table |
sum(<table>{field}) |
Sum a numeric field across sub-table rows |
avg(<table>{field}) |
Average a numeric field across sub-table rows |
min(<table>{field}) |
Minimum value across sub-table rows |
max(<table>{field}) |
Maximum value across sub-table rows |
field : "prefix%" |
LIKE / wildcard filter — % or * match any substring |
(expr) := alias |
Computed column — arithmetic on row fields |
Place these inside a { } block to control result shape.
| Directive | Meaning |
|---|---|
@sort : field, -field2 |
ORDER BY — prefix - for descending |
@limit : N |
Return at most N rows |
@offset : N |
Skip the first N rows |
@distinct |
Remove duplicate rows |
@group : field1, field2 |
GROUP BY — collapses rows into groups |
@having : field op value |
Filter on grouped results (requires @group) |
| Syntax | Meaning |
|---|---|
count() := alias |
Count rows in each group |
sum(field) := alias |
Sum a field across each group |
avg(field) := alias |
Average a field across each group |
min(field) := alias |
Minimum field value in each group |
max(field) := alias |
Maximum field value in each group |
Macros support arithmetic, bitwise ops, ranges, lists, f-strings, destructuring, and print() for debugging.
$global {
min_b : 50
range : [$min_b, $min_b * 2] # range → {__ge__: 50, __le__: 100}
mask : 1 << 3 # bitwise → 8
flags : {$mask, 16} # list → [8, 16]
label : f"Budget ≥ {$min_b}" # f-string
{a, *rest, z} : {1,2,3,4,5} # destructuring
print($range) # debug print during parsing
}
Supported expression operators: + - * / // % ** << >> & | ^ ~
Pre-execute a sub-query and bind the result to a name. Temporary tables are available as join targets inside subsequent blocks but are excluded from the final output.
$tables {
big_movies := <movies> {
name
director_id
budget : [100, 999]
rating
}
}
<directors> {
~id
name
<big_movies> {
name := title
~director_id = id
rating
} := hits
}
All examples use the sample data in data/:
directors.csv
| id | name | country | experience |
|---|---|---|---|
| d1 | Rajamouli | India | 25 |
| d2 | Hari | India | 15 |
| d3 | Shankar | India | 30 |
| d4 | James Cameron | USA | 40 |
movies.json
| id | name | director_id | budget | rating | genre |
|---|---|---|---|---|---|
| 1 | Bahubali | d1 | 200 | 8.1 | Action |
| 2 | Singam | d2 | 50 | 6.5 | Action |
| 3 | Robo | d3 | 150 | 7.1 | Sci-Fi |
| 4 | S3 | d2 | 5 | 6.2 | Action |
| 5 | Avatar | d4 | 237 | 7.9 | Sci-Fi |
Sample Queries/basic/basic.nestql
$global {
min_budget : 50
low_movie_budget : [$min_budget, 100]
high_movie_budget: [101, 500]
target_country : {"India", "Sri Lanka"}
}
<directors> {
id := dir_var
country : $target_country
name
<movies> *{
name
~budget : $low_movie_budget
~director_id = dir_var := directors_id
} := low_budget_movies
<movies> *{
name
budget : $high_movie_budget
~director_id = dir_var := directors_id
} := high_budget_movies
} := directors_india
<directors> {
id := dir_var
country : "USA"
name
<movies> {
name
~budget : $low_movie_budget
~director_id = dir_var := directors_id
} := low_budget_movies
<movies> {
name
budget : $high_movie_budget
~director_id = dir_var := directors_id
} := high_budget_movies
} := directors_usa
Output
{
"directors_india": [
{"dir_var": "d1", "country": "India", "name": "Rajamouli",
"low_budget_movies": [], "high_budget_movies": [{"name": "Bahubali", "budget": "200"}]},
{"dir_var": "d2", "country": "India", "name": "Hari",
"low_budget_movies": [["Singam"]], "high_budget_movies": []},
{"dir_var": "d3", "country": "India", "name": "Shankar",
"low_budget_movies": [], "high_budget_movies": [{"name": "Robo", "budget": "150"}]}
],
"directors_usa": [
{"dir_var": "d4", "country": "USA", "name": "James Cameron",
"low_budget_movies": [], "high_budget_movies": [{"name": "Avatar", "budget": "237"}]}
]
}Sample Queries/arthimetic/arthimetic.nestql
$global {
min_budget : 50
print("min_budget :", $min_budget)
double_min : $min_budget * 2
low_movie_budget : [$min_budget, $min_budget * 2]
high_movie_budget : [$double_min + 1, 1 << 9]
mask : 1 << 3
print($mask)
flags : {$mask, 8}
target_country : {"India", "USA"}
print($target_country)
}
<directors> {
id := dir_var,
country : $target_country,
name,
<movies> {
name,
~budget : $low_movie_budget,
~director_id = dir_var := directors_id
} := low_budget_movies,
<movies> {
name,
budget : $high_movie_budget,
~director_id = dir_var := directors_id
} := high_budget_movies
} := directors_math_1
Console output (from print() calls during parsing):
min_budget : 50
8
{'India', 'USA'}
Sample Queries/count/count.nestql
<directors> {
id := dir_var
country,
name,
count(<movies> {
name,
budget,
~director_id = dir_var := directors_id
}) := alias_count,
count(<movies> {
name,
budget,
~director_id = dir_var := directors_id
})
}
Output
{
"directors": [
{"dir_var": "d1", "country": "India", "name": "Rajamouli", "alias_count": 1, "count": 1},
{"dir_var": "d2", "country": "India", "name": "Hari", "alias_count": 2, "count": 2},
{"dir_var": "d3", "country": "India", "name": "Shankar", "alias_count": 1, "count": 1},
{"dir_var": "d4", "country": "USA", "name": "James Cameron", "alias_count": 1, "count": 1}
]
}Sample Queries/sum/sum.nestql
<directors> {
id := dir_var,
country,
name,
sum(<movies> {
budget,
~director_id = dir_var := directors_id
}) := budget_consumed
}
Output
{
"directors": [
{"dir_var": "d1", "country": "India", "name": "Rajamouli", "budget_consumed": 200},
{"dir_var": "d2", "country": "India", "name": "Hari", "budget_consumed": 55},
{"dir_var": "d3", "country": "India", "name": "Shankar", "budget_consumed": 150},
{"dir_var": "d4", "country": "USA", "name": "James Cameron", "budget_consumed": 237}
]
}<movies> {
name
budget
rating
(budget * 1.1) := adjusted_budget
(rating * 10) := score
@sort : -adjusted_budget
@limit : 3
}
Output
{
"movies": [
{"name": "Avatar", "budget": "237", "rating": "7.9", "adjusted_budget": 260.7, "score": 79},
{"name": "Bahubali", "budget": "200", "rating": "8.1", "adjusted_budget": 220.0, "score": 81},
{"name": "Robo", "budget": "150", "rating": "7.1", "adjusted_budget": 165, "score": 71}
]
}Use % or * in a string filter for SQL-style pattern matching:
<movies> {
name
genre : "Sci%"
}
Output
{
"movies": [
{"name": "Robo"},
{"name": "Avatar"}
]
}<movies> {
name
budget
rating
@sort : -budget, -rating
@limit : 3
@offset : 1
}
Output
{
"movies": [
{"name": "Bahubali", "budget": "200", "rating": "8.1"},
{"name": "Robo", "budget": "150", "rating": "7.1"},
{"name": "Singam", "budget": "50", "rating": "6.5"}
]
}<directors> {
country
@distinct
}
Output
{
"directors": [
{"country": "India"},
{"country": "USA"}
]
}<movies> {
genre
count() := num_movies
avg(budget) := avg_budget
max(rating) := top_rating
@group : genre
@having : num_movies >= 2
@sort : -avg_budget
}
Output
{
"movies": [
{"genre": "Sci-Fi", "num_movies": 2, "avg_budget": 193.5, "top_rating": 7.9},
{"genre": "Action", "num_movies": 3, "avg_budget": 85.0, "top_rating": 8.1}
]
}<directors> {
~id
name
avg(<movies> { budget, ~director_id = id }) := avg_budget
min(<movies> { budget, ~director_id = id }) := min_budget
max(<movies> { budget, ~director_id = id }) := max_budget
@sort : -max_budget
}
Output
{
"directors": [
{"name": "James Cameron", "avg_budget": 237.0, "min_budget": 237.0, "max_budget": 237.0},
{"name": "Rajamouli", "avg_budget": 200.0, "min_budget": 200.0, "max_budget": 200.0},
{"name": "Shankar", "avg_budget": 150.0, "min_budget": 150.0, "max_budget": 150.0},
{"name": "Hari", "avg_budget": 27.5, "min_budget": 5.0, "max_budget": 50.0}
]
}Sample Queries/pluck/pluck.nestql — * returns a 2-D list; ** flattens to 1-D.
<directors> {
id := dir_var
country,
name,
<movies> *{
name,
budget,
~director_id = dir_var := directors_id
} := movies_single_pluck
<movies> **{
name,
budget,
~director_id = dir_var := directors_id
} := movies_double_pluck
}
Output (single pluck gives `[[name, budget], …]`; double pluck flattens everything)
{
"directors": [
{
"dir_var": "d1", "country": "India", "name": "Rajamouli",
"movies_single_pluck": [["Bahubali", "200"]],
"movies_double_pluck": ["Bahubali", "200"]
},
{
"dir_var": "d2", "country": "India", "name": "Hari",
"movies_single_pluck": [["Singam", "50"], ["S3", "5"]],
"movies_double_pluck": ["Singam", "50", "S3", "5"]
}
]
}Sample Queries/string/string.nestql
$global {
s_hello : "Hello"
s_world : "World"
s_combined : $s_hello + ", " + $s_world + "!"
laugh : "ha" * 4
names : {"Rajamouli", "Hari", "Shankar", "James Cameron"}
pi : 3.14159
greeting_fmt : f"{ $s_hello } { $s_world } - PI={ $pi:.2f }"
print($s_combined)
print($laugh)
print($names)
print($greeting_fmt)
}
<directors> {
id := dir_var,
name,
country,
name : $names
} := directors_named
Console output:
Hello, World!
hahahaha
{'Rajamouli', 'Hari', 'Shankar', 'James Cameron'}
Hello World - PI=3.14
Sample Queries/list/list.nestql
$global {
list_1 : {1, 2, 3}
list_2 : {"a", 'b', 4.5}
list_3 : {*list_1, *list_2} # python-style splat
list_4 : list_1 + list_2
{list_a, *list_b, list_c} : list_4 # destructuring with rest
list_5 : list_1 * 2
list_6 : {...list_1, ...list_2} # js-style spread
print("List 1 :", $list_1)
print("List 2 :", $list_2)
print("List 3 :", $list_3)
print("List 4 :", $list_4)
print("List 5 :", $list_5)
print("List 6 :", $list_6)
print("List a :", $list_a)
print("List b :", $list_b)
print("List c :", $list_c)
}
Console output:
List 1 : [1, 2, 3]
List 2 : ['a', 'b', 4.5]
List 3 : [1, 2, 3, 'a', 'b', 4.5]
List 4 : [1, 2, 3, 'a', 'b', 4.5]
List 5 : [1, 2, 3, 1, 2, 3]
List 6 : [1, 2, 3, 'a', 'b', 4.5]
List a : 1
List b : [2, 3, 'a', 'b']
List c : 4.5
from nestql import FolderAdapter, GQLParser, GQLConfig
# ── Simple: pass the query string directly ────────────────────────────────
adapter = FolderAdapter(folder_path="./data")
results = adapter.execute(open("query.nestql").read())
# returns {"alias": [rows…], …} — $tables blocks handled automatically
# ── Advanced: parse first, then execute individual tables ─────────────────
config = GQLConfig(data_folder="./data", max_rows_per_table=50_000)
adapter = FolderAdapter(folder_path="./data", config=config)
ast = GQLParser().parse(open("query.nestql").read())
tables_ast = ast.pop("__tables__", {})
adapter.load_temp_tables(tables_ast)
results = {}
for table_name, node in ast.items():
table_source = node["__meta__"]["table_source"]
results[node["__meta__"].get("alias", table_name)] = adapter.execute(table_source, node)| Variable | Default | Description |
|---|---|---|
NESTQL_DATA_FOLDER |
./data |
Data directory |
NESTQL_MAX_DEPTH |
10 |
Max query nesting depth |
NESTQL_MAX_ROWS |
100000 |
Max rows per table load |
NESTQL_LOG_LEVEL |
WARNING |
Log level |
NESTQL_VERBOSE |
(unset) | Set to 1 for DEBUG output |
GQLError
├── GQLParseError # syntax or macro errors
├── GQLAdapterError
│ └── GQLFileNotFoundError # missing data file
└── GQLQueryComplexityError # nesting limit exceeded
python app.py
# open http://localhost:7865Features: data-file browser, schema viewer, query editor, Results / AST / Console output views.
from nestql.adapters.base import BaseAdapter
from typing import Any
class MyAdapter(BaseAdapter):
def execute(
self,
table_key: str,
node: dict[str, Any],
context: dict[str, Any] | None = None,
) -> list[Any]:
meta = node.get("__meta__", {})
table = meta.get("table_source", table_key)
# load data, apply filters from node, recurse for sub-tables …
return []nestql/ # installable Python package
├── __init__.py # public API re-exports
├── parser.py # GQLParser: tokenizer + AST builder + macro resolver
├── exceptions.py # GQLError hierarchy
├── config.py # GQLConfig (env-var aware)
├── utils.py # ql_to_json() helper
├── cli.py # `nestql` CLI entry point
└── adapters/
├── base.py # BaseAdapter ABC
└── folder.py # FolderAdapter (CSV/JSON)
nestql-vscode/ # VS Code extension
├── extension.js # activation + linting logic
├── package.json # extension manifest
├── syntaxes/
│ └── nestql.tmLanguage.json # TextMate grammar
├── icons/
│ └── nestql-file.svg # file icon
└── nestql-icon-theme.json # icon theme mapping
app.py # Gradio web UI
pyproject.toml # pip install config
data/ # sample data files
Sample Queries/ # example queries — each in its own subfolder
├── basic/
│ ├── basic.nestql # query file
│ └── basic.json # expected output
├── group_by/
│ ├── group_by.nestql
│ └── group_by.json
└── … # arthimetic, computed, count, distinct, sort_limit, wildcard, temp_tables, …
MIT
