# BIQL Tutorial Guide

Welcome to the BIQL (BIDS Query Language) tutorial! This guide will walk you through
using BIQL to query BIDS neuroimaging datasets. We'll start with basic queries and
progressively explore more advanced features.

## What is BIQL?

BIQL is a SQL-like query language designed specifically for querying Brain Imaging 
Data Structure (BIDS) datasets. It allows you to:

- Search for specific files based on BIDS entities (subject, session, task, etc.)
- Filter data using metadata from JSON sidecars
- Access participant information from participants.tsv
- Perform aggregations and grouping operations
- Export results in various formats

## Prerequisites

First, let's set up our environment and get the example data:

In [1]:
import tempfile
from pathlib import Path
from biql import create_query_engine

# Set up paths - use a temporary directory that works in different environments  
bids_examples_dir = Path(tempfile.gettempdir()) / "bids-examples"

# Clone bids-examples if it doesn't exist
if not bids_examples_dir.exists():
    !git clone https://github.com/bids-standard/bids-examples.git {bids_examples_dir}
else:
    print(f"✅ bids-examples already exists at {bids_examples_dir}")

Cloning into '/tmp/bids-examples'...


remote: Enumerating objects: 30595, done.[K
remote: Counting objects:   0% (1/1900)[Kremote: Counting objects:   1% (19/1900)[Kremote: Counting objects:   2% (38/1900)[Kremote: Counting objects:   3% (57/1900)[Kremote: Counting objects:   4% (76/1900)[Kremote: Counting objects:   5% (95/1900)[Kremote: Counting objects:   6% (114/1900)[Kremote: Counting objects:   7% (133/1900)[Kremote: Counting objects:   8% (152/1900)[Kremote: Counting objects:   9% (171/1900)[Kremote: Counting objects:  10% (190/1900)[Kremote: Counting objects:  11% (209/1900)[Kremote: Counting objects:  12% (228/1900)[Kremote: Counting objects:  13% (247/1900)[Kremote: Counting objects:  14% (266/1900)[Kremote: Counting objects:  15% (285/1900)[Kremote: Counting objects:  16% (304/1900)[Kremote: Counting objects:  17% (323/1900)[Kremote: Counting objects:  18% (342/1900)[Kremote: Counting objects:  19% (361/1900)[Kremote: Counting objects:  20% (380/1900)[Kremote: Counting

remote: Counting objects:  37% (703/1900)[Kremote: Counting objects:  38% (722/1900)[Kremote: Counting objects:  39% (741/1900)[Kremote: Counting objects:  40% (760/1900)[Kremote: Counting objects:  41% (779/1900)[Kremote: Counting objects:  42% (798/1900)[Kremote: Counting objects:  43% (817/1900)[Kremote: Counting objects:  44% (836/1900)[Kremote: Counting objects:  45% (855/1900)[Kremote: Counting objects:  46% (874/1900)[Kremote: Counting objects:  47% (893/1900)[Kremote: Counting objects:  48% (912/1900)[Kremote: Counting objects:  49% (931/1900)[Kremote: Counting objects:  50% (950/1900)[Kremote: Counting objects:  51% (969/1900)[Kremote: Counting objects:  52% (988/1900)[Kremote: Counting objects:  53% (1007/1900)[Kremote: Counting objects:  54% (1026/1900)[Kremote: Counting objects:  55% (1045/1900)[Kremote: Counting objects:  56% (1064/1900)[Kremote: Counting objects:  57% (1083/1900)[Kremote: Counting objects:  58% (1102/1900)[Kremo

remote: Compressing objects:  21% (93/441)[K

remote: Compressing objects:  22% (98/441)[K

remote: Compressing objects:  23% (102/441)[K

remote: Compressing objects:  24% (106/441)[Kremote: Compressing objects:  25% (111/441)[K

remote: Compressing objects:  26% (115/441)[Kremote: Compressing objects:  27% (120/441)[Kremote: Compressing objects:  28% (124/441)[Kremote: Compressing objects:  29% (128/441)[Kremote: Compressing objects:  30% (133/441)[Kremote: Compressing objects:  31% (137/441)[Kremote: Compressing objects:  32% (142/441)[K

remote: Compressing objects:  33% (146/441)[Kremote: Compressing objects:  34% (150/441)[Kremote: Compressing objects:  35% (155/441)[Kremote: Compressing objects:  36% (159/441)[Kremote: Compressing objects:  37% (164/441)[Kremote: Compressing objects:  38% (168/441)[Kremote: Compressing objects:  39% (172/441)[Kremote: Compressing objects:  40% (177/441)[Kremote: Compressing objects:  41% (181/441)[Kremote: Compressing objects:  42% (186/441)[Kremote: Compressing objects:  43% (190/441)[Kremote: Compressing objects:  44% (195/441)[Kremote: Compressing objects:  45% (199/441)[Kremote: Compressing objects:  46% (203/441)[Kremote: Compressing objects:  47% (208/441)[Kremote: Compressing objects:  48% (212/441)[Kremote: Compressing objects:  49% (217/441)[Kremote: Compressing objects:  50% (221/441)[Kremote: Compressing objects:  51% (225/441)[Kremote: Compressing objects:  52% (230/441)[Kremote: Compressing objects:  53% (234/441)[Kremote: Compr

Receiving objects:   0% (1/30595)

Receiving objects:   1% (306/30595)Receiving objects:   2% (612/30595)Receiving objects:   3% (918/30595)Receiving objects:   4% (1224/30595)Receiving objects:   5% (1530/30595)Receiving objects:   6% (1836/30595)

Receiving objects:   7% (2142/30595)Receiving objects:   8% (2448/30595)Receiving objects:   9% (2754/30595)Receiving objects:  10% (3060/30595)Receiving objects:  11% (3366/30595)Receiving objects:  12% (3672/30595)

Receiving objects:  13% (3978/30595)Receiving objects:  14% (4284/30595)Receiving objects:  15% (4590/30595)Receiving objects:  16% (4896/30595)

Receiving objects:  17% (5202/30595)Receiving objects:  18% (5508/30595)Receiving objects:  19% (5814/30595)Receiving objects:  20% (6119/30595)Receiving objects:  21% (6425/30595)Receiving objects:  22% (6731/30595)Receiving objects:  23% (7037/30595)Receiving objects:  24% (7343/30595)Receiving objects:  25% (7649/30595)Receiving objects:  26% (7955/30595)Receiving objects:  27% (8261/30595)Receiving objects:  28% (8567/30595)

Receiving objects:  29% (8873/30595)

Receiving objects:  30% (9179/30595), 14.69 MiB | 29.37 MiB/s

Receiving objects:  31% (9485/30595), 14.69 MiB | 29.37 MiB/s

Receiving objects:  32% (9791/30595), 14.69 MiB | 29.37 MiB/s

Receiving objects:  33% (10097/30595), 14.69 MiB | 29.37 MiB/sReceiving objects:  34% (10403/30595), 14.69 MiB | 29.37 MiB/sReceiving objects:  35% (10709/30595), 14.69 MiB | 29.37 MiB/s

Receiving objects:  36% (11015/30595), 14.69 MiB | 29.37 MiB/sReceiving objects:  37% (11321/30595), 14.69 MiB | 29.37 MiB/s

Receiving objects:  38% (11627/30595), 14.69 MiB | 29.37 MiB/s

Receiving objects:  39% (11933/30595), 14.69 MiB | 29.37 MiB/sReceiving objects:  40% (12238/30595), 14.69 MiB | 29.37 MiB/s

Receiving objects:  41% (12544/30595), 14.69 MiB | 29.37 MiB/sReceiving objects:  42% (12850/30595), 14.69 MiB | 29.37 MiB/sReceiving objects:  43% (13156/30595), 14.69 MiB | 29.37 MiB/sReceiving objects:  44% (13462/30595), 14.69 MiB | 29.37 MiB/sReceiving objects:  45% (13768/30595), 14.69 MiB | 29.37 MiB/sReceiving objects:  45% (13842/30595), 14.69 MiB | 29.37 MiB/sReceiving objects:  46% (14074/30595), 37.17 MiB | 37.16 MiB/sReceiving objects:  47% (14380/30595), 37.17 MiB | 37.16 MiB/s

Receiving objects:  48% (14686/30595), 37.17 MiB | 37.16 MiB/sReceiving objects:  49% (14992/30595), 37.17 MiB | 37.16 MiB/sReceiving objects:  50% (15298/30595), 37.17 MiB | 37.16 MiB/sReceiving objects:  51% (15604/30595), 37.17 MiB | 37.16 MiB/sReceiving objects:  52% (15910/30595), 37.17 MiB | 37.16 MiB/sReceiving objects:  53% (16216/30595), 37.17 MiB | 37.16 MiB/sReceiving objects:  54% (16522/30595), 37.17 MiB | 37.16 MiB/sReceiving objects:  55% (16828/30595), 37.17 MiB | 37.16 MiB/sReceiving objects:  56% (17134/30595), 37.17 MiB | 37.16 MiB/sReceiving objects:  57% (17440/30595), 37.17 MiB | 37.16 MiB/sReceiving objects:  58% (17746/30595), 37.17 MiB | 37.16 MiB/s

Receiving objects:  59% (18052/30595), 37.17 MiB | 37.16 MiB/sReceiving objects:  60% (18357/30595), 37.17 MiB | 37.16 MiB/sReceiving objects:  61% (18663/30595), 37.17 MiB | 37.16 MiB/sReceiving objects:  62% (18969/30595), 37.17 MiB | 37.16 MiB/sReceiving objects:  63% (19275/30595), 37.17 MiB | 37.16 MiB/sReceiving objects:  64% (19581/30595), 37.17 MiB | 37.16 MiB/s

Receiving objects:  65% (19887/30595), 37.17 MiB | 37.16 MiB/sReceiving objects:  66% (20193/30595), 37.17 MiB | 37.16 MiB/sReceiving objects:  67% (20499/30595), 37.17 MiB | 37.16 MiB/sReceiving objects:  68% (20805/30595), 37.17 MiB | 37.16 MiB/sReceiving objects:  69% (21111/30595), 37.17 MiB | 37.16 MiB/s

Receiving objects:  70% (21417/30595), 37.17 MiB | 37.16 MiB/sReceiving objects:  71% (21723/30595), 37.17 MiB | 37.16 MiB/sReceiving objects:  72% (22029/30595), 37.17 MiB | 37.16 MiB/sReceiving objects:  73% (22335/30595), 37.17 MiB | 37.16 MiB/sReceiving objects:  74% (22641/30595), 37.17 MiB | 37.16 MiB/sReceiving objects:  75% (22947/30595), 37.17 MiB | 37.16 MiB/s

Receiving objects:  76% (23253/30595), 62.98 MiB | 41.98 MiB/sReceiving objects:  77% (23559/30595), 62.98 MiB | 41.98 MiB/s

Receiving objects:  78% (23865/30595), 62.98 MiB | 41.98 MiB/sReceiving objects:  79% (24171/30595), 62.98 MiB | 41.98 MiB/sReceiving objects:  80% (24476/30595), 62.98 MiB | 41.98 MiB/sReceiving objects:  81% (24782/30595), 62.98 MiB | 41.98 MiB/sReceiving objects:  82% (25088/30595), 62.98 MiB | 41.98 MiB/s

Receiving objects:  83% (25394/30595), 62.98 MiB | 41.98 MiB/sReceiving objects:  84% (25700/30595), 62.98 MiB | 41.98 MiB/sReceiving objects:  85% (26006/30595), 62.98 MiB | 41.98 MiB/sReceiving objects:  86% (26312/30595), 62.98 MiB | 41.98 MiB/sReceiving objects:  87% (26618/30595), 62.98 MiB | 41.98 MiB/s

Receiving objects:  88% (26924/30595), 62.98 MiB | 41.98 MiB/sReceiving objects:  89% (27230/30595), 62.98 MiB | 41.98 MiB/sReceiving objects:  90% (27536/30595), 62.98 MiB | 41.98 MiB/sReceiving objects:  91% (27842/30595), 62.98 MiB | 41.98 MiB/sReceiving objects:  92% (28148/30595), 62.98 MiB | 41.98 MiB/sReceiving objects:  92% (28343/30595), 94.12 MiB | 47.03 MiB/s

Receiving objects:  93% (28454/30595), 94.12 MiB | 47.03 MiB/sReceiving objects:  94% (28760/30595), 94.12 MiB | 47.03 MiB/sReceiving objects:  95% (29066/30595), 94.12 MiB | 47.03 MiB/sReceiving objects:  96% (29372/30595), 94.12 MiB | 47.03 MiB/sReceiving objects:  97% (29678/30595), 94.12 MiB | 47.03 MiB/sReceiving objects:  98% (29984/30595), 94.12 MiB | 47.03 MiB/s

Receiving objects:  99% (30290/30595), 94.12 MiB | 47.03 MiB/sremote: Total 30595 (delta 1599), reused 1478 (delta 1455), pack-reused 28695 (from 2)[K
Receiving objects: 100% (30595/30595), 94.12 MiB | 47.03 MiB/sReceiving objects: 100% (30595/30595), 100.06 MiB | 47.84 MiB/s, done.
Resolving deltas:   0% (0/15576)Resolving deltas:   1% (156/15576)Resolving deltas:   2% (312/15576)Resolving deltas:   3% (468/15576)Resolving deltas:   4% (624/15576)Resolving deltas:   5% (779/15576)Resolving deltas:   6% (935/15576)Resolving deltas:   7% (1091/15576)Resolving deltas:   8% (1247/15576)Resolving deltas:   9% (1402/15576)

Resolving deltas:  10% (1558/15576)Resolving deltas:  11% (1714/15576)Resolving deltas:  12% (1870/15576)Resolving deltas:  13% (2026/15576)Resolving deltas:  14% (2181/15576)Resolving deltas:  15% (2337/15576)Resolving deltas:  16% (2493/15576)Resolving deltas:  17% (2648/15576)Resolving deltas:  18% (2804/15576)Resolving deltas:  19% (2960/15576)Resolving deltas:  20% (3116/15576)Resolving deltas:  21% (3271/15576)Resolving deltas:  22% (3427/15576)Resolving deltas:  23% (3583/15576)Resolving deltas:  24% (3739/15576)Resolving deltas:  25% (3894/15576)

Resolving deltas:  26% (4050/15576)Resolving deltas:  27% (4206/15576)Resolving deltas:  28% (4362/15576)Resolving deltas:  29% (4518/15576)Resolving deltas:  30% (4673/15576)Resolving deltas:  31% (4829/15576)Resolving deltas:  32% (4986/15576)Resolving deltas:  33% (5141/15576)Resolving deltas:  34% (5296/15576)Resolving deltas:  35% (5452/15576)Resolving deltas:  36% (5608/15576)Resolving deltas:  37% (5764/15576)Resolving deltas:  38% (5919/15576)Resolving deltas:  39% (6075/15576)Resolving deltas:  40% (6231/15576)Resolving deltas:  41% (6388/15576)Resolving deltas:  42% (6542/15576)Resolving deltas:  43% (6698/15576)Resolving deltas:  44% (6854/15576)Resolving deltas:  45% (7010/15576)Resolving deltas:  46% (7166/15576)Resolving deltas:  47% (7321/15576)Resolving deltas:  48% (7477/15576)

Resolving deltas:  49% (7633/15576)Resolving deltas:  50% (7788/15576)Resolving deltas:  51% (7944/15576)Resolving deltas:  52% (8100/15576)Resolving deltas:  53% (8256/15576)Resolving deltas:  54% (8412/15576)Resolving deltas:  55% (8567/15576)Resolving deltas:  56% (8723/15576)Resolving deltas:  57% (8879/15576)Resolving deltas:  58% (9035/15576)Resolving deltas:  59% (9190/15576)Resolving deltas:  60% (9347/15576)Resolving deltas:  61% (9503/15576)Resolving deltas:  62% (9658/15576)Resolving deltas:  63% (9813/15576)Resolving deltas:  64% (9970/15576)

Resolving deltas:  65% (10125/15576)Resolving deltas:  66% (10281/15576)Resolving deltas:  67% (10436/15576)Resolving deltas:  68% (10592/15576)Resolving deltas:  69% (10748/15576)Resolving deltas:  70% (10904/15576)Resolving deltas:  71% (11060/15576)Resolving deltas:  72% (11215/15576)Resolving deltas:  73% (11371/15576)Resolving deltas:  74% (11528/15576)Resolving deltas:  75% (11682/15576)Resolving deltas:  76% (11838/15576)Resolving deltas:  77% (11994/15576)Resolving deltas:  78% (12150/15576)Resolving deltas:  79% (12306/15576)Resolving deltas:  80% (12461/15576)

Resolving deltas:  81% (12618/15576)Resolving deltas:  82% (12773/15576)Resolving deltas:  83% (12930/15576)Resolving deltas:  84% (13084/15576)

Resolving deltas:  85% (13240/15576)Resolving deltas:  86% (13396/15576)Resolving deltas:  87% (13552/15576)Resolving deltas:  88% (13707/15576)Resolving deltas:  89% (13863/15576)Resolving deltas:  90% (14019/15576)Resolving deltas:  91% (14175/15576)Resolving deltas:  92% (14331/15576)Resolving deltas:  93% (14486/15576)Resolving deltas:  94% (14642/15576)Resolving deltas:  95% (14798/15576)Resolving deltas:  96% (14953/15576)Resolving deltas:  97% (15109/15576)Resolving deltas:  98% (15265/15576)Resolving deltas:  99% (15421/15576)

Resolving deltas: 100% (15576/15576)Resolving deltas: 100% (15576/15576), done.


## Part 1: Basic Queries

Let's start with the synthetic dataset from bids-examples. This is a simple dataset
that's perfect for learning BIQL basics.

In [2]:
dataset_path = bids_examples_dir / "synthetic"
q = create_query_engine(dataset_path)

q.dataset_stats()

{'total_files': 60,
 'total_subjects': 5,
 'files_by_datatype': {'anat': 10, 'func': 30, 'beh': 5},
 'subjects': ['01', '02', '03', '04', '05'],
 'datatypes': ['anat', 'beh', 'func']}

### Simple Entity Queries

The most basic BIQL queries filter files by BIDS entities. You can query by any
BIDS entity that appears in your filenames:

In [3]:
q.run_query("sub=01", format="dataframe").head(5)

Unnamed: 0,filepath,relative_path,filename,sub,ses,suffix,datatype,extension,metadata,participants,task,run
0,/tmp/bids-examples/synthetic/sub-01/ses-02/ana...,sub-01/ses-02/anat/sub-01_ses-02_T1w.nii,sub-01_ses-02_T1w.nii,1,2,T1w,anat,.nii,{},age=34; sex=F,,
1,/tmp/bids-examples/synthetic/sub-01/ses-02/fun...,sub-01/ses-02/func/sub-01_ses-02_task-nback_ru...,sub-01_ses-02_task-nback_run-02_bold.nii,1,2,bold,func,.nii,{},age=34; sex=F,nback,2.0
2,/tmp/bids-examples/synthetic/sub-01/ses-02/fun...,sub-01/ses-02/func/sub-01_ses-02_task-nback_ru...,sub-01_ses-02_task-nback_run-01_bold.nii,1,2,bold,func,.nii,{},age=34; sex=F,nback,1.0
3,/tmp/bids-examples/synthetic/sub-01/ses-02/fun...,sub-01/ses-02/func/sub-01_ses-02_task-rest_bol...,sub-01_ses-02_task-rest_bold.nii,1,2,bold,func,.nii,{},age=34; sex=F,rest,
4,/tmp/bids-examples/synthetic/sub-01/ses-01/ana...,sub-01/ses-01/anat/sub-01_ses-01_T1w.nii,sub-01_ses-01_T1w.nii,1,1,T1w,anat,.nii,{},age=34; sex=F,,


In [4]:
results = q.run_query("datatype=func")
len(results)  # Number of functional files

30

In [5]:
q.run_query("SELECT DISTINCT task WHERE datatype=func", format="dataframe")

Unnamed: 0,task
0,nback
1,rest


### Combining Conditions

You can combine multiple conditions using AND, OR, and NOT operators:

In [6]:
q.run_query("datatype=anat AND suffix=T1w", format="dataframe").head(5)

Unnamed: 0,filepath,relative_path,filename,sub,ses,suffix,datatype,extension,metadata,participants
0,/tmp/bids-examples/synthetic/sub-01/ses-02/ana...,sub-01/ses-02/anat/sub-01_ses-02_T1w.nii,sub-01_ses-02_T1w.nii,1,2,T1w,anat,.nii,{},age=34; sex=F
1,/tmp/bids-examples/synthetic/sub-01/ses-01/ana...,sub-01/ses-01/anat/sub-01_ses-01_T1w.nii,sub-01_ses-01_T1w.nii,1,1,T1w,anat,.nii,{},age=34; sex=F
2,/tmp/bids-examples/synthetic/sub-04/ses-02/ana...,sub-04/ses-02/anat/sub-04_ses-02_T1w.nii,sub-04_ses-02_T1w.nii,4,2,T1w,anat,.nii,{},age=21; sex=F
3,/tmp/bids-examples/synthetic/sub-04/ses-01/ana...,sub-04/ses-01/anat/sub-04_ses-01_T1w.nii,sub-04_ses-01_T1w.nii,4,1,T1w,anat,.nii,{},age=21; sex=F
4,/tmp/bids-examples/synthetic/sub-05/ses-02/ana...,sub-05/ses-02/anat/sub-05_ses-02_T1w.nii,sub-05_ses-02_T1w.nii,5,2,T1w,anat,.nii,{},age=42; sex=M


In [7]:
q.run_query("task=nback OR task=rest", format="dataframe")

Unnamed: 0,filepath,relative_path,filename,sub,ses,task,run,suffix,datatype,extension,metadata,participants
0,/tmp/bids-examples/synthetic/sub-01/ses-02/fun...,sub-01/ses-02/func/sub-01_ses-02_task-nback_ru...,sub-01_ses-02_task-nback_run-02_bold.nii,1,2,nback,2.0,bold,func,.nii,{},age=34; sex=F
1,/tmp/bids-examples/synthetic/sub-01/ses-02/fun...,sub-01/ses-02/func/sub-01_ses-02_task-nback_ru...,sub-01_ses-02_task-nback_run-01_bold.nii,1,2,nback,1.0,bold,func,.nii,{},age=34; sex=F
2,/tmp/bids-examples/synthetic/sub-01/ses-02/fun...,sub-01/ses-02/func/sub-01_ses-02_task-rest_bol...,sub-01_ses-02_task-rest_bold.nii,1,2,rest,,bold,func,.nii,{},age=34; sex=F
3,/tmp/bids-examples/synthetic/sub-01/ses-01/fun...,sub-01/ses-01/func/sub-01_ses-01_task-nback_ru...,sub-01_ses-01_task-nback_run-02_bold.nii,1,1,nback,2.0,bold,func,.nii,{},age=34; sex=F
4,/tmp/bids-examples/synthetic/sub-01/ses-01/fun...,sub-01/ses-01/func/sub-01_ses-01_task-rest_bol...,sub-01_ses-01_task-rest_bold.nii,1,1,rest,,bold,func,.nii,{},age=34; sex=F
5,/tmp/bids-examples/synthetic/sub-01/ses-01/fun...,sub-01/ses-01/func/sub-01_ses-01_task-nback_ru...,sub-01_ses-01_task-nback_run-01_bold.nii,1,1,nback,1.0,bold,func,.nii,{},age=34; sex=F
6,/tmp/bids-examples/synthetic/sub-04/ses-02/fun...,sub-04/ses-02/func/sub-04_ses-02_task-nback_ru...,sub-04_ses-02_task-nback_run-02_bold.nii,4,2,nback,2.0,bold,func,.nii,{},age=21; sex=F
7,/tmp/bids-examples/synthetic/sub-04/ses-02/fun...,sub-04/ses-02/func/sub-04_ses-02_task-nback_ru...,sub-04_ses-02_task-nback_run-01_bold.nii,4,2,nback,1.0,bold,func,.nii,{},age=21; sex=F
8,/tmp/bids-examples/synthetic/sub-04/ses-02/fun...,sub-04/ses-02/func/sub-04_ses-02_task-rest_bol...,sub-04_ses-02_task-rest_bold.nii,4,2,rest,,bold,func,.nii,{},age=21; sex=F
9,/tmp/bids-examples/synthetic/sub-04/ses-01/fun...,sub-04/ses-01/func/sub-04_ses-01_task-nback_ru...,sub-04_ses-01_task-nback_run-02_bold.nii,4,1,nback,2.0,bold,func,.nii,{},age=21; sex=F


### Using WHERE Clause

For more SQL-like queries, you can use the WHERE clause:

In [8]:
q.run_query("WHERE sub=01 AND datatype=func", format="dataframe")

Unnamed: 0,filepath,relative_path,filename,sub,ses,task,run,suffix,datatype,extension,metadata,participants
0,/tmp/bids-examples/synthetic/sub-01/ses-02/fun...,sub-01/ses-02/func/sub-01_ses-02_task-nback_ru...,sub-01_ses-02_task-nback_run-02_bold.nii,1,2,nback,2.0,bold,func,.nii,{},age=34; sex=F
1,/tmp/bids-examples/synthetic/sub-01/ses-02/fun...,sub-01/ses-02/func/sub-01_ses-02_task-nback_ru...,sub-01_ses-02_task-nback_run-01_bold.nii,1,2,nback,1.0,bold,func,.nii,{},age=34; sex=F
2,/tmp/bids-examples/synthetic/sub-01/ses-02/fun...,sub-01/ses-02/func/sub-01_ses-02_task-rest_bol...,sub-01_ses-02_task-rest_bold.nii,1,2,rest,,bold,func,.nii,{},age=34; sex=F
3,/tmp/bids-examples/synthetic/sub-01/ses-01/fun...,sub-01/ses-01/func/sub-01_ses-01_task-nback_ru...,sub-01_ses-01_task-nback_run-02_bold.nii,1,1,nback,2.0,bold,func,.nii,{},age=34; sex=F
4,/tmp/bids-examples/synthetic/sub-01/ses-01/fun...,sub-01/ses-01/func/sub-01_ses-01_task-rest_bol...,sub-01_ses-01_task-rest_bold.nii,1,1,rest,,bold,func,.nii,{},age=34; sex=F
5,/tmp/bids-examples/synthetic/sub-01/ses-01/fun...,sub-01/ses-01/func/sub-01_ses-01_task-nback_ru...,sub-01_ses-01_task-nback_run-01_bold.nii,1,1,nback,1.0,bold,func,.nii,{},age=34; sex=F


## Part 2: SELECT Clause and Field Selection

By default, BIQL returns all available fields. Use SELECT to choose specific fields:

In [9]:
q.run_query(
    "SELECT sub, task, run, filename WHERE datatype=func",
    format="dataframe"
)

Unnamed: 0,sub,task,run,filename
0,1,nback,2.0,sub-01_ses-02_task-nback_run-02_bold.nii
1,1,nback,1.0,sub-01_ses-02_task-nback_run-01_bold.nii
2,1,rest,,sub-01_ses-02_task-rest_bold.nii
3,1,nback,2.0,sub-01_ses-01_task-nback_run-02_bold.nii
4,1,rest,,sub-01_ses-01_task-rest_bold.nii
5,1,nback,1.0,sub-01_ses-01_task-nback_run-01_bold.nii
6,4,nback,2.0,sub-04_ses-02_task-nback_run-02_bold.nii
7,4,nback,1.0,sub-04_ses-02_task-nback_run-01_bold.nii
8,4,rest,,sub-04_ses-02_task-rest_bold.nii
9,4,nback,2.0,sub-04_ses-01_task-nback_run-02_bold.nii


In [10]:
q.run_query(
    "SELECT sub, relative_path WHERE suffix=T1w",
    format="dataframe"
)

Unnamed: 0,sub,relative_path
0,1,sub-01/ses-02/anat/sub-01_ses-02_T1w.nii
1,1,sub-01/ses-01/anat/sub-01_ses-01_T1w.nii
2,4,sub-04/ses-02/anat/sub-04_ses-02_T1w.nii
3,4,sub-04/ses-01/anat/sub-04_ses-01_T1w.nii
4,5,sub-05/ses-02/anat/sub-05_ses-02_T1w.nii
5,5,sub-05/ses-01/anat/sub-05_ses-01_T1w.nii
6,2,sub-02/ses-02/anat/sub-02_ses-02_T1w.nii
7,2,sub-02/ses-01/anat/sub-02_ses-01_T1w.nii
8,3,sub-03/ses-02/anat/sub-03_ses-02_T1w.nii
9,3,sub-03/ses-01/anat/sub-03_ses-01_T1w.nii


## Part 3: Pattern Matching

BIQL supports wildcards and regular expressions for flexible matching:

In [11]:
results = q.run_query("suffix=*bold*")
len(results)  # Count of files with 'bold' in suffix

30

In [12]:
q.run_query(
    "SELECT DISTINCT task WHERE task~=\".*back.*\"",
    format="dataframe"
)

Unnamed: 0,task
0,nback


## Part 4: Ranges and Lists

BIQL supports range queries and IN operators for matching multiple values:

In [13]:
q.run_query(
    "SELECT sub, ARRAY_AGG(DISTINCT task) as tasks, COUNT(*) as total_files "
    "WHERE sub IN ['01', '02', '03'] "
    "GROUP BY sub",
    format="json"
)

[{'sub': '01', 'tasks': ['nback', 'rest', 'stroop'], 'total_files': 12},
 {'sub': '02', 'tasks': ['nback', 'rest', 'stroop'], 'total_files': 12},
 {'sub': '03', 'tasks': ['nback', 'rest', 'stroop'], 'total_files': 12}]

In [14]:
q.run_query(
    "SELECT task, run, COUNT(*) as file_count, "
    "COUNT(DISTINCT sub) as subjects "
    "WHERE datatype=func "
    "GROUP BY task, run "
    "ORDER BY task, run",
    format="dataframe"
)

Unnamed: 0,task,run,file_count,subjects
0,nback,1.0,10,5
1,nback,2.0,10,5
2,rest,,10,5


## Part 5: Grouping and Aggregation

BIQL supports SQL-like grouping and aggregation functions:

In [15]:
q.run_query("SELECT sub, COUNT(*) GROUP BY sub", format="dataframe")

Unnamed: 0,sub,count
0,1,12
1,4,12
2,5,12
3,2,12
4,3,12


In [16]:
q.run_query(
    "SELECT sub, datatype, COUNT(*) GROUP BY sub, datatype",
    format="json"
)

[{'sub': '01', 'datatype': 'anat', 'count': 2},
 {'sub': '01', 'datatype': 'func', 'count': 6},
 {'sub': '04', 'datatype': 'anat', 'count': 2},
 {'sub': '04', 'datatype': 'func', 'count': 6},
 {'sub': '05', 'datatype': 'anat', 'count': 2},
 {'sub': '05', 'datatype': 'func', 'count': 6},
 {'sub': '02', 'datatype': 'anat', 'count': 2},
 {'sub': '02', 'datatype': 'func', 'count': 6},
 {'sub': '03', 'datatype': 'anat', 'count': 2},
 {'sub': '03', 'datatype': 'func', 'count': 6},
 {'sub': '01', 'datatype': None, 'count': 3},
 {'sub': '01', 'datatype': 'beh', 'count': 1},
 {'sub': '04', 'datatype': None, 'count': 3},
 {'sub': '04', 'datatype': 'beh', 'count': 1},
 {'sub': '05', 'datatype': None, 'count': 3},
 {'sub': '05', 'datatype': 'beh', 'count': 1},
 {'sub': '02', 'datatype': None, 'count': 3},
 {'sub': '02', 'datatype': 'beh', 'count': 1},
 {'sub': '03', 'datatype': None, 'count': 3},
 {'sub': '03', 'datatype': 'beh', 'count': 1}]

## Part 6: Working with Metadata

BIQL can query JSON sidecar metadata using the `metadata.` namespace.
The synthetic dataset has task-level metadata files like `task-nback_bold.json`:

In [17]:
q.run_query(
    "SELECT task, COUNT(*) as file_count, "
    "ARRAY_AGG(DISTINCT sub) as subjects_with_task, "
    "ARRAY_AGG(DISTINCT datatype) as datatypes "
    "GROUP BY task",
    format="json"
)

[{'task': None,
  'file_count': 25,
  'subjects_with_task': ['01', '02', '03', '04', '05'],
  'datatypes': ['anat']},
 {'task': 'nback',
  'file_count': 20,
  'subjects_with_task': ['01', '02', '03', '04', '05'],
  'datatypes': ['func']},
 {'task': 'rest',
  'file_count': 10,
  'subjects_with_task': ['01', '02', '03', '04', '05'],
  'datatypes': ['func']},
 {'task': 'stroop',
  'file_count': 5,
  'subjects_with_task': ['01', '02', '03', '04', '05'],
  'datatypes': ['beh']}]

In [18]:
q.run_query(
    "SELECT datatype, COUNT(*) as total_files, "
    "COUNT(DISTINCT sub) as subjects, "
    "ARRAY_AGG(DISTINCT sub) as subject_list "
    "GROUP BY datatype "
    "ORDER BY total_files DESC",
    format="json"
)

[{'datatype': 'anat',
  'total_files': 10,
  'subjects': 5,
  'subject_list': ['01', '02', '03', '04', '05']},
 {'datatype': 'func',
  'total_files': 30,
  'subjects': 5,
  'subject_list': ['01', '02', '03', '04', '05']},
 {'datatype': None,
  'total_files': 15,
  'subjects': 5,
  'subject_list': ['01', '02', '03', '04', '05']},
 {'datatype': 'beh',
  'total_files': 5,
  'subjects': 5,
  'subject_list': ['01', '02', '03', '04', '05']}]

## Part 7: Participant Information

Access participant demographics using the `participants.` namespace:

In [19]:
q.run_query(
    "SELECT DISTINCT sub, participants.age, participants.sex",
    format="dataframe"
)

Unnamed: 0,sub,participants.age,participants.sex
0,1,34,F
1,4,21,F
2,5,42,M
3,2,38,M
4,3,22,M


In [20]:
q.run_query(
    "SELECT sub, task, participants.age WHERE participants.age > 25",
    format="dataframe"
)

Unnamed: 0,sub,task,participants.age
0,1,,34
1,1,nback,34
2,1,nback,34
3,1,rest,34
4,1,,34
5,1,nback,34
6,1,rest,34
7,1,nback,34
8,5,,42
9,5,rest,42


## Part 8: Advanced Queries

Let's combine multiple features for more complex queries:

In [21]:
q.run_query("""
    SELECT sub, ses, task, COUNT(*) as n_runs
    WHERE datatype=func AND task != rest
    GROUP BY sub, ses, task
    HAVING COUNT(*) > 1
    ORDER BY sub, task
""", format="json")

[{'sub': '01', 'ses': '02', 'task': 'nback', 'n_runs': 2},
 {'sub': '01', 'ses': '01', 'task': 'nback', 'n_runs': 2},
 {'sub': '02', 'ses': '02', 'task': 'nback', 'n_runs': 2},
 {'sub': '02', 'ses': '01', 'task': 'nback', 'n_runs': 2},
 {'sub': '03', 'ses': '02', 'task': 'nback', 'n_runs': 2},
 {'sub': '03', 'ses': '01', 'task': 'nback', 'n_runs': 2},
 {'sub': '04', 'ses': '02', 'task': 'nback', 'n_runs': 2},
 {'sub': '04', 'ses': '01', 'task': 'nback', 'n_runs': 2},
 {'sub': '05', 'ses': '02', 'task': 'nback', 'n_runs': 2},
 {'sub': '05', 'ses': '01', 'task': 'nback', 'n_runs': 2}]

In [22]:
q.run_query("""
    SELECT sub, task,
           ARRAY_AGG(filename WHERE suffix='bold') as imaging_files,
           ARRAY_AGG(filename WHERE run='01') as run01_files,
           ARRAY_AGG(filename WHERE run='02') as run02_files
    WHERE datatype=func
    GROUP BY sub, task
""", format="table")  # Using table format since arrays don't display well in dataframes

'| imaging_files   | run01_files     | run02_files     | sub | task  |\n| --------------- | --------------- | --------------- | --- | ----- |\n| [...4 items...] | [...2 items...] | [...2 items...] | 01  | nback |\n| [...2 items...] | [...0 items...] | [...0 items...] | 01  | rest  |\n| [...4 items...] | [...2 items...] | [...2 items...] | 04  | nback |\n| [...2 items...] | [...0 items...] | [...0 items...] | 04  | rest  |\n| [...2 items...] | [...0 items...] | [...0 items...] | 05  | rest  |\n| [...4 items...] | [...2 items...] | [...2 items...] | 05  | nback |\n| [...2 items...] | [...0 items...] | [...0 items...] | 02  | rest  |\n| [...4 items...] | [...2 items...] | [...2 items...] | 02  | nback |\n| [...4 items...] | [...2 items...] | [...2 items...] | 03  | nback |\n| [...2 items...] | [...0 items...] | [...0 items...] | 03  | rest  |'

## Part 9: Output Formats

BIQL supports multiple output formats for different use cases:

In [23]:
sample_query = "SELECT sub, task, run WHERE datatype=func AND sub=01"

print(q.run_query(sample_query, format="table"))

| run | sub | task  |
| --- | --- | ----- |
| 02  | 01  | nback |
| 01  | 01  | nback |
|     | 01  | rest  |
| 02  | 01  | nback |
|     | 01  | rest  |
| 01  | 01  | nback |


In [24]:
print(q.run_query(sample_query, format="csv"))

run,sub,task
02,01,nback
01,01,nback
,01,rest
02,01,nback
,01,rest
01,01,nback



In [25]:
results_json = q.run_query(sample_query, format="json")
results_json[:2]  # Show first 2 entries

[{'sub': '01', 'task': 'nback', 'run': '02'},
 {'sub': '01', 'task': 'nback', 'run': '01'}]

In [26]:
print(q.run_query("WHERE sub=01 AND suffix=T1w", format="paths"))

/tmp/bids-examples/synthetic/sub-01/ses-02/anat/sub-01_ses-02_T1w.nii
/tmp/bids-examples/synthetic/sub-01/ses-01/anat/sub-01_ses-01_T1w.nii


In [27]:
q.run_query(sample_query, format="dataframe")

Unnamed: 0,sub,task,run
0,1,nback,2.0
1,1,nback,1.0
2,1,rest,
3,1,nback,2.0
4,1,rest,
5,1,nback,1.0


## Part 10: Real-World Examples

Let's look at some practical queries you might use in neuroimaging research:

In [28]:
q.run_query("""
    SELECT sub, 
           COUNT(*) as total_files,
           COUNT(DISTINCT datatype) as datatypes,
           ARRAY_AGG(DISTINCT datatype) as available_data
    GROUP BY sub
""", format="json")

[{'sub': '01',
  'total_files': 12,
  'datatypes': 3,
  'available_data': ['anat', 'beh', 'func']},
 {'sub': '04',
  'total_files': 12,
  'datatypes': 3,
  'available_data': ['anat', 'beh', 'func']},
 {'sub': '05',
  'total_files': 12,
  'datatypes': 3,
  'available_data': ['anat', 'beh', 'func']},
 {'sub': '02',
  'total_files': 12,
  'datatypes': 3,
  'available_data': ['anat', 'beh', 'func']},
 {'sub': '03',
  'total_files': 12,
  'datatypes': 3,
  'available_data': ['anat', 'beh', 'func']}]

In [29]:
q.run_query("""
    SELECT sub, ses,
           COUNT(*) as files_per_session,
           ARRAY_AGG(DISTINCT task) as tasks_in_session
    GROUP BY sub, ses
""", format="json")

[{'sub': '01',
  'ses': '02',
  'files_per_session': 5,
  'tasks_in_session': ['nback', 'rest']},
 {'sub': '01',
  'ses': '01',
  'files_per_session': 6,
  'tasks_in_session': ['nback', 'rest', 'stroop']},
 {'sub': '04',
  'ses': '02',
  'files_per_session': 5,
  'tasks_in_session': ['nback', 'rest']},
 {'sub': '04',
  'ses': '01',
  'files_per_session': 6,
  'tasks_in_session': ['nback', 'rest', 'stroop']},
 {'sub': '05',
  'ses': '02',
  'files_per_session': 5,
  'tasks_in_session': ['nback', 'rest']},
 {'sub': '05',
  'ses': '01',
  'files_per_session': 6,
  'tasks_in_session': ['nback', 'rest', 'stroop']},
 {'sub': '02',
  'ses': '02',
  'files_per_session': 5,
  'tasks_in_session': ['nback', 'rest']},
 {'sub': '02',
  'ses': '01',
  'files_per_session': 6,
  'tasks_in_session': ['nback', 'rest', 'stroop']},
 {'sub': '03',
  'ses': '02',
  'files_per_session': 5,
  'tasks_in_session': ['nback', 'rest']},
 {'sub': '03',
  'ses': '01',
  'files_per_session': 6,
  'tasks_in_session': 

In [30]:
q.run_query("""
    SELECT sub,
           COUNT(DISTINCT task) as unique_tasks,
           ARRAY_AGG(DISTINCT task) as completed_tasks,
           COUNT(*) as total_functional_files
    WHERE datatype=func
    GROUP BY sub
    HAVING COUNT(DISTINCT task) > 1  # Subjects with multiple tasks
""", format="json")

[{'sub': '01',
  'unique_tasks': 2,
  'completed_tasks': ['nback', 'rest'],
  'total_functional_files': 6},
 {'sub': '04',
  'unique_tasks': 2,
  'completed_tasks': ['nback', 'rest'],
  'total_functional_files': 6},
 {'sub': '05',
  'unique_tasks': 2,
  'completed_tasks': ['nback', 'rest'],
  'total_functional_files': 6},
 {'sub': '02',
  'unique_tasks': 2,
  'completed_tasks': ['nback', 'rest'],
  'total_functional_files': 6},
 {'sub': '03',
  'unique_tasks': 2,
  'completed_tasks': ['nback', 'rest'],
  'total_functional_files': 6}]

## Summary

You've learned how to:

1. **Basic queries**: Filter by BIDS entities
2. **Logical operators**: Combine conditions with AND, OR, NOT
3. **SELECT clause**: Choose specific fields to return
4. **Pattern matching**: Use wildcards and regex
5. **Ranges and lists**: Query multiple values efficiently
6. **Aggregations**: Count and group data
7. **Metadata queries**: Access JSON sidecar information
8. **Participant data**: Query demographics
9. **Complex queries**: Combine multiple features
10. **Output formats**: Export results in different formats

## Next Steps

- Check out the [Language Reference](language.md) for complete syntax details
- Explore more [examples](../examples/) for specific use cases
- Use the CLI tool `biql` for command-line queries
- Integrate BIQL into your Python analysis pipelines

Happy querying!