A Python library for reading kdb+ splayed tables directly from disk, without requiring a running q process. kdb+ is a high-performance time-series database commonly used in finance. Read your data q-less!
pip install -e .Requires: numpy, pandas
This library memory-maps the column files and parses the binary format directly, giving efficient read access without needing q/kdb+. Just like in kdb we can read only the data we need.
import qless
# Mount typical NYSE TAQ database
hdb = qless.HDB('TAQ_DATA/')
# Read entire splayed table directly into pandas df (34m rows and 15 fields, takes almost a minute)
hdb.read_table(table='trade')
# Read the whole Symbol column, then use a mask to load only A trades (about 15k rows and 4 fields, takes ~100ms)
mask = hdb.read_table(table='trade', columns=['Symbol'])['Symbol'] == 'A'
hdb.read_table(table='trade', columns=['Time', 'Symbol', 'Trade_Price', 'Trade_Volume'], rows=mask)
# Simply read the first 100 rows (takes 20ms)
hdb.read_table(table='trade', columns=['Time', 'Symbol', 'Trade_Price', 'Trade_Volume'], rows=slice(0, 100))
# List partitions and read from a specific date for a parted table
hdb.partitions # ['2024.01.02', '2024.01.03', ...]
hdb.read_table('quote', partition='2024.01.02')| kdb+ Type | Code | Python Type |
|---|---|---|
| boolean | 1 | bool |
| byte | 4 | uint8 |
| short | 5 | int16 |
| int | 6 | int32 |
| long | 7 | int64 |
| real | 8 | float32 |
| float | 9 | float64 |
| char | 10 | bytes (S1) |
| symbol | enum | Categorical |
| timestamp | 12 | datetime64[ns] |
| date | 14 | datetime64[ns] |
| timespan | 16 | timedelta64[ns] |
| time | 19 | timedelta64[ms] |
- Read-only (no write support)
- Attributes not yet used (but can read attributed files)
pip install -e ".[dev]" # install with dev dependenciespytest # run all tests
pytest -v # verbose outputruff check . # check for issues
ruff check . --fix # auto-fix issues
ruff format . # format codeMIT