v2.0.0
This is a full rewrite of bids2table aiming to make the project much simpler while also adding a few useful new features. The major changes are:
- Reduce lines of code from 2150 (across bids2table and elbow) to 720 lines.
- Reduce dependencies to only bidschematools and pyarrow (and tqdm).
- Improve runtime performance. Indexing bids-examples on my laptop went from 2.5s to 650ms (4x speedup). (Although a thorough benchmarking is still tbd.)
- Add support for indexing datasets hosted in the cloud via cloudpathlib. This lets us for example index all of OpenNeuro (~1400 datasets, 1.2M files) in less than 15 minutes.
(bids2table) clane$ b2t2 index -o openneuro.parquet -j 8 --use-threads s3://openneuro.org/ds*
100%|█████████████████████████████████████| 1408/1408 [12:25<00:00, 1.89it/s, ds=ds006193, N=1.2M]
The new API completely breaks from the old API. Please see the updated documentation for details, and consider opening an issue for any help transitioning to the new API.
Credit to @nx10 for suggesting these improvements and providing high-level direction.
What's Changed
- bids2table 2.0 by @clane9 in #48
- Add documentation by @clane9 in #49
- Pre-release fixes by @clane9 in #50
Full Changelog: v0.2.0...v2.0.0