Skip to content

Use sqlite3 database for reference data

Compare
Choose a tag to compare
@pbashyal-nmdp pbashyal-nmdp released this 15 Oct 22:10
· 431 commits to master since this release
e06cea7

Use sqlite3 database for data

Offload MAC codes from memory to sqlite3 database (natively supported by Python) to reduce
memory footprint. All MAC lookups happen through the db. The alleles and G group expansions
are still held in memory.

In addition, all generated data is saved as tables in the same database. This leads to one
file for storing all reference data in a standard format.

This led to drastic reduction in memory usage and startup time.

Version First Time Prebuilt Data
0.1.0 10.5 sec 4.92 sec
0.2.0 814 msec 598 msec
0.3.0 24.1 msec 24.7 msec

Heap memory used by ARD reference data after ard = pyard.ARD(3290)

Version Memory (MB)
0.1.0 2977.86 MB
0.2.0 420.76 MB
0.3.0 3.74 MB