Releases: aprilweilab/picovcf
Releases · aprilweilab/picovcf
v2.0
This release increments the IGD file format version from v2 to v3. There aren't really any VCF-related changes.
- Simplification of missing data handling. Previously, it was stored in its own table as sparse lists, and loaded all at once. For any non-trivial amount of missing data this could use a fair amount of RAM. Now it is just another row in the "regular" data rows.
- Each row can be either sparse or not. Sparse rows are list of sample indexes (like missing data was before). Non-sparse are bit vectors (like all the other data was before). This change makes the resulting file significantly smaller than before, for large datasets.
- The API for writing IGD data rows was simplifed.
- Faster processing of the bitvector representation.
- Example in
igdpp
for computing allele frequency. - In-memory storage of allele values (the strings) was reworked to be significantly smaller. More than 6x smaller for most alleles.