Skip to content

v0.30.2 - gget virus updates

Choose a tag to compare

@lauraluebbert lauraluebbert released this 08 Feb 19:23
· 257 commits to main since this release
d34e1ce
  • gget virus updates: Metadata streaming optimization, improved protein filtering, and enhanced error handling and retry logic
    • Metadata now streams to disk during fetch to prevent memory exhaustion on large datasets (100,000+ records)
    • Fixed metadata CSV mapping (camelCase → snake_case) for organism name, host, and collection date
    • Enhanced protein filtering for segmented viruses with improved FASTA header parsing
    • Added annotated=False option for filtering unannotated sequences
    • Added progress bars to batched sequence downloads
    • Fixed collection date naming bug
    • Improved error messages for invalid filter dates
    • Added enhanced retry attempts for virus name resolution
    • Added verbosity to influenza A and COVID-19 checking steps