Skip to content

v0.3.0

Choose a tag to compare

@nvictus nvictus released this 31 Aug 21:44
· 244 commits to main since this release

Maintenance

  • Drop support for Python 3.6, add support for 3.9
  • Pandas dependency bumped to 1.3
  • Lots of new documentation

Conceptual changes

We formulated strict definitions for genomic intervals, dataframes, and their various properties. All bioframe functions are expected to adhere to these definitions.

API changes

Reorganize modules

  • ops - operations on genomic interval dataframes
  • extras - miscellaneous operations, most involving
    genomic sequences and gene annotations
  • vis - visualizations of genomic interval dataframes
  • core.arrops - operations on genomic interval arrays
  • core.checks - tests for definitions of genomic interval dataframes
  • core.construction - construction and sanitization of genomic interval dataframes
  • core.specs - specifications for the implementation of genomic intervals using pandas dataframes
    (i.e. column names, datatypes, etc)
  • core.stringops - operations on genomic interval strings
  • io.fileops - I/O on common file formats for genomic data
  • io.schemas - schemas for standard tabular formats for genomic data storage
  • io.resources - interfaces to popular online genomic data resources

New functions

  • extras.pair_by_distance, ops.sort_bedframe, ops.assign_view, dataframe constructors

Existing functions

  • expand: take negative values and fractional values
  • overlap: change default suffixes, keep_order=True
  • subtract: add return_index and keep_order

Enable pd.NA for missing values, typecasting

Data additions

  • add schemas for bedpe, gap, UCSCmRNA, pgsnp
  • add tables with curated detailed genome assembly information

Miscellaneous

  • frac_gc is faster now