Skip to content

v1.0.0b4

Compare
Choose a tag to compare
@chschroeder chschroeder released this 04 May 18:20

This release adds two no query strategies, improves the Dataset interface, and introduces optional dependencies.

Added

  • General:
    • We now have a concept for optional dependencies which allows components to rely on soft dependencies, i.e. python dependencies which can be installed on demand (and only when certain functionality is needed).
  • Datasets:
    • The Dataset interface now has a clone() method that creates an identical copy of the respective dataset.
  • Query Strategies:

Changed

  • Datasets:
    • Separated the previous DatasetView implementation into interface (DatasetView) and implementation (SklearnDatasetView).
    • Added clone() method which creates an identical copy of the dataset.
  • Query Strategies:
    • EmbeddingBasedQueryStrategy now only embeds instances that are either in the label or in the unlabeled pool (and no longer the entire dataset).
  • Code examples:
    • Code structure was unified.
    • Number of iterations can now be passed via an cli argument.
  • small_text.integrations.pytorch.utils.data:
    • Method get_class_weights() now scales the resulting multi-class weights so that the smallest class weight is equal to 1.0.