Skip to content

v0.3

Compare
Choose a tag to compare
@jacobkahn jacobkahn released this 16 Apr 22:30

First stable release post-consolidation. Separates Flashlight into four parts:

  • flashlight/lib contains kernels and standalone utilities for sequence losses, beam search decoding, text processing, and more.
  • flashlight/fl is the core neural network library using the ArrayFire tensor library.
  • flashlight/app are applications of the core library to machine learning across domains.
  • flashlight/ext are extensions on top of Flashlight and ArrayFire that are useful across apps.

Major Features

  • Automatic mixed precision training (AMP) -- typed tensor and autograd operators
  • Framework for building custom memory managers on top of ArrayFire (docs)
  • OneDNN as a backend for primitive operations on the CPU
  • New dataset abstractions in core (flashlight/fl/dataset)
  • Application libraries
  • Audio augmentation library (ASR)
  • Tools for training models using iterative pseudo-labeling (IPL) (ASR)
  • [early] OpenCL support with both RoCM and Intel

Build Changes/Improvements

  • C++ 17 support -- gcc 7/clang 6 required.
  • Support for vcpkg via FL_BUILD_STANDALONE
  • Consolidation of wav2letter and app-based build selection
  • CMake 3.10 minimum, better support for shared objects
  • First class support for CUDA and Halide kernels
  • Improved support for downloading not-found dependencies (Gloo, KenLM, libsndfile)
  • Improved support for dependency management for downstream projects using Flashlight's installed CMake config (cmake/flashlightConfig.cmake.in)
  • Supporting padding in transformer/multihead attention
  • SpecAugment for raw waves (implemented vie low-pass filter)
  • Conformer Implementation
  • Improve autograd for indexing operator (support repeated indices)
  • Improve python bindings build, supporting setup.py install
  • A lot of docs.

Improvements/new features in wav2letter (flashlight/app/asr)

  • Fixed padding issues in s2s models: pre-training window, encoder attention, encoder-decoder attention
  • Refactor s2s codebase
  • Fixes to memory allocations for s2s beam-search decoder (less memory, no OOM issues)
  • Fixes to beam-search decoder to support non-empty surround
  • Fixes to dataset pipeline + dynamic batching support