Skip to content

idiap/buslr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

50 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Build System for Learning Research

Overview

BuSLR knows about various packages that are useful for machine learning (originally speech and language processing), and how to go get them and build them. BuSLR supports two build systems:

  • One is built around cmake's ExternalProject package. It uses make dependencies to handle package dependencies.
  • The other is built around conda. It functions as a repository for conda's meta.yaml and build.sh metadata.

Not all packages are supported in both systems.

There is an aspect of balance between the wider conda infrastructure and the linux distribution packaging systems. In general, if something is normally in a linux distribution (e.g., sox) then there's no point handling it here. If it's in conda then the same argument applies, but more subjectively. pytorch is better in conda, kaldi perhaps not.

Also, with conda, bear in mind that this is not a joke; the thing marked "another PIP?" does exist.

To use cmake

Clone the repo and do

cd buslr/local
cp Configure.example configure.sh # Edit if necessary
./configure.sh
make <package name>

The package is built in local and installed to local unless the appropriate line in configure.sh is changed. You can set:

export PATH=<path-to-buslr>/local/bin

to access the builds, or do source <path-to-buslr>/local/etc/buslrvars.sh to set other appropriate variables too. Set the INHIBIT line in configure.sh to inhibit building of packages for which you might have a system version (typically cuda or mkl).

To use conda

Clone the repo and do

cd buslr
conda build src/<package name>

As long as the conda-bld directory is on your channel list (it is indexed and functions as a local channel), you can do this:

conda install <package name>
conda build purge

Many of the packages were initialised with this command

conda-skeleton pypi <name-of-pip-package>

It allows conda versions of PIP packages to be built, thus avoiding the problem with muliple PIPs and conda being unaware of PIP.

Some individual package instructions

  • HTS requires the HTK sources to be downloaded manually.
  • SRILM also requires a manual download
  • Some packages (festival, kaldi, SRILM) don't really support a make install. See the in-place build section below.

Guidelines for creating new packages

Packages

There is a directory for each package. Typically there are only CMakeLists.txt and meta.yaml files, but there can also be patched or whole files to be copied into the tree. In the case of HTS and SRILM, the manually downloaded files are placed there too.

Patches

Following the man page for patch, patches can be generated by copying the original file to <path-to>/<file>.org, modifying the file, then running

diff -Naur <path-to>/<old-file> <path-to>/<new-file>

This is typically run relative to a directory called package/package-prefix/src/package. At patch time, cmake will cd to that directory. The patch can be applied using

PATCH_COMMAND     patch -p0 < ${CMAKE_CURRENT_SOURCE_DIR}/patch.txt

in the CMakeLists.txt file. A precedent for this is the sctk package, which patches the installation directory of a deep makefile.

If there are multiple patched files, it's better to run it on a copy of the whole directory. In this case, it will prepend a directory so we need patch -p1.

If the package is git based then git can generate the patch using git diff. It functions like the directory case, so patch -p1. However patching git checkouts causes problems on updates; see irstlm.

Where a package doesn't even have a build system, a cmake file can be copied directly into the tree. This approach is taken in sph2pipe.

Installing using CMake

Some packages don't have an install step. The native CMake install can work well in these cases. CMake's install() command actually writes things to a file called cmake_install.cmake. The trick is to use this file as the INSTALL_COMMAND for these cases. The simplest precedent is libresample. So, define this:

set(CMAKE_INSTALL_SCRIPT ${CMAKE_CURRENT_BINARY_DIR}/cmake_install.cmake)

add this command

INSTALL_COMMAND   ${CMAKE_COMMAND} -P ${CMAKE_INSTALL_SCRIPT}

and specify the files using install(FILES <files> DESTINATION <where>).

In-place builds

Some packages, notably kaldi and the festvox family, don't really support being installed. For these, we set SOURCE_DIR to something at top level (rather than buried in the src tree) and set INSTALL_COMMAND true to suppress installation. true here is the unix command that returns 1; empty strings don't survive the BuSLR_Add wrapper.

About

BuSLR: Build System for Speech and Language Research

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published