Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.
Branch: master
Clone or download
jackgerrits Small Json parser cleanup (#1759)
* Small fixes to json parser

* Use std::find

* remove unused function

* Make file sync variable const
Latest commit 4d181a4 Feb 15, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.github/ISSUE_TEMPLATE Add bug report issue template (#1693) Jan 11, 2019
.pipelines Add clang-format 7.0.1 to CI image (#1719) Feb 2, 2019
.scripts Scripts for running tests and generating NuGet packages (#1732) Jan 30, 2019
R Fix missing bracket (#1333) Oct 11, 2017
big_tests . Jan 21, 2017
c_test Clean windows build and unify output paths (#1599) Sep 22, 2018
cluster Enable selective CMake configuration, improve messaging (#1678) Nov 13, 2018
cs Move to std types for concurrency (#1731) Jan 30, 2019
demo [WIP] addressing python semantic checking warnings from issue #1442, … Apr 9, 2018
doc CMake build definitions (#1624) Nov 5, 2018
explore Fix unused params warnings plus incomplete struct init (done to defau… Jan 16, 2019
java vw java 11 compatibility (#1700) Feb 8, 2019
library CMake build definitions (#1624) Nov 5, 2018
logo_assets . Jan 21, 2017
python Fix build scripts forcing Debug builds. Add LTO mode and fix VW defau… Feb 11, 2019
rapidjson Bi-annual PR (#1270) Jul 15, 2017
reinforcement_learning Remove reinforcement_learning (#1675) Nov 13, 2018
sdl Fix issues detected by Secure Development Life check. (SDL-7.0) (#1356) Nov 27, 2017
singularity install netcat Mar 14, 2018
test Do not define BOOST_TEST_DYN_LINK when statically linking (#1750) Feb 8, 2019
utl shift clang-format to advise (#1744) Feb 5, 2019
vowpalwabbit Small Json parser cleanup (#1759) Feb 15, 2019
.clang-format Add clang-format (#1701) Feb 5, 2019
.editorconfig turn off autostyle for new versions and shift to Allman Jul 19, 2017
.gitattributes Fix file enocding for .rc files (#1734) Jan 30, 2019
.gitignore Update Dockerfile for TravisCI, define pipeline, upgrade Java (#1716) Jan 19, 2019
.travis.yml Update image used in travis (#1738) Feb 4, 2019
AUTHORS . Jan 21, 2017
CMakeLists.txt Fix static linking (#1758) Feb 11, 2019
INSTALL updated INSTALL Mar 29, 2017
LICENSE added license file May 15, 2015
Makefile CMake build definitions (#1624) Nov 5, 2018
README.Windows.md Create Windows build scripts and update instructions (#1721) Jan 23, 2019
README.md Update README.md (#1737) Feb 4, 2019
appveyor.yml Create Windows build scripts and update instructions (#1721) Jan 23, 2019
build-linux.sh Add clang-format (#1701) Feb 5, 2019
deployvw.bat fix typo Jun 12, 2015
libvw.pc.in CMake build definitions (#1624) Nov 5, 2018
libvw_c_wrapper.pc.in CMake build definitions (#1624) Nov 5, 2018
version.txt CMake build definitions (#1624) Nov 5, 2018

README.md

Vowpal Wabbit

Build Status Windows Build status

Coverage Status Total Alerts

This is the Vowpal Wabbit fast online learning code. For Windows specific info, look at README.windows.md

Why Vowpal Wabbit?

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning. There is a specific focus on reinforcement learning with several contextual bandit algorithms implemented and the online nature lending to the problem well. Vowpal Wabbit is a destination for implementing and maturing state of the art algorithms with performance in mind.

  • Input Format. The input format for the learning algorithm is substantially more flexible than might be expected. Examples can have features consisting of free form text, which is interpreted in a bag-of-words way. There can even be multiple sets of free form text in different namespaces.
  • Speed. The learning algorithm is fast -- similar to the few other online algorithm implementations out there. There are several optimization algorithms available with the baseline being sparse gradient descent (GD) on a loss function.
  • Scalability. This is not the same as fast. Instead, the important characteristic here is that the memory footprint of the program is bounded independent of data. This means the training set is not loaded into main memory before learning starts. In addition, the size of the set of features is bounded independent of the amount of training data using the hashing trick.
  • Feature Interaction. Subsets of features can be internally paired so that the algorithm is linear in the cross-product of the subsets. This is useful for ranking problems. The alternative of explicitly expanding the features before feeding them into the learning algorithm can be both computation and space intensive, depending on how it's handled.

Visit the wiki to learn more.

Getting Started

For the most up to date instructions for getting started on Windows, MacOS or Linux please see the wiki. This includes: