Skip to content

Commit

Permalink
Add fairseq to PyPI (facebookresearch#495)
Browse files Browse the repository at this point in the history
Summary:
- fairseq can now be installed via pip: `pip install fairseq`
- command-line tools are globally accessible: `fairseq-preprocess`, `fairseq-train`, `fairseq-generate`, etc.
Pull Request resolved: facebookresearch#495

Differential Revision: D14017761

Pulled By: myleott

fbshipit-source-id: 10c9f6634a3056074eac2f33324b4f1f404d4235
  • Loading branch information
myleott authored and facebook-github-bot committed Feb 9, 2019
1 parent cea0e4b commit fbd4cef
Show file tree
Hide file tree
Showing 30 changed files with 143 additions and 136 deletions.
14 changes: 11 additions & 3 deletions README.md
Expand Up @@ -45,10 +45,18 @@ Please follow the instructions here: https://github.com/pytorch/pytorch#installa
If you use Docker make sure to increase the shared memory size either with
`--ipc=host` or `--shm-size` as command line options to `nvidia-docker run`.

After PyTorch is installed, you can install fairseq with:
After PyTorch is installed, you can install fairseq with `pip`:
```
pip install -r requirements.txt
python setup.py build develop
pip install fairseq
```

**Installing from source**

To install fairseq from source and develop locally:
```
git clone https://github.com/pytorch/fairseq
cd fairseq
pip install --editable .
```

# Getting Started
Expand Down
66 changes: 33 additions & 33 deletions docs/command_line_tools.rst
Expand Up @@ -5,81 +5,81 @@ Command-line Tools

Fairseq provides several command-line tools for training and evaluating models:

- :ref:`preprocess.py`: Data pre-processing: build vocabularies and binarize training data
- :ref:`train.py`: Train a new model on one or multiple GPUs
- :ref:`generate.py`: Translate pre-processed data with a trained model
- :ref:`interactive.py`: Translate raw text with a trained model
- :ref:`score.py`: BLEU scoring of generated translations against reference translations
- :ref:`eval_lm.py`: Language model evaluation
- :ref:`fairseq-preprocess`: Data pre-processing: build vocabularies and binarize training data
- :ref:`fairseq-train`: Train a new model on one or multiple GPUs
- :ref:`fairseq-generate`: Translate pre-processed data with a trained model
- :ref:`fairseq-interactive`: Translate raw text with a trained model
- :ref:`fairseq-score`: BLEU scoring of generated translations against reference translations
- :ref:`fairseq-eval-lm`: Language model evaluation


.. _preprocess.py:
.. _fairseq-preprocess:

preprocess.py
~~~~~~~~~~~~~
fairseq-preprocess
~~~~~~~~~~~~~~~~~~
.. automodule:: preprocess

.. argparse::
:module: preprocess
:func: get_parser
:prog: preprocess.py
:module: fairseq.options
:func: get_preprocessing_parser
:prog: fairseq-preprocess


.. _train.py:
.. _fairseq-train:

train.py
~~~~~~~~
fairseq-train
~~~~~~~~~~~~~
.. automodule:: train

.. argparse::
:module: fairseq.options
:func: get_training_parser
:prog: train.py
:prog: fairseq-train


.. _generate.py:
.. _fairseq-generate:

generate.py
~~~~~~~~~~~
fairseq-generate
~~~~~~~~~~~~~~~~
.. automodule:: generate

.. argparse::
:module: fairseq.options
:func: get_generation_parser
:prog: generate.py
:prog: fairseq-generate


.. _interactive.py:
.. _fairseq-interactive:

interactive.py
~~~~~~~~~~~~~~
fairseq-interactive
~~~~~~~~~~~~~~~~~~~
.. automodule:: interactive

.. argparse::
:module: fairseq.options
:func: get_interactive_generation_parser
:prog: interactive.py
:prog: fairseq-interactive


.. _score.py:
.. _fairseq-score:

score.py
~~~~~~~~
fairseq-score
~~~~~~~~~~~~~
.. automodule:: score

.. argparse::
:module: score
:module: fairseq_cli.score
:func: get_parser
:prog: score.py
:prog: fairseq-score


.. _eval_lm.py:
.. _fairseq-eval-lm:

eval_lm.py
~~~~~~~~~~
fairseq-eval-lm
~~~~~~~~~~~~~~~
.. automodule:: eval_lm

.. argparse::
:module: fairseq.options
:func: get_eval_lm_parser
:prog: eval_lm.py
:prog: fairseq-eval-lm
4 changes: 2 additions & 2 deletions docs/conf.py
Expand Up @@ -60,9 +60,9 @@
# built documents.
#
# The short X.Y version.
version = '0.6.0'
version = '0.6.1'
# The full version, including alpha/beta/rc tags.
release = '0.6.0'
release = '0.6.1'

# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.
Expand Down
2 changes: 0 additions & 2 deletions docs/data.rst
Expand Up @@ -46,8 +46,6 @@ Dictionary
Iterators
---------

.. autoclass:: fairseq.data.BufferedIterator
:members:
.. autoclass:: fairseq.data.CountingIterator
:members:
.. autoclass:: fairseq.data.EpochBatchIterator
Expand Down
26 changes: 13 additions & 13 deletions docs/getting_started.rst
Expand Up @@ -15,17 +15,17 @@ done with the
script using the ``wmt14.en-fr.fconv-cuda/bpecodes`` file. ``@@`` is
used as a continuation marker and the original text can be easily
recovered with e.g. ``sed s/@@ //g`` or by passing the ``--remove-bpe``
flag to :ref:`generate.py`. Prior to BPE, input text needs to be tokenized
flag to :ref:`fairseq-generate`. Prior to BPE, input text needs to be tokenized
using ``tokenizer.perl`` from
`mosesdecoder <https://github.com/moses-smt/mosesdecoder>`__.

Let's use :ref:`interactive.py` to generate translations
Let's use :ref:`fairseq-interactive` to generate translations
interactively. Here, we use a beam size of 5:

.. code-block:: console
> MODEL_DIR=wmt14.en-fr.fconv-py
> python interactive.py \
> fairseq-interactive \
--path $MODEL_DIR/model.pt $MODEL_DIR \
--beam 5 --source-lang en --target-lang fr
| loading model(s) from wmt14.en-fr.fconv-py/model.pt
Expand Down Expand Up @@ -66,7 +66,7 @@ datasets: IWSLT 2014 (German-English), WMT 2014 (English-French) and WMT
> bash prepare-iwslt14.sh
> cd ../..
> TEXT=examples/translation/iwslt14.tokenized.de-en
> python preprocess.py --source-lang de --target-lang en \
> fairseq-preprocess --source-lang de --target-lang en \
--trainpref $TEXT/train --validpref $TEXT/valid --testpref $TEXT/test \
--destdir data-bin/iwslt14.tokenized.de-en
Expand All @@ -76,17 +76,17 @@ This will write binarized data that can be used for model training to
Training
--------

Use :ref:`train.py` to train a new model. Here a few example settings that work
Use :ref:`fairseq-train` to train a new model. Here a few example settings that work
well for the IWSLT 2014 dataset:

.. code-block:: console
> mkdir -p checkpoints/fconv
> CUDA_VISIBLE_DEVICES=0 python train.py data-bin/iwslt14.tokenized.de-en \
> CUDA_VISIBLE_DEVICES=0 fairseq-train data-bin/iwslt14.tokenized.de-en \
--lr 0.25 --clip-norm 0.1 --dropout 0.2 --max-tokens 4000 \
--arch fconv_iwslt_de_en --save-dir checkpoints/fconv
By default, :ref:`train.py` will use all available GPUs on your machine. Use the
By default, :ref:`fairseq-train` will use all available GPUs on your machine. Use the
``CUDA_VISIBLE_DEVICES`` environment variable to select specific GPUs and/or to
change the number of GPU devices that will be used.

Expand All @@ -98,12 +98,12 @@ Generation
----------

Once your model is trained, you can generate translations using
:ref:`generate.py` **(for binarized data)** or
:ref:`interactive.py` **(for raw text)**:
:ref:`fairseq-generate` **(for binarized data)** or
:ref:`fairseq-interactive` **(for raw text)**:

.. code-block:: console
> python generate.py data-bin/iwslt14.tokenized.de-en \
> fairseq-generate data-bin/iwslt14.tokenized.de-en \
--path checkpoints/fconv/checkpoint_best.pt \
--batch-size 128 --beam 5
| [de] dictionary: 35475 types
Expand Down Expand Up @@ -136,7 +136,7 @@ to training on 8 GPUs:

.. code-block:: console
> CUDA_VISIBLE_DEVICES=0 python train.py --update-freq 8 (...)
> CUDA_VISIBLE_DEVICES=0 fairseq-train --update-freq 8 (...)
Training with half precision floating point (FP16)
--------------------------------------------------
Expand All @@ -152,7 +152,7 @@ Fairseq supports FP16 training with the ``--fp16`` flag:

.. code-block:: console
> python train.py --fp16 (...)
> fairseq-train --fp16 (...)
Lazily loading large training datasets
--------------------------------------
Expand All @@ -178,7 +178,7 @@ replacing ``node_rank=0`` with ``node_rank=1`` on the second node:
> python -m torch.distributed.launch --nproc_per_node=8 \
--nnodes=2 --node_rank=0 --master_addr="192.168.1.1" \
--master_port=1234 \
train.py data-bin/wmt16_en_de_bpe32k \
$(which fairseq-train) data-bin/wmt16_en_de_bpe32k \
--arch transformer_vaswani_wmt_en_de_big --share-all-embeddings \
--optimizer adam --adam-betas '(0.9, 0.98)' --clip-norm 0.0 \
--lr-scheduler inverse_sqrt --warmup-init-lr 1e-07 --warmup-updates 4000 \
Expand Down
2 changes: 1 addition & 1 deletion docs/lr_scheduler.rst
Expand Up @@ -29,6 +29,6 @@ epoch boundaries via :func:`step`.
.. autoclass:: fairseq.optim.lr_scheduler.reduce_lr_on_plateau.ReduceLROnPlateau
:members:
:undoc-members:
.. autoclass:: fairseq.optim.lr_scheduler.reduce_angular_lr_scheduler.TriangularSchedule
.. autoclass:: fairseq.optim.lr_scheduler.triangular_lr_scheduler.TriangularSchedule
:members:
:undoc-members:
9 changes: 6 additions & 3 deletions docs/overview.rst
Expand Up @@ -49,7 +49,10 @@ new plug-ins.

**Loading plug-ins from another directory**

New plug-ins can be defined in a custom module stored in the user system. In order to import the module, and make the plugin available to *fairseq*, the command line supports the ``--user-dir`` flag that can be used to specify a custom location for additional modules to load into *fairseq*.
New plug-ins can be defined in a custom module stored in the user system. In
order to import the module, and make the plugin available to *fairseq*, the
command line supports the ``--user-dir`` flag that can be used to specify a
custom location for additional modules to load into *fairseq*.

For example, assuming this directory tree::

Expand All @@ -65,6 +68,6 @@ with ``__init__.py``::
def transformer_mmt_big(args):
transformer_vaswani_wmt_en_de_big(args)

it is possible to invoke the ``train.py`` script with the new architecture with::
it is possible to invoke the :ref:`fairseq-train` script with the new architecture with::

python3 train.py ... --user-dir /home/user/my-module -a my_transformer --task translation
fairseq-train ... --user-dir /home/user/my-module -a my_transformer --task translation
10 changes: 5 additions & 5 deletions docs/tutorial_classifying_names.rst
Expand Up @@ -28,7 +28,7 @@ train, valid and test sets.
Download and extract the data from here:
`tutorial_names.tar.gz <https://dl.fbaipublicfiles.com/fairseq/data/tutorial_names.tar.gz>`_

Once extracted, let's preprocess the data using the :ref:`preprocess.py`
Once extracted, let's preprocess the data using the :ref:`fairseq-preprocess`
command-line tool to create the dictionaries. While this tool is primarily
intended for sequence-to-sequence problems, we're able to reuse it here by
treating the label as a "target" sequence of length 1. We'll also output the
Expand All @@ -37,7 +37,7 @@ enhance readability:

.. code-block:: console
> python preprocess.py \
> fairseq-preprocess \
--trainpref names/train --validpref names/valid --testpref names/test \
--source-lang input --target-lang label \
--destdir names-bin --output-format raw
Expand Down Expand Up @@ -324,19 +324,19 @@ following contents::
4. Training the Model
---------------------

Now we're ready to train the model. We can use the existing :ref:`train.py`
Now we're ready to train the model. We can use the existing :ref:`fairseq-train`
command-line tool for this, making sure to specify our new Task (``--task
simple_classification``) and Model architecture (``--arch
pytorch_tutorial_rnn``):

.. note::

You can also configure the dimensionality of the hidden state by passing the
``--hidden-dim`` argument to :ref:`train.py`.
``--hidden-dim`` argument to :ref:`fairseq-train`.

.. code-block:: console
> python train.py names-bin \
> fairseq-train names-bin \
--task simple_classification \
--arch pytorch_tutorial_rnn \
--optimizer adam --lr 0.001 --lr-shrink 0.5 \
Expand Down
12 changes: 6 additions & 6 deletions docs/tutorial_simple_lstm.rst
Expand Up @@ -341,7 +341,7 @@ function decorator. Thereafter this named architecture can be used with the
3. Training the Model
---------------------

Now we're ready to train the model. We can use the existing :ref:`train.py`
Now we're ready to train the model. We can use the existing :ref:`fairseq-train`
command-line tool for this, making sure to specify our new Model architecture
(``--arch tutorial_simple_lstm``).

Expand All @@ -352,7 +352,7 @@ command-line tool for this, making sure to specify our new Model architecture

.. code-block:: console
> python train.py data-bin/iwslt14.tokenized.de-en \
> fairseq-train data-bin/iwslt14.tokenized.de-en \
--arch tutorial_simple_lstm \
--encoder-dropout 0.2 --decoder-dropout 0.2 \
--optimizer adam --lr 0.005 --lr-shrink 0.5 \
Expand All @@ -362,12 +362,12 @@ command-line tool for this, making sure to specify our new Model architecture
| epoch 052 | valid on 'valid' subset | valid_loss 4.74989 | valid_ppl 26.91 | num_updates 20852 | best 4.74954
The model files should appear in the :file:`checkpoints/` directory. While this
model architecture is not very good, we can use the :ref:`generate.py` script to
model architecture is not very good, we can use the :ref:`fairseq-generate` script to
generate translations and compute our BLEU score over the test set:

.. code-block:: console
> python generate.py data-bin/iwslt14.tokenized.de-en \
> fairseq-generate data-bin/iwslt14.tokenized.de-en \
--path checkpoints/checkpoint_best.pt \
--beam 5 \
--remove-bpe
Expand Down Expand Up @@ -498,7 +498,7 @@ Finally, we can rerun generation and observe the speedup:
# Before
> python generate.py data-bin/iwslt14.tokenized.de-en \
> fairseq-generate data-bin/iwslt14.tokenized.de-en \
--path checkpoints/checkpoint_best.pt \
--beam 5 \
--remove-bpe
Expand All @@ -508,7 +508,7 @@ Finally, we can rerun generation and observe the speedup:
# After
> python generate.py data-bin/iwslt14.tokenized.de-en \
> fairseq-generate data-bin/iwslt14.tokenized.de-en \
--path checkpoints/checkpoint_best.pt \
--beam 5 \
--remove-bpe
Expand Down
6 changes: 3 additions & 3 deletions examples/language_model/README.md
Expand Up @@ -24,20 +24,20 @@ $ cd ../..
# Binarize the dataset:
$ TEXT=examples/language_model/wikitext-103
$ python preprocess.py --only-source \
$ fairseq-preprocess --only-source \
--trainpref $TEXT/wiki.train.tokens --validpref $TEXT/wiki.valid.tokens --testpref $TEXT/wiki.test.tokens \
--destdir data-bin/wikitext-103
# Train the model:
# If it runs out of memory, try to reduce max-tokens and max-target-positions
$ mkdir -p checkpoints/wikitext-103
$ python train.py --task language_modeling data-bin/wikitext-103 \
$ fairseq-train --task language_modeling data-bin/wikitext-103 \
--max-epoch 35 --arch fconv_lm_dauphin_wikitext103 --optimizer nag \
--lr 1.0 --lr-scheduler reduce_lr_on_plateau --lr-shrink 0.5 \
--clip-norm 0.1 --dropout 0.2 --weight-decay 5e-06 --criterion adaptive_loss \
--adaptive-softmax-cutoff 10000,20000,200000 --max-tokens 1024 --tokens-per-sample 1024
# Evaluate:
$ python eval_lm.py data-bin/wikitext-103 --path 'checkpoints/wiki103/checkpoint_best.pt'
$ fairseq-eval-lm data-bin/wikitext-103 --path 'checkpoints/wiki103/checkpoint_best.pt'
```

0 comments on commit fbd4cef

Please sign in to comment.