Skip to content
Branch: master
Find file History
wanchaol separate bench groups to avoid unnecessary runs (#54)
* separate bench groups to avoid unnecessary runs

* address comment and add group arg

* minor fix

* add comment
Latest commit cdff40e Mar 13, 2019
Type Name Latest commit message Commit time
Failed to load latest commit information.
fastrnns add flake8 config from pytorch and format the code Feb 6, 2019

LSTM Benchmarking

Setup environment

Make sure you're on a machine with CUDA, torchvision, and pytorch installed. Install in the following order:

# Install torchvision. It comes with the pytorch stable release binary
conda install pytorch torchvision -c pytorch

# Install the latest pytorch master from source.
# It should supercede the installation from the release binary.
python build develop

# Check the pytorch installation version
python -c "import torch; print(torch.__version__)"

Test the fastrnns benchmarking scripts with the following: python -m fastrnns.test --rnns jit

For most stable results, do the following:

  • Set CPU Governor to performance mode (as opposed to energy save)
  • Turn off turbo for all CPUs (assuming Intel CPUs)
  • Shield cpus via cset shield when running benchmarks.

Run benchmarks

python -m fastrnns.bench --rnns cudnn aten jit should give a good comparision.

Run nvprof

python -m fastrnns.profile --rnns aten jit should output an nvprof file somewhere.

OLD: RNN benchmarks

To run all the benchmarks, and get a summary view, use python

To run a specific benchmark, run it as a python script: python benchmarks/ or python benchmarks/ They come with a lot of command line options for fine-tuning.


Use Linux for the most accurate timing. A lot of these tests only run on CUDA.

You can’t perform that action at this time.