Statistical arbitrage simulation, modeling and backtesting with Python.
HTML Jupyter Notebook Python
Latest commit 29b8e41 Jul 28, 2016 @harpone committed on GitHub Update README


UPDATE 2016: don't use this, it's crap :)
Hi! This is a model dependent equity statistical arbitrage backtest module for Python. Roughly speaking, the input is a universe of N stock prices over a selected time period, and the output is a mean reverting portfolio which can be used for trading. 

Please see a more complete introduction in the IPython Notebook file "PyArb Intro.ipynb". If you don't have IPython installed and/or want to just see the results, you can instead view the corresponding HTML version "PyArb Intro.html". Please note that just clicking the file in GitHub opens the raw version, so instead click the following link to see the actual notebook:

Since this is only a backtest module, I've decided to do a "walk-forward" with the optimized parameters from this backtest. In practice this would be just another backtest but the rules are that the parameters cannot be changed to make sure there's no data snooping. Feel free to check out the progress at my homepage at

UPDATE: Walk forward cancelled: the code is a bit broken and the backtest results should not be trusted (as if they ever could)... and needs more work anyway... 

If you have any questions, drop me an email at

NOTE: Unfortunately I can't include the data here because 1) The files are way too big and 2) I don't think I'm allowed to (I guess it's in the TOS/EULA somewhere...). So you have to get your own data. I got some free data at, but their data seems to be pretty dirty. I also got some paid data from (data format is "3F VIP Trading"), which seems to be of higher quality that TBG's. If you want the same data I was using, send me an email and maybe I can send it to you e.g. via Dropbox/Google Drive etc.