Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvement: Change Decimal to simple float #20

Closed
ryankennedyio opened this issue May 17, 2016 · 25 comments
Closed

Improvement: Change Decimal to simple float #20

ryankennedyio opened this issue May 17, 2016 · 25 comments

Comments

@ryankennedyio
Copy link
Contributor

Speed has been frustrating me a little bit while backtesting, have had a hunch it was due to the Decimal calculations being used for accuracy's sake. A little research brought up this discussion from the quantopian repo.

I have made a small start here, tests are currently broken though.
https://github.com/ryankennedyio/qstrader/tree/decimal-to-float

Some preliminary testing on my macbook air (1.7GHz dual-core i7, 8GB RAM) show:

SP500 backtest at 7c88d26:
15.6sec
15.2sec
15.8sec

SP500 backtest at 2373c59 (from my branch):
2.88sec
3.03sec
2.90sec

Decimal usage seems to lead to a 5x longer backtest (rule of thumb). This might take even longer for strategies that have lots of price calculations.

I understand the motivation for using Decimal, so I will also update here with the value of the difference between the unit test results when using Decimal and float, to see how major the calculation differences end up being.

@hgeorgako
Copy link

This looks interesting: https://github.com/jrmuizel/pyunum

@mhallsmoore
Copy link
Owner

Ryan, thanks a lot for carrying out the speed profiling. Very useful!

This is a actually quite a tricky issue and we'll need to tread carefully here. The motivation for using Decimal values was to ensure 100% accuracy of the pricing calculations. While this may seem rather onerous for the majority of backtesting purposes, it is an absolute necessity for regulatory/audit purposes in the institutional world.

As an anecdote, in the original fund I worked for, after around 8-10 months, we had a difference of $0.12 in our Net Asset Value calculations when using floating point compared to Decimal. This may seem like small change, but for audit purposes it is essentially a "lost" 12 cents and had to be accounted for.

I definitely don't want to penalise speed for 95% of QSTrader users just to ensure institutional compatibility, but I would love to find a way where we can offer both options, perhaps as a configuration setting.

However, I don't see there being an easy way to do this without a lot of code branching or class duplication.

Thoughts?

@ryankennedyio
Copy link
Contributor Author

ryankennedyio commented May 25, 2016

@mhallsmoore I can imagine the pain! Completely agree, so I'm very wary of making any decisions on this.

One method I've come across in C/C++ for a ticker-plant that seems crude is to multiply out the decimal points so we end up with int or long types. This might be worth exploring? $202.50 becomes 20250, or add some more 0's depending on the required precision.

Just have to define how much precision we want. Would just have to be careful to ensure that this is a blanket rule applied across the entire system; with the only exception being for printing values to screen.

I think in the end a compromise will have to be made one way or the other -- catering to both will add a scary amount of complexity.

It's not too difficult to change this to one thing or another, so I can't see any reason to do this "right now" -- best to be sure of it first.

@canada4663
Copy link

Interesting discussion! Good chance I'm naive here... But why not wrap the decimal function currently used in the code in a way that the wrapper checks a config setting / and if OK to use simple floats, returns a simple float instead of the decimal rounded result.

I haven't spent much time inspecting the code for if this approach would work... But thought I would throw it out!

Ryan

@polsinello
Copy link

polsinello commented Jun 7, 2016

After a little search it seems that gmplib (https://github.com/aleaxit/gmpy) or cDecimal (https://pypi.python.org/pypi/cdecimal/) might do the trick but it implies adding more dependencies to the project though it seems to be worth when seeing @ryankennedyio's results ...

@ryankennedyio
Copy link
Contributor Author

For what it's worth, here is some further discussion on how exchanges disseminate data, it seems integer or long is the way to go;

http://www.elitetrader.com/et/index.php?threads/ticks-last-traded-price-float-or-double.262403/page-3

FWIW I'm taking this approach when with Cassandra when storing my data now.

I really believe the best result will be had by using 10,000 as a constant price multiplier, and working with values as fixed-point data types with accuracy to 1/100th of a cent.

Whenever displaying numbers out to screen, just divide them through by the price multiplier.

Thoughts @mhallsmoore ?
When running a backtest over ~1000 instruments for a few years it does take pretty long. I'm keen to get that down to as little as possible. We can probably explore parallelisation in order to make use of more than one core too, but at the cost of code complexity. Not fussed about that at the moment as I think there are easier performance bottlenecks to work out.

@mhallsmoore
Copy link
Owner

I think 10,000 is the way to go (for the price multiplier) as this will easily allow forex positions when we eventually add them down the line.

This will be quite a bit of work, but thankfully I have prior unit tests in place that will enforce the correct prices. I just need to modify them from Decimal to Integer.

@hgeorgako
Copy link

ZF price looks like: 121.2890625. 10,000 is not enough.

@ryankennedyio
Copy link
Contributor Author

10,000,000 ?

Everything in Python is a long in 3.x, so I guess technically that's OK. Just an issue of readability at some point, though that would force the user to multiply in order to read it, which sort of enforces best practice.

@hgeorgako
Copy link

Yes. That should do it.

@femtotrader
Copy link
Contributor

femtotrader commented Jun 19, 2016

A clean approach may be to use multiple dispatch https://en.wikipedia.org/wiki/Multiple_dispatch

https://github.com/mrocklin/multipledispatch/
https://bitbucket.org/coady/multimethod

it's a key concept of Julia but I haven't use it with Python

This code could also help:

To parse string

from decimal import Decimal

TWOPLACES = Decimal("0.01")
FIVEPLACES = Decimal("0.00001")

class Parser(object):
    pass

class FloatParser(Parser):

    def price(self, s):
        return float(s)

    def volume(self, s):
        return float(s)

    def amount(s):
        return float(s)

    def midpoint(self, a, b):
        return (a + b) / 2.0

class DecimalParser(Parser):

    def price(self, s):
        return Decimal(s).quantize(FIVEPLACES)

    def volume(self, s):
        return Decimal(s)

    def amount(self, s):
        return Decimal(s).quantize(TWOPLACES)

    def midpoint(self, a, b):
        return (a + b) / Decimal("2.0")

class IntegerParser(Parser):

    PRICE_MULT = 10**5
    VOLUME_MULT = 10**2
    AMOUNT_MULT = 10**2

    def price(self, s):
        return int(float(s) * self.PRICE_MULT)

    def volume(self, s):
        return int(float(s) * self.VOLUME_MULT)

    def amount(self, s):
        return int(float(s) * self.AMOUNT_MULT)

    def midpoint(self, a, b):
        return (a + b) // 2  # integer division

To display price, volume, amount...

class Display(object):
    pass

class IntegerDisplay(Display):
    PRICE_DIGITS = 5
    PRICE_FORMAT = "%.5f"
    PRICE_MULT = 10**PRICE_DIGITS

    VOLUME_DIGITS = 2
    VOLUME_FORMAT = "%.2f"
    VOLUME_MULT = 10**VOLUME_DIGITS

    AMOUNT_DIGITS = 2
    AMOUNT_FORMAT = "%.2f"
    AMOUNT_MULT = 10**AMOUNT_DIGITS

    def price(self, x):
        return self.PRICE_FORMAT % (x / self.PRICE_MULT)

    def volume(self, x):
        return self.VOLUME_FORMAT % (x / self.VOLUME_MULT)

    def amount(self, x):
        return self.AMOUNT_FORMAT % (x / self.AMOUNT_MULT)

ryankennedyio added a commit to ryankennedyio/qstrader that referenced this issue Jun 21, 2016
@ryankennedyio
Copy link
Contributor Author

ryankennedyio commented Jun 21, 2016

Okay I've spent 2 or 3 hours this evening revisiting the above, and fortunately the backtests run in about 90% less time. Honestly I think my first profiles were done with Statistics commented out -- so Statistics was definitely responsible for the bulk of that time. Sorry !!! That was honestly the very first thing I ever wrote in Python....

Anyway tl;dr

  • Decimal to Int can save a backtest about 30% of its execution time
  • Using list instead of Series in the Statistics module will save a further 80% of execution time...!
  • Original savings of Decimal->Float were accidentally exaggerated, but savings of 30% execution time is realistic.

All profiling done on 2.5GHz processor, using the example MAC strategy on SP500TR ticker.

At master
13.97s
14.101s
13.876s

At ryankennedyio@4ee3c07
Removed all "decimal", replaced with int. Except for stream_next_tick bit.
10.05s
9.92s
9.88s

At ryankennedyio@e67070c,
further improved speed by 5%, but only relevant for yahoo daily price handler.

At ryankennedyio@f4045ee
After removing pd.Series from statistics and using lists (Duh -- sorry I was very green to Python when I wrote that !! ), I now have the speeds down to:
1.89s
1.88s
1.89s

At ryankennedyio@2b7c74e
After removing the print statement that comes through on every tick, it's now down to;
1.73s
1.72s
1.72s

So, from ~15s down to less than 2 seconds. Loads better.

@femtotrader that actually looks exactly like what I was thinking of. I didn't know the name, and those examples are great.

Basically, as soon as I include multiple dispatch into my branch and rewrite most of the tests, I'm happy to PR it into master. Feedback on user-friendliness with other people's algorithms will be appreciated.

@mhallsmoore
Copy link
Owner

This is really good - thanks Ryan and femto. Down from ~15s to 2s is a vast improvement, which will really pay dividends in parameter studies.

I do like the multiple dispatch approach, it nicely separates out the calculation code.

@femtotrader
Copy link
Contributor

Nice job @ryankennedyio but I really think that an interesting metrics for profiling is number of ticks processed by second. see mhallsmoore/qsforex#18

@mhallsmoore if you like the multiple dispatch approach you should like Julia 😄

@femtotrader
Copy link
Contributor

@ryankennedyio A possible improvement may also be to use Enum
see mhallsmoore/qsforex#42
Comparing Enum (an so Int) is probably quicker than comparing strings

@femtotrader
Copy link
Contributor

femtotrader commented Jun 22, 2016

I did some speed measurements of qstrader with random data generated using generate_simulated_prices.py (I was only looking at qsforex before but qstrader is more active now)

I added

    def _speed(self, i):
        return i / (time.time() - self.t0)

    @property
    def speed_iters(self):
        return self._speed(self.iters)

    @property
    def speed_ticks(self):
        return self._speed(self.ticks)

    @property
    def speed_bars(self):
        return self._speed(self.bars)

    def _s_speed(self, s, i):
        return "%d %s processed @ %f %s/s" % (i, s, self._speed(i), s)

    def s_speed_iters(self):
        return self._s_speed("iters", self.iters)

    def s_speed_ticks(self):
        return self._s_speed("ticks", self.ticks)

    def s_speed_bars(self):
        return self._s_speed("bars", self.bars)

to Backtest class and I only printed one tick per 1000 (using modulo)

                    if event.type == 'TICK':
                        self.cur_time = event.time
                        if self.ticks % self.N == 0:
                            print("Tick %s, at %s" % (self.ticks, self.cur_time))
                            print(self.s_speed_ticks())

In current code, we are processing less than 150 ticks per second !!!!

About printing ticks, bars, ... I think we should have a PrintStrategy that we could send to Backtest
(in fact it's very important that Backtest will be able to apply several strategies... and printing ticks, bars... in just one strategy out of others...)
see #41

@femtotrader
Copy link
Contributor

femtotrader commented Jun 22, 2016

It's also very interesting to see how speed evolves over time

$ python qstrader/examples/test_strategy_backtest.py
Running Backtest...
Tick 0, at 2014-01-01 00:00:01.449000
0 ticks processed @ 0.000 ticks/s
Tick 1000, at 2014-01-01 00:23:24.651000
1000 ticks processed @ 142.403 ticks/s
Tick 2000, at 2014-01-01 00:46:43.550000
2000 ticks processed @ 122.072 ticks/s
Tick 3000, at 2014-01-01 01:09:59.220000
3000 ticks processed @ 104.470 ticks/s
Tick 4000, at 2014-01-01 01:33:21.505000
4000 ticks processed @ 65.427 ticks/s
Tick 5000, at 2014-01-01 01:56:38.939000
5000 ticks processed @ 63.518 ticks/s
Tick 6000, at 2014-01-01 02:19:57.438000
6000 ticks processed @ 61.752 ticks/s
Tick 7000, at 2014-01-01 02:43:21.699000
7000 ticks processed @ 50.453 ticks/s
Tick 8000, at 2014-01-01 03:06:42.067000
8000 ticks processed @ 45.977 ticks/s
Tick 9000, at 2014-01-01 03:30:02.780000
9000 ticks processed @ 43.235 ticks/s
Tick 10000, at 2014-01-01 03:53:25.028000
10000 ticks processed @ 41.235 ticks/s
Tick 11000, at 2014-01-01 04:16:48.905000
11000 ticks processed @ 39.428 ticks/s
Tick 12000, at 2014-01-01 04:40:05.578000
12000 ticks processed @ 38.524 ticks/s
Tick 13000, at 2014-01-01 05:03:23.279000
13000 ticks processed @ 36.974 ticks/s
Tick 14000, at 2014-01-01 05:26:46.582000
14000 ticks processed @ 34.748 ticks/s
Tick 15000, at 2014-01-01 05:50:04.747000
15000 ticks processed @ 32.652 ticks/s

@ryankennedyio
Copy link
Contributor Author

ryankennedyio commented Jun 22, 2016

@femtotrader If you're testing that on the current master, it'll slow down over time due to the use of Pandas Series. Turns out Series is really bad at inserting data at indexes.

If you add my repo as a Remote Upstream and merge arithmetic-optimization into your branch you should see more like 1000 ticks/s

I guess @mhallsmoore will have to draw some distinction on what he wants "out-of-the-box" for users right away, without being overly complex or daunting (with regard to strategies etc). Having slow example strategies really doesn't matter, as long as the "core" system will support faster ones written by the user.

The thing I find most useful about this system is that it's designed in a really modular way -- I can just about plug and play anything I want into or out of it, while still being able to maintain dependancies on the rest of the codebase as it's updated.

@femtotrader
Copy link
Contributor

femtotrader commented Jun 23, 2016

Some useful profiling tools:

Running unit tests with nose-timer plugin

$ nosetests -s -v --with-timer

it's very easy to setup (just need to install it)

other tools (maybe more complex to setup)

vbench https://github.com/wesm/vbench
used by pandas
It produces nice (historical) graph http://pandas.pydata.org/pandas-docs/vbench/index.html
http://pandas.pydata.org/pandas-docs/vbench/vb_indexing.html

Airspeed Velocity https://github.com/spacetelescope/asv
it also produces nice historical graph http://droettboom.com/astropy-benchmark/

PS: pandas dev are considering moving from vbench to asv
see https://github.com/pydata/pandas/issues/9660 and https://github.com/pydata/pandas/pull/9715

@ryankennedyio
Copy link
Contributor Author

Ok, aiming to have this wrapped up this weekend. Main thing left is multiple dispatch and fix the numerous merge conflicts to the new master.

@mhallsmoore would be great if you could run through the open PR's and merge them into master when you get a tick, so this will slot right in when I'm done rather than doing another round of merge conflicts :)

@ryankennedyio
Copy link
Contributor Author

ryankennedyio commented Jul 3, 2016

Phew ! Took longer than expected to get up to speed with these new changes in the codebase. Very nice @femtotrader . Sadly only got to squeeze an hour in this weekend. Bah.

Not happy with how I'm using PRICE_MULTIPLIER and pulling config in too often (ryankennedyio@42ac63f), so need to pull in the above example of multiple dispatch. Very happy with speed now though <3

@femtotrader
Copy link
Contributor

How many BARS/s now or TICKS/s did you get with sample strategy ?
@ryankennedyio You might enable Travis on you side also
because https://travis-ci.org/ryankennedyio/qstrader raises "The repository at ryankennedyio/qstrader was not found."

@femtotrader
Copy link
Contributor

femtotrader commented Jul 3, 2016

Ideally I would prefer a solution with support of all these types:

  • float
  • integer and PRICE_MULTIPLIER
  • decimal

@ryankennedyio
Copy link
Contributor Author

ryankennedyio commented Jul 4, 2016

Good point, supporting all types seems like it will be fine with multiple dispatch. Very clean.

Buy and hold: 1600 BARS processed @ 6833.806070 BARS/s
MAC: 1600 BARS processed @ 4344.251263 BARS/s

@femtotrader
Copy link
Contributor

Maybe this issue can be closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants