ENH: added positions computation in 'performance.create_pyfolio_input' #250

luca-s · 2018-01-09T13:07:19Z

'performance.create_pyfolio_input' now computes positions too. Also it is now possible to select the 'period' to be used in benchmark computation and for factor returns/positions is now possible to select equal weighing instead of factor weighing.

luca-s · 2018-01-09T13:22:13Z

I noticed th pyfolio Exposure plot is empty. I am not sure if it is a pyfolio bug or if the data misses something. The positions are computed as percentage and the cash too, is this correct?

e.g.

asset | A | B | C | D | E | F | cash
-- | -- | -- | -- | -- | -- | -- | --

0.125 | 0.3750 | -0.125000 | -0.375 | 0.0000 | 0.000000 | 1.0
0.125 | 0.1875 | -0.062500 | -0.375 | 0.1875 | -0.062500 | 1.0
0.125 | 0.2500 | -0.083333 | -0.375 | 0.1250 | -0.041667 | 1.0
0.125 | 0.2500 | -0.083333 | -0.375 | 0.1250 | -0.041667 | 1.0
0.125 | 0.3750 | -0.125000 | -0.375 | 0.0000 | 0.000000 | 1.0
0.125 | 0.3750 | -0.125000 | -0.375 | 0.0000 | 0.000000 | 1.0
0.125 | 0.2500 | -0.083333 | -0.375 | 0.1250 | -0.041667 | 1.0

luca-s · 2018-01-09T13:54:45Z

If anybody has a better name for the new API (or the new internal functions) please let me know because I am not so happy about them but I couldn't think of anything better.
I also wonder if performance is the right place for create_pyfolio_input, or if it would be better inside utils or tears.

luca-s · 2018-01-09T15:32:26Z

I've just realized that positions must be in dollars. Only pyfolio'tears.create_perf_attrib_tear_sheet accepts positions both in dollars or percentages. That's a pity, I have to fix the positions computation then.

mmargenot · 2018-01-09T17:17:56Z

I think that utils might make more sense. A alphalens.tears.create_perf_attrib_tear_sheet might be a good wrapper for create_pyfolio_input -> pyfolio.tears.create_perf_attrib_tear_sheet, though.

luca-s · 2018-01-09T19:06:32Z

I like the idea of moving create_pyfolio_input to utils.

I thought about a wrapper too but I discarded the idea because it doesn't add anything useful and also we would have to keep updating the alphalens API to reflect the changes that happens on pyfolio. More importantly I don't like the idea of hiding pyfolio calls as it is interesting for the user to understand what functionality is called so that they can customize the calls for their needs (there are so many parameters in pyfolio tears functions). Let's see if calling pyfolio becomes more difficult in the future but as long as it is as simple as now we can keep it the way it is. What do you think?

mmargenot · 2018-01-09T19:51:04Z

That makes sense to me. It's a case of doing the whole performance attribution in two lines vs. one line, which I think is okay to leave as two for now.

luca-s · 2018-01-10T12:06:17Z

@twiecki it is ready to be reviewed. Positions are now compute as dollar amount instead of percentage. Actually Pyfolio results are identical to before so I wonder if we could stick to percentage positions as I like them more and also the users wouldn't be forced to provide an initial capital in create_pyfolio_input just to transform the positions from percentage to dollar amount

twiecki · 2018-01-10T12:15:17Z

Really excited about this. But looking at the NB wondering if there is a bug, e.g.:

luca-s · 2018-01-10T13:06:44Z

I believe that's correct. This is the date when ES is the only short position in the portfolio:

Looking at the factor values for that date we can see that 'ES' has a factor value 3 order of magnitude bigger than the other values. This should explain what we are seeing

twiecki · 2018-01-10T14:53:08Z

But shouldn't the logic just select the top and bottom n stocks? Seems like it's weighting by alpha signal.

luca-s · 2018-01-10T15:38:45Z

So the point of confusion is that we are asking for a portfolio that has these characteristics: long_short=True, equal_weight=True, quantiles=[1,5] and the user would expect to have long positions on quantile 5 and short positions on quantile 1, while the simulated portfolio contains only one short position.

The problem is that the code demeans the factor values and go long on the positive ones and short on the negative ones and then it computes equal weights. This is the cause of the confusion.

I need to think again about this behavior and makes sure it doesn't end up with this kind of inconsistencies. Thank you for spotting this out, I love when the bugs are found right away :)

twiecki · 2018-01-11T09:56:33Z

The problem is that the code demeans the factor values and go long on the positive ones and short on the negative ones and then it computes equal weights. This is the cause of the confusion.

Not sure I understand yet what the problem really is. Shouldn't the 1 and 5 quantile have roughly the same number of stocks despite what weighting is used?

luca-s · 2018-01-11T10:16:18Z

Yes and I will fix that, it has to work as you say. I believe I was looking at the equal weighting from the wrong point of view. I used the factor values to decide what assets should be long and what short and then I compute the equal weighting. I actually have to use the quantile information to decide what asset should be in the short positions and what in the long ones.

twiecki · 2018-01-11T10:49:21Z

One other idea for the future would be the ability to supply a custom weighting function. E.g. could see the case for equal weight, alpha weighted, inv vol etc.

twiecki · 2018-01-12T11:43:20Z

alphalens/performance.py


+from pandas.tseries.offsets import Day


Should that be BDay?

I was in doubt. That is used only to save an error condition that should never happen (freq not set in the factor_data) so it's just a "safety belt" default value. I can switch that to BDay though

luca-s · 2018-01-12T22:24:34Z

The issue with the long/short weights should be fixed now and NB updated too.

By the way, the change I made to the weights computation is that factor values above the median become long positions, while factor values below the median become short positions. The previous behaviour was very similar except I used the mean instead of the median, that's why the extremely huge negative factor value for ES made it to be the only short position

luca-s · 2018-01-13T20:37:23Z

I also found the reason of the Exposure plot being blank, it turned out to be a pyfolio bug.

twiecki · 2018-01-14T20:54:33Z

By the way, the change I made to the weights computation is that factor values above the median become long positions, while factor values below the median become short positions. The previous behaviour was very similar except I used the mean instead of the median, that's why the extremely huge negative factor value for ES made it to be the only short position

That makes sense. Ideally I think that would be configurable as well, e.g.: lower_percentile and upper_percentile.

twiecki · 2018-01-14T20:59:51Z

Also, seems like sub-sampling doesn't quite do what we want: ideally we wouldn't exit the positions but hold them for the whole week. I suppose one would need to have the same signal for all days in the week to achieve that.

luca-s · 2018-01-15T11:23:01Z

That makes sense. Ideally I think that would be configurable as well, e.g.: lower_percentile and upper_percentile.

For now it's possible to choose which quantile to use, eventually we can add the percentile option if the quantile configuration is not flexible enough.

Also, seems like sub-sampling doesn't quite do what we want: ideally we wouldn't exit the positions but hold them for the whole week. I suppose one would need to have the same signal for all days in the week to achieve that.

The portfolio is holding the positions for 1 day because the code calls create_pyfolio_input(period='1D', ... ). If we switched period to '5D', which is one of the periods computed by get_clean_factor_and_forward_returns, the position would be held for 5 days. The reason I used '1D' is I didn't find a good example to use the 5 days period. I didn't want to give the misleading idea that there is a good reason to trade a 5 days signal every 5 days.

I believe that rebalancing every 5 days is not the best way of trading a 5 days signal. A better way to do that would be to trade 1/5 th of the portfolio every subsequent day and rebalance each 1/5th portfolio every 5 days. This would result in the same transaction cost, but the slippage impact would be 1/5th, the portfolio capacity would be 5 times bigger, the volatility of the portfolio would be lower, the factor would be traded every single day making it more statistically robust and independent of the starting day.

I can still modify the NB to show the usage of 5 days period traded every Monday, except I need a good excuse to show that.

twiecki · 2018-01-15T16:38:38Z

If the quantile is already used, where does the median (or mean) value come in when building the portfolio?

The holding period question is tricky indeed. Although I think trading a 5-day signal every 5 days is a pretty simple method to go with as a default.

luca-s · 2018-01-16T09:48:42Z

If the quantile is already used, where does the median (or mean) value come in when building the portfolio?

I am not sure I understand your question. This is how I implemented it: the option quantiles of create_pyfolio_input function selects the quantiles that will be used in the portfolio. The assets belonging to those selected quantiles become long positions if their factor values are above the median and short positions otherwise. This ensures the same number of assets in long and short positions. The user can choose what quantiles to use to increase or decrease the number of assets traded (e.g. quantiles[1,5] vs quantiles=[1,2,4,5]). That's not exactly how choosing the percentile but it's something.

The holding period question is tricky indeed. Although I think trading a 5-day signal every 5 days is a pretty simple method to go with as a default.

Ok then, I'll update the NB.

twiecki · 2018-01-16T11:35:55Z

Oh I see. So first you select whatever quantiles the user specified, e.g. [1, 2, 5] (which makes no sense) and then you do a median split inside that selection. So the algo would go long from 2.5 to 5 and short on 1 to 2.5 (there are no 2.5s but it's based on the actual values). Correct?

luca-s · 2018-01-16T11:39:48Z

Exactly but please let me know if you have a better idea. Eventually I'd like to add your idea of a custom weighting function though, so the users can do what they like

twiecki · 2018-01-16T11:42:23Z

OK, that makes sense. An alternative would be to require the user to specify long_quantiles=[4, 5], short_quantiles=[1, 2] to make it explicit. Although I think the current one is simpler and probably foolproof as well.

NB looks great too. I will try to review or find someone to review the code in more detail.

@richafrank Do you know of someone who could help review this new feature?

luca-s · 2018-01-16T13:14:30Z

Making the long/short quantiles explicit would be nicer but then we would still need the quantiles option to handle the factor weighted scenario, where the factor value implies the long/short positions.So to avoid the proliferation of too many function arguments I chose this path.

twiecki · 2018-01-23T14:59:32Z

Ping @richafrank.

richafrank · 2018-01-23T17:50:34Z

Thanks for the ping. Sorry I lost track of this. Will find someone!

prsutherland · 2018-01-29T23:23:01Z

alphalens/performance.py

@@ -379,6 +386,11 @@ def cumulative_returns(returns, period, freq=None):
    if freq is None:
        freq = returns.index.freq

+    if freq is None:


You might consider combining this with the if statement on line 387.

You mean moving that "if" inside the previous one? uhmm, I don't like too many levels of indentation

Just that both if statements are evaluating freq is None so the statements within each if statement can be combined into one block.

That would change the logic as freq is changed inside the first "if" statement

Ah! Good point!

prsutherland · 2018-01-29T23:24:51Z

alphalens/performance.py

+    if freq is None:
+        freq = weights.index.freq
+
+    if freq is None:


Same comment about this if statement and line 528

prsutherland · 2018-02-05T21:59:26Z

alphalens/performance.py

            group = group.copy()

-        if _equal_weight:
+            if _demeaned:
+                # top assets positive weights, bottom ones negative


Using median instead of mean is slightly confusing. It might need further clarification. Maybe update the docstring to "...If demeaned is True then the factor universe will be split in two equal sized groups,..."?

I like "equal sized groups". I will update that, thanks.

Plus it is now possible to select the 'period' to be used in benchmark computation

twiecki · 2018-02-08T09:39:01Z

Waiting on @prsutherland's sign-off before merging.

luca-s · 2018-02-13T11:00:43Z

@prsutherland any more comments on this PR?

twiecki · 2018-02-15T13:04:37Z

OK, I think this went through some solid review. Going to merge this -- really cool feature @luca-s!

luca-s requested a review from twiecki January 9, 2018 13:18

luca-s force-pushed the pyfolio_integration branch from 2f8a03d to 459e4ee Compare January 9, 2018 13:42

luca-s force-pushed the pyfolio_integration branch 2 times, most recently from 60a9145 to ada4fff Compare January 10, 2018 12:01

luca-s force-pushed the pyfolio_integration branch from ada4fff to e9d9309 Compare January 10, 2018 12:10

luca-s force-pushed the pyfolio_integration branch from d33335c to e9d9309 Compare January 10, 2018 12:32

luca-s force-pushed the pyfolio_integration branch from e9d9309 to b718b81 Compare January 10, 2018 13:38

luca-s force-pushed the pyfolio_integration branch from b718b81 to 8d95794 Compare January 11, 2018 16:43

twiecki reviewed Jan 12, 2018

View reviewed changes

luca-s force-pushed the pyfolio_integration branch 3 times, most recently from 55a9eb0 to 93e2872 Compare January 12, 2018 22:19

luca-s force-pushed the pyfolio_integration branch 3 times, most recently from d1bd17b to 63f7ea6 Compare January 13, 2018 18:18

luca-s force-pushed the pyfolio_integration branch from 63f7ea6 to ad0980e Compare January 13, 2018 20:28

luca-s force-pushed the pyfolio_integration branch from ad0980e to f2c7a95 Compare January 16, 2018 10:37

luca-s mentioned this pull request Jan 31, 2018

Release v0.3.0 #261

Closed

prsutherland reviewed Feb 5, 2018

View reviewed changes

luca-s added 2 commits February 6, 2018 12:33

ENH: added positions computation to performance.create_pyfolio_input

925324e

Plus it is now possible to select the 'period' to be used in benchmark computation

DOC: updated "pyfolio integration" NB to reflect new API

b99d89b

luca-s force-pushed the pyfolio_integration branch from f2c7a95 to b99d89b Compare February 6, 2018 11:33

twiecki assigned prsutherland Feb 8, 2018

twiecki merged commit b9fdc8f into quantopian:master Feb 15, 2018

luca-s deleted the pyfolio_integration branch February 15, 2018 14:11

luca-s mentioned this pull request Feb 15, 2018

Integration with pyfolio and Quantopian's new Risk Model #225

Closed

ENH: added positions computation in 'performance.create_pyfolio_input' #250

ENH: added positions computation in 'performance.create_pyfolio_input' #250

Conversation

luca-s commented Jan 9, 2018 • edited Loading

luca-s commented Jan 9, 2018 • edited Loading

luca-s commented Jan 9, 2018 • edited Loading

luca-s commented Jan 9, 2018

mmargenot commented Jan 9, 2018 • edited Loading

luca-s commented Jan 9, 2018

mmargenot commented Jan 9, 2018

luca-s commented Jan 10, 2018 • edited Loading

twiecki commented Jan 10, 2018

luca-s commented Jan 10, 2018

twiecki commented Jan 10, 2018

luca-s commented Jan 10, 2018 • edited Loading

twiecki commented Jan 11, 2018

luca-s commented Jan 11, 2018

twiecki commented Jan 11, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

luca-s commented Jan 12, 2018 • edited Loading

luca-s commented Jan 13, 2018

twiecki commented Jan 14, 2018

twiecki commented Jan 14, 2018 • edited Loading

luca-s commented Jan 15, 2018

twiecki commented Jan 15, 2018

luca-s commented Jan 16, 2018

twiecki commented Jan 16, 2018

luca-s commented Jan 16, 2018

twiecki commented Jan 16, 2018

luca-s commented Jan 16, 2018 • edited Loading

twiecki commented Jan 23, 2018

richafrank commented Jan 23, 2018

Choose a reason for hiding this comment

luca-s Feb 6, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

twiecki commented Feb 8, 2018

luca-s commented Feb 13, 2018

twiecki commented Feb 15, 2018

luca-s commented Jan 9, 2018 •

edited

Loading

luca-s commented Jan 9, 2018 •

edited

Loading

luca-s commented Jan 9, 2018 •

edited

Loading

mmargenot commented Jan 9, 2018 •

edited

Loading

luca-s commented Jan 10, 2018 •

edited

Loading

luca-s commented Jan 10, 2018 •

edited

Loading

luca-s commented Jan 12, 2018 •

edited

Loading

twiecki commented Jan 14, 2018 •

edited

Loading

luca-s commented Jan 16, 2018 •

edited

Loading

luca-s Feb 6, 2018 •

edited

Loading