Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: added positions computation in 'performance.create_pyfolio_input' #250

Merged
merged 2 commits into from
Feb 15, 2018

Conversation

luca-s
Copy link
Collaborator

@luca-s luca-s commented Jan 9, 2018

'performance.create_pyfolio_input' now computes positions too. Also it is now possible to select the 'period' to be used in benchmark computation and for factor returns/positions is now possible to select equal weighing instead of factor weighing.

@luca-s luca-s requested a review from twiecki January 9, 2018 13:18
@luca-s
Copy link
Collaborator Author

luca-s commented Jan 9, 2018

I noticed th pyfolio Exposure plot is empty. I am not sure if it is a pyfolio bug or if the data misses something. The positions are computed as percentage and the cash too, is this correct?

e.g.

asset | A | B | C | D | E | F | cash
-- | -- | -- | -- | -- | -- | -- | --

0.125 | 0.3750 | -0.125000 | -0.375 | 0.0000 | 0.000000 | 1.0
0.125 | 0.1875 | -0.062500 | -0.375 | 0.1875 | -0.062500 | 1.0
0.125 | 0.2500 | -0.083333 | -0.375 | 0.1250 | -0.041667 | 1.0
0.125 | 0.2500 | -0.083333 | -0.375 | 0.1250 | -0.041667 | 1.0
0.125 | 0.3750 | -0.125000 | -0.375 | 0.0000 | 0.000000 | 1.0
0.125 | 0.3750 | -0.125000 | -0.375 | 0.0000 | 0.000000 | 1.0
0.125 | 0.2500 | -0.083333 | -0.375 | 0.1250 | -0.041667 | 1.0


@luca-s
Copy link
Collaborator Author

luca-s commented Jan 9, 2018

If anybody has a better name for the new API (or the new internal functions) please let me know because I am not so happy about them but I couldn't think of anything better.
I also wonder if performance is the right place for create_pyfolio_input, or if it would be better inside utils or tears.

@luca-s
Copy link
Collaborator Author

luca-s commented Jan 9, 2018

I've just realized that positions must be in dollars. Only pyfolio'tears.create_perf_attrib_tear_sheet accepts positions both in dollars or percentages. That's a pity, I have to fix the positions computation then.

@mmargenot
Copy link
Contributor

mmargenot commented Jan 9, 2018

I think that utils might make more sense. A alphalens.tears.create_perf_attrib_tear_sheet might be a good wrapper for create_pyfolio_input -> pyfolio.tears.create_perf_attrib_tear_sheet, though.

@luca-s
Copy link
Collaborator Author

luca-s commented Jan 9, 2018

I like the idea of moving create_pyfolio_input to utils.

I thought about a wrapper too but I discarded the idea because it doesn't add anything useful and also we would have to keep updating the alphalens API to reflect the changes that happens on pyfolio. More importantly I don't like the idea of hiding pyfolio calls as it is interesting for the user to understand what functionality is called so that they can customize the calls for their needs (there are so many parameters in pyfolio tears functions). Let's see if calling pyfolio becomes more difficult in the future but as long as it is as simple as now we can keep it the way it is. What do you think?

@mmargenot
Copy link
Contributor

That makes sense to me. It's a case of doing the whole performance attribution in two lines vs. one line, which I think is okay to leave as two for now.

@luca-s luca-s force-pushed the pyfolio_integration branch 2 times, most recently from 60a9145 to ada4fff Compare January 10, 2018 12:01
@luca-s
Copy link
Collaborator Author

luca-s commented Jan 10, 2018

@twiecki it is ready to be reviewed. Positions are now compute as dollar amount instead of percentage. Actually Pyfolio results are identical to before so I wonder if we could stick to percentage positions as I like them more and also the users wouldn't be forced to provide an initial capital in create_pyfolio_input just to transform the positions from percentage to dollar amount

@twiecki
Copy link
Contributor

twiecki commented Jan 10, 2018

Really excited about this. But looking at the NB wondering if there is a bug, e.g.:
image

image

image

@luca-s
Copy link
Collaborator Author

luca-s commented Jan 10, 2018

I believe that's correct. This is the date when ES is the only short position in the portfolio:

image

Looking at the factor values for that date we can see that 'ES' has a factor value 3 order of magnitude bigger than the other values. This should explain what we are seeing

image

@twiecki
Copy link
Contributor

twiecki commented Jan 10, 2018

But shouldn't the logic just select the top and bottom n stocks? Seems like it's weighting by alpha signal.

@luca-s
Copy link
Collaborator Author

luca-s commented Jan 10, 2018

So the point of confusion is that we are asking for a portfolio that has these characteristics: long_short=True, equal_weight=True, quantiles=[1,5] and the user would expect to have long positions on quantile 5 and short positions on quantile 1, while the simulated portfolio contains only one short position.

The problem is that the code demeans the factor values and go long on the positive ones and short on the negative ones and then it computes equal weights. This is the cause of the confusion.

I need to think again about this behavior and makes sure it doesn't end up with this kind of inconsistencies. Thank you for spotting this out, I love when the bugs are found right away :)

@twiecki
Copy link
Contributor

twiecki commented Jan 11, 2018

The problem is that the code demeans the factor values and go long on the positive ones and short on the negative ones and then it computes equal weights. This is the cause of the confusion.

Not sure I understand yet what the problem really is. Shouldn't the 1 and 5 quantile have roughly the same number of stocks despite what weighting is used?

@luca-s
Copy link
Collaborator Author

luca-s commented Jan 11, 2018

Yes and I will fix that, it has to work as you say. I believe I was looking at the equal weighting from the wrong point of view. I used the factor values to decide what assets should be long and what short and then I compute the equal weighting. I actually have to use the quantile information to decide what asset should be in the short positions and what in the long ones.

@twiecki
Copy link
Contributor

twiecki commented Jan 11, 2018

One other idea for the future would be the ability to supply a custom weighting function. E.g. could see the case for equal weight, alpha weighted, inv vol etc.


from pandas.tseries.offsets import Day
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should that be BDay?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was in doubt. That is used only to save an error condition that should never happen (freq not set in the factor_data) so it's just a "safety belt" default value. I can switch that to BDay though

@luca-s luca-s force-pushed the pyfolio_integration branch 3 times, most recently from 55a9eb0 to 93e2872 Compare January 12, 2018 22:19
@luca-s
Copy link
Collaborator Author

luca-s commented Jan 12, 2018

The issue with the long/short weights should be fixed now and NB updated too.

By the way, the change I made to the weights computation is that factor values above the median become long positions, while factor values below the median become short positions. The previous behaviour was very similar except I used the mean instead of the median, that's why the extremely huge negative factor value for ES made it to be the only short position

@luca-s luca-s force-pushed the pyfolio_integration branch 3 times, most recently from d1bd17b to 63f7ea6 Compare January 13, 2018 18:18
@luca-s
Copy link
Collaborator Author

luca-s commented Jan 13, 2018

I also found the reason of the Exposure plot being blank, it turned out to be a pyfolio bug.

@twiecki
Copy link
Contributor

twiecki commented Jan 14, 2018

By the way, the change I made to the weights computation is that factor values above the median become long positions, while factor values below the median become short positions. The previous behaviour was very similar except I used the mean instead of the median, that's why the extremely huge negative factor value for ES made it to be the only short position

That makes sense. Ideally I think that would be configurable as well, e.g.: lower_percentile and upper_percentile.

@twiecki
Copy link
Contributor

twiecki commented Jan 14, 2018

Also, seems like sub-sampling doesn't quite do what we want: ideally we wouldn't exit the positions but hold them for the whole week. I suppose one would need to have the same signal for all days in the week to achieve that.

@luca-s
Copy link
Collaborator Author

luca-s commented Jan 15, 2018

That makes sense. Ideally I think that would be configurable as well, e.g.: lower_percentile and upper_percentile.

For now it's possible to choose which quantile to use, eventually we can add the percentile option if the quantile configuration is not flexible enough.

Also, seems like sub-sampling doesn't quite do what we want: ideally we wouldn't exit the positions but hold them for the whole week. I suppose one would need to have the same signal for all days in the week to achieve that.

The portfolio is holding the positions for 1 day because the code calls create_pyfolio_input(period='1D', ... ). If we switched period to '5D', which is one of the periods computed by get_clean_factor_and_forward_returns, the position would be held for 5 days. The reason I used '1D' is I didn't find a good example to use the 5 days period. I didn't want to give the misleading idea that there is a good reason to trade a 5 days signal every 5 days.

I believe that rebalancing every 5 days is not the best way of trading a 5 days signal. A better way to do that would be to trade 1/5 th of the portfolio every subsequent day and rebalance each 1/5th portfolio every 5 days. This would result in the same transaction cost, but the slippage impact would be 1/5th, the portfolio capacity would be 5 times bigger, the volatility of the portfolio would be lower, the factor would be traded every single day making it more statistically robust and independent of the starting day.

I can still modify the NB to show the usage of 5 days period traded every Monday, except I need a good excuse to show that.

@twiecki
Copy link
Contributor

twiecki commented Jan 15, 2018

If the quantile is already used, where does the median (or mean) value come in when building the portfolio?

The holding period question is tricky indeed. Although I think trading a 5-day signal every 5 days is a pretty simple method to go with as a default.

@luca-s
Copy link
Collaborator Author

luca-s commented Jan 16, 2018

If the quantile is already used, where does the median (or mean) value come in when building the portfolio?

I am not sure I understand your question. This is how I implemented it: the option quantiles of create_pyfolio_input function selects the quantiles that will be used in the portfolio. The assets belonging to those selected quantiles become long positions if their factor values are above the median and short positions otherwise. This ensures the same number of assets in long and short positions. The user can choose what quantiles to use to increase or decrease the number of assets traded (e.g. quantiles[1,5] vs quantiles=[1,2,4,5]). That's not exactly how choosing the percentile but it's something.

The holding period question is tricky indeed. Although I think trading a 5-day signal every 5 days is a pretty simple method to go with as a default.

Ok then, I'll update the NB.

@twiecki
Copy link
Contributor

twiecki commented Jan 16, 2018

Oh I see. So first you select whatever quantiles the user specified, e.g. [1, 2, 5] (which makes no sense) and then you do a median split inside that selection. So the algo would go long from 2.5 to 5 and short on 1 to 2.5 (there are no 2.5s but it's based on the actual values). Correct?

@luca-s
Copy link
Collaborator Author

luca-s commented Jan 16, 2018

Exactly but please let me know if you have a better idea. Eventually I'd like to add your idea of a custom weighting function though, so the users can do what they like

@twiecki
Copy link
Contributor

twiecki commented Jan 16, 2018

OK, that makes sense. An alternative would be to require the user to specify long_quantiles=[4, 5], short_quantiles=[1, 2] to make it explicit. Although I think the current one is simpler and probably foolproof as well.

NB looks great too. I will try to review or find someone to review the code in more detail.

@richafrank Do you know of someone who could help review this new feature?

@luca-s
Copy link
Collaborator Author

luca-s commented Jan 16, 2018

Making the long/short quantiles explicit would be nicer but then we would still need the quantiles option to handle the factor weighted scenario, where the factor value implies the long/short positions.So to avoid the proliferation of too many function arguments I chose this path.

@twiecki
Copy link
Contributor

twiecki commented Jan 23, 2018

Ping @richafrank.

@richafrank
Copy link
Member

Thanks for the ping. Sorry I lost track of this. Will find someone!

@luca-s luca-s mentioned this pull request Jan 31, 2018
@@ -379,6 +386,11 @@ def cumulative_returns(returns, period, freq=None):
if freq is None:
freq = returns.index.freq

if freq is None:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might consider combining this with the if statement on line 387.

Copy link
Collaborator Author

@luca-s luca-s Feb 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean moving that "if" inside the previous one? uhmm, I don't like too many levels of indentation

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just that both if statements are evaluating freq is None so the statements within each if statement can be combined into one block.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would change the logic as freq is changed inside the first "if" statement

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! Good point!

if freq is None:
freq = weights.index.freq

if freq is None:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment about this if statement and line 528

group = group.copy()

if _equal_weight:
if _demeaned:
# top assets positive weights, bottom ones negative

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using median instead of mean is slightly confusing. It might need further clarification. Maybe update the docstring to "...If demeaned is True then the factor universe will be split in two equal sized groups,..."?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like "equal sized groups". I will update that, thanks.

@twiecki
Copy link
Contributor

twiecki commented Feb 8, 2018

Waiting on @prsutherland's sign-off before merging.

@luca-s
Copy link
Collaborator Author

luca-s commented Feb 13, 2018

@prsutherland any more comments on this PR?

@twiecki
Copy link
Contributor

twiecki commented Feb 15, 2018

OK, I think this went through some solid review. Going to merge this -- really cool feature @luca-s!

@twiecki twiecki merged commit b9fdc8f into quantopian:master Feb 15, 2018
@luca-s luca-s deleted the pyfolio_integration branch February 15, 2018 14:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants