Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bin edges must be unique #131

Closed
DMTSource opened this issue Jan 30, 2017 · 3 comments
Closed

Bin edges must be unique #131

DMTSource opened this issue Jan 30, 2017 · 3 comments

Comments

@DMTSource
Copy link

Trying to use Alphalens on Quantopian Research. But I frequently get this error when trying to create a factor tear sheet:
"ValueError: Bin edges must be unique: array([ 1. , 1.42857143, 1.85714286, 2.42857143, 3.07142857, 3.5 , 3.5 , 3.5 ])"
`

ValueError Traceback (most recent call last)
in ()
2 prices=pricing,
3 quantiles=7,
----> 4 periods=(1,5,10,20))

/usr/local/lib/python2.7/dist-packages/alphalens/plotting.pyc in call_w_context(*args, **kwargs)
40 # sns.set_style("whitegrid")
41 sns.despine(left=True)
---> 42 return func(*args, **kwargs)
43 else:
44 return func(*args, **kwargs)

/usr/local/lib/python2.7/dist-packages/alphalens/tears.pyc in create_factor_tear_sheet(factor, prices, groupby, show_groupby_plots, periods, quantiles, filter_zscore, groupby_labels, long_short, avgretplot, turnover_for_all_periods)
124 quantile_factor = perf.quantize_factor(factor,
125 by_group=False,
--> 126 quantiles=quantiles)
127
128 def compound_returns(period_ret):

/usr/local/lib/python2.7/dist-packages/alphalens/performance.pyc in quantize_factor(factor, quantiles, by_group)
253
254 factor_percentile = factor.groupby(level=grouper)
--> 255 factor_quantile = factor_percentile.apply(quantile_calc, quantiles=quantiles)
256 factor_quantile.name = 'quantile'
257

/usr/local/lib/python2.7/dist-packages/pandas/core/groupby.pyc in apply(self, func, *args, **kwargs)
649 # ignore SettingWithCopy here in case the user mutates
650 with option_context('mode.chained_assignment', None):
--> 651 return self._python_apply_general(f)
652
653 def _python_apply_general(self, f):

/usr/local/lib/python2.7/dist-packages/pandas/core/groupby.pyc in _python_apply_general(self, f)
653 def _python_apply_general(self, f):
654 keys, values, mutated = self.grouper.apply(f, self._selected_obj,
--> 655 self.axis)
656
657 return self._wrap_applied_output(

/usr/local/lib/python2.7/dist-packages/pandas/core/groupby.pyc in apply(self, f, data, axis)
1525 # group might be modified
1526 group_axes = _get_axes(group)
-> 1527 res = f(group)
1528 if not _is_indexed_like(res, group_axes):
1529 mutated = True

/usr/local/lib/python2.7/dist-packages/pandas/core/groupby.pyc in f(g)
645 @wraps(func)
646 def f(g):
--> 647 return func(g, *args, **kwargs)
648
649 # ignore SettingWithCopy here in case the user mutates

/usr/local/lib/python2.7/dist-packages/alphalens/performance.pyc in quantile_calc(x, quantiles)
248
249 def quantile_calc(x, quantiles):
--> 250 return pd.qcut(x, quantiles, labels=False) + 1
251
252 grouper = ['date', 'group'] if by_group else ['date']

/usr/local/lib/python2.7/dist-packages/pandas/tools/tile.pyc in qcut(x, q, labels, retbins, precision)
171 bins = algos.quantile(x, quantiles)
172 return _bins_to_cuts(x, bins, labels=labels, retbins=retbins,
--> 173 precision=precision, include_lowest=True)
174
175

/usr/local/lib/python2.7/dist-packages/pandas/tools/tile.pyc in _bins_to_cuts(x, bins, right, labels, retbins, precision, name, include_lowest)
190
191 if len(algos.unique(bins)) < len(bins):
--> 192 raise ValueError('Bin edges must be unique: %s' % repr(bins))
193
194 if include_lowest:

ValueError: Bin edges must be unique: array([ 1. , 1.42857143, 1.85714286, 2.42857143, 3.07142857,
3.5 , 3.5 , 3.5 ])

<matplotlib.figure.Figure at 0x7f27916950d0>`

@luca-s
Copy link
Collaborator

luca-s commented Jan 30, 2017

This is due to pandas.qcut implementation that doesn't have a workaround for data/quantiles combinations that result in non-unique bin edges. There is a fix in pandas 0.20.0 but we have to wait before making use of that on Alphalens.

Anyway, what you could do is to decrease the number of quantiles or to use the "bin" option and to set "quantiles=None". For details see here.

Let me know if this helps.

@luca-s
Copy link
Collaborator

luca-s commented Feb 1, 2017

I forgot to add that another workaround would be to rank the factor and pass that to alphalens ( factor.rank(method='first') ) instead of the plain factor. Anyway, I am closing this as it has been discussed in #87, #114 and #104 already.

@luca-s luca-s closed this as completed Feb 1, 2017
@0xstochastic
Copy link

using .rank(method='first') when qcut gave error "Bin edges must be unique:" just saved me in a context outside of Alphalens. Thanks @luca-s 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants