New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Add MACD, MA, and AnnVol as built in factors #1588

Merged
merged 11 commits into from Nov 28, 2016

Conversation

Projects
None yet
6 participants
@ahgnaw
Contributor

ahgnaw commented Nov 9, 2016

These factors will be used in Quabbin and might be useful to everyone in the community.

inputs = [USEquityPricing.close]
def compute(self, today, assets, out, closes):
out[:] = nanstd(closes, ddof=1, axis=0) * (252 ** 0.5)

This comment has been minimized.

@ssanderson

ssanderson Nov 9, 2016

Member

We run with trading calendars other than NYSE in zipline. I don't think hard-coding 252 makes sense here.

@coveralls

This comment has been minimized.

coveralls commented Nov 9, 2016

Coverage Status

Coverage decreased (-0.06%) to 86.932% when pulling acc969a on randc-built-in-factors into b309e3c on master.

**Default Inputs:** None
**Default Window Length:** None
"""
def compute(self, today, assets, out, data):

This comment has been minimized.

@richafrank

richafrank Nov 9, 2016

Member

I think this exists already as SimpleMovingAverage!

@coveralls

This comment has been minimized.

coveralls commented Nov 9, 2016

Coverage Status

Coverage decreased (-0.07%) to 86.927% when pulling 57827f2 on randc-built-in-factors into b309e3c on master.

@ahgnaw ahgnaw force-pushed the randc-built-in-factors branch 3 times, most recently from 9ef7ee7 to 0959ed5 Nov 14, 2016

@coveralls

This comment has been minimized.

coveralls commented Nov 15, 2016

Coverage Status

Coverage decreased (-0.4%) to 86.648% when pulling 0959ed5 on randc-built-in-factors into de25694 on master.

@ahgnaw ahgnaw force-pushed the randc-built-in-factors branch from 0959ed5 to e44e5cc Nov 17, 2016

@coveralls

This comment has been minimized.

coveralls commented Nov 17, 2016

Coverage Status

Coverage decreased (-0.04%) to 86.985% when pulling e44e5cc on randc-built-in-factors into 3c94ee7 on master.

@ahgnaw ahgnaw force-pushed the randc-built-in-factors branch from e44e5cc to c7443d0 Nov 17, 2016

slow_period : int >= 0, <= window_length
The window length for the "slow" EMA.
signal_period' : int >= 0, <= slow_period
The window length for the signal line.

This comment has been minimized.

@dmichalowicz

dmichalowicz Nov 17, 2016

Contributor

I might include in here the default values for these parameters (12, 26, 9), or say that they're optional

@coveralls

This comment has been minimized.

coveralls commented Nov 17, 2016

Coverage Status

Coverage increased (+0.01%) to 86.992% when pulling c7443d0 on randc-built-in-factors into 515db18 on master.

*args, **kwargs
)
def calculate_ewa(self, data, length):

This comment has been minimized.

@dmichalowicz

dmichalowicz Nov 17, 2016

Contributor

Is there a reason you call it "EWA" here but "EMA" above? (assuming both of these mean exponentially weighted moving average)

**Default Inputs:** :data:`zipline.pipeline.data.USEquityPricing.close`
"""
inputs = [USEquityPricing.close]

This comment has been minimized.

@dmichalowicz

dmichalowicz Nov 17, 2016

Contributor

According to the wikipedia link cited: "In finance, volatility is the degree of variation of a trading price series over time as measured by the standard deviation of returns." Would this default value be better off as Returns?

This comment has been minimized.

@ssanderson

ssanderson Nov 17, 2016

Member

👍 on this question. Calculating this as stddev of price seems incorrect to me.

@@ -62,4 +64,6 @@
'TrueRange',
'VWAP',
'WeightedAverageValue',
'MovingAverageConvergenceDivergence',
'MACD'

This comment has been minimized.

@dmichalowicz

dmichalowicz Nov 17, 2016

Contributor

AnnualVolatility was not added here

params = {'annualization_factor': 252}
def compute(self, today, assets, out, closes, annualization_factor):
out[:] = nanstd(closes, ddof=1, axis=0) * (annualization_factor ** 0.5)

This comment has been minimized.

@dmichalowicz

dmichalowicz Nov 17, 2016

Contributor

you want axis=1 here. axis=0 would calculate the standard deviation between stocks on a day

This comment has been minimized.

@dmichalowicz

dmichalowicz Nov 17, 2016

Contributor

also ddof can be 0

out.hist[:] = hist
class AnnualVolatility(CustomFactor):

This comment has been minimized.

@dmichalowicz

dmichalowicz Nov 17, 2016

Contributor

AnnualizedVolatility is more clear here

axis=1,
weights=self.weights(length, decay_rate))
def rolling_window(self, a, window):

This comment has been minimized.

@dmichalowicz

dmichalowicz Nov 17, 2016

Contributor

this can use rolling_windowin zipline.utils.numpy_utils

self.params['signal_period'])
hist = macd[-1] - signal_line
return macd[-1], signal_line[-1], hist[-1]
except:

This comment has been minimized.

@ssanderson

ssanderson Nov 17, 2016

Member

We should only be catching the exceptions we expect to see here.

@ahgnaw ahgnaw force-pushed the randc-built-in-factors branch 2 times, most recently from 55e902a to 108530b Nov 21, 2016

@ahgnaw ahgnaw force-pushed the randc-built-in-factors branch from 108530b to 435d5ac Nov 21, 2016

@coveralls

This comment has been minimized.

coveralls commented Nov 21, 2016

Coverage Status

Coverage increased (+0.03%) to 87.147% when pulling 435d5ac on randc-built-in-factors into f3a36fe on master.

@coveralls

This comment has been minimized.

coveralls commented Nov 21, 2016

Coverage Status

Coverage increased (+0.03%) to 87.147% when pulling 435d5ac on randc-built-in-factors into f3a36fe on master.

@coveralls

This comment has been minimized.

coveralls commented Nov 21, 2016

Coverage Status

Coverage increased (+0.03%) to 87.147% when pulling 435d5ac on randc-built-in-factors into f3a36fe on master.

out = np.empty(shape=(nassets,), dtype=np.float64)
ann_vol.compute(today, assets, out, returns, 252)
mean = returns.sum(axis=0) / returns.shape[0]

This comment has been minimized.

@dmichalowicz

dmichalowicz Nov 22, 2016

Contributor

This could be np.mean(returns, axis=0)

window_length = 252
def compute(self, today, assets, out, returns, annualization_factor):
out[:] = nanstd(returns, ddof=0, axis=0) * (annualization_factor ** .5)

This comment has been minimized.

@dmichalowicz

dmichalowicz Nov 22, 2016

Contributor

ddof is zero by default

This comment has been minimized.

@ahgnaw

ahgnaw Nov 22, 2016

Contributor

I wanted to make it explicit since we use ddof=1 in empyrical.

This comment has been minimized.

@ssanderson

ssanderson Nov 22, 2016

Member

@ahgnaw, if we think that ddof=0 is correct here, should we change empyrical? I think they're being used in the same way.

This comment has been minimized.

@twiecki

twiecki Nov 23, 2016

Contributor

Shouldn't we use empyrical for this computation in the first place?

This comment has been minimized.

@ahgnaw

ahgnaw Nov 23, 2016

Contributor

We should, I have a pr for changing empyrical to use ddof=0 consistently. Zipline regression tests will have to be updated so in the meantime I will leave this defined here with the intention of changing to empyrical later.

ann_vol.compute(today, assets, out, returns, 252)
expected_vol = np.array([0] * nassets)

This comment has been minimized.

@dmichalowicz

dmichalowicz Nov 22, 2016

Contributor

This could be np.zeros(nassets)

)
expected_macd = np.array([0.01691553] * nassets)
expected_signal = np.array([0.01691553] * nassets)

This comment has been minimized.

@dmichalowicz

dmichalowicz Nov 22, 2016

Contributor

How did you come up with this constant (0.01691553) here?

This comment has been minimized.

@ahgnaw

ahgnaw Nov 22, 2016

Contributor

Good point, I'll use a different (less efficient) method to calculate MACD and check against that.

@dmichalowicz

This comment has been minimized.

Contributor

dmichalowicz commented Nov 22, 2016

Had a couple small things but otherwise looks good to me. @ssanderson Did you have any other comments on this?

@@ -683,3 +679,115 @@ def compute(self, today, assets, out, highs, lows, closes):
)),
2
)
class MovingAverageConvergenceDivergence(_ExponentialWeightedFactor):

This comment has been minimized.

@ssanderson

ssanderson Nov 22, 2016

Member

Why is this inheriting from _ExponentialWeightedFactor I don't think it will work if you call any of the inherited classmethods.

This comment has been minimized.

@ahgnaw

ahgnaw Nov 22, 2016

Contributor

self.weights() is used for calculate_ewma, just realized it's a static method

This comment has been minimized.

@ssanderson

ssanderson Nov 22, 2016

Member

We should just pull that into a free function.

self.params['fast_period'])
macd = fast_EWMA - slow_EWMA
signal_line = self.calculate_ewma(
macd.reshape(-1, self.params['signal_period']),

This comment has been minimized.

@ssanderson

ssanderson Nov 22, 2016

Member

What does the reshape(-1 do here?

def compute(self, today, assets, out, close, fast_period, slow_period,
signal_period):
macd, sig, hist = zip(*map(self.calculate_macd, close.T))

This comment has been minimized.

@ssanderson

ssanderson Nov 22, 2016

Member

Why are calling calclulate_macd separately on each column of close? It seems like we should be able to do it vectorized on the whole 2d array?

This comment has been minimized.

@ahgnaw

ahgnaw Nov 23, 2016

Contributor

GP

slow_EWMA = self.calculate_ewma(
rolling_window(
col,
self.params['slow_period']

This comment has been minimized.

@ssanderson

ssanderson Nov 22, 2016

Member

We should pull out all the params values once instead of multiple times per value.

trading days per year, 252.
"""
inputs = [Returns(window_length=2)]
params = {'annualization_factor': 252}

This comment has been minimized.

@ssanderson

ssanderson Nov 22, 2016

Member

When would you not want annualization_factor to match the window length? If we want to make the length configurable, more natural to me would just be to have:

class Volatility(CustomFactor):
    inputs = [Returns(...)]

    def compute(self, today, assets, out):
        out[:] =  nanstd(returns, ddof=0, axis=0) * (self.window_length ** .5)

This comment has been minimized.

@ssanderson

ssanderson Nov 22, 2016

Member

And then, we could have an AnnualizedVolatility defined as:

class AnnualizedVolatility(Volatility):
    window_length = 252

This comment has been minimized.

@ahgnaw

ahgnaw Nov 23, 2016

Contributor

Annualized volatility doesn't prescribe a minimum or maximum sample size. The annualization factor is derived from the time interval of the data, e.g. daily returns.

@@ -62,4 +65,7 @@
'TrueRange',
'VWAP',
'WeightedAverageValue',
'MovingAverageConvergenceDivergence',

This comment has been minimized.

@ssanderson

ssanderson Nov 22, 2016

Member

The rest of the entries here were in sorted order prior to these additions. It's nice for a reader looking for an entry to maintain that order.

@@ -32,6 +32,9 @@
TrueRange,
VWAP,
WeightedAverageValue,
MovingAverageConvergenceDivergence,

This comment has been minimized.

@ssanderson

ssanderson Nov 22, 2016

Member

Same note as below re: sorting of the imports here.

-------
MACD: The difference between "fast" EWMA and "slow" EWMA.
signal: The EWMA of the MACD line using `signal_period` as span.
hist: Difference between MACD and signal. (Divergence series)

This comment has been minimized.

@ssanderson

ssanderson Nov 22, 2016

Member

Why is this called hist? I don't have much intuition for that name.

Returns
-------
MACD: The difference between "fast" EWMA and "slow" EWMA.

This comment has been minimized.

@ssanderson

ssanderson Nov 22, 2016

Member

Do we expect people to use the outputs from MACD other than signal? Having multiple outputs makes it significantly easier to misuse, so I wonder if we'd be better off with a simpler API.

signal_period=9,
*args,
**kwargs):
return super(MovingAverageConvergenceDivergence, cls).__new__(

This comment has been minimized.

@ssanderson

ssanderson Nov 22, 2016

Member

We should raise an error here if slow_period <= fast_period.

@coveralls

This comment has been minimized.

coveralls commented Nov 23, 2016

Coverage Status

Coverage decreased (-0.05%) to 87.066% when pulling 3363237 on randc-built-in-factors into f3a36fe on master.

@coveralls

This comment has been minimized.

coveralls commented Nov 23, 2016

Coverage Status

Coverage decreased (-0.01%) to 87.107% when pulling 3363237 on randc-built-in-factors into f3a36fe on master.

ssanderson added some commits Nov 28, 2016

TEST: Use parameter_space for randomized tests.
- Use a RandomState with a seed so that we have repeatible results.
- Use `randint` instead of `random_integers.` `random_integers` is
  deprecated.
- Use `parameter_space` to test multiple period lengths.
MAINT: Put exponential_weights where it's used.
`math_utils` is mostly a shim around bottleneck imports.  If we need
this somewhere else, it probably belongs in `numpy_utils`.
MAINT: Tweaks/cleanups in technical.py.
- Use `expect_bounded` to check inputs.
- Add tests for expected failures from `MACDSignal`.
- Use `float64` instead of `float` in a few places.  This prevents
  diverging behavior on 32-bit systems.
- Docstring edits.
@coveralls

This comment has been minimized.

coveralls commented Nov 28, 2016

Coverage Status

Coverage increased (+0.1%) to 87.223% when pulling 37f5826 on randc-built-in-factors into f3a36fe on master.

@ssanderson ssanderson merged commit 74df429 into master Nov 28, 2016

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@ssanderson ssanderson deleted the randc-built-in-factors branch Nov 28, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment