Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Support Pipeline arithmetic that combines more than 32 terms #2727

Merged
merged 1 commit into from
Jun 23, 2020

Conversation

richafrank
Copy link
Member

No description provided.

@richafrank
Copy link
Member Author

@llllllllll I'm curious your thoughts on the resulting tree - not very balanced...

@llllllllll
Copy link
Contributor

As an experiment, can you see what the performance looks like if we never merge nodes? I personally believe that arithmetic folding should happen in the engine so that we can make high level scheduling decisions, for example: memory vs speed. I think that folding in the engine would also make it easier to balance, since we have the whole tree at that point.

@coveralls
Copy link

coveralls commented Jun 19, 2020

Coverage Status

Coverage increased (+0.004%) to 88.21% when pulling f35ef51 on i-thought-the-answer-was-42 into 0fe71b2 on master.

@richafrank
Copy link
Member Author

richafrank commented Jun 22, 2020

Local benchmarking of never-merging experiment with following pipeline:

from quantopian.pipeline import Pipeline
from quantopian.pipeline.factors import Returns, CustomFactor

terms_list = [Returns(window_length=i) for i in range(2, 33)]  # 31 terms

def make_pipeline():
    combined_factor = sum(terms_list)

    return Pipeline(
        columns={
            'combined_factor': combined_factor
        }
    )

31 terms: merging: 10 years: ~127s
31 terms: no merging: 10 years: ~127s

I tried also with 2 terms:
2 terms: merging: 10 years: 12s
2 terms: no merging: 10 years: 11s

From talking with @llllllllll about further optimization, there's more work to be done, so going to merge this for now, since it unlocks some larger pipelines.

@richafrank richafrank merged commit b661d07 into master Jun 23, 2020
@richafrank richafrank deleted the i-thought-the-answer-was-42 branch June 23, 2020 23:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants