Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation example: "as of" cache control flag for backtesting #473

Closed
stevesimmons opened this issue Dec 12, 2021 · 3 comments · Fixed by #528
Closed

Documentation example: "as of" cache control flag for backtesting #473

stevesimmons opened this issue Dec 12, 2021 · 3 comments · Fixed by #528
Labels
docs Documentation and examples
Milestone

Comments

@stevesimmons
Copy link

Feature description

I'd like an option to use the cache's state as it was at a particular times in the past. By including an "as of" time in the request, the cache would only consider items saved before that time, with staleness etc criteria calculated relative to it.

This would generally be used with a cache that doesn't expire saved items.

Use case

This is a common requirement in financial services, where you want to test algorithms against historical data. Examples include "retrospectives" in credit risk and "back-testing" in trading, where we want to make decisions (approve/decline for credit risk, buy/sell/hold for trading) at some point T in the past, using only data known before that T, and then observe the impact on outcomes between T and right now.

It can also be applied in software engineering, as a source of consistent test data for unit testing.

Workarounds

I'm not aware of any existing workarounds. Caches have a variety of max_age/stale_before flags for querying at the present moment. But nothing (afaik) for querying as if it were a moment in the past.

Plan to implement

Yes, I am interesting in implementing such a feature. Ideally with guidance to make sure it fits with the design intentions for requests-cache.

@JWCook
Copy link
Member

JWCook commented Dec 12, 2021

Interesting idea! I think the best way to do this would be to override CachedSession.request(), and check the response.created_at attribute.

Option 1

Here's an example that I think does what you want:

from datetime import datetime

import requests
from requests_cache import CachedSession, set_response_defaults


class BacktestCachedSession(CachedSession):
    def request(self, method: str, url: str, as_of: datetime = None, **kwargs):
        response = super().request(method, url, **kwargs)

        # Response was cached after the specified date, so ignore it and send a new request
        if response.created_at and as_of and response.created_at > as_of:
            new_response = requests.request(method, url, **kwargs)
            return set_response_defaults(new_response)
        else:
            return response


def demo():
    session = BacktestCachedSession()
    response = session.get('https://httpbin.org/get')
    response = session.get('https://httpbin.org/get')
    assert response.from_cache is True

    # Response was not cached yet at this point, so we should get a fresh one
    response = session.get('https://httpbin.org/get', as_of=datetime(2020, 1, 1))
    assert response.from_cache is False


if __name__ == '__main__':
    demo()

Option 2

If you're going to be testing other date/time-sensitive parts of your application this way, I'd highly recommend using a tool like time-machine or freezegun. That would let you use a consistent method for backtesting requests-cache and other components without adding a separate keyword arg.

This would just require changing the example above to check against the current system time instead of as_of, and then change the current time using time_machine.travel():

from datetime import datetime

import requests
import time_machine
from requests_cache import CachedSession, set_response_defaults

class BacktestCachedSession(CachedSession):
    def request(self, method: str, url: str, **kwargs):
        response = super().request(method, url, **kwargs)

        # Response was cached after the (simulated) current time, so ignore it and send a new request
        if response.created_at and response.created_at > datetime.utcnow():
            new_response = requests.request(method, url, **kwargs)
            return set_response_defaults(new_response)
        else:
            return response


def demo():
    session = BacktestCachedSession()
    response = session.get('https://httpbin.org/get')
    response = session.get('https://httpbin.org/get')
    assert response.from_cache is True

    # Response was not cached yet at this point, so we should get a fresh one
    with time_machine.travel(datetime(2020, 1, 1)):
        response = session.get('https://httpbin.org/get')
        assert response.from_cache is False


if __name__ == '__main__':
    demo()

As for where to put this, I think this may be too niche to include in the library itself, but it would be a nice addition to the examples folder, if there are others out there who want to do something similar. What do you think?

@stevesimmons
Copy link
Author

Thanks for the super quick suggestions, Jordan. I'll do a PR for an example, once my final solution is working.

Also, I hope to include a CosmosDB backend, since I am using Azure. The existing DynamoDB backend looks to be a good starting point.

@JWCook JWCook added the docs Documentation and examples label Dec 22, 2021
@JWCook JWCook changed the title Feature request: "as of" cache control flag for backtesting Documentation example: "as of" cache control flag for backtesting Dec 22, 2021
@JWCook JWCook added this to the v0.9 milestone Feb 15, 2022
@JWCook
Copy link
Member

JWCook commented Feb 15, 2022

I added the 2nd example above to the docs. Feel free to add onto examples/time_machine_backtesting.py if you'd like.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation and examples
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants