Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] add exponential backoff and jitter to embedding calls #1526

Merged

Conversation

rancomp
Copy link
Contributor

@rancomp rancomp commented Dec 14, 2023

This is a WIP, closes #1524

Summarize the changes made by this PR.

  • Improvements & Bug fixes
    • Use tenacity to add exponential backoff and jitter
  • New functionality
    • control the parameters of the exponential backoff and jitter and allow the user to use their own wait functions from tenacity's API

Test plan

How are these changes tested?

  • Tests pass locally with pytest for python, yarn test for js

Documentation Changes

None

Copy link

Reviewer Checklist

Please leverage this checklist to ensure your code review is thorough before approving

Testing, Bugs, Errors, Logs, Documentation

  • Can you think of any use case in which the code does not behave as intended? Have they been tested?
  • Can you think of any inputs or external events that could break the code? Is user input validated and safe? Have they been tested?
  • If appropriate, are there adequate property based tests?
  • If appropriate, are there adequate unit tests?
  • Should any logging, debugging, tracing information be added or removed?
  • Are error messages user-friendly?
  • Have all documentation changes needed been made?
  • Have all non-obvious changes been commented?

System Compatibility

  • Are there any potential impacts on other parts of the system or backward compatibility?
  • Does this change intersect with any items on our roadmap, and if so, is there a plan for fitting them together?

Quality

  • Is this code of a unexpectedly high quality (Readbility, Modularity, Intuitiveness)

Copy link
Contributor

@tazarov tazarov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rancomp, have a look at my comments.

@@ -31,6 +33,19 @@
logger = logging.getLogger(__name__)


def _retry_call(call):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments on this:

  • Does it make sense to move this to a separate util module
  • Does it make sense to have _ for a decorator? Seems odd
  • This decorator seems to erase type info.
  • Does it make sense to add configuration options e.g. how many retries, how much to wait, ignored exceptions?
  • The @retry decorator seems to throw RetryException. Does it make sense to raise the original exception as it will be better DX

Copy link
Contributor Author

@rancomp rancomp Dec 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @tazarov!

A few comments on this:

  • Does it make sense to move this to a separate util module

Yes that makes sense. But then again I don't see it used elsewhere at the moment.

  • Does it make sense to have _ for a decorator? Seems odd

I agree with you. It was just a leftover from my first commit where I had it as a private function within the class

  • This decorator seems to erase type info.

Is it because I'm using *args? This can be fixed by explicitly naming the arguments. I should also annotate the outputs Embeddings.

  • Does it make sense to add configuration options e.g. how many retries, how much to wait, ignored exceptions?

Yes that makes sense. My idea was to have each EmbeddingFunction instance carry its own arguments for Tenacity.retry through self.wait, which is what I wanted to achieve with the super().__init__(). Another thing I can do is avoid the super() all together and just let the call_wrapper_factory check whether self has attribute wait. If not, set the decorator to be the default wait_exponential_jitter().

  • The @retry decorator seems to throw RetryException. Does it make sense to raise the original exception as it will be better DX

Important catch here. I'll see how tenacity suggest raise the original exception.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK there's no way to implement a decorator like this in python <= 3.10 without destroying type info. Our auth and telemetry decorators destroy type info and it's something I would like to fix.

The fact that we can't use a decorator and preserve type info makes me think we shouldn't do it. Instead, we could (in order of my preference off the top of my head):

  • Use a contextmanager
  • Just use tenacity directly wherever we call out to embedding providers.
  • Have an explicit function try_with_retries or something, which takes retry parameters and the relevant method call.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @beggers . I pushed a small update addressing @tazarov's remarks.

Regarding your points:

AFAIK there's no way to implement a decorator like this in python <= 3.10 without destroying type info. Our auth and telemetry decorators destroy type info and it's something I would like to fix.

The fact that we can't use a decorator and preserve type info makes me think we shouldn't do it. Instead, we could (in order of my preference off the top of my head):

I'm not familiar with contextlib but I'm looking into it now. I want to ask, IIUC the problem with type-info exists in other places in the repo. Should we push for a solution that solves all of these together?

  • Just use tenacity directly wherever we call out to embedding providers.
  • Have an explicit function try_with_retries or something, which takes retry parameters and the relevant method call.

I can do both. Let me think about these.

chromadb/api/types.py Outdated Show resolved Hide resolved
@beggers
Copy link
Member

beggers commented Dec 18, 2023

@rancomp I see this PR is still a draft -- tag me when it's ready for a review and I'll take a look

@rancomp rancomp closed this Dec 19, 2023
@rancomp rancomp reopened this Dec 19, 2023
@rancomp
Copy link
Contributor Author

rancomp commented Dec 21, 2023

Hey @beggers .I'm finding it challenging to create a straightforward backend along with a user-friendly API for this task. My initial attempt involved the use of the class attributes to edit the retry parameters, but that doesn't feel right.

Let's start with a simple solution which is your 3rd suggestion.
I added a class method EmbeddingFunction.embed_with_retries (in types.py), which simply returns retry(**retry_kwargs)(self.__call__)(input). This basically allows the user to access tenacity directly. I like it because it gives the user direct control over the retry parameters. Anything else is either masking retry or making it cumbersome to edit these parameters. What do you think?

@beggers
Copy link
Member

beggers commented Dec 23, 2023

Sorry for my lateness here.

I like the approach of having embed_with_retries on the top-level EmbeddingFunction so it's available to all other EmbeddingFunctions. If you get rid of @retry_decorator I'd be happy to have this in our codebase. I agree that giving users direct control over tenacity is the correct flow here.

One other option for us to consider: We could give each EmbeddingFunction a retry_kwargs dict as a field for users to set, and if it's populated we could wrap the actual embedding call in EmbeddingFunctions' __call__s with tenacity retries. In other words, every embedding function's __call__ method would internally check for the existence of the dict and use tenacity to make the call with retries. This could probably be abstracted to embed_with_retries on the top-level EmbeddingFunction but it would accept a function handle, args, and kwargs for the embedding and use the retry_kwargs. WDYT? I'm happy to do this either way.

@rancomp
Copy link
Contributor Author

rancomp commented Dec 24, 2023

hey @beggers NP.
I was thinking about your other suggestion. I think this could give the user a slightly cleaner access to retry through the __call__ method, but that's at the cost of a "messier" back-end + more parameters to each EmbeddingFunction. Another thing I'm uncertain about is how Tenacity plays out with multiprocessing. That's another reason why I don't want to change the __call__ which probably should be a stable method of the the API.

Compare this with the newly added embed_with_retries: Clean back-end, no new parameters to the instantiation, and clear method. The downside is that the user would need to specify the kwargs at every call, or probably wrap it with a lambda function.

I'm leaning towards the embed_with_retries solution because it doesn't change existing methods.

@rancomp rancomp marked this pull request as ready for review December 28, 2023 20:16
Copy link
Member

@beggers beggers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey @rancomp , sorry I dropped this. Looks good! If you fix the merge conflicts I'll run the precommit hooks and get this merged.

@rancomp
Copy link
Contributor Author

rancomp commented Jan 11, 2024

alright @beggers np!

PS, I got a weird message flake error when merging main into my branch:

flake8...................................................................Failed
- hook id: flake8
- exit code: 1

chromadb/utils/embedding_functions.py:717:18: F821 undefined name 'boto3'

Is it worth adding noqa: F821 on this line?

@rancomp
Copy link
Contributor Author

rancomp commented Jan 16, 2024

merged main (conflicts), but black hook caught some stuff from other modules and corrected it.

@@ -194,6 +195,9 @@ def __call__(self: EmbeddingFunction[D], input: D) -> Embeddings:

setattr(cls, "__call__", __call__)

def embed_with_retries(self, input: D, **retry_kwargs: Dict) -> Embeddings:
return retry(**retry_kwargs)(self.__call__)(input)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I think about this, this retry will work even if the errors from OPENAI are non transient correct? In cases where let say user max send limit for their Openai is reached no amount of retry is going to fix it until they update their spend limit. Do we want to retry for non transient errors? I guess consumers of chromaDb should be handling this no 🤔

I am a user of chromaDB and what I have seen usually is Open AI waiting 600 secs and returning Timeouts. And literally so many folks complain about this error on their forum - https://community.openai.com/t/frequent-api-timeout-errors-recently/106903

We should also have a way to not wait 600 seconds and allow consumers to configure this. Ideally openAI should have given us a configuration option but there does not seem to be one.

thoughts? @tazarov @beggers

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't used tenacity directly before (others on the team have and it's what we use elsewhere in the codebase), but it looks like it allows you to set a timeout: https://tenacity.readthedocs.io/en/latest/#stopping . I didn't see anything in my skimming about retrying only certain error codes though I'm sure it's possible.

@rancomp designed this implementation so Chromadb users have full control over the tenacity retry logic so it should be plug-and-play to get this working.

Copy link
Member

@beggers beggers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rancomp sorry for the slog here. CI is currently broken. I'll merge it into this branch and re-run tests once we've fixed it -- no action required from you and we'll get this over the finish line.

@@ -194,6 +195,9 @@ def __call__(self: EmbeddingFunction[D], input: D) -> Embeddings:

setattr(cls, "__call__", __call__)

def embed_with_retries(self, input: D, **retry_kwargs: Dict) -> Embeddings:
return retry(**retry_kwargs)(self.__call__)(input)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't used tenacity directly before (others on the team have and it's what we use elsewhere in the codebase), but it looks like it allows you to set a timeout: https://tenacity.readthedocs.io/en/latest/#stopping . I didn't see anything in my skimming about retrying only certain error codes though I'm sure it's possible.

@rancomp designed this implementation so Chromadb users have full control over the tenacity retry logic so it should be plug-and-play to get this working.

@beggers beggers merged commit 9824336 into chroma-core:main Jan 17, 2024
94 of 97 checks passed
@HammadB HammadB changed the title [WIP] [ENH] add exponential backoff and jitter to embedding calls [ENH] add exponential backoff and jitter to embedding calls Jan 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request]: Exponential backoff retries in embedding functions
4 participants