Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: async support for pinecone and openai #326

Merged
merged 8 commits into from
Jun 13, 2024
Merged

Conversation

jamescalam
Copy link
Member

@jamescalam jamescalam commented Jun 13, 2024

PR Type

Enhancement


Description

  • Added asynchronous methods (acall) to base encoder class, OpenAI encoder, Azure OpenAI encoder, and Pinecone index client.
  • Implemented asynchronous client initialization for OpenAI, Azure OpenAI, and Pinecone.
  • Added default score threshold values to OpenAI model configurations.
  • Enhanced RouteLayer with asynchronous methods for route retrieval and encoding.
  • Added threshold attribute to EncoderInfo class.

Changes walkthrough 📝

Relevant files
Enhancement
base.py
Add asynchronous call method to base encoder class.           

semantic_router/encoders/base.py

  • Added acall method to the base encoder class for asynchronous support.

  • +3/-0     
    openai.py
    Add asynchronous support and default values to OpenAI encoder.

    semantic_router/encoders/openai.py

  • Added asynchronous client initialization.
  • Implemented acall method for asynchronous API calls.
  • Added default score threshold values to model configurations.
  • +60/-5   
    azure.py
    Add asynchronous support to Azure OpenAI encoder.               

    semantic_router/encoders/azure.py

  • Added asynchronous client initialization.
  • Implemented acall method for asynchronous API calls.
  • pinecone.py
    Add asynchronous support to Pinecone index client.             

    semantic_router/index/pinecone.py

  • Added asynchronous client initialization.
  • Implemented asynchronous methods for index creation, querying, and
    description.
  • +128/-1 
    layer.py
    Add asynchronous support to RouteLayer.                                   

    semantic_router/layer.py

  • Added asynchronous methods for route retrieval and encoding.
  • Implemented acall method for asynchronous route layer operations.
  • +99/-1   
    schema.py
    Add threshold attribute to EncoderInfo class.                       

    semantic_router/schema.py

    • Added threshold attribute to EncoderInfo class.
    +1/-0     

    💡 PR-Agent usage:
    Comment /help on the PR to get a list of all available PR-Agent tools and their descriptions

    @jamescalam jamescalam changed the title James/async op feat: async support for pinecone and openai Jun 13, 2024
    @github-actions github-actions bot added enhancement Enhancement to existing features Review effort [1-5]: 4 labels Jun 13, 2024
    Copy link

    PR Reviewer Guide 🔍

    ⏱️ Estimated effort to review [1-5]

    4

    🧪 Relevant tests

    No

    🔒 Security concerns

    No

    ⚡ Key issues to review

    Possible Bug:
    The use of exponential backoff in the asynchronous methods (acall) for both OpenAI and Azure OpenAI encoders does not handle the case where all retries fail. The loop exits without a clear error or exception handling for the final attempt, potentially leading to unclear behavior or errors that are not properly reported.

    Error Handling:
    In the Pinecone async client initialization (_initialize_async_client), the User-Agent header is set incorrectly with source_tag=semanticrouter. This should be part of the query parameters or another part of the request, not the User-Agent.

    Inconsistent API Key Handling:
    In the Pinecone index client, the API key is fetched from the environment variable and used directly. It would be better to handle this more securely or uniformly, ensuring that API keys are not logged or mishandled.

    Logging and Error Messages:
    There are multiple instances where logging and error handling could be improved. For example, the error handling in the async methods does not always log the exception details, making debugging more difficult.

    Copy link

    PR Code Suggestions ✨

    CategorySuggestion                                                                                                                                    Score
    Error handling
    Add error handling for the initialization of async_client

    Add error handling for the initialization of async_client to ensure it is properly set up
    before use, similar to the existing error handling for client.

    semantic_router/encoders/openai.py [72-74]

    -self.async_client = openai.AsyncClient(
    -    base_url=base_url, api_key=api_key, organization=openai_org_id
    -)
    +try:
    +    self.async_client = openai.AsyncClient(
    +        base_url=base_url, api_key=api_key, organization=openai_org_id
    +    )
    +except Exception as e:
    +    logger.error(f"Failed to initialize OpenAI AsyncClient. Error: {e}")
    +    raise
     
    Suggestion importance[1-10]: 9

    Why: Adding error handling for the initialization of async_client is crucial for ensuring robustness and preventing potential runtime errors. This suggestion aligns with existing error handling practices in the code.

    9
    Robustness
    Add error handling for the asynchronous retrieval method to manage exceptions

    To ensure that the async_retrieve method behaves consistently with the synchronous
    version, consider handling exceptions that might occur during the asynchronous operations,
    such as network issues or timeouts. This can be done by wrapping the asynchronous calls in
    a try-except block and logging or handling the error appropriately.

    semantic_router/layer.py [499-500]

    -scores, routes = await self.index.aquery(
    -    vector=xq, top_k=top_k, route_filter=route_filter
    -)
    +try:
    +    scores, routes = await self.index.aquery(
    +        vector=xq, top_k=top_k, route_filter=route_filter
    +    )
    +except Exception as e:
    +    logger.error(f"Failed to retrieve routes: {e}")
    +    return []
     
    Suggestion importance[1-10]: 9

    Why: Adding error handling significantly improves the robustness and reliability of the asynchronous method, ensuring that potential issues like network failures are managed gracefully.

    9
    Maintainability
    Replace explicit retry loop with a utility function for exponential backoff

    Replace the explicit loop for retries with a more robust and reusable utility function for
    exponential backoff. This will improve the maintainability and readability of the code.

    semantic_router/encoders/openai.py [156-172]

    -for j in range(1, 7):
    -    try:
    -        embeds = await self.async_client.embeddings.create(
    -            input=docs,
    -            model=self.name,
    -            dimensions=self.dimensions,
    -        )
    -        if embeds.data:
    -            break
    -    except OpenAIError as e:
    -        await asleep(2**j)
    -        error_message = str(e)
    -        logger.warning(f"Retrying in {2**j} seconds...")
    -    except Exception as e:
    -        logger.error(f"OpenAI API call failed. Error: {error_message}")
    -        raise ValueError(f"OpenAI API call failed. Error: {e}") from e
    +embeds = await retry_with_exponential_backoff(self.async_client.embeddings.create, input=docs, model=self.name, dimensions=self.dimensions)
     
    Suggestion importance[1-10]: 8

    Why: This suggestion improves the maintainability and readability of the code by abstracting the retry logic into a reusable utility function. This change is beneficial for reducing code duplication and enhancing clarity.

    8
    Refactor retry logic into a separate method for better code organization

    Refactor the exponential backoff logic into a separate method to enhance code reuse and
    separation of concerns.

    semantic_router/encoders/zure.py [141-160]

    -for j in range(3):
    -    try:
    -        embeds = await self.async_client.embeddings.create(
    -            input=docs,
    -            model=str(self.model),
    -            dimensions=self.dimensions,
    -        )
    -        if embeds.data:
    -            break
    -    except OpenAIError as e:
    -        import traceback
    -        traceback.print_exc()
    -        await asleep(2**j)
    -        error_message = str(e)
    -        logger.warning(f"Retrying in {2**j} seconds...")
    -    except Exception as e:
    -        logger.error(f"Azure OpenAI API call failed. Error: {error_message}")
    -        raise ValueError(f"Azure OpenAI API call failed. Error: {e}") from e
    +embeds = await retry_with_exponential_backoff(self.async_client.embeddings.create, input=docs, model=str(self.model), dimensions=self.dimensions)
     
    Suggestion importance[1-10]: 8

    Why: Refactoring the retry logic into a separate method enhances code reuse and separation of concerns, making the codebase more maintainable and organized.

    8
    Resolve the TODO comment in _async_semantic_classify by evaluating and implementing the necessary check

    The method _async_semantic_classify uses a TODO comment questioning the necessity of a
    check. It's important to resolve TODOs before merging to ensure that the code is complete
    and all intended features are implemented correctly. Consider evaluating the need for this
    check and implementing it if necessary, or removing the comment if it's decided that the
    check is not needed.

    semantic_router/layer.py [361-362]

    -# TODO do we need this check?
    -route = self.check_for_matching_routes(top_class)
    +# After evaluation, if the check is necessary:
    +if some_condition:  # replace 'some_condition' with the actual condition
    +    route = self.check_for_matching_routes(top_class)
    +else:
    +    route = None  # or handle as appropriate
    +# If the check is not necessary, simply remove the TODO comment.
     
    Suggestion importance[1-10]: 8

    Why: Resolving TODO comments is crucial for maintaining clean and complete code. This suggestion ensures that the code is fully reviewed and any necessary checks are implemented or clarified.

    8
    Readability
    Rename acall to async_call for better readability and clarity

    Consider renaming the acall method to a more descriptive name that indicates its
    asynchronous nature and functionality, such as async_call.

    semantic_router/encoders/base.py [17-18]

    -def acall(self, docs: List[Any]) -> List[List[float]]:
    +def async_call(self, docs: List[Any]) -> List[List[float]]:
         raise NotImplementedError("Subclasses must implement this method")
     
    Suggestion importance[1-10]: 7

    Why: Renaming the method to async_call improves readability and makes the asynchronous nature of the method more explicit. This is a minor but useful improvement for code clarity.

    7
    Possible issue
    Handle the case where route.llm is an instance of BaseLLM when route.function_schemas is not None

    Consider handling the case where route.function_schemas is not None but route.llm is an
    instance of BaseLLM. Currently, the code only checks if route.llm is not an instance of
    BaseLLM and raises a NotImplementedError. This might lead to unexpected behavior if
    route.llm is indeed an instance of BaseLLM but no further action is defined for this case.

    semantic_router/layer.py [294-296]

    -if route.function_schemas and not isinstance(route.llm, BaseLLM):
    -    raise NotImplementedError(
    -        "Dynamic routes not yet supported for async calls."
    -    )
    +if route.function_schemas:
    +    if not isinstance(route.llm, BaseLLM):
    +        raise NotImplementedError(
    +            "Dynamic routes not yet supported for async calls."
    +        )
    +    else:
    +        # Handle the case where route.llm is an instance of BaseLLM
    +        # Example: return await route.llm.acall(text)
    +        pass
     
    Suggestion importance[1-10]: 7

    Why: The suggestion addresses a potential edge case that could lead to unexpected behavior. However, the importance of this edge case depends on the specific use cases and how often route.llm being an instance of BaseLLM occurs in practice.

    7
    Enhancement
    Improve the clarity of the exception message in the acall method

    In the acall method, the exception message "Either text or vector must be provided" could
    be more informative by suggesting what the user should do next. Consider revising the
    message to guide the user more clearly.

    semantic_router/layer.py [282]

     if text is None:
    -    raise ValueError("Either text or vector must be provided")
    +    raise ValueError("Either 'text' or 'vector' must be provided. Please ensure that at least one of them is not None before calling this method.")
     
    Suggestion importance[1-10]: 6

    Why: Improving the clarity of error messages enhances the user experience by providing more actionable feedback. However, this is a minor enhancement compared to functional or robustness improvements.

    6

    Copy link

    codecov bot commented Jun 13, 2024

    Codecov Report

    Attention: Patch coverage is 26.11111% with 133 lines in your changes missing coverage. Please review.

    Project coverage is 73.80%. Comparing base (1916fec) to head (80b6fc1).

    Files Patch % Lines
    semantic_router/layer.py 14.89% 40 Missing ⚠️
    semantic_router/index/pinecone.py 31.57% 39 Missing ⚠️
    semantic_router/encoders/openai.py 20.58% 27 Missing ⚠️
    semantic_router/encoders/zure.py 18.51% 22 Missing ⚠️
    semantic_router/splitters/rolling_window.py 0.00% 2 Missing ⚠️
    semantic_router/encoders/base.py 87.50% 1 Missing ⚠️
    semantic_router/hybrid_layer.py 50.00% 1 Missing ⚠️
    semantic_router/index/base.py 50.00% 1 Missing ⚠️
    Additional details and impacted files
    @@            Coverage Diff             @@
    ##             main     #326      +/-   ##
    ==========================================
    - Coverage   77.01%   73.80%   -3.22%     
    ==========================================
      Files          45       45              
      Lines        2728     2901     +173     
    ==========================================
    + Hits         2101     2141      +40     
    - Misses        627      760     +133     

    ☔ View full report in Codecov by Sentry.
    📢 Have feedback on the report? Share it here.

    @jamescalam jamescalam merged commit 4da3da1 into main Jun 13, 2024
    6 of 8 checks passed
    @jamescalam jamescalam deleted the james/async-op branch June 13, 2024 09:06
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    enhancement Enhancement to existing features Review effort [1-5]: 4
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    None yet

    1 participant