Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Added fix to encode documents within rolling window #256

Merged
merged 11 commits into from
Apr 27, 2024

Conversation

mesax1
Copy link
Contributor

@mesax1 mesax1 commented Apr 26, 2024

fix: BadRequestError: Error code: 400 - {'error': {'message': "'$.input' is invalid. Please check the API reference: https://platform.openai.com/docs/api-reference.", 'type': 'invalid_request_error', 'param': None, 'code': None}}

Occurs when _encode_documents attempts to run splitter(docs), and len(docs)>2048, as it's an OpenAI limit when generating embeddings of an array.

How it fixes the problem:
Split the array into multiple arrays when len(docs)>2000, and sends the multiple arrays to generate the embeddings.

Copy link

Failed to generate code suggestions for PR

@mesax1 mesax1 changed the title Added fix to _encode_documents within rolling_window.py when len(docs… fix: Added fix to _encode_documents within rolling_window.py Apr 26, 2024
Copy link
Member

@simjak simjak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mesax1 looks good to me, just some checks are failing

@simjak simjak changed the title fix: Added fix to _encode_documents within rolling_window.py fix: Added fix to encode documents within rolling window Apr 26, 2024
Copy link

codecov bot commented Apr 27, 2024

Codecov Report

Attention: Patch coverage is 55.55556% with 12 lines in your changes are missing coverage. Please review.

Project coverage is 80.95%. Comparing base (302fe17) to head (3b78805).

Files Patch % Lines
semantic_router/splitters/rolling_window.py 0.00% 11 Missing ⚠️
semantic_router/index/pinecone.py 91.66% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #256      +/-   ##
==========================================
+ Coverage   80.78%   80.95%   +0.16%     
==========================================
  Files          44       44              
  Lines        2389     2399      +10     
==========================================
+ Hits         1930     1942      +12     
+ Misses        459      457       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jamescalam jamescalam merged commit fd8cc15 into main Apr 27, 2024
7 of 8 checks passed
@jamescalam jamescalam deleted the juan/fix-encoder-arrays branch April 27, 2024 11:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants