Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: embedder errors in embed length #9584

Merged
merged 8 commits into from Apr 1, 2024
Merged

Conversation

mattkrick
Copy link
Member

@mattkrick mattkrick commented Apr 1, 2024

Description

embedder is still failing on large chunks of text because splitting is done using a heuristic.
now, after we split, we verify the chunk length & if it's still too big, we split again, but using smaller chunks.

embedder should be able to run even if env.AI_GENERATION_MODELS is undefined. Also cleaned up the validation of env vars so we get better error messages

Signed-off-by: Matt Krick <matt.krick@gmail.com>
Signed-off-by: Matt Krick <matt.krick@gmail.com>
Signed-off-by: Matt Krick <matt.krick@gmail.com>
Signed-off-by: Matt Krick <matt.krick@gmail.com>
@github-actions github-actions bot added size/l and removed size/s labels Apr 1, 2024
@github-actions github-actions bot added size/xl and removed size/l labels Apr 1, 2024
Copy link
Contributor

github-actions bot commented Apr 1, 2024

This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR will be delayed and might be rejected due to its size.

Signed-off-by: Matt Krick <matt.krick@gmail.com>
Copy link
Contributor

github-actions bot commented Apr 1, 2024

This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR will be delayed and might be rejected due to its size.

Signed-off-by: Matt Krick <matt.krick@gmail.com>
Copy link
Contributor

github-actions bot commented Apr 1, 2024

This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR will be delayed and might be rejected due to its size.

@mattkrick mattkrick removed the request for review from tianrunhe April 1, 2024 21:46
@mattkrick mattkrick merged commit 341b4b7 into master Apr 1, 2024
5 checks passed
@mattkrick mattkrick deleted the feat/embedder-errors branch April 1, 2024 21:46
@github-actions github-actions bot mentioned this pull request Apr 2, 2024
24 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant