You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
Failure on data ingestion into qdrant using text-embedding-ada-002 embedding
BadRequestError: Error code: 400 - {'error': {'message': 'This model does not support specifying dimensions.', 'type': 'invalid_request_error',
'param': None, 'code': None}}
Describe the bug
Failure on data ingestion into qdrant using text-embedding-ada-002 embedding
BadRequestError: Error code: 400 - {'error': {'message': 'This model does not support specifying dimensions.', 'type': 'invalid_request_error',
'param': None, 'code': None}}
The issue seems to be in
OpenAIEmbeddingProvider.get_embedding
method inr2r/embeddings/openai/openai_base.py
which is always passing in the dimensions while as per https://platform.openai.com/docs/api-reference/embeddings/createdimensions
integer OptionalSo - for "text-embedding-ada-002" embedding type, the code shouldn't send the dimensions value.
To Reproduce
Steps to reproduce the behavior:
Use a config of
{
"app": {
"max_logs": 100,
"max_file_size_in_mb": 50
},
"completions": {
"provider": "openai"
},
"embedding": {
"provider": "openai",
"search_model": "text-embedding-ada-002",
"search_dimension": 1536,
"batch_size": 128,
"text_splitter": {
"type": "recursive_character",
"chunk_size": 512,
"chunk_overlap": 20
},
"rerank_model": "None"
},
"eval": {
"provider": "local",
"llm": {
"model": "gpt-4o",
"provider": "openai"
},
"sampling_fraction": 1.0
},
"ingestion": {
"selected_parsers": {
"csv": "default",
"docx": "default",
"html": "default",
"json": "default",
"md": "default",
"pdf": "default",
"pptx": "default",
"txt": "default",
"xlsx": "default",
"gif": "default",
"png": "default",
"jpg": "default",
"jpeg": "default",
"svg": "default"
}
},
"logging": {
"provider": "local",
"log_table": "logs",
"log_info_table": "log_info"
},
"prompt": {
"provider": "local"
},
"vector_database": {
"provider": "qdrant",
"collection_name": "blahblahblah"
}
}
and ingest any data files.
Expected behavior
Data files vectorized and uploaded to qdrant
Additional context
I installed
r2r
package and programmatically provided a list of files and calledr2r.aingest_files
for the issue to hit.The text was updated successfully, but these errors were encountered: