New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Milvus / Qdrant / Pinecone: Update CDK #30689
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
destination-qdrant test report (commit
|
Step | Result |
---|---|
Connector package install | ✅ |
Build destination-qdrant docker image for platform linux/x86_64 | ✅ |
Unit tests | ✅ |
Code format checks | ✅ |
Validate airbyte-integrations/connectors/destination-qdrant/metadata.yaml | ✅ |
Connector version semver check | ✅ |
Connector version increment check | ✅ |
QA checks | ✅ |
☁️ View runs for commit in Dagger Cloud
Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command
airbyte-ci connectors --name=destination-qdrant test
destination-milvus test report (commit
|
Step | Result |
---|---|
Connector package install | ✅ |
Build destination-milvus docker image for platform linux/x86_64 | ✅ |
Unit tests | ✅ |
Integration tests | ✅ |
Acceptance tests | ❌ |
Code format checks | ✅ |
Validate airbyte-integrations/connectors/destination-milvus/metadata.yaml | ✅ |
Connector version semver check | ✅ |
Connector version increment check | ✅ |
QA checks | ✅ |
☁️ View runs for commit in Dagger Cloud
Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command
airbyte-ci connectors --name=destination-milvus test
destination-qdrant test report (commit
|
Step | Result |
---|---|
Connector package install | ✅ |
Build destination-qdrant docker image for platform linux/x86_64 | ✅ |
Unit tests | ✅ |
Code format checks | ✅ |
Validate airbyte-integrations/connectors/destination-qdrant/metadata.yaml | ✅ |
Connector version semver check | ✅ |
Connector version increment check | ✅ |
QA checks | ✅ |
☁️ View runs for commit in Dagger Cloud
Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command
airbyte-ci connectors --name=destination-qdrant test
destination-pinecone test report (commit
|
Step | Result |
---|---|
Connector package install | ✅ |
Build destination-pinecone docker image for platform linux/x86_64 | ✅ |
Unit tests | ✅ |
Integration tests | ✅ |
Acceptance tests | ✅ |
Code format checks | ✅ |
Validate airbyte-integrations/connectors/destination-pinecone/metadata.yaml | ✅ |
Connector version semver check | ✅ |
Connector version increment check | ✅ |
QA checks | ✅ |
☁️ View runs for commit in Dagger Cloud
Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command
airbyte-ci connectors --name=destination-pinecone test
destination-milvus test report (commit
|
Step | Result |
---|---|
Connector package install | ✅ |
Build destination-milvus docker image for platform linux/x86_64 | ✅ |
Unit tests | ❌ |
Code format checks | ✅ |
Validate airbyte-integrations/connectors/destination-milvus/metadata.yaml | ✅ |
Connector version semver check | ✅ |
Connector version increment check | ✅ |
QA checks | ✅ |
☁️ View runs for commit in Dagger Cloud
Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command
airbyte-ci connectors --name=destination-milvus test
destination-milvus test report (commit
|
Step | Result |
---|---|
Connector package install | ✅ |
Build destination-milvus docker image for platform linux/x86_64 | ✅ |
Unit tests | ❌ |
Code format checks | ✅ |
Validate airbyte-integrations/connectors/destination-milvus/metadata.yaml | ✅ |
Connector version semver check | ✅ |
Connector version increment check | ✅ |
QA checks | ✅ |
☁️ View runs for commit in Dagger Cloud
Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command
airbyte-ci connectors --name=destination-milvus test
destination-pinecone test report (commit
|
Step | Result |
---|---|
Connector package install | ✅ |
Build destination-pinecone docker image for platform linux/x86_64 | ✅ |
Unit tests | ✅ |
Integration tests | ✅ |
Acceptance tests | ✅ |
Code format checks | ✅ |
Validate airbyte-integrations/connectors/destination-pinecone/metadata.yaml | ✅ |
Connector version semver check | ✅ |
Connector version increment check | ✅ |
QA checks | ✅ |
☁️ View runs for commit in Dagger Cloud
Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command
airbyte-ci connectors --name=destination-pinecone test
destination-qdrant test report (commit
|
Step | Result |
---|---|
Connector package install | ✅ |
Build destination-qdrant docker image for platform linux/x86_64 | ✅ |
Unit tests | ✅ |
Code format checks | ✅ |
Validate airbyte-integrations/connectors/destination-qdrant/metadata.yaml | ✅ |
Connector version semver check | ✅ |
Connector version increment check | ✅ |
QA checks | ✅ |
☁️ View runs for commit in Dagger Cloud
Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command
airbyte-ci connectors --name=destination-qdrant test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I performed a full code readthrough. A couple questions but no blockers. 👍
- "${indexing.url}" | ||
- api.openai.com | ||
- api.cohere.ai | ||
- "${embedding.api_base}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for my own understanding, I'm curious how these dynamic refs are parsed. I see embedding.api_base
matches the config setting name but I'm not sure about indexing.url
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's referencing this config value:
def _init_indexer(self, config: ConfigModel): | ||
self.embedder = embedder_map[config.embedding.mode](config.embedding) | ||
if config.embedding.mode == "azure_openai" or config.embedding.mode == "openai": | ||
self.embedder = embedder_map[config.embedding.mode](config.embedding, config.processing.chunk_size) | ||
else: | ||
self.embedder = embedder_map[config.embedding.mode](config.embedding) | ||
self.indexer = MilvusIndexer(config.indexing, self.embedder.embedding_dimensions) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this and embedder_map
be moved into the CDK?
Not a blocker for this round, but it seems it would be good to have this logic centrally managed. (Also appears for Milvus.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think that would make sense. I'll create an issue for that
for key, value in metadata.items(): | ||
normalized_key = key | ||
# the primary key can't be set directly with auto_id, so we prefix it with an underscore | ||
if key == self._primary_key: | ||
normalized_key = f"_{key}" | ||
result[normalized_key] = value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I 100% follow the issue or the mitigation. I understand that we can't modify the primary key when it is of type auto_id
- so are we setting a different key using the underscore? Is it in the case of the 'natural' primary key of the stream colliding with a built-in column name?
Does the problem become less if (in the future) we generate our own unique keys instead of using auto_id?
Seems like this is not a big issue either way (as long as we stop the failure scenario) so I'm not asking to change anything - just would be good for me to understand longer-term.
Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Milvus collection has to be set to auto_id, but if that's the case Milvus will complain if an entity contains a value for the primary key (as it will pick a value itself). So let's say you create a collection with primary key id
set to auto_id, but a record coming from some source contains an id
field as well, it will error.
This field changes the logic so the id
field from the Airbyte record is stored as _id
in the collection. This is necessary because one record might be split into multiple entities, so _id
won't be guaranteed to be unique.
destination-qdrant test report (commit
|
Step | Result |
---|---|
Connector package install | ✅ |
Build destination-qdrant docker image for platform linux/x86_64 | ✅ |
Unit tests | ✅ |
Code format checks | ✅ |
Validate airbyte-integrations/connectors/destination-qdrant/metadata.yaml | ✅ |
Connector version semver check | ✅ |
Connector version increment check | ✅ |
QA checks | ✅ |
☁️ View runs for commit in Dagger Cloud
Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command
airbyte-ci connectors --name=destination-qdrant test
destination-milvus test report (commit
|
Step | Result |
---|---|
Connector package install | ✅ |
Build destination-milvus docker image for platform linux/x86_64 | ✅ |
Unit tests | ❌ |
Code format checks | ✅ |
Validate airbyte-integrations/connectors/destination-milvus/metadata.yaml | ✅ |
Connector version semver check | ✅ |
Connector version increment check | ✅ |
QA checks | ✅ |
☁️ View runs for commit in Dagger Cloud
Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command
airbyte-ci connectors --name=destination-milvus test
destination-pinecone test report (commit
|
Step | Result |
---|---|
Connector package install | ✅ |
Build destination-pinecone docker image for platform linux/x86_64 | ✅ |
Unit tests | ✅ |
Integration tests | ✅ |
Acceptance tests | ✅ |
Code format checks | ✅ |
Validate airbyte-integrations/connectors/destination-pinecone/metadata.yaml | ✅ |
Connector version semver check | ✅ |
Connector version increment check | ✅ |
QA checks | ✅ |
☁️ View runs for commit in Dagger Cloud
Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command
airbyte-ci connectors --name=destination-pinecone test
destination-qdrant test report (commit
|
Step | Result |
---|---|
Connector package install | ✅ |
Build destination-qdrant docker image for platform linux/x86_64 | ✅ |
Unit tests | ✅ |
Code format checks | ✅ |
Validate airbyte-integrations/connectors/destination-qdrant/metadata.yaml | ✅ |
Connector version semver check | ✅ |
Connector version increment check | ✅ |
QA checks | ✅ |
☁️ View runs for commit in Dagger Cloud
Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command
airbyte-ci connectors --name=destination-qdrant test
destination-pinecone test report (commit
|
Step | Result |
---|---|
Connector package install | ✅ |
Build destination-pinecone docker image for platform linux/x86_64 | ✅ |
Unit tests | ✅ |
Integration tests | ✅ |
Acceptance tests | ✅ |
Code format checks | ✅ |
Validate airbyte-integrations/connectors/destination-pinecone/metadata.yaml | ✅ |
Connector version semver check | ✅ |
Connector version increment check | ✅ |
QA checks | ✅ |
☁️ View runs for commit in Dagger Cloud
Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command
airbyte-ci connectors --name=destination-pinecone test
destination-milvus test report (commit
|
Step | Result |
---|---|
Connector package install | ✅ |
Build destination-milvus docker image for platform linux/x86_64 | ✅ |
Unit tests | ✅ |
Integration tests | ✅ |
Acceptance tests | ✅ |
Code format checks | ✅ |
Validate airbyte-integrations/connectors/destination-milvus/metadata.yaml | ✅ |
Connector version semver check | ✅ |
Connector version increment check | ✅ |
QA checks | ✅ |
☁️ View runs for commit in Dagger Cloud
Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command
airbyte-ci connectors --name=destination-milvus test
destination-qdrant test report (commit
|
Step | Result |
---|---|
Connector package install | ✅ |
Build destination-qdrant docker image for platform linux/x86_64 | ✅ |
Unit tests | ✅ |
Code format checks | ✅ |
Validate airbyte-integrations/connectors/destination-qdrant/metadata.yaml | ✅ |
Connector version semver check | ✅ |
Connector version increment check | ✅ |
QA checks | ✅ |
☁️ View runs for commit in Dagger Cloud
Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command
airbyte-ci connectors --name=destination-qdrant test
destination-pinecone test report (commit
|
Step | Result |
---|---|
Connector package install | ✅ |
Build destination-pinecone docker image for platform linux/x86_64 | ✅ |
Unit tests | ✅ |
Integration tests | ✅ |
Acceptance tests | ✅ |
Code format checks | ✅ |
Validate airbyte-integrations/connectors/destination-pinecone/metadata.yaml | ✅ |
Connector version semver check | ✅ |
Connector version increment check | ✅ |
QA checks | ✅ |
☁️ View runs for commit in Dagger Cloud
Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command
airbyte-ci connectors --name=destination-pinecone test
destination-milvus test report (commit
|
Step | Result |
---|---|
Connector package install | ✅ |
Build destination-milvus docker image for platform linux/x86_64 | ✅ |
Unit tests | ✅ |
Integration tests | ✅ |
Acceptance tests | ✅ |
Code format checks | ✅ |
Validate airbyte-integrations/connectors/destination-milvus/metadata.yaml | ✅ |
Connector version semver check | ✅ |
Connector version increment check | ✅ |
QA checks | ✅ |
☁️ View runs for commit in Dagger Cloud
Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command
airbyte-ci connectors --name=destination-milvus test
What
Based off #30718
This PR upgrades the CDK for Milvus and Qdrant desinations to allow processing of large records as well as adding support for Azure OpenAI embeddings and adding more text splitting options.
It also sets allowed hosts correctly.