Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Document Loader for Datastore. #7

Merged
merged 16 commits into from
Feb 16, 2024
Merged

feat: Document Loader for Datastore. #7

merged 16 commits into from
Feb 16, 2024

Conversation

JU-2094
Copy link
Member

@JU-2094 JU-2094 commented Feb 14, 2024

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated (if necessary)

Add Document Loader for Firestore in Datastore Mode

@JU-2094 JU-2094 requested a review from a team as a code owner February 14, 2024 05:22
@product-auto-label product-auto-label bot added the api: datastore Issues related to the googleapis/langchain-google-datastore-python API. label Feb 14, 2024
@kurtisvg kurtisvg self-assigned this Feb 14, 2024
src/langchain_google_datastore/document_loader.py Outdated Show resolved Hide resolved
TYPE = "datastore_type"


class DocumentConverter:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need a DocumentConverter class here? What value does it add over these staticmethods just being functions?

Comment on lines 34 to 38
def convertFirestoreEntity(
entity: Entity,
page_content_properties: List[str] = [],
metadata_properties: List[str] = [],
) -> Document:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add some comments? convertFirestoreEntity -- am I converting to a FirestoreEntity? from a FirestoreEntity?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for all 4 of these functions

Comment on lines 164 to 171
val_converted = {
k: DocumentConverter._convertFromLangChain(v, client)
for k, v in val.items()
}
elif isinstance(val, list):
val_converted = [
DocumentConverter._convertFromLangChain(v, client) for v in val
]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these for covering the nested cases?

It would probably be more readable to short circuit these at the top

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SG. I will move this to the TOP in both PRs

Comment on lines +27 to +29
@pytest.fixture
def test_case() -> TestCase:
return TestCase()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this being used for?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the assert, that ensures two list have the same elements. The name in python3 is misleading:

test_case.assertCountEqual

https://docs.python.org/3.6/library/unittest.html#unittest.TestCase.assertCountEqual

Previously called assertItemsEqual

src/langchain_google_datastore/document_loader.py Outdated Show resolved Hide resolved
@kurtisvg kurtisvg assigned JU-2094 and unassigned kurtisvg Feb 16, 2024
src/langchain_google_datastore/document_converter.py Outdated Show resolved Hide resolved
src/langchain_google_datastore/document_converter.py Outdated Show resolved Hide resolved
src/langchain_google_datastore/document_loader.py Outdated Show resolved Hide resolved
src/langchain_google_datastore/document_loader.py Outdated Show resolved Hide resolved
@JU-2094 JU-2094 merged commit 28a9c01 into main Feb 16, 2024
8 checks passed
@JU-2094 JU-2094 deleted the jfurbina-doc-loader branch February 16, 2024 23:15
@release-please release-please bot mentioned this pull request Feb 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: datastore Issues related to the googleapis/langchain-google-datastore-python API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants