Skip to content

Conversation

@e-palmisano
Copy link

INTPYTHON-809 - Migrate langchain-mongodb to LangChain 1.0

Issue Key

Summary

This PR migrates the langchain-mongodb package to be compatible with LangChain 1.0+. The main focus is on removing dependencies on legacy packages (langchain-classic) where possible and updating retrievers to use the modern
langchain_core APIs. This includes:

  • Refactoring MongoDBAtlasParentDocumentRetriever to extend BaseRetriever directly from langchain_core instead of the legacy ParentDocumentRetriever
  • Fixing critical bugs in the add_documents() method discovered during migration
  • Updating MongoDBAtlasSelfQueryRetriever to explicitly use langchain-classic for components not yet available in LangChain 1.0
  • Removing the langchain < 1.0 version pin to allow users to upgrade to LangChain 1.0+
  • Updating dependencies for compatibility with langgraph-checkpoint>=3.0.1 and aiohttp>=3.13.2

Changes in this PR

1. MongoDBAtlasParentDocumentRetriever Refactoring

File: libs/langchain-mongodb/langchain_mongodb/retrievers/parent_document.py

  • Removed dependency on langchain-classic: Changed base class from ParentDocumentRetriever (legacy) to BaseRetriever from langchain_core
  • Added child_splitter field: Required for splitting parent documents into chunks
  • Added search_kwargs field: Enables passing additional search parameters to vector search
  • Implemented add_documents() method: Complete implementation for adding parent documents and child chunks to the vectorstore

Critical Bug Fixes in add_documents():

  • Fixed call to non-existent docstore.add_documents() → changed to docstore.mset() which is the correct API for MongoDBDocStore
  • Fixed indentation bug where vectorstore.add_documents() and docstore.mset() were called inside the document loop instead of after processing all documents
  • Added proper key-value pair formatting for mset(): [(id, Document), ...]

Updated Docstrings:

  • Removed references to legacy ParentDocumentRetriever and MultiVectorRetriever classes
  • Added documentation about LangChain 1.0+ compatibility
  • Fixed example code to use correct class name MongoDBAtlasParentDocumentRetriever

2. MongoDBAtlasSelfQueryRetriever Dependencies

File: libs/langchain-mongodb/langchain_mongodb/retrievers/self_querying.py

  • Updated imports to use langchain-classic for AttributeInfo and SelfQueryRetriever
  • These classes are still in langchain-classic in LangChain 1.0 as they are considered legacy components
  • Added documentation noting the langchain-classic dependency requirement

3. Documentation Updates

File: libs/langchain-mongodb/CHANGELOG.md

  • Added version 0.8.0 entry with detailed breakdown of:
    • Breaking changes (LangChain 1.0+ requirement, API changes)
    • Bug fixes (docstore and indentation issues)
    • Enhancements (new fields, updated docs)

4. Dependency Updates

Files: pyproject.toml, uv.lock

  • Removed pin langchain < 1.0
  • Updated to require langchain >= 1.0
  • Added/updated dependencies:
    • langgraph-checkpoint >= 3.0.1
    • aiohttp >= 3.13.2

Test Plan

Unit Tests

  • Existing tests: All existing unit tests for MongoDBAtlasParentDocumentRetriever pass with the refactored implementation
  • Key tested scenarios:
    • Document splitting and chunk creation
    • Metadata propagation (doc_id field)
    • Vector search with parent document lookup
    • Docstore operations using mset()

Integration Tests

  • Verified that MongoDBAtlasParentDocumentRetriever.from_connection_string() correctly initializes the retriever
  • Tested add_documents() with multiple parent documents
  • Verified retrieval returns complete parent documents (not chunks)
  • Confirmed MongoDBAtlasSelfQueryRetriever works with langchain-classic installed

Manual Testing

  • Created a sample script using MongoDBAtlasParentDocumentRetriever with LangChain 1.0
  • Verified document indexing and retrieval workflow end-to-end
  • Confirmed no import errors or deprecation warnings with LangChain 1.0+

Compatibility Testing

  • Tested with langchain==1.0.0
  • Tested with langchain-core==1.0.0
  • Verified langchain-classic is correctly used only where necessary

Checklist

Checklist for Author

  • Did you update the changelog (if necessary)?
  • Is the intention of the code captured in relevant tests?
  • If there are new TODOs, has a related JIRA ticket been created?
  • Has a MongoDB Employee run the patch build of this PR?

Checklist for Reviewer {@primary_reviewer}

  • Does the title of the PR reference a JIRA Ticket?
  • Do you fully understand the implementation? (Would you be comfortable explaining how this code works to someone else?)
  • Have you checked for spelling & grammar errors?
  • Is all relevant documentation (README or docstring) updated?

Focus Areas for Reviewer

  1. add_documents() implementation in parent_document.py: This method was completely rewritten to fix critical bugs. Please verify:

    • The logic for splitting documents into chunks is correct
    • The use of docstore.mset() with key-value pairs is appropriate
    • The operations are correctly placed outside the document loop
  2. Breaking changes: This PR introduces breaking changes for users:

    • MongoDBAtlasParentDocumentRetriever no longer requires langchain-classic but has a different API surface
    • MongoDBAtlasSelfQueryRetriever now requires explicit installation of langchain-classic
    • Verify the CHANGELOG adequately documents these changes
  3. LangChain 1.0 compatibility:

    • Ensure all imports from langchain_core are correct
    • Verify no deprecated APIs are being used
    • Check that the migration path is clear for users
  4. Backward compatibility concerns:

    • Are there any users who might be affected by the removal of the langchain < 1.0 pin?
    • Should we provide a migration guide in the documentation?

- Update dependencies for compatibility with `langgraph-checkpoint>=3.0.1` and `aiohttp>=3.13.2`.
@e-palmisano e-palmisano changed the title - Migrate to updated LangChain and LangGraph APIs. Update to langchain 1.0 and langgraph 1.0 Nov 11, 2025
@e-palmisano e-palmisano changed the title Update to langchain 1.0 and langgraph 1.0 INTPYTHON-809 - Update to langchain 1.0 and langgraph 1.0 Nov 11, 2025
@blink1073
Copy link
Collaborator

Hi @e-palmisano, thank you for creating this PR. We had to prioritize getting security fixes for the langgraph packages, so there are now merge conflicts in this PR. I think we can make langchain_classic a test dependency, and an optional dependency for the self-querying retriever. It could raise an informative error on class instantiation if the import was not found.

@blink1073
Copy link
Collaborator

Closing in favor of #271. The 0.8 release is now compatible with langchain>1.0.

@blink1073 blink1073 closed this Nov 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants