Skip to content

Commit

Permalink
add missing source field to pymupdf output (#2110)
Browse files Browse the repository at this point in the history
To be consistent with other loaders for use with the `Sources` vector
workflows.
  • Loading branch information
timothyasp committed Mar 28, 2023
1 parent a554e94 commit b25dbcb
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions langchain/document_loaders/pdf.py
Original file line number Diff line number Diff line change
Expand Up @@ -156,6 +156,7 @@ def load(self, **kwargs: Optional[Any]) -> List[Document]:
page_content=page.get_text(**kwargs).encode("utf-8"),
metadata=dict(
{
"source": file_path,
"file_path": file_path,
"page_number": page.number + 1,
"total_pages": len(doc),
Expand Down

0 comments on commit b25dbcb

Please sign in to comment.