Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
UnstructuredURLLoader: allow url failures, keep processing (#1954)
By default, UnstructuredURLLoader now continues processing remaining `urls` if encountering an error for a particular url. If failure of the entire loader is desired as was previously the case, use `continue_on_failure=False`. E.g., this fails splendidly, courtesy of the 2nd url: ``` from langchain.document_loaders import UnstructuredURLLoader urls = [ "https://www.understandingwar.org/backgrounder/russian-offensive-campaign-assessment-february-8-2023", "https://doesnotexistithinkprobablynotverynotlikely.io", "https://www.understandingwar.org/backgrounder/russian-offensive-campaign-assessment-february-9-2023", ] loader = UnstructuredURLLoader(urls=urls, continue_on_failure=False) data = loader.load() ``` Issue: #1939
- Loading branch information