Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Harrison/unstructured validation #2111

Merged
merged 2 commits into from Mar 28, 2023
Merged

Conversation

hwchase17
Copy link
Contributor

No description provided.

kravetsmic and others added 2 commits March 28, 2023 13:11
Added `headers` parameters for UnstructuredURLLoader.
If the version of `unstructured` is less than 0.5.7 and `headers` is not
an empty dict, the user will see a warning (You are using old version of
unstructured. The headers parameter is ignored).

Ways to reproduce:
```bash
poetry add unstructured="0.5.6"
```
```python
from langchain.document_loaders import UnstructuredURLLoader
urls = [
     "https://www.understandingwar.org/backgrounder/russian-offensive-campaign-assessment-february-8-2023",
     "https://doesnotexistithinkprobablynotverynotlikely.io",
     "https://www.understandingwar.org/backgrounder/russian-offensive-campaign-assessment-february-9-2023",
 ]
loader = UnstructuredURLLoader(urls=urls, continue_on_failure=False, headers={"User-Agent": "value"})
```
Logs:
```
You are using old version of unstructured. The headers parameter is ignored
```
In this case, headers will not be passed to `partition_html` function.
If the user will create the object of `UnstructuredURLLoader` without
the `headers` parameter or with an empty dict, he will not see the
warning.


If the unstructured version is equal to or more than 0.5.7, the user
will not see the warning after creating the object of
`UnstructuredURLLoader` with the `headers` parameter.

---

Closes issue #1944
@hwchase17 hwchase17 merged commit 6e85cbc into master Mar 28, 2023
9 checks passed
@hwchase17 hwchase17 deleted the harrison/unstructured-validation branch March 28, 2023 20:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants