Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to pass kwargs to loader classes in DirectoryLoader, add ability to modify encoding and BeautifulSoup behaviour in BSHTMLLoader #2275

Merged

Conversation

samuelwcm
Copy link
Contributor

@samuelwcm samuelwcm commented Apr 1, 2023

Solves #2247. Noted that the only test I added checks for the BeautifulSoup behaviour change. Happy to add a test for DirectoryLoader if deemed necessary.

@@ -15,3 +18,25 @@ def test_bs_html_loader() -> None:

assert metadata["title"] == "Chew dad's slippers"
assert metadata["source"] == str(file_path)


@pytest.mark.skipif(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test will only run on Windows? Not sure this will be run during checks.

Copy link
Contributor

@hwchase17 hwchase17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

@hwchase17 hwchase17 merged commit 1ddd6db into langchain-ai:master Apr 1, 2023
9 checks passed
@samuelwcm
Copy link
Contributor Author

@hwchase17 Just spotted #2250 - solves the same problem as my PR, but the parameter is just named encoding rather than open_encoding. Any thoughts on this? Should I raise another PR renaming it to be in line with TextLoader (obviously considering this will be a breaking change)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants