We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Describe the bug
docs_path and pdf_path are typically followed by a slash "/" like below:
docs_path
pdf_path
docs_path = "tests/data/html/" pdf_path = "tests/data/pdf/" doc_preprocessor = HTMLDocPreprocessor(docs_path, max_docs=max_docs) corpus_parser = Parser( session, parallelism=PARALLEL, structural=True, lingual=True, visual=True, pdf_path=pdf_path, )
While docs_path can be without a trailing slash, pdf_path cannot.
To Reproduce Steps to reproduce the behavior:
pdf_path = "tests/data/pdf"
Expected behavior
pdf_path without a trailing slash works too.
Error Logs/Screenshots If applicable, add error logs or screenshots to help explain your problem.
I/O Error: Couldn't open file 'tests/data/pdf112823.PDF': No such file or directory.
Environment (please complete the following information):
The text was updated successfully, but these errors were encountered:
Successfully merging a pull request may close this issue.
Describe the bug
docs_path
andpdf_path
are typically followed by a slash "/" like below:While
docs_path
can be without a trailing slash,pdf_path
cannot.To Reproduce
Steps to reproduce the behavior:
pdf_path = "tests/data/pdf"
Expected behavior
pdf_path
without a trailing slash works too.Error Logs/Screenshots
If applicable, add error logs or screenshots to help explain your problem.
Environment (please complete the following information):
The text was updated successfully, but these errors were encountered: