-
Notifications
You must be signed in to change notification settings - Fork 15.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue: Need a trace or debug feature in Lanchain DirectoryLoader #7725
Comments
Answer generated by a 🤖 AnswerI understand that you're having trouble with the DirectoryLoader in LangChain. You've set the debug mode to true, but it doesn't seem to be working as expected. You're unable to identify which file is causing the issue when loading multiple PDFs, and you're looking for a way to trace or debug the loader code to help you pinpoint the problematic file. Here's a potential solution that might help. You can modify the Here's how you can do it: import logging
def load_file(self, item: Path, path: Path, docs: List[Document], pbar: Optional[Any]) -> None:
if item.is_file():
if _is_visible(item.relative_to(path)) or self.load_hidden:
try:
logging.debug(f"Processing file: {item}") # Add this line
sub_docs = self.loader_cls(str(item), **self.loader_kwargs).load()
docs.extend(sub_docs)
except Exception as e:
if self.silent_errors:
logging.warning(e)
else:
raise e
finally:
if pbar:
pbar.update(1) This modification will log the name of the file being processed at the debug level. To see these logs, you'll need to set the logging level to debug in your application. You can do this by adding the following line at the start of your application: logging.basicConfig(level=logging.DEBUG) This will set the root logger level to DEBUG, so all debug (and higher level) logs will be shown. If you only want to see debug logs from LangChain, you can set the level on its logger specifically: logging.getLogger('langchain').setLevel(logging.DEBUG) Please note that this is just one potential solution. There might be other ways to achieve the same result depending on your specific use case and environment. Also, as an AI, I'm unable to view images or follow links. Therefore, I can't see the screenshot you've attached or access the URL you've provided. If you find this solution helpful and believe it could benefit others, I encourage you to make a pull request to the LangChain repository with these changes. This way, other users facing the same issue can benefit from your contribution. SourcesThis response is meant to be useful, save you time, and share context. It is not meant to be a precise solution, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. |
Description: Added debugging output in DirectoryLoader to identify the file being processed. Issue: [Need a trace or debug feature in Lanchain DirectoryLoader #7725](#7725) Dependencies: No additional dependencies are required. Tag maintainer: @rlancemartin, @eyurtsev This PR enhances the DirectoryLoader with debugging output to help diagnose issues when loading documents. This new feature does not add any dependencies and has been tested on a local machine.
Hi, @botchagalupe! I'm Dosu, and I'm here to help the LangChain team manage their backlog. I wanted to let you know that we are marking this issue as stale. From what I understand, you were requesting a trace or debug feature for the DirectoryLoader in LangChain. You were experiencing issues with loading PDF files and wanted a way to identify which file is causing the problem. Dosu-beta suggested modifying the Before we close this issue, we wanted to check with you if it is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on the issue. Otherwise, feel free to close the issue yourself or it will be automatically closed in 7 days. Thank you for your understanding and contribution to the LangChain project! |
This can not possibly be the answer to modify source. This is super basic? |
Issue you'd like to raise.
I've set "langchain.debug=True"; however, it does not work for the DirectoryLoader. I have a notebook that tried to load a dozen or more PDFs, and typically, at least one of the files fails (see attached). I looked at the code, and as far as I can tell, there is no trace or debug feature in (https://github.com/hwchase17/langchain/tree/master/langchain/document_loaders).
My issue is that the loader code is a black box. I can't tell which file is failing; therefore, I have to process each one individually to find out which one is failing. It would be beneficial if a trace/debugger could help me identify which file it's failing on.
TIA
Suggestion:
Please make a debug option for "https://github.com/hwchase17/langchain/tree/master/langchain/document_loaders" code.
The text was updated successfully, but these errors were encountered: