Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could not browse (read) the uploaded pdf file #8

Closed
libragirl-dewiyana opened this issue Nov 1, 2023 · 9 comments
Closed

Could not browse (read) the uploaded pdf file #8

libragirl-dewiyana opened this issue Nov 1, 2023 · 9 comments
Labels
bug Something isn't working

Comments

@libragirl-dewiyana
Copy link

libragirl-dewiyana commented Nov 1, 2023

Environment: Safari Version 17.0 (19616.1.27.211.1)
Frequency: every time
Steps to reproduce error:

  1. Input the Chat GPT API-key
  2. Browse file from local source (pdf)
  3. Error happened when executed
TypeError: This app has encountered an error. The original error message is redacted to prevent data leaks. Full error details have been recorded in the logs (if you're on Streamlit Cloud, click on 'Manage app' in the lower right of your app).
Traceback:
File "/home/appuser/venv/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 541, in _run_script
    exec(code, module.__dict__)
File "/app/document-qa/streamlit_app.py", line 262, in <module>
    st.session_state['doc_id'] = hash = st.session_state['rqa'][model].create_memory_embeddings(tmp_file.name,
File "/app/document-qa/document_qa/document_qa_engine.py", line 204, in create_memory_embeddings
    texts, metadata, ids = self.get_text_from_document(pdf_path, chunk_size=chunk_size, perc_overlap=perc_overlap)
File "/app/document-qa/document_qa/document_qa_engine.py", line 169, in get_text_from_document
@lfoppiano
Copy link
Owner

Hi @libragirl-dewiyana, thank you for reporting the problem. It seems an issue when extracting text from the PDF document.
Could you share the PDF documents that you are using when you experience this problem?

NOTE: If the documents are public, you can upload them here by replying, then drag and drop them on the text form. If they are not public, could you send them via email (FOPPIANO.Luca@nims.go.jp)?

@lfoppiano lfoppiano added the bug Something isn't working label Nov 1, 2023
@libragirl-dewiyana
Copy link
Author

Thank you for your reply. I just sent it via email to your address above.

@lfoppiano
Copy link
Owner

Dear @libragirl-dewiyana,
I tested the PDF you sent me but I could not find anything wrong with it.

image

Can you send me a screenshot, next time you have a problem?

In mac you can use command + shitf + 4 and then select the area to delimit the screenshot, then you can just drag and drop here or via email.

Thanks!

@libragirl-dewiyana
Copy link
Author

image
Thank you for your reply. I also tried the above pdf file (SID-47482.pdf) at the moment, and there is no any problem browsing it. Since I can't reproduce the bug that I experienced yesterday, I'm not able to provide you the error screenshot for that file.
However, I can provide you with the screenshots of the previous bugs with different pdf files as following.

First image> October 30th 9:35 am for the file (SID-47456.pdf)
image
Second image> October 30th 9:38 am for the file (SID-47152-pdf)
image
For the second image, I also provided the comparison with the previous successful browsing for the same file (SID-47152) on Oct 27th 9:24 am.

All attempts were made on the same environment: Safari Version 17.0 (19616.1.27.211.1)
I'll send both files I mentioned above via email.
Thank you.

@lfoppiano
Copy link
Owner

@libragirl-dewiyana thanks for the screenshots.
It seems that these errors were due to bugs that should have solved in recent development. In fact I updated the application several time in the last few days.

It seems that now it works well, then I think we can close this issue.

Feel free to open a new one if you have other problems.

@libragirl-dewiyana
Copy link
Author

image Just now (Mon, Nov 6, 15:00 PM) I tried to browsed the previous PDF file (SID-47482.pdf), however the same error message occurred like the previous case. I wonder why the error messages happened randomly, even though the environment is the same? Thank you.

@lfoppiano
Copy link
Owner

lfoppiano commented Nov 6, 2023

oh! I'll check again the log and let you know.

Meanwhile, could you try this URL from now on? It should be working better: https://lfoppiano-document-qa.hf.space/

@libragirl-dewiyana
Copy link
Author

The above URL is working. Thank you very much!

@lfoppiano
Copy link
Owner

I've opened #11 on what I think is the problem. If you still have troubles, please write directly there.

Terima kasih!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants