Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PDF Tables with RowSpan and ColSpan not interpreted correctly #305

Closed
nkidambi opened this issue Oct 26, 2023 · 3 comments
Closed

PDF Tables with RowSpan and ColSpan not interpreted correctly #305

nkidambi opened this issue Oct 26, 2023 · 3 comments
Labels
enhancement New feature or request

Comments

@nkidambi
Copy link
Collaborator

Describe the bug
I have attached a PDF (publicly available) here. On Page 3 there is a table for VA pension benefits. For the question 'Can you tell me the full eligibility rules for receiving VA pension", the answer returned is incorrect. I have attached a screenshot of the issue.

To Reproduce
Steps to reproduce the behavior:

  1. Install the vNext-Dev version of the Information Assistant using bge embedding model
  2. Upload the attached pdf file
  3. Ask the question - "Can you tell me the full eligibility rules for receiving VA pension" and see the error as indicated in the attachment below.

Expected behavior
A clear explanation of legibility conditions that should match the ALL the content in Page 3 of the attached document.

Screenshots

Table Parsing

Alpha version details

  • GitHub branch: vNext-Dev

Additional context
summaryofvanationalguardandreserve.pdf

@dayland
Copy link
Contributor

dayland commented Nov 2, 2023

@nkidambi , Thanks for the sample. We have added this to our backlog as an improvement to be made. Unfortunately it will not be in our 0.4-Delta or v1.0 Releases that are pending.

@dayland dayland added the enhancement New feature or request label Nov 2, 2023
@ArpitaisAn0maly
Copy link
Contributor

Describe the bug I have attached a PDF (publicly available) here. On Page 3 there is a table for VA pension benefits. For the question 'Can you tell me the full eligibility rules for receiving VA pension", the answer returned is incorrect. I have attached a screenshot of the issue.

To Reproduce Steps to reproduce the behavior:

  1. Install the vNext-Dev version of the Information Assistant using bge embedding model
  2. Upload the attached pdf file
  3. Ask the question - "Can you tell me the full eligibility rules for receiving VA pension" and see the error as indicated in the attachment below.

Expected behavior A clear explanation of legibility conditions that should match the ALL the content in Page 3 of the attached document.

Screenshots

Table Parsing

Alpha version details

  • GitHub branch: vNext-Dev

Additional context summaryofvanationalguardandreserve.pdf

I have tested this particular use case and we are getting correct responses with GPT-4 in IA. GPT 3.5 as expected is not able to analyze tables to all details that is expected here. Given the variation in performance between GPT-3.5 and GPT-4 and the influence of chunking strategy on generation, it might be beneficial to explore chunking strategy for your particular use case. While IA works well with GPT-4 for your particular use case, we are seeing consistencies with GPT-3.5 with tables. You may have to tweak your top K to very limited to get what you need with GPT-4. Adding functionality to analyze tables with GPT 3.5 is in our backlog list. Thanks.

@nkidambi
Copy link
Collaborator Author

nkidambi commented Nov 6, 2023

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants