You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello @kyleclo I identified an issue in referencing all_word_ids[-1] in case of no words detected on the page. I could try to fix it by checking first if the list is empty. But if you know a better fix please let me know
Hello @kyleclo I identified an issue in referencing all_word_ids[-1] in case of no words detected on the page. I could try to fix it by checking first if the list is empty. But if you know a better fix please let me know
Here is the page screen shot:
And the paper:
f87f9a26543e03c985867d0dbff1b900ecb6e46d.pdf
Here is the stack trace:
`File ~/Documents/codes/git/ai2/s2/mmda/src/mmda/parsers/pdfplumber_parser.py:170, in PDFPlumberParser.parse(self, input_pdf_path)
166 all_tokens.extend(fine_tokens)
167 all_row_ids.extend(
168 [i + last_row_id + 1 for i in line_ids_of_fine_tokens]
169 )
--> 170 last_row_id = all_row_ids[-1]
171 all_word_ids.extend(
172 [i + last_word_id + 1 for i in word_ids_of_fine_tokens]
173 )
174 last_word_id = all_word_ids[-1]
IndexError: list index out of range
`
The text was updated successfully, but these errors were encountered: