You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a follow up of earlier issue#43 which was closed. All the details provided in #43 are the same, here is the updated information:
I upgraded tabula.py to use the latest jar (tabula-1.0.1-jar-with-dependencies.jar) and while it reduced these warnings, I still get some.
Aug 31, 2017 11:42:02 AM org.apache.pdfbox.pdmodel.font.PDTrueTypeFont
WARNING: Using fallback font 'LiberationSans' for 'TimesNewRomanPS-ItalicMT'
Aug 31, 2017 11:42:03 AM org.apache.pdfbox.pdmodel.font.PDTrueTypeFont
WARNING: Using fallback font 'LiberationSans' for 'TimesNewRomanPS-ItalicMT'
Aug 31, 2017 11:42:03 AM org.apache.pdfbox.pdmodel.font.PDTrueTypeFont
WARNING: Using fallback font 'LiberationSans' for 'TimesNewRomanPS-ItalicMT'
The main issue is that common cell headers donot get read in and not sure if the warnings are related. Please find the PDF file here: ufile.io/5xuti
You will see that common cell header in table of page 1 for instance ('Three Months Ended March 31') gets dropped.
The text was updated successfully, but these errors were encountered:
I think even if tabula-java can't solve to extract "common cell header". If tabula-java can, tabula-py convert it into DataFrame, and I think DataFrame can not handle combined cell.
This is a follow up of earlier issue#43 which was closed. All the details provided in #43 are the same, here is the updated information:
I upgraded tabula.py to use the latest jar (tabula-1.0.1-jar-with-dependencies.jar) and while it reduced these warnings, I still get some.
Aug 31, 2017 11:42:02 AM org.apache.pdfbox.pdmodel.font.PDTrueTypeFont
WARNING: Using fallback font 'LiberationSans' for 'TimesNewRomanPS-ItalicMT'
Aug 31, 2017 11:42:03 AM org.apache.pdfbox.pdmodel.font.PDTrueTypeFont
WARNING: Using fallback font 'LiberationSans' for 'TimesNewRomanPS-ItalicMT'
Aug 31, 2017 11:42:03 AM org.apache.pdfbox.pdmodel.font.PDTrueTypeFont
WARNING: Using fallback font 'LiberationSans' for 'TimesNewRomanPS-ItalicMT'
The main issue is that common cell headers donot get read in and not sure if the warnings are related. Please find the PDF file here: ufile.io/5xuti
You will see that common cell header in table of page 1 for instance ('Three Months Ended March 31') gets dropped.
The text was updated successfully, but these errors were encountered: