We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Here are some failing PDF files from ISTEX (after processing 1000 random), which were not failing with the latest pdf2xml:
pdf2xml
D3B2DA15EBD9A692BF1EF4D32606F95A72D5D381 5A09169C31467704EBB453123479708334DDAF35 45BCDD6CD0ECF1D7C6B9169E999C63BFF30DB501 7528880E3DCB09F09E214AFEC57C3A4FCEA15905 774EE3CD645A861B5F5184F96F80A837412887FA F8E939EACBC26F4F39B309AAAC7ABB8FC8A86C59 864EFF775D7F56E7223EAD95801A6A07ACD8CF71 0ACFDDBB83BF9A5ABAD34686AC4C8CE9317BDB2E 122C63850FE715C35B2B7A5FF376E484E5627C75 7F6B6DE03BEA6EAE36896867F88670B0E62F6EAA 86EDACFC946D09D3F2F7703448EA7E2544CC9AE4 7EE3BDA171BB275B860191E918289EC4F1289566 ED9964BA3659F48C2E9227DE095836E47845B509 83FAB54C06DCFB813657DCCAFF3C66AD67CC95FE 38BD0E2B812BB737321F7D9E76AF0E95A9593E3D DD522EB94B2A865F00B4DDFC345780B119579DD7 5EA4AAF2C4674DC7C8AFC750AF5320C0F7489FCE DD338AD05CEA42CF737A187344B1337385CC6FFB 052DFBD14E0015CA914E28A0A561675D36FFA2CC C3D11DEE82F3403336BE55E9F94DB6E9A6343E1B
The following 3 are failing both with pdfalto and pdf2xml if I am not wrong:
pdfalto
2DE3AF6CC5E90F16E64866D3784DC06B33705360 5E00837DC8C8EF0C9B4D16603261C993135EBCD5 8AD9F55CF0BC915BB6B448B1B536D3A7DC08239D
To get them: https://api.istex.fr/document/*ISTEXID*/fulltext/pdf
https://api.istex.fr/document/*ISTEXID*/fulltext/pdf
The text was updated successfully, but these errors were encountered:
Sorry this is relevant to grobid pdfalto integration!
Sorry, something went wrong.
No branches or pull requests
Here are some failing PDF files from ISTEX (after processing 1000 random), which were not failing with the latest
pdf2xml
:The following 3 are failing both with
pdfalto
andpdf2xml
if I am not wrong:To get them:
https://api.istex.fr/document/*ISTEXID*/fulltext/pdf
The text was updated successfully, but these errors were encountered: