can not decode afii characters (ISO 10036) #1381
Labels
help wanted
We appreciate help everywhere - this one might be an easy start!
workflow-arabic-text-extraction
Related to text extraction, but with a focus on Arabic text
extracted from #1379
PS : in the extraction result, the arabic characters are replaced with /afiinnnn. this is because the data uses the iso 10036 standard that I've not been able to find any free information on how to do transcoding
file 02voc.pdf
test code:
Originally posted by @pubpub-zz in #1379 (comment)
The text was updated successfully, but these errors were encountered: