You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What is the expected behavior? (add screenshot)
The selected area should cover the text displayed. I presume this means loading the fonts from the PDF into the TextLayer, and that this is not done to improve performance (ref), so would a setting to turn off this optimization be possible? (I tried the textLayerMode or enhanceTextSelection settings, and adding an extracted font-family codename to the TextLayer span CSS.)
I have generated a screenshot of expected behaviour by rendering using SVG, and deleting the TextLayer div:
Going by the deeper shading, I suspect my diagnosis is nonsense and there are several spans overlaid. (Unfortunately I have textbooks for the entire school curriculum, and I'm trying to launch for lock down study, I only have evenings, so am stuck with these awful PDFs. )
I would love to extract the font though so I can extract and display text outside pdf.js, if possible. Any pointers very gratefully received - I'm still struggling to navigate the core.
What went wrong? (add screenshot)
As you can see from the first screenshot, the selected areas do not match the text areas:
In the second screenshot you can faintly see the actual characters on the TextLayer spans (I made the span color black):
Link to a viewer (if hosted on a site other than mozilla.github.io/pdf.js or as Firefox/Chrome extension):
I am running the latest code from Git in Node.
The text was updated successfully, but these errors were encountered:
technicaltitch
changed the title
TextLayer not well aligned for Amharic fonts extracted from the PDF
TextLayer not well aligned for Amharic fonts extracted from this PDF
Mar 28, 2020
File:
Social Studies in Amharic Grade 5 Student Book.pdf
Configuration:
Steps to reproduce the problem:
ƒ T>’>e‚
" (the right character codes in a standard font).What is the expected behavior? (add screenshot)
The selected area should cover the text displayed. I presume this means loading the fonts from the PDF into the TextLayer, and that this is not done to improve performance (ref), so would a setting to turn off this optimization be possible? (I tried the
textLayerMode
orenhanceTextSelection
settings, and adding an extractedfont-family
codename to the TextLayer span CSS.)I have generated a screenshot of expected behaviour by rendering using SVG, and deleting the TextLayer div:
Going by the deeper shading, I suspect my diagnosis is nonsense and there are several spans overlaid. (Unfortunately I have textbooks for the entire school curriculum, and I'm trying to launch for lock down study, I only have evenings, so am stuck with these awful PDFs. )
I would love to extract the font though so I can extract and display text outside pdf.js, if possible. Any pointers very gratefully received - I'm still struggling to navigate the core.
What went wrong? (add screenshot)
As you can see from the first screenshot, the selected areas do not match the text areas:
In the second screenshot you can faintly see the actual characters on the TextLayer spans (I made the span color black):
Link to a viewer (if hosted on a site other than mozilla.github.io/pdf.js or as Firefox/Chrome extension):
I am running the latest code from Git in Node.
The text was updated successfully, but these errors were encountered: