Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Text not selectable with current Linux binaries #3

Closed
obilodeau opened this issue Jun 29, 2015 · 36 comments
Closed

Text not selectable with current Linux binaries #3

obilodeau opened this issue Jun 29, 2015 · 36 comments
Assignees
Milestone

Comments

@obilodeau
Copy link
Contributor

I checked out master, built my own phantomjs in the submodule and PDF rendered some slides. The resulting PDF although very nice doesn't have selectable text.

For me this is still way better than the print-pdf option which has many problems so I don't really mind but since your README mentions selectable text I thought I should let you know.

@astefanutti
Copy link
Owner

Thanks for the report. Indeed, selectable text is supposed to work and even if you don't need it, that still impacts file size significantly.

Could you please precise on what OS / architecture you've built PhantomJS submodule? Indeed, it is a known issue on Mac 64-bit though it's supposed to be fixed by astefanutti/phantomjs@d32178f. Another lead would be to build the version before the Qt 5.4.1 upgrade:

git reset HEAD~1
git submodule update

That may help identify the root cause of the issue / regression.

@obilodeau
Copy link
Contributor Author

OS / architecture

Ubuntu x86_64

Another lead would be to build the version before the Qt 5.4.1 upgrade

$ git submodule update
fatal: reference is not a tree: f4ab03d601071e9d94346fd36dd832e7b29c5ada
Unable to checkout 'f4ab03d601071e9d94346fd36dd832e7b29c5ada' in submodule path 'phantomjs'

Not sure what's going on here. I see f4ab03 in your phantomjs fork on github and the submodule seems correctly configured:

$ cat .gitmodules 
[submodule "phantomjs"]
    path = phantomjs
    url = https://github.com/astefanutti/phantomjs.git

@astefanutti
Copy link
Owner

I haven't been able to generate and test binaries for Ubuntu yet unfortunately. There may have some missing dependencies as documented in:
http://doc.qt.io/qt-5/linux-requirements.html
https://wiki.qt.io/Building_Qt_5_from_Git

I see at least the following dependencies relating to font management / rendering:

sudo apt-get install libfontconfig1-dev libfreetype6-dev fontconfig

@obilodeau
Copy link
Contributor Author

I have no problem building phantomjs. I have a problem because I can't checkout commit f4ab03d from your phantomjs fork. This is because you rebased your patches on top of phantomjs and pushed the changes in the same branch thus overwriting the previous history.

Once a commit id is not referred to by some ref like a branch or a tag by git it will get garbage collected and will no longer be reachable. To avoid this in the future you might want to tag the commits that you will refer to with your decktape submodule.

I forked your forked, cherry-picked your patches on top of c4df640 and tagged it as decktape-fbe46421. I'm rebuilding phantomjs right now and we'll see if it fixes the issue.

@obilodeau
Copy link
Contributor Author

Another lead would be to build the version before the Qt 5.4.1 upgrade

Finally built previous phantomjs and I still experience the issue. I can select monospace text but not text that has the fancy reveal.js default web font (which is used everywhere).

I think that someone will have to make a patch similar to yours for the Linux platform in ariya/phantomjs#10373. This is beyond my competencies but I would be willing to test any fixes.

@astefanutti
Copy link
Owner

OK, thanks for the test. I've squashed the two commits to get rid of the garbage collected one as I haven't noticed any regression introduced by the rebase as far as DeckTape is concerned.

It looks like this is a specific Linux issue for embedding / rendering Web fonts. I'll try to dig into it further ASAP.

@morckx
Copy link

morckx commented Jul 3, 2015

I also had the problem that some fonts rendered as graphics instead of as text with a self-compiled phantomjs, on Ubuntu 15.04.
My solution was to simply use the pre-built windows version of the forked phantomjs with Ubuntu's standard version of wine. That worked flawlessly except that the rendering of opentype fonts was not perfect. However, I simply converted them to truetype fonts an had a perfect result.

@mojavelinux
Copy link
Contributor

This doesn't seem to be a problem with deck.js, so perhaps it has something to do with web fonts only.

@mojavelinux
Copy link
Contributor

Upon further exploration, it appears that PhantomJS is converting the web font glyphs into paths. That's why the text cannot be selected.

You can find this out by converting one of the PDF pages to SVG using Inkscape. This reveals that what is in the PDF are actually paths and not embedded font glyphs. Otherwise, Inkscape would try to map the glyphs to text objects.

$ inkscape --without-gui --file=slide-1.pdf --export-plain-svg=slide-1.svg

So this appears to be a PhantomJS (and perhaps WebKit) issue.

@mojavelinux
Copy link
Contributor

The workaround to this problem seems to be to install the TTF or OTF font on your system (and perhaps disable the web fonts import). Then, the custom font can be selected.

@mojavelinux
Copy link
Contributor

It's rather odd that it makes a difference that the font was loaded via the web, but that seems to be what the results are telling me.

@astefanutti
Copy link
Owner

Reported as a regression in ariya/phantomjs#13997.

@mojavelinux
Copy link
Contributor

and perhaps disable the web fonts import

If the font is on your system, and the @font-face declaration looks for a local font first (such as when using Google Fonts), you don't need to disable the font import.

@mojavelinux
Copy link
Contributor

From upstream:

The problem is that nearly all of the text is being emitted as vector graphics.

@mojavelinux
Copy link
Contributor

...we know that this is limited to text that uses web fonts that aren't on the local system.

@mojavelinux
Copy link
Contributor

As a workaround, you can grab the webfont TTF files and put them into ~/.fonts. You could write a script that puts them there before the export and removes them afterwards. Not ideal, but gets the job done.

@astefanutti
Copy link
Owner

I've provided the minimal test case below to the upstream ariya/phantomjs#13997 issue:

<html>
<style>
    @font-face {
        font-family: 'Ubuntu Mono';
        font-style: normal;
        font-weight: 400;
        src: url(http://astefanutti.io/further-cdi/fonts/UbuntuMono-Regular.woff) format('woff');
    }
    body {
        font-family: 'Ubuntu Mono', monospace;
    }
</style>
<body>
    <div>
        Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod
        tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam,
        quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo
        consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse
        cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat
        non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
    </div>
</body>
</html>

When executing phantomjs rasterize.js test.html test.pdf, the output PDF is 6 KB on Mac OS X and Windows while it's 122 KB on Debian 7. Besides, the text is not selectable in the latter.

So at least the PhantomJS dev team now has a clear reproducer. The good news is that it's a regression apparently (I haven't verified that yet). So that may ease the resolution. If not, I'll try to jump into it ASAP as this is a major limitation for Linux users.

@patrickdepinguin
Copy link

I have not seen progress on this issue either here or on the referenced phantomjs issue. What is the current status of this issue?

@astefanutti
Copy link
Owner

@patrickdepinguin no progress unfortunately yet since the above elements. Though that's a major issue so I'll jump into it ASAP to move forward on this.

I encourage you to weight in ariya/phantomjs#13997 as well as PhantomJS devs tend to prioritize based on user feedback.

@patrickdepinguin
Copy link

@astefanutti Thanks.
One note: I'm trying to convert a remark.js slideshow to PDF. In the CSS I specified plain font names; they appear in my ~/.fonts directory and the HTML displays them correctly.
However, when using decktape, the resulting fonts would fall back to a standard sans-serif font. It seems that decktape/phantomjs does not find the custom fonts.
For this reason, I was forced to use the CSS font-face property to download the fonts from the web. This does give a correct PDF but following this issue of very large size and without selectable text.
Strangely enough, using the font-face does not work in the HTML document: I get errors that the font cannot be downloaded due to cross-site requests. This could be true, but then I don't understand how decktape/phantomjs gets them.

@astefanutti
Copy link
Owner

I've isolated a minimal Qt application that demonstrates this is a regression from Qt 4.8 to Qt 5.x. I've created QTBUG-52417 and I'll try to dig into Qt to eventually have a fix.

@astefanutti astefanutti self-assigned this Apr 14, 2016
@astefanutti astefanutti changed the title text not selectable with current master Text not selectable with current Linux binaries May 12, 2016
@willemmulder
Copy link

Is there any update on this?

Thanks for looking into this!!

@astefanutti
Copy link
Owner

@willemmulder, next-step is to apply the work-around proposed in ariya/phantomjs#13997 (comment) and look into a definitive solution to QTBUG-52417.

@willemmulder
Copy link

But that fix is for Qt code, right? Would PhantomJS be able to 'package' (excuse the wording) that fix so that a new Phantom release fixes the Qt problem, even when Qt itself is not yet fixed?

(in other words: does Phantom reference Qt, or does it actually contain Qt?)

@astefanutti
Copy link
Owner

@willemmulder, yes PhantomJS uses its own version of Qt, for instance, that would be merging the fix in https://github.com/Vitallium/qtbase.

@willemmulder
Copy link

@astefanutti Ah that would be perfect! I don't know anything about this myself, but would be really grateful once this is in...!

@astefanutti
Copy link
Owner

An update on this is available at ariya/phantomjs#13997 (comment).

So there is a working solution. I'm still investing a bit to iterate over the proper solution and I'll update the binaries / Docker image ASAP.

@astefanutti
Copy link
Owner

astefanutti commented Sep 8, 2016

I think I'll go with the fix that I have as it works for all the test examples that I have.

@astefanutti astefanutti added this to the 1.0.0 milestone Sep 20, 2016
@astefanutti
Copy link
Owner

I've just released version 1.0.0, which contains the fix for this issue. On this occasion, I've managed to produce a statically built / linked version of PhantomJS Linux binary, which I successfully tested on CentOS, Debian and Ubuntu, see #66.

The Docker image has been updated with this statically linked binary.

I'm so glad it's fixed! Thanks for your patience!

@willemmulder
Copy link

Great; thank you so much! If I wanted to get the standalone PhantomJS binary, where could I go get it?

@astefanutti
Copy link
Owner

astefanutti commented Sep 23, 2016

@willemmulder it's all in the 1.0.0 release here: https://github.com/astefanutti/decktape/releases/tag/v1.0.0 (phantomjs-linux-x86-64 download).

@astefanutti
Copy link
Owner

@willemmulder BTW, I was thinking a plugin for Presenteer.js may be worth 😉!

@willemmulder
Copy link

Hey @astefanutti yeah that would be great!

I'm not sure how Decktape works?

Note that there is no Presenteer.js 'wrapper' to actually create presentations. Presenteer.js is (currently) focussed at developers, not end-users. Is that something that would need fixing first or can Decktape work with any Presenteer.js webpage, as long as the Presenteer.js API is clear?

@astefanutti
Copy link
Owner

@willemmulder DeckTape would just need to rely on the Presenteer.js API. DeckTape basically needs a nextSlide and (slideCount or hasNextSlide) API.

@willemmulder
Copy link

Okay, so how do we go from here? :-) Is there an instruction-page that I could follow? Or is there something else you need?

@astefanutti
Copy link
Owner

@willemmulder it's just a matter of creating a plugins/presenteer.js file. You can take the remark.js one as an example of a straightforward implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants