Skip to content

Remove hyphenate.js #352

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from
Closed

Conversation

brad-anderson
Copy link
Member

I know Andrei is passionate about hyphenation but the current client-side javascript hyphenation slows down the website significantly for all users. If hyphenation is desired it should be done during page generation.

Replaced by #367

@andralex
Copy link
Member

No please. There are many things to improve before we get to remove hyphenation. Removing hyphenation would entail giving up on justified formatting so it would be a net downgrade in readability.

@andralex andralex closed this Jul 15, 2013
@andralex
Copy link
Member

For a better move: we should upgrade to the latest hyphenator, something we haven't done in a while. Newer releases support CSS3 hyphenation and adds more optimizations: https://code.google.com/p/hyphenator/

@brad-anderson
Copy link
Member Author

I already did upgrade hyphenate.js to a version that supports CSS3 (hypenate.js 4.0) about a year ago 4d58b3b. It helped a little bit but it's still slow. I don't see any optimizations in the changelog since the 4.0 release we are using. We should upgrade if we are going to continue using hyphenate.js, of course, but I don't think it will solve the performance problem.

@brad-anderson
Copy link
Member Author

@andralex Hyphenate.js upgrade pull request #353.

@andralex
Copy link
Member

@eco great, thanks. We're still two releases behind and at least 4.1.0 does add optimizations, see https://code.google.com/p/hyphenator/wiki/en_VersionHistory: faster pattern checking (async), globally hide and unhide text by setting CSS classes (faster).

Also there is advice on writing fast pages, see https://code.google.com/p/hyphenator/wiki/en_Optimizations, paragraph starting with "What an webauthor can do: ..." (we don't do that). I'm not sure where we are with sync/async/flicker control for our site, a good start would be go explore different modes of hyphenation.

In brief, no problem with removing this if necessary, but there are a bunch of things we could and should do (with hyphenator and orthogonal to it) to improve page speed before removing this valuable functionality.

@braddr
Copy link
Member

braddr commented Jul 15, 2013

IF this feature cost a perceptible performance difference or is the source of shifting content, then I believe that we're making the wrong customer experience trade off here. Hyphenation and text justification are of marginal value, imho. Snappy pages that only render once are of high value.

@andralex
Copy link
Member

@braddr we're in good shape thanks to @eco

@MartinNowak
Copy link
Member

Come on guys, there no reason why hypenation for static content has to be done by the client.
It shows up in the timeline profile with 2 seconds and despite the missing CDN it's the number one reason why the documentation loads sluggish.
So please integrate any offline implementation with the makefile.

Generated from:
http://hyphenator.googlecode.com/svn/tags/Version%204.2.0/mergeAndPack.html

Vim apparently has a bug when copying and pasting text from Chrome.
@brad-anderson
Copy link
Member Author

Obviously I'm in favor of this because I opened this pull request. The current state of things after I worked on a compromise is that fast CSS3 hyphens support is now used on every browser but Chrome and Opera.

Doing it in a makefile was also what I felt should happen until I read a line in hyphenator's FAQ:

A html file that has been hyphenated on the server would be full of ­'s. That's ugly and nobody knows how that would be treated by search engines

The last thing D needs is making everything harder to search for. The only two options we are left with are no hyphenation on Chrome and Opera or to use hyphenator (we needn't worry about other browsers as the CSS3 hyphens support works great and is very fast in my testing).

@quickfur
Copy link
Member

@andralex I disagree that this is "valuable functionality". It's a major time waster on Opera (my main browser), and makes the site hard to use for me. After turning off JS on dlang.org, I found that it is significantly more usable, and suffers from no noticeable degradation in readability. Judging by the repeated complaints about this on the D forum, I'm by far not the only one who is experiencing this. Please reconsider this pull.

I agree that if hyphenation is deemed necessary, then it should be done server-side, not client-side. Repeatedly hyphenating static HTML pages on the client-side simply makes no sense.

@MartinNowak
Copy link
Member

I hosted the documentation with disabled hyphenate.js and
server-side compression (mod_deflate) http://dlang.dawg.eu.

Just an extreme example.
http://dlang.org/phobos/std_datetime.html 7.5s until DOMContent event, 20s until Load event
http://dlang.dawg.eu/phobos/std_datetime.html 800ms until DOMContent event, 1.2s until Load event

@brad-anderson
Copy link
Member Author

Just to be clear, Jan enabled mod_deflate on dlang.org so the only thing
that differs is hyphenator (and whatever server performance/geography
differences there are) so this shows how truly slow it is. I've been
testing using the massive changelog for testing and getting similar numbers.

On the subject of server configuration, I'm probably going to prepare some
small configurations we can send to Jan to enable mod_expires and get
caching going (unless someone else beats me to it first).

@quickfur
Copy link
Member

To back up @dawgfoto 's numbers, I just did a quick test: (1) Clear browser cache, enable JS, reload dlang.org: total load time: 7s. (2) Clear browser cache, disable JS, reload dlang.org: 1.9s. Guess which setting I will be using from now on.

@MartinNowak
Copy link
Member

And another update, running listanchors() with jQuery(document).ready(listanchors) instead of <body onload="listanchors()" class='hyphenate'> gets rid of the annoying relayout. #365

@MartinNowak
Copy link
Member

No mod_deflate on my end btw.

@andralex
Copy link
Member

The whole listanchors business should be gotten rid of, it's a net negative. Ideas for a better design?

andralex added 2 commits July 22, 2013 10:53
Restore unicode characters in hyphenate.js
build list anchors as soon as the DOM is ready
@brad-anderson
Copy link
Member Author

Seems like there are four options:

  1. Extend DDOC in a way that would support generation of lists like these.
  2. Write a tool that can generate them that can run in the Makefile.
  3. Write them by hand for modules missing them (as is done for std.algorithm and std.range).
  4. Just ignore the problem because the ddox stuff (if it ever gets merged) will solve this and other problems (Add necessary files and new build targets for ddox based HTML documentation #267).

I prefer 4 because it brings more to the table overall and is the least amount of extra work. 1 feels like something Walter wouldn't go for and I don't know if I like either. 2 and 3 just require a dedicated person or two to do the work.

@braddr
Copy link
Member

braddr commented Jul 22, 2013

I strongly favor static generation over dynamic generation when the dynamic generation is identical to the static. The way I'd frame this: what cannot be done at doc build time? (note: not isn't currently.. I mean can't with work being done on the generator). Serving static files is significantly more efficient than any other mechanism.

Personally, the order you listed them matches my order of preferences.

@MartinNowak
Copy link
Member

How about going only with css hyphens which are already enabled? http://caniuse.com/css-hyphens

@quickfur
Copy link
Member

@dawgfoto +1, use CSS hyphenation if the browser supports it, otherwise just leave it alone. Get rid of JS hyphenation.

@brad-anderson
Copy link
Member Author

This would be my prefered approach as well. hyphens: auto is already in place so we'd just need to get rid of hyphenator.

@MartinNowak
Copy link
Member

Can you update this pull request then and leave the class="hyphenate" and class="donthyphenate" in place.

@brad-anderson
Copy link
Member Author

I've rebased it against master and modified it as needed. New commit isn't showing up here though. My guess is Github doesn't bother to resync closed pull requests. I can create a new pull request if necessary.

@brad-anderson
Copy link
Member Author

Can one of you org members reopen this?

@jmdavis jmdavis reopened this Jul 24, 2013
@brad-anderson
Copy link
Member Author

Thanks for reopening, Jonathan, but GitHub is not picking up the current branch state for some reason. Opened #367 to replace this.

@MartinNowak
Copy link
Member

Thanks for reopening, Jonathan, but GitHub is not picking up the current branch state for some reason. Opened #367 to replace this.

To trigger an update you need to force push a new commit hash.
A simple way to get one is to amend nothing. The updated commit date changes the hash.

@MartinNowak
Copy link
Member

How about using hyphenate.d instead?
We'd need to find a way to process HTML files or doc comments though.
I guess it would be pretty simple to integrate that with ddox (@s-ludwig).

@brad-anderson
Copy link
Member Author

Nice. My concern is still that adding &shy; throughout the middle of words has an unknown effect on search engine results and rankings.

@MartinNowak
Copy link
Member

While this is a valid concern, I don't think that is an issue for us. Google does correctly ignore soft hyphens so searching on dlang.org will work. Most search engines seem to handle HTML4 correctly.
I'm not convinced that high ranking of our documentation for unrelated searches is that important.
You'll always find D related stuff by adding D, dlang or site:dlang.org to your query.

If we decided SEO was important to us we should tackle this thoroughly rather than making decisions based on rumors.
A quick test searching for popFront shows that we're ranking very low currently. Google, Bing, Yahoo!, DuckDuckGo and Ixquick don't show D related results unless you add empty or D to the query.
You can get similar results for findSplit, nextPermutationor partition3.

Having a look at a SEO report for dlang.org, improving the load time is probably the most rewarding tasks on this list.

@brad-anderson
Copy link
Member Author

My main point is that hyphenation does not add enough value to risk even a small dip in page ranking and searchability. D is already criticized for (and you just showed yourself) being difficult to search for. So much so that a FAQ entry had to be added to help alleviate the problem. site:dlang.org works well and I use it extensively (keyword d in my browser makes use of it and I'm Feeling Lucky and has been great: http://www.google.com/search?q=site:dlang.org+%s&btnI). We can't really expect everyone to know or use that though.

If you can confirm adding &shy; everywhere won't hurt search results then I'm all for hyphenation done during page generation. Page source would be a bit ugly but that's not really a major problem. I'm sure Andrei would appreciate it.

@MartinNowak
Copy link
Member

My argumentation went slightly different.

  • I don't think that search engine ranks for D unrelated searches are important.
    The D related ones are very specific anyhow because dlang.org ranks extremely
    high for the keyword 'D'.
  • The small survey just showed that nobody paid attention to this until now.
    So a potential drop from 1000th to 1010th rank for searching findSplit is really
    not an argument.
  • Even if we would care about the whole stuff there are plenty of other opportunities
    for optimizations. The most valuable being load time.

Frankly the documentation is currently a pain.
As shown by http://dlang.dawg.eu there are exactly two things to fix besides #365,
getting the site on a CDN and getting rid of hyphenate.js. No two ways about this and the
sooner we get this done the better.
So I'm for merging #367 ASAP, integrating a hyphenation alternative soon and getting the CDN
organized soonish.

NB:
DPL-docs also suffers terribly from hyphenate.js load times.

@brad-anderson
Copy link
Member Author

I just think the overall visibility of D on search engines matters. A higher pagerank means someone searching for "alternative to C++" is more likely to find D and that's the sort of thing that is important. Reference lookup searches are not all that important as you've said. I think you are probably right that &shy; doesn't affect page ranking though.

I think we are in complete agreement about the course of action. I was going to send some apache configuration to Jan to get proper caching of the site but a CDN would be even better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants