Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Long words disappearing #39

Open
bminde opened this issue Mar 6, 2014 · 11 comments
Open

Long words disappearing #39

bminde opened this issue Mar 6, 2014 · 11 comments

Comments

@bminde
Copy link

bminde commented Mar 6, 2014

Long words are not shown, it jumps right to the next word.

E.g on this page http://www.nrk.no/nordland/en-av-fire-disponert-for-narkolepsi-1.11585097 the words "forskningsartikkel", "svineinfluensaviruset", "Pandemrix-vaksinen" and "årsakssammenhengen" are not shown.

@0xE282B0
Copy link

0xE282B0 commented Mar 6, 2014

spritz.js:

173        var tail = 22 - (word.length + 7);
174        word = '.......' + word + ('.'.repeat(tail));
310        String.prototype.repeat = function( num ){
311            return new Array( num + 1 ).join( this );
312        }

forskningsartikkel length is 18: 22-(18+7) = -3

Uncaught RangeError: Invalid array length (on line 311)

Max. word size is 16.
I think we need to split long words. The question is where.

Any suggestions?

@Miserlou
Copy link
Owner

Miserlou commented Mar 6, 2014

Ah! Great catch. I guess it's not going to work all that well for German and Norwegian, etc right now..

22 is an arbitrary number. We could just raise that up. Would that solve your problem?

@0xE282B0
Copy link

0xE282B0 commented Mar 6, 2014

We can use soft hyphen (­) to split long words. For German and English there is for example hyphenator.js https://code.google.com/p/hyphenator/ which can be used as bookmarklet, too. But I have no solution for Norwegian or other Languages.

@elicwhite
Copy link
Contributor

On the official spritz example, I think they actually split words in the middle, and use dashes to show parts of the word over multiple frames.

@F30
Copy link

F30 commented Mar 10, 2014

This is really annoying and currently makes OpenSpritz hardly usable for German texts. Hyphenation is the way to go in my opinion.

@tomByrer
Copy link
Contributor

@F30 , how long are the (typical) German words please?

@kukulski
Copy link

I use the hyphenator that @smielke mentions -- it exposes a hyphenateWord method. I only hyphenate words that are too long (presently by character length, but I'll be upgrading to base this on rendered width in ens.) see: https://github.com/kukulski/readifry/blob/master/main.js

@F30
Copy link

F30 commented Mar 16, 2014

@tomByrer Hmm, hard to estimate. We do have words like
„Ver­mö­gens­zu­ord­nungs­zu­stän­dig­keits­über­tra­gungs­ver­ord­nung“ [1], but such are of course rather the exception than the rule. As the graphic in [2] is 404'ing, I unfortunately couldn't find a source for the word length distribution, but the article states an average length of 10.6 characters.

I agree, however, that it would be sensible to add hyphenation for long words instead of just increasing the maximum to some arbitrary value.
From my knowledge, there exist both algorithm- and word list-based approaches to hyphenation. The one @smielke mentioned seems to be algorithmic, which of course appears preferable for a JS solution. It also promises to support more languages than he mentioned [3].


[1] http://www.sprachlog.de/2013/06/05/das-neue-laengste-wort-des-deutschen/
[2] http://www.duden.de/sprachwissen/sprachratgeber/durchschnittliche-laenge-eines-deutschen-wortes
[3] http://code.google.com/p/hyphenator/wiki/en_AddNewLanguage

@tomByrer
Copy link
Contributor

average length of 10.6 characters

With a max length of 18 characters, hyphenating does sound best.

@Miserlou
Copy link
Owner

Hyphenator is a good idea here. Is there a preferred method, or should we
just rip out the readifry one?

From my phone..
On Mar 16, 2014 7:17 AM, "tomByrer" notifications@github.com wrote:

average length of 10.6 characters

With a max length of 18 characters, hyphenating does sound best.

Reply to this email directly or view it on GitHubhttps://github.com//issues/39#issuecomment-37758152
.

@kukulski
Copy link

I broke out my hyphenation wrapper to make it easy to pick up: https://github.com/kukulski/readifry/blob/gh-pages/HyphenHelper.js

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants