Finalise wiki link format #337

Jermolene · 2014-01-11T22:43:05Z

TiddlyWiki5 currently uses the same regular expression for matching WikiLinks as TiddlyWiki Classic:

var textPrimitives = {
    upperLetter: "[A-Z\u00c0-\u00de\u0150\u0170]",
    lowerLetter: "[a-z0-9_\\-\u00df-\u00ff\u0151\u0171]",
    anyLetter:   "[A-Za-z0-9_\\-\u00c0-\u00de\u00df-\u00ff\u0150\u0170\u0151\u0171]",
    anyLetterStrict: "[A-Za-z0-9\u00c0-\u00de\u00df-\u00ff\u0150\u0170\u0151\u0171]"
};

textPrimitives.unWikiLink = "~";
textPrimitives.wikiLink = textPrimitives.upperLetter + "+" +
    textPrimitives.lowerLetter + "+" +
    textPrimitives.upperLetter +
    textPrimitives.anyLetter + "*";

In plain language:

An uppercase letter is defined as the letters A-Z plus the characters: ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞŐŰ
A lowercase letter is defined as the letters a-z and the digits 0-9 plus the characters: underscore, minus, and ßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿőű
Any letter is defined as either an uppercase letter or a lowercase letter
A link mustn't be preceded by any letter (except minus and underscore)
A link is:
- one or more uppercase letters
- followed by one or more lowercase letters
- followed by one upper case letter
- followed by any combination of upper and lowercase letters

There are several obvious problems:

The characters × and ÷ shouldn't be classified as letters
The rule that allows links to be preceded by minus or underscore means that the string "HelloThere" in the text "something_HelloThere" is wikified, which seems incorrect
Allowing the minus sign as a lower case letter means that many compound nouns in German are erroneously rendered as links

Generally, though, the rules are arguably far too loose. I think it makes sense for the wikilink rules to be on the conservative side, as it is easier to explicitly link a text than it is to suppress a wiki link.

The text was updated successfully, but these errors were encountered:

pmario · 2014-01-12T16:11:24Z

Generally, though, the rules are arguably far too loose. I think it makes sense for the wikilink rules to be on the conservative side, as it is easier to explicitly link a text than it is to suppress a wiki link.

hmmm,
I think [[create a wikilink]] is more work than ~NoLink. Both escape mechanisms are annoying, if you have a lot of them. So imo it depends on the usecase and probably the language settings.

Jermolene · 2014-01-12T16:17:59Z

It's not just the work of typing that we should be concerned with, it's also the cognitive overhead of links appearing unexpectedly versus explicitly forcing a link. The trouble with complex rules is that the user has to be able to rerun the rules in their head in order to understand what's going on.

pmario · 2014-01-12T18:49:26Z

I think the rules are not complex. Ther's just a bug with the "word" detection

something_HelloThere and something-HelloThere are treated as 2 words, where the second word is CamelCase. -> if something_HelloThere and something-HelloThere would be the same as somethingHelloThere, there would be no problem.

So for me this is just a word detection bug

The question is, if "underline" and "hyphens" combine to words, so the parser sees them as one.

In german language there are rules, that are not "optional" (rule 26-30)[1] ...
eg: Mund-zu-Mund-Beatmung ... (TW5 does create a wikilink here)

If we apply the first assumption hypens and underline combine words we get perfect CamelCase words. Those words can be escaped by ~ -> ~Mund-zu-Mund-Beatmung. Which imo is a simple rule that actually works.

The problem I see, is that adding ~ sucks, if you have a lot of combined words like this.

I'd use \rules except wikitext in this case or I'd like to have a possibility to switch off wikilinks globally and only allow [[wikilinks]] like that.

IMO it depends on the usecase, if Mund-zu-Mund-Beatmung is a CamelCase word or not.

If I write an online "first aid" manual, that contains the text "Mund-zu-Mund-Beatmung", I'd like the automatic link to an "explanation tiddler"
- Since TW5 actually does what I expect, I have no reason to complain.
- If this rule would be changed, I may have a reason :)
If I write a german prose "single tiddler paper", that contains a lot of combined words I'd probably have a language specific problem. May be this can be solved with "translation plugins"

So I really think it depends very much on the usecase.
eg: "egoism" in german can be "Ichsucht" or "Ich-Sucht". While the first one is "just a word" the second one "wants to indicate importance", which may need a more detailed explanation. -> new tiddler -> so it should be a wikilink.

[1] http://www.duden.de/sprachwissen/rechtschreibregeln/bindestrich

ghost · 2014-01-16T15:38:28Z

Hi... I hope you won't mind a little input from a very appreciative and fairly new user.

Regarding this:

I find any sort of link inference annoying (in practice and in theory).
I obviously find the situation worse if such inference is active by default.
It's obviously even worse if one can't ever turn the inference off at all!

In the interest of simplicity (and interoperability), though, it's better to have fewer settings to toggle, so I'd actually vote for no inference at all.

But of course you have the question of backward compatibility to consider.

I believe that 'no inference' (i.e. explicit links only) is the Markdown (and the MediaWiki) way, and I am very keen (for reasons of interoperability etc.) on seeing increased support for that format in TW5; I actually visited here today to raise what turns out to be #352!

Reasons for finding link inference annoying:

I'm always typing upper-camel-case and similar words.
I'm forced to pepper my text with tildes, which reduces portability.
It should be up to me what gets (potentially, eventually) construed as a link; that's why I'm using a wiki!
Conversely, it's a poor use of developer time (and a complication of your code base) trying to concoct universal rules to second-guess requirements that might satisfy everyone!

I hope these viewpoints are of some use, though I'm sure they are obvious to you anyway.

Jermolene · 2014-01-16T15:57:14Z

Hi @pipedelimited that's very useful feedback, thank you. I think you might like the proposal in #345. It's basically a way to allow authors to control the wikification of their own tiddlers whilst giving them interoperability so that their content can be displayed properly in other wikis.

tgirod · 2014-01-17T09:55:48Z

Just a quick word from another somewhat new user to say that I totally agree with @pipedelimited - from my personal experience with wikis, I always end up disabling CamelCase links.

Emphasis is already expressed by enclosing text between double chars - isn't it logical to do the same for links ?

Spangenhelm · 2014-01-29T17:55:48Z

+1 for CamelCase and ~tildes, very annoying when using lots of them

pmario · 2014-05-16T15:19:34Z

There are several obvious problems:

The characters × and ÷ shouldn't be classified as letters

IMO this is a bug and needs to be fixed anyway

pmario · 2014-05-16T15:20:12Z

A related discussion at the group: https://groups.google.com/forum/?fromgroups=#!topic/tiddlywiki/jE46tM1RzPQ

some infos about the unicode chars: http://www.fileformat.info/info/unicode/block/latin_supplement/images.htm

ssokolow · 2014-07-15T06:20:10Z

For the record, if one of my TWClassic wikis isn't running DisableWikiLinksPlugin from TiddlyTools, it's either because I forgot or because I haven't touched it since I discovered that plugin.

matthias-ronge · 2014-07-24T11:33:31Z

I would like to provide this example image (sentences from German Wikipedia article about Wikipedia itself) to show that this is a real issue in our language. Here are six undesired links showing up in four sentences:

I vote for no inference at all, too. Another reason is that practically none of the links do automatically match on Wiki entries, even if they exist, due to lingual transformations (singular, plural, genitive, compound nouns, …) and it seems to me to be the common case that you need the “pipe”, like
If you want to [[sail on tidal waters|Sailing on tidal waters]], you will need to …

Jermolene · 2014-07-24T13:50:00Z

I'll try to include this ticket in 5.0.14. My view is that automatic wikilinking should only occur for classical camelcase words. (In your example above @matthias-ronge, "ShareAlike" would be the only wikified word).

pmario · 2014-08-08T06:51:28Z

@Jermolene
Is this issue still on your radar for beta 14? .. It would be nice, if so.

Part of #337

“HelloThere” in “My-HelloThere” shouldn’t be wikified. Part of #337

Jermolene · 2014-08-09T10:06:48Z

I've made a series of changes to the camelcase rules for 5.0.14:

Removed support for underscore and dash within camelcase words (d7390db)
Stop classifying "÷" (\u00f7) as an upper case letter (f8548cc)
Stop classifying "×" (\u00d7) as a lower case letter (9c8564d)
Disabled camelcase recognition when preceded with any letter, dash or underscore

The idea is that these changes take us back to a much stricter, more conservative recognition scheme, with fewer false positives.

AwesomeAxolotl · 2014-08-09T10:54:09Z

Sounds nice, when can we expect 5.0.14 to be out?
I've downloaded TiddlyWiki for the first time today, stumbled over the problem (being a German speaker) and found it a nice coincidence that it just seems to be fixed by today.

Jermolene · 2014-08-09T10:57:38Z

Thanks @mindfaQ - 5.0.14 will be out in the next couple of days.

pmario · 2014-08-09T10:58:23Z

I did a short test and it seems nice till now :) thx a lot!

Jermolene · 2014-08-09T10:59:38Z

Great, glad to hear that @pmario

tgirod · 2014-09-12T21:03:54Z

Just a quick word on disabling wikilinks - for a project I have to generate tiddlers automatically from an external source which is not wiki-aware, and I get a few false wikilinks. Not many, but enough to make my sense of consistency tingle.

As regular imports need to be done, it is not feasible to correct the link manually. Correcting them automatically would mean to reimplement tiddlywiki's wikilink rules in my generator, which is boring.

pmario · 2014-09-13T07:11:15Z

Hi @tgirod Your importer can add the following line to the imported tiddler.

\rules except wikilink

How do your false positive links look like?

tgirod · 2014-09-14T10:08:23Z

Hi @pmario
Ok, I didn't know about this feature, it's pretty nice. Most of my positives look like regular WikiWords, so it's normal to see them Wikified. I think there is only one who could justify a change in the wikilink detection: T4LB07

A mix of uppercase letters and numbers can appear as a designation code, or a password, or something else and should not be wikified by default, I think.

pmario · 2014-09-14T11:12:16Z

T4LB07

A mix of uppercase letters and numbers can appear as a designation code, or a password, or something else and should not be wikified by default, I think.

I think, you should create a new issue for this one.

Jermolene mentioned this issue Mar 22, 2014

New Tiddler auto creates references #501

Closed

pmario mentioned this issue Jul 24, 2014

Many dead links showing up #697

Closed

Jermolene closed this as completed in d7390db Aug 8, 2014

Jermolene pushed a commit that referenced this issue Aug 8, 2014

Stop classifying "×" (\u00d7) as a lower case letter

9c8564d

Part of #337

Jermolene pushed a commit that referenced this issue Aug 8, 2014

Stop classifyinh "÷" (\u00f7) as an upper case letter

f8548cc

Part of #337

Jermolene pushed a commit that referenced this issue Aug 8, 2014

Disable wikilinking when preceded with dash or underscore

2526bfb

“HelloThere” in “My-HelloThere” shouldn’t be wikified. Part of #337

mdrda mentioned this issue Apr 20, 2019

Unicode CamelCase issue #3927

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Finalise wiki link format #337

Finalise wiki link format #337

Jermolene commented Jan 11, 2014

pmario commented Jan 12, 2014

Jermolene commented Jan 12, 2014

pmario commented Jan 12, 2014

ghost commented Jan 16, 2014

Jermolene commented Jan 16, 2014

tgirod commented Jan 17, 2014

Spangenhelm commented Jan 29, 2014

pmario commented May 16, 2014

pmario commented May 16, 2014

ssokolow commented Jul 15, 2014

matthias-ronge commented Jul 24, 2014

Jermolene commented Jul 24, 2014

pmario commented Aug 8, 2014

Jermolene commented Aug 9, 2014

AwesomeAxolotl commented Aug 9, 2014

Jermolene commented Aug 9, 2014

pmario commented Aug 9, 2014

Jermolene commented Aug 9, 2014

tgirod commented Sep 12, 2014

pmario commented Sep 13, 2014

tgirod commented Sep 14, 2014

pmario commented Sep 14, 2014

Finalise wiki link format #337

Finalise wiki link format #337

Comments

Jermolene commented Jan 11, 2014

pmario commented Jan 12, 2014

Jermolene commented Jan 12, 2014

pmario commented Jan 12, 2014

ghost commented Jan 16, 2014

Jermolene commented Jan 16, 2014

tgirod commented Jan 17, 2014

Spangenhelm commented Jan 29, 2014

pmario commented May 16, 2014

pmario commented May 16, 2014

ssokolow commented Jul 15, 2014

matthias-ronge commented Jul 24, 2014

Jermolene commented Jul 24, 2014

pmario commented Aug 8, 2014

Jermolene commented Aug 9, 2014

AwesomeAxolotl commented Aug 9, 2014

Jermolene commented Aug 9, 2014

pmario commented Aug 9, 2014

Jermolene commented Aug 9, 2014

tgirod commented Sep 12, 2014

pmario commented Sep 13, 2014

tgirod commented Sep 14, 2014

pmario commented Sep 14, 2014