Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bold inside a word (or sentence in this case). #640

Closed
graywolf opened this issue Mar 26, 2019 · 12 comments
Closed

Bold inside a word (or sentence in this case). #640

graywolf opened this issue Mar 26, 2019 · 12 comments

Comments

@graywolf
Copy link
Contributor

graywolf commented Mar 26, 2019

Currently it's not possible to make bold just part of the word (as already mentioned #545). That works reasonably well for english texts, however it does not really play nice with languages that do not use spaces to separate words.

I don't know much about vim programming, but if I figure out how to add alternative sequence to make bold, something that is reasonably improbable to appear in normal text (for example !*), would that be consider for merge or is this something this project will not accepts?

Basically I would be able to do !*どれ!*ですか。 to make the どれ part bold without need to surround it with spaces.

@hq6
Copy link
Contributor

hq6 commented Mar 26, 2019

I suspect this might cause problems with having to double-escape the sequence !* or whatever other sequence you choose. I think it might make more sense for it to be an option which, if enabled, would cause * to work inside words.

@EinfachToll @Nudin @ranebrown , Do you want to weigh in on this?

@Nudin
Copy link
Member

Nudin commented Mar 26, 2019

@hq6 I'm not convinced that there's not an more elegant way to solve that. I'd hope for a better solution with an improved regexp that doesn't depend explicitly on spaces. \<\> should match word-boundaries but it's implementation in vim is rather primitive and based on iskeyword
This leads me to wonder how do you navigate word-wise in CJK texts? @graywolf Can you educate us a bit on that? Do you have any settings changed for Japanese text handing?

@Nudin
Copy link
Member

Nudin commented Mar 26, 2019

Something like \(\W\|\<\) could work as delimiter – but is not as elegant than geting a regex to work that realy works word-wise.

@ranebrown
Copy link
Contributor

I agree, this shouldn't be an option but rather an improvement to the existing syntax definition / regex patterns. From what I have found the commonly accepted way to handle this is to only allow asterisks and not underscores for intraword emphasis. (CommonMark Spec, Meta StackExchange). E.g. foo*bar* should render italic and foo**bar** should render bold. Following this path would result in a modified default syntax since italic is currently only defined using underscores but I don't see another good option since words with underscores that you wouldn't want emphasized are fairly common.

@hq6
Copy link
Contributor

hq6 commented Mar 27, 2019

@ranebrown I think your suggestion makes sense, although I think a backwards-incompatible change might justify a version jump by itself, possible paired with other backwards-incompatible changes.

@Nudin, I'm not sure I understand your suggestion. Are you suggesting that we should try to discover word boundaries in a more general-purpose way and still use * and _ around word boundaries for a different definition of word boundaries?

@Nudin
Copy link
Member

Nudin commented Mar 27, 2019

@Nudin, I'm not sure I understand your suggestion. Are you suggesting that we should try to discover word boundaries in a more general-purpose way and still use * and _ around word boundaries for a different definition of word boundaries?

Yes.

@graywolf
Copy link
Contributor Author

graywolf commented Mar 27, 2019

@Nudin disclaimer that needs to be done is that I'm not a native speaker, I'm just studying it and trying to use vimwiki to make a notes. So most of my japanese text are short (one or few sentences) and therefore usage by native speaker could be way different.

That out of way, I use normal word movements (e, b) without any special settings or plugins. So for example sentence これはだれの犬ですか。, which from linguistic point of view is これ, , だれ, , , です, , . However, vim considers it これはだれの, , ですか, for word movement (afaict).

This works (I think) thanks to the fact that japanese is composed of 3 alphabets all used at once, and vim seems to make words on boundries where the alphabets change. How it works in chinese for example (which has only one alphabet afaik) I have no idea (it probably does not really work in useable way).

As for the improving automatic detection of word boundries, this quora question https://www.quora.com/What-are-some-Japanese-tokenizers-or-tokenization-strategies , especially answer from Graham Neubig is pretty interesting. So while it's possible to do somewhat reliably (I read somewhere around 95% accuracy for the TinySegmenter), it's probably not a small feet to implement it.

And imho you would also need a way to flag parts of file as some language, since if file is mix of English and Japanese (as are mine) you would need different word-splitting method for each language applied correctly.

@hq6
Copy link
Contributor

hq6 commented Mar 27, 2019

I do not have a strong opinion on whether it makes more sense to change what the syntax means inside words vs trying to find word boundaries in a different way.

The disadvantage of the former is that we're breaking backwards compatibility, while the disadvantage of the latter if that it might be trickier to implement correctly. The advantage of the former is that it might generalize more easily to other use cases.

If there are more native speakers of languages that do not use spaces to break words (who use vimwiki), perhaps they could weigh in on this issue?

@graywolf
Copy link
Contributor Author

graywolf commented Mar 28, 2019

Here is a patch I'm using in the mean time https://github.com/graywolf/vimwiki/commits/enforceable_typefaces , seems to work well.

@mgarort
Copy link

mgarort commented Apr 14, 2020

Hi, one question on this topic: what would be the problem with using <b> </b> to define exactly where you want the bold to happen? I am able to bold only parts of words like this.

@graywolf
Copy link
Contributor Author

what would be the problem with using <b> </b>

Mainly that the tags are still visible there and not hidden:

2020-04-15-160755_222x62_scrot

Compare

foo *bar*
foo <b>bar</b>
_CURSOR_

with cursor being on line _CURSOR_.

tinmarino added a commit to tinmarino/vimwiki that referenced this issue Aug 2, 2020
…iki#847, vimwiki#640)

- Less code, Easyer to maintain, to add a markup language
- Faster to load, and to highlight
- Support multiline tags: vimwiki#847
- Support nested tags
- Support intraword tags: this<b>bold</b>type vimwiki#640
tinmarino added a commit to tinmarino/vimwiki that referenced this issue Aug 2, 2020
…iki#847, vimwiki#640)

- Less code, Easyer to maintain, to add a markup language
- Faster to load, and to highlight
- Support multiline tags: vimwiki#847
- Support nested tags
- Support intraword tags: this<b>bold</b>type vimwiki#640
@tinmarino
Copy link
Member

Fixed: (1b16720 or search "Typeface" in commit title or Last commit refering this issue)

The syntax are redefined as regions which permited to solve this (see $VIMRUNTIME/wiki/markdown.vim from tim Pope)
This permited to solve this issue and #847 (receiving the same comment)

deepredsky pushed a commit to deepredsky/vimwiki that referenced this issue Jan 16, 2021
…iki#847, vimwiki#640)

- Less code, Easyer to maintain, to add a markup language
- Faster to load, and to highlight
- Support multiline tags: vimwiki#847
- Support nested tags
- Support intraword tags: this<b>bold</b>type vimwiki#640
jls83 pushed a commit to jls83/vimwiki that referenced this issue Jan 17, 2023
…iki#847, vimwiki#640)

- Less code, Easyer to maintain, to add a markup language
- Faster to load, and to highlight
- Support multiline tags: vimwiki#847
- Support nested tags
- Support intraword tags: this<b>bold</b>type vimwiki#640
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants