-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The citation key displays Chinese pinyin instead of Chinese character #9605
Comments
I want to mention that any dumb auto converter solution is not reliable, since Hanzi is used in many Asian countries and territories, in many variants. For example, Hanzi 【出来】 maybe chu lai in Chinese, or de ki in Japanese, or xuý lãi in Vietnamese, cheok rae in Korean. In my workflow, calibre is using a dumb converter, converting my Chinese items into ugly "ASCII equivalents" which it believes to be so. Sometimes in Chinese (Mandarin or Cantonese), sometimes Japanese... So I suggest that let user decide which romanization to be used. |
Implementation note: Use https://github.com/houbb/pinyin. I found it via https://search.maven.org/search?q=pinyin4j - and it seems to be the most maintained one. If that does not work, try https://github.com/belerweb/pinyin4j. I checked https://sourceforge.net/p/pinyin4j/news/ and think, the library needs to be configured to (in bold)
|
I would again state that it may not be a good idea to do so naively in main program, since one cannot distinguish Chinese among other CJK languages easily without natives or AI, and will definitely BREAK other users' experience, especially Japanese users which uses Hanzi (Kanji in Japanese) too but totally different romanization. |
BibTeX allows for using the field "language" to indicate the language of the entry. Maybe, one could use that as input for the citation key generator. @clzls I assume you are on UTF8 and use the non-ASCII characters also for your citation keys? The issue is complicated with many different user "profiles". Maybe we need a preference? |
The implementation is complicated since we need to accommodate users with different languages while ensuring a smooth user experience for those accustomed to the current system. Maybe we could offer users the option to enable this function (default off). Other preconditions are also needed such as ensuring that romanization only occurs when a valid "language" field is specified (as mentioned by kropper). Perhaps we can extend romanization support not only to Chinese language but also other languages (Korean, Japanese, etc.). Alternatively, a semi-automatic approach could suffice. We could introduce new options in the right-click menu (see figure below). We could also use “check integrity” to collect those entries with non-ASCII citation keys into one group, followed by “cleanup entries” for this group (see figure below). Thus, this method can also make it convenient for people who are in need. |
Yes I do, and a bunch of my papers were written in Chinese, using tons of packages to tweak LaTeX compilers, to make them happy dealing with non-ASCII characters... (no one would write papers full of something like
Looks good for me. By implementing this way, it is like an extension to opt-in and extensible for any language that has needs to obtain ASCII equivalents (even Europeans may need it, such as Danish or Greek, I think). I would go even further and suggest that introducing dynamic-loadable custom formatters may be an even better solution, so that everyone would be happy... |
At a LaTeX conference, I learned form the LaTeX developers that it is now also possible to use Unicode with |
Does it work for BibTeX too? |
According to the LaTeX 3 team: Yes. Just ensure that you run latest TeXLive 😅 |
I use the latest MiKTeX, BibTeX and pdflatex, the citation key like
|
@ehehela I asked LaTeX pros. It works on TeXLive. See https://chat.stackexchange.com/transcript/message/65511308#65511308 OK, it seems, some more "magic" is needed: \documentclass{article}
\DeclareUnicodeCharacter{4EFB}{CJK Ideograph 4efb}
\DeclareUnicodeCharacter{653F}{CJK Ideograph 653f}
\begin{document}
\cite{任政2018}
\begin{thebibliography}{99}
\bibitem{任政2018} xxxx
\end{thebibliography}
\end{document} |
actually that resolves the error but the cite doesn't work it doesn't need the definitions but it does (currently) need something safe as the first token
Although the official position is that cite keys should use ascii characters, |
@koppor and @davidcarlisle Thank you. The source code of the first test is:
The source code of the second test is:
|
FYI: My thesis is using |
As for JabRef, I am not yet sure what would be the best option to have the correct language in the entry preview, but when it comes to rendering the entry in LaTeX, there seems to be a limitation of pdflatex that can be worked around with xelatex, special commands/syntax or other packages. Are you aware of Babel? There is also LuaLaTeX. |
@ThiloteE lualatex is the way to go :). pdflatex and xelatex should only be used if absolutely necessary :) |
Idea: Maybe, some of the Apache Lucene functionality can be used. There are these |
Here, an alternative citation key generation scheme is recommended for Chinese bibliography: using Chinese pinyin of authors rather Chinese character which is non-ASCII.
For example:
WanZheng2016 or WanZ2016 is preffered rather than the default 万征2016.
The text was updated successfully, but these errors were encountered: