Problem with Japanese abbreviations. #2

uhho · 2015-04-07T03:10:23Z

(moved from foretagsplatsen/Numeral-js/issues/9)

I've already reported this issue here adamwdraper/Numeral-js/issues/248, but as the original branch is not maintained anymore, I think it would be better to focus on your branch.

Now, Numeral.js is grouping numbers into 4 groups: thousands, milions, billions and trillions.

But, in case of Japanese (and I suppose it also applies to Chinese) numbers are grouped in slightly different way: every hundred, thousand and ten thousand.

100 -> 1百
1,000 -> 1千
10,000 -> 1万
100,000 -> 10万
1,000,000 -> 1百万
10,000,000 -> 1千万
100,000,000 -> 1億

Before I start fixing this issue, I'd like to discuss how to approach that problem.
Ideas:

add more abbreviations in library core
let languages define their own formatting/unformatting functions

Sample formatting function here: http://jsfiddle.net/tuknLbz8/1/

Anyway, I think we need more flexible architecture if we want to support even more complicated numeral systems like Indian numbering system.

The text was updated successfully, but these errors were encountered:

NadyaNayme · 2015-04-13T17:29:42Z

Please correct me if I'm wrong, seeing as you're the one living in Japan and my Japanese is rusty from lack-of-use.

Shouldn't "1百" simply be "百"? In similar fashion, wouldn't 10万 be 十万? While I'm sure both are understandable I'm thinking "正しい文法"

Per the "languages" section of the website I think letting languages define their own formatting is the better idea.

uhho · 2015-04-14T02:46:17Z

@kyokou Thank you very much for comment!
Yes - both are correct and commonly used here, depending on the context.

To sum up, there are three systems:

Arabic numerals => 10,000 100,000
Arabic numerals + kanji (ideographs) => 1万 10万
Kanji only => 一万十万

As you mentioned, one thousand is an exception - you can ommit number in front of kanji (千　= １千). It's similar to "one hundred" and "a hundred" in English.

Particularly, second system is commonly used for counting big amount of money.
So "two hundred twenty million yen" becomes "2億2千万円".

But reading your comment and thinking about that more deeply, made me think that if a language has more than one numeral systems, user should be able to switch between them.

For example:

// Standard usage
numbro(100000).format('$0,0'); // ¥100,000
numbro(100000).format('0a$'); // 10万円 <- currency symbol has to be different!

// Selecting numeral system
numbro(100000).format('0a$', 1); // 十万円

In above case, second argument is an index of selected numeral system (defined in language configuration file), but not sure if it's good idea though.

Any ideas?

NadyaNayme · 2015-04-14T19:32:04Z

Thanks for the info! I've never actually seen the second system used. It's still logical and easy to follow though, so I'm not too surprised it exists.

As for the different formatting for individual languages - I'm not sure the best way to go about that. I think it is relatively uncommon for languages to have multiple ways of writing. For completeness it would be best to add them, but it should be planned and thought out in a maintainable fashion.

Your solution seems like it would work fine - so long as it is standardized.

For example:

[0] - Native (一、二、三、四）
[1] - Arabic Numerals (1, 2, 3, 4) 
[2] - Hybrid/Other, if applicable (1百、1千)

BenjaminVanRyseghem · 2015-04-14T20:28:07Z

I am not a big fan of magic index referring to hidden things 😸
So if we go in this direction (I am still not convinced as I am still not sure I understand everything), maybe introducing special function per language would be better

I do not know Japanese at all, so I will try to summarize what I understand, please correct me if I am wrong.

there is a native way to write currency in Japanese, where everything is in kanji.
To me, it sounds like the english equivalent would be something like two k€ (I use euro as it's postfixed)
there is a half-arabic/half-japanese way, which english equivalent would be: 2 k€

Am I right? 😄

NadyaNayme · 2015-04-14T21:53:10Z

Your example is correct. I'm assuming the "k" is shorthand for "thousand"?

Native: two thousand € // equal to 二千円
Arabic: 2,000 € // equal to ¥2,000
Hybrid: 2 thousand € // equal to 2千円

"Native" probably isn't the best way to describe it, since all of the above are natively used... was just the most accurate word I could think of.

Magic index referring to hidden things could be standardized and documented to not make it so hidden or magical. My only issue with magic index referring to hidden things is when it isn't documented; but I can see why you would be against it.

We could make the index less magical by passing a string as the second param instead of an index value.

numeral(100000).format('$0,0', "arabic"); // ¥100,000
numeral(100000).format('0a$', "hybrid"); // 10万円 <- currency symbol has to be different!
numeral(100000).format('0a$', "native"); // 十万円

BenjaminVanRyseghem · 2015-04-14T22:01:38Z

Thanks for the clarifications 😄 (indeed, the k stood for kilo)

Could you please explain why the currency symbol has to be different?

If for all other languages we use something like 2 k€ why not using a similar approach here and use 2万円?

Is ¥2kcompletely weird?
(sorry for all the questions, but I try to have a clear understanding of the situation 😸 )

NadyaNayme · 2015-04-14T23:53:45Z

The rules are a little complex. Without trying to get too specific into nuance and semantics, this is the easiest way I can sum it up:

円 can be used anywhere, including price tags
¥ can only be used for prices/value
If the number uses kanji, 円 must be used over ¥.

¥100 // OK
100円 // Also OK
百円 // OK
¥百 // NOT OK (or at least, I've never ever ever seen this)

When to use 円 depends on context. When writing vertically, kanji + 円 is preferred over the other forms. Same with formal/old documents/literature. There are more scenarios as well and not all of them are "hard rules, must do this or it's wrong".

It's kind of hard to explain since I'm not 100% familiar with all scenarios/contexts or without a mini-Japanese lesson in formality and counters.

@lukaszkrawczyk
If I'm wrong about ¥百, let me know. I'd love to learn more! I've not once seen this (although after you mentioned it, I have seen the "hybrid" style for populations of cities and a few other large numbers)

BenjaminVanRyseghem · 2015-04-15T07:12:56Z

Thanks for the explanations once again 😄

my proposal for currencies (which is in fact not the point of this issue 😸 maybe it should be moved in a separate issue):

we keep the currency symbol: $, €, or ¥. We use this symbol when there is no average, therefor the abbreviations or not used (and we are sure not to mix symbol with kanjis)
we introduce a translatedSymbol (or we can find a better name): 円. We use this symbol when average is used, so we are sure to have a consistent translated text. Of course, if this symbol is not defined, we fallback to currencySymbol

What do you guys think about that?

uhho · 2015-04-16T03:10:21Z

@kyokou Yes, everything is correct. I couldn't explain it better ;)

@BenjaminVanRyseghem , are you OK with @kyokou 's solution?
If yes, I will modify code, write some test, add examples and send a PR during next week.

On the other hand, I've been thinkning about publishing separate library to deal with this problem.
A kind of plug-in to numbro library, where user could define custom formatting function and so on.
I'll think about that little more.

Regarding issues with currency, let's move your proposal to different issue.
Because I've got several ideas how we could improve currency formatting as well. 😈

BenjaminVanRyseghem · 2015-04-16T10:14:48Z

@lukaszkrawczyk I am not sure we agreed on a solution yet 😄 and to be honest, I am still not super convinced about the introduction of a new argument.

But we can continue to discuss it 😸

edit: after reading again the thread, I am not sure we are talking about the same things 😉

BenjaminVanRyseghem · 2015-04-16T10:16:45Z

@lukaszkrawczyk I would rather push things directly into numbro instead of having another layer of external dependencies, don't you think?

NadyaNayme · 2015-04-16T16:21:13Z

@BenjaminVanRyseghem
I think he's talking about my "pass a string as second argument" instead of "have a magical index as second argument" solution.

EG:

numeral(100000).format('$0,0', "arabic"); // ¥100,000
numeral(100000).format('$0,0');           // also ¥100,000
numeral(100000).format('0a$', "hybrid");  // 10万円 <- currency symbol has to be different!
numeral(100000).format('0a$', "native");  // 十万円

Without solving the currency issue and leaving just the number:

numeral(100000).format('0,0', "arabic"); // 100,000
numeral(100000).format('0,0');           // also 100,000
numeral(100000).format('0a', "hybrid");  // 10万
numeral(100000).format('0a', "native");  // 十万

Because this may have to change on-the-fly and possibly per-call, I would prefer a second argument for Japanese over a setting in the configuration file.

BenjaminVanRyseghem · 2015-04-16T16:48:17Z

sounds like a good idea for all the languages with a different alphabet 😄

@lukaszkrawczyk if you want to give this a try, I will be very pleased to read your code 😉

uhho · 2015-04-17T00:57:52Z

@kyokou Correct. Sorry for being unclear.
@BenjaminVanRyseghem OK, let's do it!

BenjaminVanRyseghem · 2015-07-02T15:39:39Z

@lukaszkrawczyk any progress here? 😄

ArmorDarks · 2017-03-29T20:19:15Z

As another solution, GNU uses @ to denote variations in locales, like en-US@euro to display English (USA) with Euro as a currency, so maybe we can use same principle in formatCurrency?

numbro(100).formatCurrency('0,0.00 $')
numbro(100).formatCurrency('0,0.00 $@arabic')

BenjaminVanRyseghem · 2017-03-30T09:22:47Z

@ArmorDarks I would like to keep locales out of the format string

ghost · 2019-03-30T07:11:41Z

@BenjaminVanRyseghem I would like to helping with this issue. But I think OP is not how a large numeric should be display.

Let's see what it should be look like:

Powers of 10	Number	English	Japanese
1	10	10	10
2	100	100	100
3	1,000	1k	1,000
4	10,000	10k	1万
5	100,000	100k	10万
6	1,000,000	1m	100万
7	10,000,000	10m	1,000万
8	100,000,000	100m	1億
9	1,000,000,000	1b	10億
10	100,000,000,000	100b	100億
12	1,000,000,000,000	1tr	1兆

This is more reasonable and I think is easier to implement. Since the only different of these two system is how we separate large number (that means, >= 1,000), the problem of "百" or "1百" doesn't exist. Japanese numeric system (which is the same as Chinese), are based on 10^4, while western system are based on 10^3.

I have found this web page might help: https://www.trussel.com/jnumbers.htm

When the OP mentions we should write "100" as "百", it just like saying we should write "1 hundred" in English, doesn't make sense. Because it is how we read the number, not how we write them.

uhho mentioned this issue Apr 16, 2015

Problem with currency symbols #14

Open

gwynjudd mentioned this issue Mar 29, 2017

en-IN localization and customizable grouping #266

Open

ArmorDarks mentioned this issue Mar 29, 2017

Fixed ru currency symbol #251

Open

ArmorDarks mentioned this issue Jan 30, 2018

Feature discussion: Set thousand and decimal separator to an instance #332

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problem with Japanese abbreviations. #2

Problem with Japanese abbreviations. #2

uhho commented Apr 7, 2015

NadyaNayme commented Apr 13, 2015

uhho commented Apr 14, 2015

NadyaNayme commented Apr 14, 2015

BenjaminVanRyseghem commented Apr 14, 2015

NadyaNayme commented Apr 14, 2015

BenjaminVanRyseghem commented Apr 14, 2015

NadyaNayme commented Apr 14, 2015

BenjaminVanRyseghem commented Apr 15, 2015

uhho commented Apr 16, 2015

BenjaminVanRyseghem commented Apr 16, 2015

BenjaminVanRyseghem commented Apr 16, 2015

NadyaNayme commented Apr 16, 2015

BenjaminVanRyseghem commented Apr 16, 2015

uhho commented Apr 17, 2015

BenjaminVanRyseghem commented Jul 2, 2015

ArmorDarks commented Mar 29, 2017

BenjaminVanRyseghem commented Mar 30, 2017

ghost commented Mar 30, 2019 •

edited by ghost

Loading

Problem with Japanese abbreviations. #2

Problem with Japanese abbreviations. #2

Comments

uhho commented Apr 7, 2015

NadyaNayme commented Apr 13, 2015

uhho commented Apr 14, 2015

NadyaNayme commented Apr 14, 2015

BenjaminVanRyseghem commented Apr 14, 2015

NadyaNayme commented Apr 14, 2015

BenjaminVanRyseghem commented Apr 14, 2015

NadyaNayme commented Apr 14, 2015

BenjaminVanRyseghem commented Apr 15, 2015

uhho commented Apr 16, 2015

BenjaminVanRyseghem commented Apr 16, 2015

BenjaminVanRyseghem commented Apr 16, 2015

NadyaNayme commented Apr 16, 2015

BenjaminVanRyseghem commented Apr 16, 2015

uhho commented Apr 17, 2015

BenjaminVanRyseghem commented Jul 2, 2015

ArmorDarks commented Mar 29, 2017

BenjaminVanRyseghem commented Mar 30, 2017

ghost commented Mar 30, 2019 • edited by ghost Loading

ghost commented Mar 30, 2019 •

edited by ghost

Loading