Skip to content
This repository has been archived by the owner on Sep 8, 2020. It is now read-only.

MB, MiB and M; binary prefix standards and IEC 80000-13:2008 vs JEDEC 100B.01. #201

Closed
EliteTK opened this issue Jun 2, 2015 · 15 comments
Closed

Comments

@EliteTK
Copy link
Contributor

EliteTK commented Jun 2, 2015

Ok, here goes.

I would like to kindly ask if it would be possible to standardise the usage of binary prefixes for memory measurement in htop to use the IEC 80000-13:2008 standard and its accompanying set of binary prefixes.

It has always struck me as annoying that whenever I find software which attempts to measure memory, and I see the dreaded "MB" I am left wondering constantly whether the software really means MB in the SI prefix sense of a million bytes, or whether it means, in the far more colloquial sense 1024^2 bytes. This is especially annoying when no information is given about which logic the measurement follows, and no option is given to also display the value in plain bytes.

This is possibly why there are now two standards to try to settle this, IEC 80000-13:2008 which outlines the following prefixes:

Short Prefix Bytes
Ki Kibi 1024
Mi Mebi 1024^2
Gi Gibi 1024^3
Ti Tibi 1024^4
Pi Pebi 1024^5
Ei Exbi 1024^6
Zi Zebi 1024^7
Yi Yobi 1024^8

And JEDEC 100B.01 introduced the following definitions:

Short Prefix Bytes
K Kilo 1024
M Mega 1024^2
G Giga 1024^3

Somewhere along the line, the units: K, M and G also started to be used alone, without an accompanying B, I think I read somewhere once that these were standardised by GNU at some point, but I could not find any information now.

In any case, it appears that currently htop uses both the JEDEC standard, and in some cases the shorthand M to mean 1048576 Bytes.

There are reasons for why staying with the JEDEC standard might be helpful, this might be less confusing for people not familiar with the other standard, and it appears quite popular.

However, I feel that there are great shortcomings of the JEDEC standard.

Firstly, it is easily confused with SI prefixes which are a long standing standard for prefixing SI units, units which are commonly referred to as "Metric." It seems strange that for some unknown reason, when we start talking about memory, we should suddenly switch to a completely different definition of a term. What if you want to actually say 1000 Bytes, you can no longer say kilo because someone may confuse it for 1024 bytes. (This can't be confused on paper since the metric prefix shorthand for kilo is k, where as the JEDEC shorthand is K but the point stands for both mega and giga.) Even NIST frowns upon the misuse of SI prefixes for anything but metric applications.

Secondly, it has the shortcoming in that it only defines 3 multiples. In the modern day and age, we more and more often have to deal with amounts of data in multiples of 1 000 000 000 bytes. This might not be the case for computer memory yet (which is currently only relevant in htop) but maybe one day htop might expand to measuring disk space or we actually do start using multiples of 1 000 000 000 bytes of data and we will need the extra definitions.

Finally, I feel that as htop is mainly centred around power users who spend a lot of time in a terminal, those users are already quite likely to be surrounded by applications which use Mega to mean 1 000 000 and Mebi to mean 1024². This is why I feel there is no serious reason to be different and I think it is time to switch to the IEC standard.

If however, switching over the few areas of code which do use MB to MiB is not feasible for whatever reason, it would at least be helpful to put a message, somewhere, informing any curious users about which exact standard the application follows, so that in the case that the user does need to know, they do not need to delve into source code to figure it out.

Edit: Whoops, that was meant to be KB, MB and GB not K, M and G.
Edit2: Wait, never mind, it was right in the first place.

@nfnty
Copy link

nfnty commented Jun 3, 2015

+1

@hishamhm
Copy link
Owner

hishamhm commented Jun 8, 2015

Thanks for pointing out that units are inconsistent.

For the pragmatic reason that htop is a terminal app and screen space (ie, character cells) are a scarce resource, I'd prefer to standardize on the single-character JEDEC 100B.01. If you send me a patch removing the instances of KB, MB, etc, and adding a note about units to the manpage, I'll be very happy to apply. Thank you!

@Earnestly
Copy link

Two characters is not going to make much difference and is certainly not worth
dropping just for that reason alone. This would be inconsistent with lots of
other output htop adds which actually wastes space, notably all the space
between columns and a bottom bar which isn't necessary if you're familiar with
the keys already.

We don't really use 12x80 anymore and even VT100 (1978) terminals supported 132
columns.

It makes much more sense to use nomenclature that actually has a real standard.
(JEDEC is not relevant anymore as noted by the IEEE/ASTM SI 10-1997 standard
as stating, that "this practice frequently leads to confusion and is
deprecated".)

@hishamhm
Copy link
Owner

hishamhm commented Jun 8, 2015

I'm not going to change, for example, VIRT and RES columns from "M" to "MiB". If a change towards consistency is to be done, it will be from MB → M (ie, shortening the long units, and not the other way around).

@Earnestly
Copy link

Sorry, but what consistency are you refering to? I thought this about using
the correct suffix, not consistency?

Okay, so assuming you go with M (and assuming M on its own has some tacit
meaning), which is SI prefix for 'mega'. I assume all memory measurement must
be measured in 1000 and not 1024?

@hishamhm
Copy link
Owner

hishamhm commented Jun 8, 2015

Sorry, but what consistency are you refering to?

First, internal consistency, as the current UI currently uses "M" in some places and "MB" in others. Then, consistency with the other GNU userland tools, which use K, M and G based on powers of 2 (coreutils in particular).

Paraphrasing the original post, htop is indeed centred around users who spend a lot of time in a terminal, and those users are surrounded by applications which use M to mean 1024² (see standard commands ls -h, du -h, free -h...). This is one of the reasons why I feel there is no sufficient motivation to be different.

Okay, so assuming you go with M, which is SI prefix for 'mega'. I assume all memory measurement must be measured in 1000 and not 1024?

No, I will not change the meaning of "100M" displayed on the UI from one release to the next.

@Earnestly
Copy link

What does 100M mean? 100 or 102?

Also, I'm sorry, but coreutils (GNU) is also wrong. Yes, I almost exclusively
use the terminal.

If you wish to side with GNU for your standard definition then it just has be
documented as such.

@hishamhm
Copy link
Owner

hishamhm commented Jun 8, 2015

What does 100M mean? 100 or 102?

10010241024

If you wish to side with GNU for your standard definition then it just has be
documented as such.

Agreed!

@EliteTK
Copy link
Contributor Author

EliteTK commented Jun 8, 2015

Alright, I'll try to put together a patch before the end of the week.

@hishamhm
Copy link
Owner

hishamhm commented Jun 8, 2015

Awesome, thank you!!

@Earnestly
Copy link

For what it's worth here is the units(7) manual.

@eworm-de
Copy link
Contributor

eworm-de commented Jun 9, 2015

Ah, interesting discussion. My pull request #143 ist related, I did not get any feedback, though.

@EliteTK
Copy link
Contributor Author

EliteTK commented Jun 10, 2015

I made a pull request #205.

@eworm-de
Copy link
Contributor

I have reworked my code, but did not send a pull request so far:
EliteTK/htop@issue-201...eworm-de:dynamic-unit

Waiting for a decision on this issue first.

@hishamhm
Copy link
Owner

#205 was merged, so I'm closing this.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants