### The `humanize` Library

The `humanize` library's documentation can be found [here](https://python-humanize.readthedocs.io/en/latest/)

This is a very useful library for displaying times, time intervals and certain quantities in a more human-readable format. It's not widely known, but it can be really useful and helps avoid a lot of tedious code you would have to write to get the same functionality.

From a display perspective, would you rather see information given this way:

```
784620802048 bytes
```

or this way?

```
784.6 GB
```

> Later we'll get into whether K, M, G, etc are factors of `1,000` or `1,024`

You can certainly write code to format this number in that way, but you'll have to account for handling different sizes, such as:
```
784620802048347 bytes --> 784.6 TB
784620802048 bytes --> 784.6GB
6208020 bytes --> 6.2 MB
1245 --> 1.2kB
100 bytes --> 100 bytes
```

Similarly, you might want to display large numbers in a more readable, or even abbreviated form, such as:
```
1456893245 --> 1,456,893,245
1456893245 --> 1.5 billion
```

This is precisely what the `humanize` library can do for us, and a lot more than what we just discussed.

So, let's dig in to that library - it's simple to use and very quick to learn.

First, the documentation for this library is OK, but it misses some things.

For example, nothing in their docs tells you what the package name is you shoudl pip install - you can see this from [pypi](https://pypi.org/project/humanize/), or by looking at the source code [here](https://github.com/python-humanize/humanize/blob/main/setup.cfg) (the `name` setting).

Look at that github repo, because you'll likely refer to this to see what the library really can do

In [1]:
import humanize

#### Numbers

We have quite a variery of ways we can format numbers using humanize.

Some of those options are included in the [docs](https://python-humanize.readthedocs.io/en/latest/), but many are not, so you'll need to look at the [code](https://github.com/python-humanize/humanize/blob/main/src/humanize/number.py)
itself (and don't worry, the code functions are themselves well documented!)

For example, we can add thousand separators to numbers:

In [2]:
humanize.intcomma(1234567890.45678)

'1,234,567,890.45678'

Here's a couple formatting options that are not mentioned in the docs.

One is for converting numbers to their "nth" version:

In [3]:
for i in range(11):
    print(humanize.ordinal(i))

0th
1st
2nd
3rd
4th
5th
6th
7th
8th
9th
10th


Another one is for limiting the number to between a floor and a ceiling, using the `clamp` function:

In [4]:
humanize.clamp(80, floor=100, ceil=200)

'<100'

In [5]:
humanize.clamp(280, floor=100, ceil=200)

'>200'

Another really interesting one (also not documented), is the `metric` function that will humanize values using the standard [SI prefixes](https://physics.nist.gov/cuu/Units/prefixes.html)

In [6]:
humanize.metric(1234, 'g')

'1.23 kg'

In [7]:
humanize.metric(0.0001, 'F')

'100 μF'

You can also deal with floats and scientific notation:

In [8]:
humanize.fractional(3.14)

'3 7/50'

In [9]:
humanize.scientific(0.5)

'5.00 x 10⁻¹'

In [10]:
from math import pi, sin
humanize.scientific(sin(pi/3), precision=10)

'8.6602540378 x 10⁻¹'

There are a few more number related functions that are documented, so check them out in the docs link I have at the top of this notebook.

#### Dates, Times and Time Deltas

Humanize is really handy for dealing with datetimes and time deltas.

Let's take a look at the `naturaltime` function:

In [11]:
from datetime import datetime

In [12]:
humanize.naturaltime(datetime.now())

'now'

You can see that instead of displaying the value of `datetime.now()` (the **local** current time), it displays the string `now`. 

> Notice that this means that the point in time relative to which this difference is calculated is based, by default, on the **local** time.

In fact, this makes it very clear:

In [13]:
humanize.naturaltime(datetime.utcnow())

'6 hours from now'

Obviously this not correct - `utcnow()` should be now, not offset by 6 hours!

We can tell humanize what that relative `now` is:

In [14]:
humanize.naturaltime(datetime.utcnow(), when=datetime.utcnow())

'now'

Since we should always be working with UTC times internally in our apps, you'll want to use that `when` option almost always for absolute times.

Let's look at a few more examples of how humanize deals with absolute times:

In [15]:
from datetime import timedelta

In [16]:
when = datetime.utcnow()
d1 = when + timedelta(days=1, minutes=3, seconds=23)
d2 = when - timedelta(days=90, hours=3)

In [17]:
humanize.naturaltime(d1, when=when)

'a day from now'

In [18]:
humanize.naturaltime(d2, when=when)

'2 months ago'

It can humanize time deltas in pretty much the same way, just without the tense ("from now", "ago"):

In [19]:
humanize.naturaldelta(timedelta(minutes=10, seconds=30, milliseconds=200))

'10 minutes'

When we deal with time deltas less than a second:

In [20]:
humanize.naturaldelta(timedelta(microseconds=200))

'a moment'

We can lower the minimum unit used:

In [21]:
humanize.naturaldelta(timedelta(microseconds=200), minimum_unit="milliseconds")

'0 milliseconds'

In [22]:
humanize.naturaldelta(timedelta(microseconds=200), minimum_unit="microseconds")

'200 microseconds'

### File Sizes

These functions are handy for display file sizes in more human readable fashion:

In [23]:
humanize.naturalsize(1024)

'1.0 kB'

We can specify the precision of the format number:

In [24]:
humanize.naturalsize(1024, format="%.3f")

'1.024 kB'

One important distinction we have to make is whether a kilo is 1000 (decimal system) or 1024 (binary system).

Typically memory in a computer is measured in a binary system, so 

```256MB```

of memory means

```256 * 1024 * 1024 = 268,435,456 bytes```

On the other hand, hard drive manufacturers use the decimal system, so 

```256MB``` 

is equivalent to 

```256 * 1000 * 1000 = 256,000,000 bytes```

This can be quite confusing, and the IEC (International Electronic Commission) finally set some standards up regarding this:

The prefixes **k**(ilo), **M**(ega) **G**(iga), **T**(erra), etc are **decimal** based, so factors of `1,000`

For a binary system, the symbols are as follows:

**Ki**, **Mi**, **Gi**, **Ti**, etc
with corresponding names:
Kibi, mebi, gibi, Tebi, etc

and are therefore factors of `1,024`.

For more info you can check out this [wikipedia article](https://en.wikipedia.org/wiki/Kilobyte) on the subject.

This way, no more ambiguity... as long as folks actually use it :-) 

`humanize` is able to humanize bytes into either decimal or binary:

In [25]:
value = 1_000_000_000

In [26]:
humanize.naturalsize(value)

'1.0 GB'

In [27]:
humanize.naturalsize(value, binary=True)

'953.7 MiB'

#### Localization

Lastly, `humanize` also supports localization (aka internationalization, usually abbreviated as **i18n**)

When we look at certain humanizations, you'll notice that mine were in (US) English - that's because my locale is that, but your locale may be different, and you may get different results.

`humanize` provides a bunch of local definitions (and you can always add one yourself if you need to). The library does not document what those locales are, but you can find them [here](https://github.com/python-humanize/humanize/tree/main/src/humanize/locale)

For example, we could set our locale to `fr_FR`:

In [28]:
humanize.i18n.activate('fr_FR')

<gettext.GNUTranslations at 0x107d2f550>

And let's look at some of our previous example:

In [29]:
for i in range(11):
    print(humanize.ordinal(i))

0e
1er
2e
3e
4e
5e
6e
7e
8e
9e
10e


In [30]:
when = datetime.utcnow()
d1 = when + timedelta(days=1, minutes=3, seconds=23)
d2 = when - timedelta(days=90, hours=3)

print(humanize.naturaltime(d1))
print(humanize.naturaltime(d2))

dans un jour
il y a 2 mois


Let's try another locale:

In [31]:
humanize.i18n.activate('bn_BD')

for i in range(11):
    print(humanize.ordinal(i))
    
when = datetime.utcnow()
d1 = when + timedelta(days=1, minutes=3, seconds=23)
d2 = when - timedelta(days=90, hours=3)

print(humanize.naturaltime(d1))
print(humanize.naturaltime(d2))

0th
1st
2nd
3rd
4th
5th
6th
7th
8th
9th
10th
আজ থেকে এক দিন পরে
2 মাস আগে


As you can see, not all locales support everything, so your mileage may vary. And if you're up to it, you can always contribute to that library with your own locale definition. I'm sure the author of the library would appreciate the help!