Skip to content

Collators missing for comparing sizes in power-of-two units (KiB, MiB, GiB, etc.) #1663

@dardhal

Description

@dardhal

lnav version
Code compiled from current upstream

Describe the bug
This is related to #1647 , and hence addressing both that one and this one may be better done together.

Started using the SQL "collators"(specifically "measure_with_units) for comparing (and sorting) amounts of data (file sizes) in human readable form. This is described in the docs like below :
https://docs.lnav.org/en/latest/sqlext.html#collators

Collators
...
    measure_with_units - Compare numbers with unit suffixes. The currently supported suffixes are:
    * Sizes with an E/P/T/G/M/K prefix.
    * Seconds with an f/p/n/u/m prefix.
    * Durations of the form HH:MM:SS or HH:MM:SS

There may be certain amount of confusion and controversy, but as explained in #1647 it seems the right prefixes for amounts which are powers of two have to come with the "i" (1024 bytes = 1 KiB, not 1 KB, and so on).

Collator "measure_with_units" works beautifully when data is using B (for bytes), KB, MB, GB and so on. But it is not working when using KiB, MiB, GiB, and so on, for example :

;SELECT log_time,path,size,tier FROM file_location ORDER BY size COLLATE measure_with_units DESC   
2026-04-10 07:05:15.000000 /data/container.63.cdsf              1023.13 GiB Active
2026-04-10 07:05:15.000000 /data/container.18.cdsf              1023.13 GiB Active
2026-04-10 07:05:15.000000 /data/hdkhajdhjkadhkjas52544D0.trace 1023.06 KiB Active
2026-04-10 07:05:15.000000 /data/container.76.cdsf              1023.00 GiB Active
2026-04-10 07:05:15.000000 /data/dlksdasdasldkasldaksldkasldaks   1023.00 B Active
2026-04-10 07:05:15.000000 /data/dasjdklajsdklakljsdkljajkldkla   1023.00 B Active
2026-04-10 07:05:15.000000 /data/lñdaskdaksdkalñkdlñaklñdklñas0   1023.00 B Active

However it works when the units are changed to remove the "i" :

;SELECT log_time,path,size,tier FROM file_location ORDER BY size COLLATE measure_with_units DESC
         log_time                                                                    path                                                              size     tier
2026-04-10 07:05:15.000000 /data/1DDASDASDASDASD/container.1.cdsf                 37.49 TB Active
2026-04-10 07:05:15.000000 /data/1DCDASDASDADA48/container.1.cdsf                 37.47 TB Active
2026-04-10 07:05:15.000000 /data/ASD21D9DASDB9DC/container.1.cdsf                 37.43 TB Active
2026-04-10 07:05:15.000000 /data/1DC1DASDADF5132/container.1.cdsf                 37.40 TB Active
2026-04-10 07:05:15.000000 /data/1DASDA7FDASDAS4/container.1.cdsf                 37.40 TB Active
2026-04-10 07:05:15.000000 /data/1DCDASD3DASDAF6/container.1.cdsf                 37.40 TB Active

To Reproduce
Like above, logged this as bug rather than a feature request, as I believe the right prefixes for amounts in powers of two are those with the "i", but it can be as well a feature request to just add the prefixes with the "i" alongside the current ones without the "i".

Thank you.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions