Skip to content

Conversation

@gazpachoking
Copy link
Member

@gazpachoking gazpachoking commented Jul 17, 2023

Motivation for changes:

There were several different places we wrote parsing functions for human readable file sizes. I consolidated these into just one main function in our tools package. Doing this, I noticed a mix of dealing with filesizes either in bytes or mebibytes. I've changed everything to just deal in bytes. The main change here was the content_size field. I've updated all the plugins I could find to use the field with bytes, but we'll still need to write an upgrade action and do a version bump in case anyone was using this field manually in their config (or custom plugin.)

Detailed changes:

  • All plugins reading or writing to content_size field have been changed to use bytes as the unit.
  • content_size plugin has been changed to allow human readable sizes, e.g. 3GiB
  • parse_size jinja filter was changed to being an unanchored regex search, but requires word breaks around file size matches
  • 'case' option to parse_size jinja filter was removed. It's always case insensitive now
  • The parse_size jinja filter always delivered bytes, but now that's what the content_size field uses, so users can set that field easily using this filter
  • Sizes in config schemas need the 'B' now. e.g. 1GB, not 1G
  • Updates utility function (that was formerly unused) format_filesize now takes a number of bytes and outputs a human readable size. This is now used in log methods when showing file sizes.

Addressed issues/feature requests:

Config usage if relevant (new plugin or updated schema):

set:
  # Previously, the parse_size filter produced bytes, but content_size wanted mebibytes, so this didn't work properly without extra division in the template.
  content_size: "{{ title|parse_size }}"
content_size:
  # Previously, this only accepted integers, which was number of mebibytes. Allowing units is much nicer.
  min: 10 GiB
  max: 20 GiB

To Do:

  • Right now sizes in the config are always in binary units with no way to specify metric ones. Should we make binary units require explicitly being specified, to allow for metric specification? i.e. 1 GB would be metric, and 1GiB binary. Currently, they both are binary. This was not explicitly true. Each plugin ends up choosing this themselves. But perhaps we should make the mode that allows metric sizes default, rather than opt-in.
  • Add special handling for --dump, such that content_size field prints in human readable units?

@gazpachoking gazpachoking merged commit e0fb206 into develop Jul 29, 2023
@gazpachoking gazpachoking deleted the content_size_bytes branch August 30, 2023 01:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant