Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add structured version info for compression modules #117404

Open
serhiy-storchaka opened this issue Mar 31, 2024 · 6 comments
Open

Add structured version info for compression modules #117404

serhiy-storchaka opened this issue Mar 31, 2024 · 6 comments
Labels
3.13 bugs and security fixes type-feature A feature request or enhancement

Comments

@serhiy-storchaka
Copy link
Member

serhiy-storchaka commented Mar 31, 2024

Feature or enhancement

The zlib module has two attributes that denote the zlib library version:

But the version string is not comfortable to use. You need to convert it to a sequence of numbers for comparison, this is boring and errorprone. It would be better to provide a structural version, like sys.version_info.

Other compression modules do not provide versions at all. The bzlib library has function BZ2_bzlibVersion() which returns the version string. The lzma version has a number of constants and functions (LZMA_VERSION_STRING, LZMA_VERSION, lzma_version_string(), lzma_version_number()) which return static and runtime versions as a string or packed into a number.

#115989 by @ivq initially included addition of lzma.LZMA_VERSION and lzma.LZMA_RUNTIME_VERSION for version strings (clearly inspired by zlib). @gpshead renamed them to lzma.LZMA_HEADER_VERSION_STRING and lzma.LZMA_VERSION_STRING and added lzma.LZMA_HEADER_VERSION and lzma.LZMA_VERSION for versions packed into a number.

I propose to add attributes for version strings and version named tuples for all compression modules. I think that named tuples are more useful than packed numbers. I created a patch for this, but I am not sure about names. Should we use "RUNTIME_" or "HEADER_" prefixes? Should we add "_STRING" or "_info" suffix? Do we need the version of the library that was used for building the module or the version of the library actually loaded by the interpreter is enough?

Linked PRs

@serhiy-storchaka serhiy-storchaka added type-feature A feature request or enhancement 3.13 bugs and security fixes labels Mar 31, 2024
serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this issue Mar 31, 2024
* Add named tuples zlib_version and zlib_runtime_version in the zlib
  module.
* Add string BZLIB_VERSION and named tuple bzlib_version in the bz2
  module.
* Add strings LZMA_VERSION and LZMA_RUNTIME_VERSION and named tuples
  lzma_version and lzma_runtime_version in the lzma module.
@gpshead
Copy link
Member

gpshead commented Mar 31, 2024

Thanks for looking at the larger picture. On that lzma PR in particular, the reason I went for "LZMA_VERSION" and "LZMA_HEADER_VERSION" is that the only one most people are going to ever find reason to use is the runtime version so it should be the shorter name. the header version generally isn't important, we expose it because we can.

Same with string versions, having a properly consumable tuple or numeric version is useful. The strings are a mere convenience. We could do without them but making them available doesn't hurt.

I also think we should reconsider the naming: don't include the name prefix on the module member name. It's in the lzma or zlib module, it doesn't need a LZMA_ or ZLIB_ or similar prefix on the name. (It should also be excluded from __all__)

For existing names that Modules already have, there is no reason to deprecate the existing ones. But adding new ones should use the brief names making the structured version info the simplest one type and get at. lzma.version for example.

Priority:

  • expose runtime versions in structured form via a simple to type name such as .version.

Everything else is a bonus feature.

@serhiy-storchaka
Copy link
Member Author

Thank you for your feedback Gregory.

Some information and reasoning:

  1. In the curses module we use ncurses_version. The ncurses_ prefix is importand, because ncurses is just one of many implementations of curses. Other implementations have different versions, and AFAIK they do not provide version programmatically. This is perhaps a lesser issue for zlib, bzlib and lzma libraries, the alternative implementations, if exists, most likely provide a version compatible with the main implementation. But zlib-ng provides also its own version which can be exposed with the zlibng_ prefix.
  2. They are needed to be in __all__ to show them in the module help() output. This is yet one reason to use unique prefixes. Unless we reuse __version__ for this (which for now should be a string). But it may make more difficult for users to discover this feature.
  3. The string representation can contain additional information not exposed in the named tuple. See sys.version and sys.version_info for example. I tried to use a hybrid approach in Tkinter: info_patchlevel() returns a named tuple, but its string representation was a version string -- so you can use it in both purposes. But it was only possible because the named tuple contained all information of version string. In other cases we need to add more fields (maybe hidden).

@serhiy-storchaka
Copy link
Member Author

The sqlite3 module exposes version, version_info, sqlite_version and sqlite_version_info. The former two are not the Sqlite version, they are the version of the pysqlite3 module when it was developed outside of the stdlib. Currently they are deprecated, because they do not contain useful information and can only cause confusion.

@gpshead
Copy link
Member

gpshead commented Apr 1, 2024

#98567 regarding zlib version when linked to zlib-ng is an example of why structured version info is useful. And perhaps why the string version can also be useful: it allows alternative implementations to identify themselves.

@gpshead
Copy link
Member

gpshead commented Apr 1, 2024

I do not think CPython should be in the business of proactively identifying alternative implementations. There are infinite possibilities there and we'd always be chasing down ones to add specific ID logic for. We should not waste effort on that unsolvable task. ie we should never have a .zlibng_version attribute.

We do not want users to have to use hasattr to try and find attributes that may or may not exist based on some choice someone made at build time.

When alternative implementations are used, it is up to them to provide "equivalent to" version numbers. They likely already do for the purposes of being drop in replacement compatible at the C level.

@gpshead
Copy link
Member

gpshead commented Apr 1, 2024

They are needed to be in all to show them in the module help() output.

I don't really care if they are in __all__ or not, so if desired, fine. Anyone using from module import * has shot themselves in the foot already. =)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.13 bugs and security fixes type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

2 participants