Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[utilities] Relax from hard dependency of python3-magic #3025

Merged

Conversation

pmoravec
Copy link
Contributor

For compatibility reasons on some distros, sos should not have a hard dependency on 'magic' python library. It should attempt to use it for detection of binary file content, but should fall back to previous "read the very first byte" method otherwise.

Resolves: #3025
Relates: #3021

Signed-off-by: Pavel Moravec pmoravec@redhat.com


Please place an 'X' inside each '[]' to confirm you adhere to our Contributor Guidelines

  • Is the commit message split over multiple lines and hard-wrapped at 72 characters?
  • Is the subject and message clear and concise?
  • Does the subject start with [plugin_name] if submitting a plugin patch or a [section_name] if part of the core sosreport code?
  • Does the commit contain a Signed-off-by: First Lastname email@example.com?
  • Are any related Issues or existing PRs properly referenced via a Closes (Issue) or Resolved (PR) line?

@pmoravec
Copy link
Contributor Author

Cc @nkshirsagar - please let me know if this PR would resolve the #3021 (in upstream).

@packit-as-a-service
Copy link

Congratulations! One of the builds has completed. 🍾

You can install the built RPMs by following these steps:

  • sudo yum install -y dnf-plugins-core on RHEL 8
  • sudo dnf install -y dnf-plugins-core on Fedora
  • dnf copr enable packit/sosreport-sos-3025
  • And now you can install the packages.

Please note that the RPMs should be used only in a testing environment.

Copy link
Member

@TurboTurtle TurboTurtle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having the failover method makes sense, but is there a way we can make python3-magic a similarly weak dep for setup.py?

Since the current approach depends on a method that only exists in 0.4.19, should we leave the dependency in there, but simply fallback to the "read a byte" approach when an older version is installed?

@pmoravec
Copy link
Contributor Author

Having the failover method makes sense, but is there a way we can make python3-magic a similarly weak dep for setup.py?

distutils.core can define just hard dependencies ('requires', 'provides', 'obsoletes', 'depends') but not a weak one :(

Since the current approach depends on a method that only exists in 0.4.19, should we leave the dependency in there, but simply fallback to the "read a byte" approach when an older version is installed?

You mean "try importing magic library, if succeeds and version is >=0.4.20, always use the method from magic; otherwise always fallback to the 'read a byte' approach"? That makes sense, and keep that in a variable to test inside file_is_binary method.

@TurboTurtle
Copy link
Member

You mean "try importing magic library, if succeeds and version is >=0.4.20, always use the method from magic; otherwise always fallback to the 'read a byte' approach"? That makes sense, and keep that in a variable to test inside file_is_binary method.

Yes, but also something that provides for "Hey, $user, the sos team really suggests having python3-magic installed here, and it should be installed alongside sos if at all possible, but just in case that can't be done, you're still covered." Effectively, try to ensure that python3-magic gets installed (so, I guess, keep it as a dep), but provide failover in case that's not possible but we still want to support generating an archive.

@nkshirsagar
Copy link
Contributor

Cc @nkshirsagar - please let me know if this PR would resolve the #3021 (in upstream).
yes it would. Thanks!

We have patched downstream with a very similar patch for bionic/focal Ubuntu versions, pasted below,

Description: Revert to old way of binary file detection
 Reverting to older (4.3) way of detecting binary files 
 because sos 4.4 uses python3-magic version 0.4.20 which is 
 not present in bionic.
 .
 sosreport (4.4-1ubuntu0.18.04.1) bionic; urgency=medium
 .
   * New 4.4 upstream. (LP: #1986611)
 .
   * For more details, full release note is available here:
     - https://github.com/sosreport/sos/releases/tag/4.4
 .
   * Former patches, now fixed:
     - d/p/0002-fix-setup-py.patch
     - d/p/0003-mention-sos-help-in-sos-manpage.patch
 .
   * Remaining patches:
     - d/p/0001-debian-change-tmp-dir-location.patch
 .
   * New patches:
     - d/p/0002-revert-to-old-style-binary-file-detection.patch
Author: Nikhil Kshirsagar <nikhil.kshirsagar@canonical.com>
Bug-Ubuntu: https://bugs.launchpad.net/bugs/1986611

--- sosreport-4.4.orig/requirements.txt
+++ sosreport-4.4/requirements.txt
@@ -2,5 +2,4 @@ pycodestyle>=2.4.0
 coverage>=4.0.3
 Sphinx>=1.3.5
 pexpect>=4.0.0
-python_magic>=0.4.20
 pyyaml
--- sosreport-4.4.orig/setup.py
+++ sosreport-4.4/setup.py
@@ -107,7 +107,7 @@ setup(
     ],
     cmdclass=cmdclass,
     command_options=command_options,
-    requires=['pexpect', 'python_magic', 'pyyaml']
+    requires=['pexpect', 'pyyaml']
     )
 
 
--- sosreport-4.4.orig/sos/utilities.py
+++ sosreport-4.4/sos/utilities.py
@@ -19,7 +19,6 @@ import tempfile
 import threading
 import time
 import io
-import magic
 
 from contextlib import closing
 from collections import deque
@@ -75,17 +74,14 @@ def file_is_binary(fname):
     :returns:   True if binary, else False
     :rtype:     ``bool``
     """
-    try:
-        _ftup = magic.detect_from_filename(fname)
-        _mimes = ['text/', 'inode/']
-        return (
-            _ftup.encoding == 'binary' and not
-            any(_ftup.mime_type.startswith(_mt) for _mt in _mimes)
-        )
-    except Exception:
-        # if for some reason this check fails, don't blindly remove all files
-        # but instead rely on other checks done by the component
-        return False
+    with open(fname, 'tr') as tfile:
+        try:
+            # when opened as above (tr), reading binary content will raise
+            # an exception
+            tfile.read(1)
+            return False
+        except UnicodeDecodeError:
+            return True
 
 
 def find(file_pattern, top_dir, max_depth=None, path_pattern=None):

So we are going ahead with our 4.4 releases with that for now, and for 4.5 we will revert the patch so as to use the upstream source.

pmoravec added a commit to pmoravec/sos that referenced this pull request Sep 13, 2022
For compatibility reasons on some distros, sos should not have a hard
dependency on 'magic' python library. It should attempt to use it for
detection of binary file content, but should fall back to previous "read
the very first byte" method otherwise.

Resolves: sosreport#3025
Relates: sosreport#3021

Signed-off-by: Pavel Moravec <pmoravec@redhat.com>
@pmoravec pmoravec force-pushed the sos-pmoravec-dont-hard-depend-on-magic branch from 49401a6 to 429fae6 Compare September 13, 2022 07:17
@pmoravec
Copy link
Contributor Author

Good point with the warning, I added it.

Yes the downstream Ubuntu patch was an inspiration :) - my question was rather to ensure the upstream solution will properly work on Ubuntu without a need of a downstream patch.

@nkshirsagar
Copy link
Contributor

Good point with the warning, I added it.

Yes the downstream Ubuntu patch was an inspiration :) - my question was rather to ensure the upstream solution will properly work on Ubuntu without a need of a downstream patch.

The warning "It is recommended to install proper python3-magic package with the module." is moot on Ubuntu bionic/focal versions because we know they will not have the required python3-magic package. The upstream patch caters to both situations but the debian patch simply assumes no such python3-magic version will be available, so no warning is needed, and we simply revert to older approach for those distros. Thinking ahead, for 4.5 we could potentially revert the debian patch, but we'd end up spewing a warning.

@pmoravec
Copy link
Contributor Author

Feel free to propose any better warning text (in either better meaning or better English), I know I am not the best writer for either :).

@nkshirsagar
Copy link
Contributor

Feel free to propose any better warning text (in either better meaning or better English), I know I am not the best writer for either :).

The warning is perfect for distributions where the upgrade of that package is actually an option :-)

sos/utilities.py Outdated
import magic
magic.detect_from_filename(__file__)
magic_mod = True
except ImportError:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd also need to catch AttributeError for cases where older magic is installed and detect_from_filename() is not present.

pmoravec added a commit to pmoravec/sos that referenced this pull request Sep 13, 2022
For compatibility reasons on some distros, sos should not have a hard
dependency on 'magic' python library. It should attempt to use it for
detection of binary file content, but should fall back to previous "read
the very first byte" method otherwise.

Resolves: sosreport#3025
Relates: sosreport#3021

Signed-off-by: Pavel Moravec <pmoravec@redhat.com>
@pmoravec pmoravec force-pushed the sos-pmoravec-dont-hard-depend-on-magic branch 2 times, most recently from 43d41ed to c9861b2 Compare September 14, 2022 20:46
sos/utilities.py Outdated
import magic
magic.detect_from_filename(__file__)
magic_mod = True
except ImportError, AttributeError:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Syntax error is due to these being non-parenthesized.

For compatibility reasons on some distros, sos should not have a hard
dependency on 'magic' python library. It should attempt to use it for
detection of binary file content, but should fall back to previous "read
the very first byte" method otherwise.

Resolves: sosreport#3025
Relates: sosreport#3021

Signed-off-by: Pavel Moravec <pmoravec@redhat.com>
@pmoravec pmoravec force-pushed the sos-pmoravec-dont-hard-depend-on-magic branch from c9861b2 to 2cf0d3c Compare September 18, 2022 19:54
@TurboTurtle TurboTurtle merged commit 4245de0 into sosreport:main Sep 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants