Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

imp.find_module reacts badly to iterator #78598

Closed
PhillipMFeldman mannequin opened this issue Aug 16, 2018 · 5 comments
Closed

imp.find_module reacts badly to iterator #78598

PhillipMFeldman mannequin opened this issue Aug 16, 2018 · 5 comments
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@PhillipMFeldman
Copy link
Mannequin

PhillipMFeldman mannequin commented Aug 16, 2018

BPO 34417
Nosy @brettcannon, @Phillip_M_Feldman, @ericsnowcurrently

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2018-08-17.16:05:35.996>
created_at = <Date 2018-08-16.21:51:44.942>
labels = ['type-feature', 'library']
title = 'imp.find_module reacts badly to iterator'
updated_at = <Date 2018-08-21.18:37:08.513>
user = 'https://github.com/PhillipMFeldman'

bugs.python.org fields:

activity = <Date 2018-08-21.18:37:08.513>
actor = 'Phillip.M.Feldman@gmail.com'
assignee = 'none'
closed = True
closed_date = <Date 2018-08-17.16:05:35.996>
closer = 'eric.snow'
components = ['Library (Lib)']
creation = <Date 2018-08-16.21:51:44.942>
creator = 'Phillip.M.Feldman@gmail.com'
dependencies = []
files = []
hgrepos = []
issue_num = 34417
keywords = []
message_count = 5.0
messages = ['323623', '323660', '323820', '323837', '323838']
nosy_count = 3.0
nosy_names = ['brett.cannon', 'Phillip.M.Feldman@gmail.com', 'eric.snow']
pr_nums = []
priority = 'normal'
resolution = 'wont fix'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue34417'
versions = ['Python 2.7']

@PhillipMFeldman
Copy link
Mannequin Author

PhillipMFeldman mannequin commented Aug 16, 2018

imp.find_module goes down in flames if one tries to pass an iterator rather than a list of folders. Firstly, the message that it produces is somewhat misleading:

RuntimeError: sys.path must be a list of directory names

Secondly, it would be helpful if one could pass an iterator. I'm thinking in particular of the situation where one wants to import something from a large folder tree, and the module in question is likely to be found early in the search process, so that it is more efficient to explore the folder tree incrementally.

@PhillipMFeldman PhillipMFeldman mannequin added type-bug An unexpected behavior, bug, or error stdlib Python modules in the Lib dir labels Aug 16, 2018
@ericsnowcurrently
Copy link
Member

There are several issues at hand here, Phillip. I'll enumerate them below.

Thanks for taking the time to let us know about this. However, I'm closing this issue since realistically the behavior of imp.find_module() isn't going to change, particularly in Python 2.7. Even though the issue is closed, feel free to reply, particularly about how you are using imp.find_module() (we may be able to point you toward how to use importlib instead).

Also, I've changed this issue's type to "enhancement". imp.find_module() is working as designed, so what you are looking for is a feature request. Consequently there's a much higher bar for justifying a change. Here are reasons why the requested change doesn't reach that bar:

  1. Python 2.7 is closed to new features.

So imp.find_module() is not going to change.

  1. Python 2.7 is nearing EOL.

We highly recommend that everyone move to Python 3 as soon as possible. Hopefully you are in a position to do so. If you're stuck on Python 2.7 then you miss the advantages of importlib, along with a ton of other benefits.

If you are not going to be able to migrate before 2020 then send an email to python-list@python.org asking for recommendations on what to do.

  1. Starting in Python 3.4, using the imp module is discouraged/deprecated.

"Deprecated since version 3.4: The imp package is pending deprecation in favor of importlib." [1]

The importlib package should have everything you need. What are you using imp.find_module() for? We should be able to demonstrate the equivalent using importlib.

  1. The import machinery is designed around using a list (the builtin type, not the concept) for the "module search path".
  • imp.find_module(): "the list of directory names given by sys.path is searched" [2]
  • imp.find_module(): "Otherwise, path must be a list of directory names" [2]
  • importlib.find_loader() (deprecated): "optionally within the specified path" (which defaults to sys.path) [3]
  • importlib.util.find_spec(): doesn't even have a "path" parameter [4]
  • ModuleSpec.submodule_search_locations: "List of strings for where to find submodules" [5]
  • sys.path: "A list of strings that specifies the search path for modules. ... Only strings and bytes should be added to sys.path; all other data types are ignored during import." [6]

[1] https://docs.python.org/3/library/imp.html#module-imp
[2] https://docs.python.org/3/library/imp.html#imp.find_module
[3] https://docs.python.org/3/library/importlib.html#importlib.find_loader
[4] https://docs.python.org/3/library/importlib.html#importlib.util.find_spec
[5] https://docs.python.org/3/library/importlib.html#importlib.machinery.ModuleSpec.submodule_search_locations
[6] https://docs.python.org/3/library/sys.html#sys.path

@ericsnowcurrently ericsnowcurrently added type-feature A feature request or enhancement and removed type-bug An unexpected behavior, bug, or error labels Aug 17, 2018
@PhillipMFeldman
Copy link
Mannequin Author

PhillipMFeldman mannequin commented Aug 21, 2018

It appears that the importlib package has the same issue: One can't
provide an iterator for the path. When searching a large folder tree for
an item that is likely to be found early in the search process (i.e., at a
high level in the folder tree), the available functionality is massively
inefficient. So, I wrote my own wrapper for imp.find_module to do this
job, and will eventually modify this code to use importlib instead of
imp.

On Fri, Aug 17, 2018 at 9:05 AM Eric Snow <report@bugs.python.org> wrote:

Eric Snow <ericsnowcurrently@gmail.com> added the comment:

There are several issues at hand here, Phillip. I'll enumerate them below.

Thanks for taking the time to let us know about this. However, I'm
closing this issue since realistically the behavior of imp.find_module()
isn't going to change, particularly in Python 2.7. Even though the issue
is closed, feel free to reply, particularly about how you are using
imp.find_module() (we may be able to point you toward how to use importlib
instead).

Also, I've changed this issue's type to "enhancement". imp.find_module()
is working as designed, so what you are looking for is a feature request.
Consequently there's a much higher bar for justifying a change. Here are
reasons why the requested change doesn't reach that bar:

  1. Python 2.7 is closed to new features.

So imp.find_module() is not going to change.

  1. Python 2.7 is nearing EOL.

We highly recommend that everyone move to Python 3 as soon as possible.
Hopefully you are in a position to do so. If you're stuck on Python 2.7
then you miss the advantages of importlib, along with a ton of other
benefits.

If you are not going to be able to migrate before 2020 then send an email
to python-list@python.org asking for recommendations on what to do.

  1. Starting in Python 3.4, using the imp module is discouraged/deprecated.

"Deprecated since version 3.4: The imp package is pending deprecation in
favor of importlib." [1]

The importlib package should have everything you need. What are you using
imp.find_module() for? We should be able to demonstrate the equivalent
using importlib.

  1. The import machinery is designed around using a list (the builtin type,
    not the concept) for the "module search path".
  • imp.find_module(): "the list of directory names given by sys.path is
    searched" [2]
  • imp.find_module(): "Otherwise, path must be a list of directory names"
    [2]
  • importlib.find_loader() (deprecated): "optionally within the specified
    path" (which defaults to sys.path) [3]
  • importlib.util.find_spec(): doesn't even have a "path" parameter [4]
  • ModuleSpec.submodule_search_locations: "List of strings for where to
    find submodules" [5]
  • sys.path: "A list of strings that specifies the search path for modules.
    ... Only strings and bytes should be added to sys.path; all other data
    types are ignored during import." [6]

[1] https://docs.python.org/3/library/imp.html#module-imp
[2] https://docs.python.org/3/library/imp.html#imp.find_module
[3] https://docs.python.org/3/library/importlib.html#importlib.find_loader
[4]
https://docs.python.org/3/library/importlib.html#importlib.util.find_spec
[5]
https://docs.python.org/3/library/importlib.html#importlib.machinery.ModuleSpec.submodule_search_locations
[6] https://docs.python.org/3/library/sys.html#sys.path

----------
nosy: +brett.cannon, eric.snow
resolution: -> wont fix
stage: -> resolved
status: open -> closed
type: behavior -> enhancement


Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue34417\>


@brettcannon
Copy link
Member

Saying "the available functionality is massively inefficient" is unnecessarily hostile towards those of us who actually wrote and maintain that code. Without diving into the code, chances are that requirement is there so that the C code can use macros to access the list as efficiently as possible.

Now if you want to propose specific changes to importlib's code for it to work with iterables instead of just lists then we would be happy to review the pull request.

@PhillipMFeldman
Copy link
Mannequin Author

PhillipMFeldman mannequin commented Aug 21, 2018

My apologies for the tone of my remark. I am grateful to you and others
who donate their time to develop the code.

I'm attaching the wrapper code that I created to work around the problem.

Phillip

def expander(paths='./*'):
   """
   OVERVIEW

This function is a generator, i.e., creates an iterator that recursively
searches a list of folders in an incremental fashion. This approach is
advantageous when the folder tree(s) to be searched are large and the
item of
interest is likely to be found early in the process.

INPUTS

paths must be either (a) a list of folder paths (each of which is a
string)
or (b) a single string containing one or more folder paths separated by
the
OS-specific path delimiter.

Each path in paths must be either (a) an existing folder or (b) an
existing
folder followed by '/' or '*'. In case (a), the folder string is
copied
from the input (paths) to the output result verbatim. In case (b), the
folder string is replaced by an expanded list that includes not only the
base (the portion of the path that remains after the '/
' or '*' has
been
removed), but all subfolders as well.

RETURN VALUES

The returned value is an iterator.

Invoking the next method of the iterator produces one folder path at a
time.
"""

   if isinstance(paths, basestring):
      paths= paths.split(os.pathsep)

elif not isinstance(paths, list):
raise TypeError("paths must be either a string or a list of
strings.")

   found= set()

   for path in paths:
      if path.endswith('/*') or path.endswith('\*'):

         # A recursive search of subfolders is required:
         for item in os.walk(path[:-2]):
            base= os.path.abspath(item[0])
            new= [os.path.join(base, nested) for nested in item[1]]

            for item in new:
               if not item in found:
                  found.add(item)
                  yield item
  else:
         # No recursive search is required:
         if not item in found:
            found.add(item)
            yield item

# end for path in paths

def find_module(module_name, in_folders=[]):
   """
   This function finds a module and return the fully-qualified file name.
   Folders from `in_folders`, if specified, are search first, followed by
   folders in the global `import_path` list.

If any folder name in in_folders or import_path ends with an
asterisk,
indicating that a recursive search is required, files.expander is
invoked to create iterators that return one folder at a time, and
imp.find_module is invoked separately for each of these folders.

EXPLICIT INPUTS

module_name is the unqualified name of the module to be found.

in_folders is an optional list of additional folders to be searched
before
the folders in import_path are searched.

IMPLICIT INPUTS

import_path is obtained from the global namespace.

RETURN VALUES

If find_module is able to find the requested module, it returns the
same
three return values (f, filename, and description) that
imp.find_module would return.
"""

   if isinstance(in_folders, basestring):
      in_folders= [in_folders]
   elif not isinstance(in_folders, list):
      raise TypeError("If specified, `in_folders` must be either a string
or a "
        "list of strings.  (A string is wrapped to produce a length-1
list).")

if any([item.endswith('*') for item in in_folders ]) or \
any([item.endswith('*') for item in import_path]):

      ex= None

      for folder in itertools.chain(
        expander(in_folders), expander(import_path)):
         try:
            return imp.find_module(module_name, in_folders + import_path)
         except Exception as ex:
            pass

      if ex:
         raise ex

else:
return imp.find_module(module_name, in_folders + import_path)

On Tue, Aug 21, 2018 at 10:32 AM Brett Cannon <report@bugs.python.org>
wrote:

Brett Cannon <brett@python.org> added the comment:

Saying "the available functionality is massively inefficient" is
unnecessarily hostile towards those of us who actually wrote and maintain
that code. Without diving into the code, chances are that requirement is
there so that the C code can use macros to access the list as efficiently
as possible.

Now if you want to propose specific changes to importlib's code for it to
work with iterables instead of just lists then we would be happy to review
the pull request.

----------


Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue34417\>


@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

2 participants