Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

shutil.copytree glob-style filtering [patch] #46915

Closed
tarekziade mannequin opened this issue Apr 20, 2008 · 16 comments
Closed

shutil.copytree glob-style filtering [patch] #46915

tarekziade mannequin opened this issue Apr 20, 2008 · 16 comments
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@tarekziade
Copy link
Mannequin

tarekziade mannequin commented Apr 20, 2008

BPO 2663
Nosy @birkenfeld, @abalkin, @giampaolo, @tarekziade
Files
  • copytree.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2008-07-05.10:13:53.082>
    created_at = <Date 2008-04-20.21:28:22.746>
    labels = ['type-feature', 'library']
    title = 'shutil.copytree glob-style filtering [patch]'
    updated_at = <Date 2008-07-05.10:13:53.072>
    user = 'https://github.com/tarekziade'

    bugs.python.org fields:

    activity = <Date 2008-07-05.10:13:53.072>
    actor = 'georg.brandl'
    assignee = 'none'
    closed = True
    closed_date = <Date 2008-07-05.10:13:53.082>
    closer = 'georg.brandl'
    components = ['Library (Lib)']
    creation = <Date 2008-04-20.21:28:22.746>
    creator = 'tarek'
    dependencies = []
    files = ['10417']
    hgrepos = []
    issue_num = 2663
    keywords = ['patch']
    message_count = 16.0
    messages = ['65652', '65663', '65670', '65682', '65906', '65907', '65908', '65909', '65919', '65924', '65925', '66149', '67158', '67225', '67269', '69276']
    nosy_count = 6.0
    nosy_names = ['georg.brandl', 'belopolsky', 'gustavo', 'draghuram', 'giampaolo.rodola', 'tarek']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = None
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue2663'
    versions = ['Python 2.6', 'Python 3.0']

    @tarekziade
    Copy link
    Mannequin Author

    tarekziade mannequin commented Apr 20, 2008

    Here's a first draft of a small addon to shutil.copytree.

    This patch allows excluding some folders or files from the copy, given
    glob-style patterns. A callable can also be provided instead of the
    patterns, for a more complex filtering.

    I didn't upgrade Doc/shutil.rst yet in this patch, as this can be done
    when the change will be accepted and in its final shape I guess.

    @tarekziade tarekziade mannequin added stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels Apr 20, 2008
    @abalkin
    Copy link
    Member

    abalkin commented Apr 21, 2008

    On the interface, I would suggest renaming 'exclude' to 'ignore' for
    consistency with filecmp.dircmp. Also consider detecting file separator
    in the patterns and interpreting them as an absolute (if
    pattern.startswith(pathsep)) or relative with respect to src.

    On the implementation, consider making 'exclude_files' a set for a
    faster lookup. It should also be possible to refactor the code to
    avoid checking the type of 'exclude' on every file and every recursion.

    @tarekziade
    Copy link
    Mannequin Author

    tarekziade mannequin commented Apr 22, 2008

    I changed the patch based on all remarks. For the absolute path, I was
    wondering if it would be useful since calls are recursive, relative to
    the visited directory.

    @draghuram
    Copy link
    Mannequin

    draghuram mannequin commented Apr 22, 2008

    Is there any reason for rmtree also to not support this exclusion
    feature? Both copytree and rmtree explicitly iterate over list of names
    and as I see it, this exclusion is really about which names to ignore.
    Already, copytree and rmtree have inconsistencies (rmtree has 'onerror'
    while 'copytree' doesn't) and it would be nice to not add more.

    @tarekziade
    Copy link
    Mannequin Author

    tarekziade mannequin commented Apr 27, 2008

    Agreed, rmtree should have it as well. I'll add that in the patch as well,

    @tarekziade
    Copy link
    Mannequin Author

    tarekziade mannequin commented Apr 27, 2008

    while working on the patch to add the same feature in rmtree, I realized
    this is a non sense since the root folder itself is removed at the end
    of the function when all its content is removed.

    So, unless we change this behavior, which I doubt it is a good idea, it
    won't be possible.

    Maybe another API could be added in shutil, in order to do any kind of
    treatment in a tree, like removing files, or whatever, and without
    copying it like copytree does.

    @tarekziade
    Copy link
    Mannequin Author

    tarekziade mannequin commented Apr 27, 2008

    I have thaught of various ways to write this new API for the deletion
    use case, but I think nothing makes it easier and shorter than a simple
    os.walk call.

    @tarekziade
    Copy link
    Mannequin Author

    tarekziade mannequin commented Apr 28, 2008

    This patch includes the documentation for shutils.rst as well. (I
    removed the older patches)

    @draghuram
    Copy link
    Mannequin

    draghuram mannequin commented Apr 28, 2008

    My update with email failed so I am just copying my response here:

    while working on the patch to add the same feature in rmtree, I realized
    this is a non sense since the root folder itself is removed at the end
    of the function when all its content is removed.

    Indeed. Sorry about that.

    So, unless we change this behavior, which I doubt it is a good idea, it
    won't be possible.

    I agree. But in general, it would be nice to separate file list
    generation and the actual operation. Something similar to shell where
    it resolves the pattern while the actual command itself cares only
    about the files passed to it. This is not necessarily a comment on
    this patch which I am hoping I can check it out soon.

    @draghuram
    Copy link
    Mannequin

    draghuram mannequin commented Apr 28, 2008

    The patch looks good to me.

    @draghuram
    Copy link
    Mannequin

    draghuram mannequin commented Apr 28, 2008

    I forgot to add that the example provided in rst doc is incorrect. The
    copytree() in that example should be given destination path as well. In
    addition, the docstring for copytree mentions "which is a directory
    list". "directory list" is a bit vague and should ideally be replaced by
    something like "list of elements" (which is what appears in the doc) or
    "list of entries".

    @tarekziade
    Copy link
    Mannequin Author

    tarekziade mannequin commented May 3, 2008

    Right, thanks.

    I have corrected the doc, and pushed some examples at the bottom of the
    module documentation.

    @tarekziade
    Copy link
    Mannequin Author

    tarekziade mannequin commented May 21, 2008

    patch with the new trunk

    @birkenfeld
    Copy link
    Member

    Hi Tarek,

    here's a review:

    • The new docs are not very clear about ignore_patterns being a function
      factory. E.g.:

    """The callable must return a list of folder and file names relative to
    the path, that will be ignored in the copy process.
    :func:`ignore_patterns` is an example of such callable."""

    Rather, the *return value* of ignore_patterns is an example of such a
    callable.

    • The new docs should also note that copytree is called recursively, and
      therefore the ignore callable will be called once for each directory
      that is copied.

    • Instead of "path and its elements" the terminology should be
      "directory" and "the list of its contents, as returned by os.listdir()".
      Likewise, "folder" should be "directory".

    • The second new example makes me wonder if *ignore* is the correct name
      for the parameter. Is *filter* better?

    • A nit; the signature should be "copytree(src, dst[, symlinks[, ignore]])".

    • The patch adds a space in the definition of rmtree().

    @tarekziade
    Copy link
    Mannequin Author

    tarekziade mannequin commented May 23, 2008

    Thanks Georg, I have changed the patch accordingly.

    There's one issue left: the name of the parameter (ignore)

    I have renamed it like this on Alexander suggestion, for consistency
    with filecmp.dircmp which uses ignore.

    By the way, I was wondering: do we need to used reStructuredText as well
    in function doctstrings ?

    @birkenfeld
    Copy link
    Member

    Committed in r64722. Thanks everyone!

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants