Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sys.audit() is called multiple times for glob.glob() #82330

Closed
serhiy-storchaka opened this issue Sep 13, 2019 · 9 comments
Closed

sys.audit() is called multiple times for glob.glob() #82330

serhiy-storchaka opened this issue Sep 13, 2019 · 9 comments
Labels
3.8 (EOL) end of life 3.9 only security fixes stdlib Python modules in the Lib dir

Comments

@serhiy-storchaka
Copy link
Member

BPO 38149
Nosy @serhiy-storchaka, @zooba, @miss-islington, @tirkarthi
PRs
  • bpo-38149: Call sys.audit() only once per call for glob.glob(). #18360
  • [3.8] bpo-38149: Call sys.audit() only once per call for glob.glob(). (GH-18360) #18373
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2020-02-06.11:08:20.855>
    created_at = <Date 2019-09-13.09:07:51.017>
    labels = ['3.8', 'library', '3.9']
    title = 'sys.audit() is called multiple times for glob.glob()'
    updated_at = <Date 2020-02-06.11:08:20.853>
    user = 'https://github.com/serhiy-storchaka'

    bugs.python.org fields:

    activity = <Date 2020-02-06.11:08:20.853>
    actor = 'serhiy.storchaka'
    assignee = 'none'
    closed = True
    closed_date = <Date 2020-02-06.11:08:20.855>
    closer = 'serhiy.storchaka'
    components = ['Library (Lib)']
    creation = <Date 2019-09-13.09:07:51.017>
    creator = 'serhiy.storchaka'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 38149
    keywords = ['patch']
    message_count = 9.0
    messages = ['352248', '352252', '352264', '352269', '352297', '352303', '352304', '361474', '361475']
    nosy_count = 4.0
    nosy_names = ['serhiy.storchaka', 'steve.dower', 'miss-islington', 'xtreak']
    pr_nums = ['18360', '18373']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue38149'
    versions = ['Python 3.8', 'Python 3.9']

    @serhiy-storchaka
    Copy link
    Member Author

    sys.audit() for "glob.glob" is called in a recursive function _iglob(). So in process of executing glob.glob() it can be called multiple times, with the original pattern and with patterns for parent directories, while there are metacharacters in them.

    I do not think it was intentional.

    @serhiy-storchaka serhiy-storchaka added 3.8 (EOL) end of life 3.9 only security fixes stdlib Python modules in the Lib dir labels Sep 13, 2019
    @zooba
    Copy link
    Member

    zooba commented Sep 13, 2019

    Provided it's called with different arguments each time (which it is), there isn't a problem here. Audit hooks are supposed to be informative, not definitive (that is, you almost always need to take the surrounding context into consideration, which is why they are better for logging actions rather than blocking actions).

    Though it might be unintentional that glob.glob() recursively handles each path segment when it could more efficiently work forward resolving each wildcard segment as it goes. Is that what you mean?

    @serhiy-storchaka
    Copy link
    Member Author

    Using recursion is rather an implementation detail, because splitpath() splits a path from the end (Path.glob() does it from other side). Also, it may be a tiny bit more efficient if the pattern contains a long literal prefix.

    @serhiy-storchaka
    Copy link
    Member Author

    Ah, and when added sys.audit() for glob.glob(), should not it be added for Path.glob() too? And for os.walk()?

    @zooba
    Copy link
    Member

    zooba commented Sep 13, 2019

    when added sys.audit() for glob.glob(), should not it be added for Path.glob() too? And for os.walk()?

    Sure, those would make sense.

    They all go via scandir() which has its own event too (and will do for each directory), but being able to see that it came from a glob or walk call could be useful information.

    Maybe moving the sys.audit call in glob.glob further down in the function would avoid some of the recursion? I don't want to duplicate where it's raised too much, as that provides opportunities to bypass it, but this one is only slightly informative - os.scandir is the real one that you'd worry about seeing.

    @serhiy-storchaka
    Copy link
    Member Author

    I think it would be better to call sys.audit() in glob.iglob(), before calling the top-level _iglob(). It will give you a general context, and at every recursion level sys.audit() will be called for os.scandir().

    @serhiy-storchaka
    Copy link
    Member Author

    It is harder to avoid repeating calls in os.walk(), because it is recursive itself. But you can introduce a recursive helper _walk() and make os.walk() just calling sys.audit() and _walk().

    @serhiy-storchaka
    Copy link
    Member Author

    New changeset 54b4f14 by Serhiy Storchaka in branch 'master':
    bpo-38149: Call sys.audit() only once per call for glob.glob(). (GH-18360)
    54b4f14

    @miss-islington
    Copy link
    Contributor

    New changeset 708f472 by Miss Islington (bot) in branch '3.8':
    bpo-38149: Call sys.audit() only once per call for glob.glob(). (GH-18360)
    708f472

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 (EOL) end of life 3.9 only security fixes stdlib Python modules in the Lib dir
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants