Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-32689: Updates shutil.move to allow for Path objects to be used as source arg #5393

Closed

Conversation

emilyemorehouse
Copy link
Member

@emilyemorehouse emilyemorehouse commented Jan 28, 2018

Updates shutil.move function to allow for Path objects to be used as source argument by casting to string before performing rstrip. Added documentation on why rstrip needs to be used in the helper _basename method.

https://bugs.python.org/issue32689

Lib/shutil.py Outdated
# 'foo'
# >>> os.path.basename('/bar/foo/')
# ''
# _basename('/bar/foo/')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing '>>> '

@@ -550,7 +557,10 @@ def move(src, dst, copy_function=copy2):
os.rename(src, dst)
return

real_dst = os.path.join(dst, _basename(src))
# Using _basename instead of os.path.basename is important, as we must
# ignore any trailing slash to avoid the basename returning ''
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain the need for str() too here? (It's clear from the PR title, but it won't be clear immediately to future readers of the code.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done!

…be used as source argument by casting to string before performing rstrip. Added documentation on why rstrip needs to be used in the helper _basename method.
@gvanrossum
Copy link
Member

This will LGTM once you add a Misc/NEWS entry (see the "test" failure).

# ignore any trailing slash to avoid the basename returning ''
# Forcing src to a string allows flexibility of objects being passed in,
# including Path objects
real_dst = os.path.join(dst, _basename(str(src)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to not use os.fspath(src) here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean to use os.fspath(src) instead of casting to a string?

I suppose it would guard users against passing in anything that isn't a string, bytes or path. Otherwise, if it is a path, __fspath__ simply calls str() anyway. However, _basename relies on having a string as its argument, bytes would indeed break that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fspath() is better. I'm surprised there are no other functions in shutil.py with the same need.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fspath() is used elsewhere in shutil.py, but it seems that only shutil.move needs to handle its basename differently. If we really want to use fspath() now, I can look at refactoring _basename(), otherwise I can add this to my list of things to improve for shutil.py.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shutil doesnt explicitly document that operations don't support all path-like objects (str, bytes, and os.PathLike), though the copy* functions do say their arguments should be strings . For the most part shutil does work with path-like objects, but _basename doesn't.

Since _basename breaks the abstraction of os.path.basename, it's _basename that needs to be fixed, not everywhere that it gets called, even if for now that's only in one place. It should use path = os.fspath(path). If this returns bytes, it should also use sep = os.fsencode(sep).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So that's actually a typo in the PR name (it wouldn't let me edit it after I realized). This modifies shutil.move not shutil.copy, which doesn't specify that its arguments are strings.

I'm not sure I'd say that _basename breaks the abstraction of os.path.basename, it just gives us part of the path with different requirements. But, I didn't write the original code :)

I'm happy to take another look at this though.

@gvanrossum
Copy link
Member

gvanrossum commented Feb 14, 2018 via email

@gvanrossum gvanrossum changed the title bpo-32689: Updates shutil.copy to allow for Path objects to be used as source arg bpo-32689: Updates shutil.move to allow for Path objects to be used as source arg Jul 10, 2018
@gvanrossum
Copy link
Member

I have superpowers that let me edit the subject. :-) I'll look into whether you can have them too (also, I thought everyone could edit their own subject -- unsure why you couldn't in this case.)

Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@emilyemorehouse: would you mind to add an unit test to make sure that the reported bug no longer fails? https://bugs.python.org/issue32689#msg310900

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@emilyemorehouse
Copy link
Member Author

@vstinner of course! i'm planning on tackling this during the sprint, probably won't have time before then

@vstinner
Copy link
Member

Ok, fine ;-)

# >>> os.path.basename('/bar/foo/')
# ''
# >>> _basename('/bar/foo/')
# 'foo'
sep = os.path.sep + (os.path.altsep or '')
Copy link
Contributor

@BoboTiG BoboTiG Jan 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mind considering patching here with something like:

    if isinstance(path, os.PathLike):
        return path.name

    sep = os.path.sep + (os.path.altsep or '')
    return os.path.basename(path.rstrip(sep))

?

That would be the only place to apply changes then.

@vstinner
Copy link
Member

@emilyemorehouse: Ping? I still don't see a new test to the change? Are you available to add it? Or would you prefer that I try to find someone else to write it?

@csabella
Copy link
Contributor

@emilyemorehouse It seems like this is ready to go pending the addition of the test that Victor requested. Would you be able to add that test for this change to make it into 3.8? Thanks!

@BoboTiG
Copy link
Contributor

BoboTiG commented Jul 16, 2019

@emilyemorehouse Do you mind if I continue the work on that subject? I am currently pacthing shutil._basename() manually, and that is quite ugly :/

@maxwellmckinnon
Copy link
Contributor

How is this for making it into 3.8? Pathlib really needs it. Needing to do str() at random functions instead of having something that cleaning does the __fspath__ consistently is a dealbreaker for a library that is supposed to make path handling simple and work with everything else.

In other words, shutil is not PEP 519 compliant, and it would be great if it were.

@maxwellmckinnon
Copy link
Contributor

Ah, the problem is that os.fspath(x) returns bytes if x is bytes, which breaks _basename. Oh well.

_basename already breaks on bytes. Not an introduced break from this code. I'll add a test, and let's get this in please!

# ignore any trailing slash to avoid the basename returning ''
# Forcing src to a string allows flexibility of objects being passed in,
# including Path objects
real_dst = os.path.join(dst, _basename(str(src)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it considered poor form to get rid of _basename altogether and make use of pathlib in the move function? I'm not sure if the idea is for all these modules to strictly avoid circular dependencies. They are already using os.path which is just as much a citizen in 3.8 as pathlib right?

e.g.

real_dst = os.path.join(dst, _basename(src))
becomes
real_dst = Path(dst) / Path(src).name

I've looked around and familiarized myself, and I now think importing pathlib here is fine. My only remaining concern is that of performance.

Here's the performance difference for this step.

In [46]: %timeit real_dst = os.path.join("a/b/c", _basename('b/'))
2.71 µs ± 62.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [47]: %timeit real_dst = Path("a/b/c") / Path('b/').name
12.4 µs ± 65.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Is 10us significant or insignificant compared to the least expensive operation this function will do? I don't know. Let's find out.

In [55]: %timeit os.rename('/tmp/a/a.txt', '/tmp/a/b.txt'); os.rename('/tmp/a/b.txt', '/tmp/a/a.txt')
124 µs ± 2.18 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

62us to rename. 10us seems significant enough that we wouldn't want to favor the Path sugar suggestion. 16% speed decrease from adding the 10us.

What do people think? I was hoping to get to use pathlib.Path here, but I suspect for this low level move, it should be as fast as possible, and 16% is not worth one line of sugary code to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another question, would it then be poor form to drop _basename from python3.8? Not sure on the policy regarding removing privates without a deprecation warning. Could always deprecate it for removal in 3.9.

@maxwellmckinnon
Copy link
Contributor

maxwellmckinnon commented Aug 18, 2019

Added a test that fails but is addressed to pass in this PR

Lib/test/test_shutil.py::TestMove::test_move_file_pathlike FAILED                                                               [100%]

============================================================== FAILURES ===============================================================
__________________________________________________ TestMove.test_move_file_pathlike ___________________________________________________

self = <test.test_shutil.TestMove testMethod=test_move_file_pathlike>

    def test_move_file_pathlike(self):
        # Move a file to another location on the same filesystem.
        src = pathlib.Path(self.src_file)
>       self._check_move_file(src, self.dst_dir, self.dst_file)

Lib/test/test_shutil.py:1563:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
Lib/test/test_shutil.py:1545: in _check_move_file
    shutil.move(src, dst)
/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/shutil.py:562: in move
    real_dst = os.path.join(dst, _basename(src))
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

path = PosixPath('/var/folders/r2/psq74t5x3nbfzlph8bh2pvdw0000gn/T/tmp9ie0wh9_/foo')

    def _basename(path):
        # A basename() variant which first strips the trailing slash, if present.
        # Thus we always get the last component of the path, even for directories.
        sep = os.path.sep + (os.path.altsep or '')
>       return os.path.basename(path.rstrip(sep))
E       AttributeError: 'PosixPath' object has no attribute 'rstrip'

/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/shutil.py:526: AttributeError
============================================== 1 failed, 102 deselected in 0.30 seconds ===============================================

After change:

========================================================= test session starts =========================================================
platform darwin -- Python 3.7.4, pytest-5.0.1, py-1.8.0, pluggy-0.12.0 -- /Users/maxwellmckinnon/.venvs/TA3.7/bin/python3.7
cachedir: .pytest_cache
rootdir: /Users/maxwellmckinnon/dev/cpython
plugins: cov-2.7.1, mock-1.10.4
collected 103 items / 102 deselected / 1 selected

Lib/test/test_shutil.py::TestMove::test_move_file_pathlike PASSED                                                               [100%]

============================================== 1 passed, 102 deselected in 0.06 seconds ===============================================

Running all the tests

╰─ pytest Lib/test/test_shutil.py -v
========================================================= test session starts =========================================================
platform darwin -- Python 3.7.4, pytest-5.0.1, py-1.8.0, pluggy-0.12.0 -- /Users/maxwellmckinnon/.venvs/TA3.7/bin/python3.7
cachedir: .pytest_cache
rootdir: /Users/maxwellmckinnon/dev/cpython
plugins: cov-2.7.1, mock-1.10.4
collected 103 items

Lib/test/test_shutil.py::TestShutil::test_chown PASSED                                                                          [  0%]
Lib/test/test_shutil.py::TestShutil::test_copy PASSED                                                                           [  1%]
Lib/test/test_shutil.py::TestShutil::test_copy2 PASSED                                                                          [  2%]
Lib/test/test_shutil.py::TestShutil::test_copy2_symlinks PASSED                                                                 [  3%]
Lib/test/test_shutil.py::TestShutil::test_copy2_xattr SKIPPED                                                                   [  4%]
Lib/test/test_shutil.py::TestShutil::test_copy_return_value PASSED                                                              [  5%]
Lib/test/test_shutil.py::TestShutil::test_copy_symlinks PASSED                                                                  [  6%]
Lib/test/test_shutil.py::TestShutil::test_copyfile_named_pipe PASSED                                                            [  7%]
Lib/test/test_shutil.py::TestShutil::test_copyfile_return_value PASSED                                                          [  8%]
Lib/test/test_shutil.py::TestShutil::test_copyfile_same_file PASSED                                                             [  9%]
Lib/test/test_shutil.py::TestShutil::test_copyfile_symlinks PASSED                                                              [ 10%]
Lib/test/test_shutil.py::TestShutil::test_copymode_follow_symlinks PASSED                                                       [ 11%]
Lib/test/test_shutil.py::TestShutil::test_copymode_symlink_to_symlink PASSED                                                    [ 12%]
Lib/test/test_shutil.py::TestShutil::test_copymode_symlink_to_symlink_wo_lchmod SKIPPED                                         [ 13%]
Lib/test/test_shutil.py::TestShutil::test_copystat_handles_harmless_chflags_errors PASSED                                       [ 14%]
Lib/test/test_shutil.py::TestShutil::test_copystat_symlinks PASSED                                                              [ 15%]
Lib/test/test_shutil.py::TestShutil::test_copytree_dangling_symlinks PASSED                                                     [ 16%]
Lib/test/test_shutil.py::TestShutil::test_copytree_named_pipe PASSED                                                            [ 17%]
Lib/test/test_shutil.py::TestShutil::test_copytree_retains_permissions PASSED                                                   [ 18%]
Lib/test/test_shutil.py::TestShutil::test_copytree_return_value PASSED                                                          [ 19%]
Lib/test/test_shutil.py::TestShutil::test_copytree_simple PASSED                                                                [ 20%]
Lib/test/test_shutil.py::TestShutil::test_copytree_special_func PASSED                                                          [ 21%]
Lib/test/test_shutil.py::TestShutil::test_copytree_symlink_dir PASSED                                                           [ 22%]
Lib/test/test_shutil.py::TestShutil::test_copytree_symlinks PASSED                                                              [ 23%]
Lib/test/test_shutil.py::TestShutil::test_copytree_winerror PASSED                                                              [ 24%]
Lib/test/test_shutil.py::TestShutil::test_copytree_with_exclude PASSED                                                          [ 25%]
Lib/test/test_shutil.py::TestShutil::test_copyxattr SKIPPED                                                                     [ 26%]
Lib/test/test_shutil.py::TestShutil::test_copyxattr_symlinks SKIPPED                                                            [ 27%]
Lib/test/test_shutil.py::TestShutil::test_disk_usage PASSED                                                                     [ 28%]
Lib/test/test_shutil.py::TestShutil::test_dont_copy_file_onto_link_to_itself PASSED                                             [ 29%]
Lib/test/test_shutil.py::TestShutil::test_dont_copy_file_onto_symlink_to_itself PASSED                                          [ 30%]
Lib/test/test_shutil.py::TestShutil::test_make_archive PASSED                                                                   [ 31%]
Lib/test/test_shutil.py::TestShutil::test_make_archive_cwd PASSED                                                               [ 32%]
Lib/test/test_shutil.py::TestShutil::test_make_archive_owner_group PASSED                                                       [ 33%]
Lib/test/test_shutil.py::TestShutil::test_make_tarball PASSED                                                                   [ 33%]
Lib/test/test_shutil.py::TestShutil::test_make_tarfile_in_curdir PASSED                                                         [ 34%]
Lib/test/test_shutil.py::TestShutil::test_make_zipfile PASSED                                                                   [ 35%]
Lib/test/test_shutil.py::TestShutil::test_make_zipfile_in_curdir PASSED                                                         [ 36%]
Lib/test/test_shutil.py::TestShutil::test_on_error PASSED                                                                       [ 37%]
Lib/test/test_shutil.py::TestShutil::test_register_archive_format PASSED                                                        [ 38%]
Lib/test/test_shutil.py::TestShutil::test_rmtree_does_not_choke_on_failing_lstat PASSED                                         [ 39%]
Lib/test/test_shutil.py::TestShutil::test_rmtree_dont_delete_file PASSED                                                        [ 40%]
Lib/test/test_shutil.py::TestShutil::test_rmtree_errors PASSED                                                                  [ 41%]
Lib/test/test_shutil.py::TestShutil::test_rmtree_fails_on_symlink PASSED                                                        [ 42%]
Lib/test/test_shutil.py::TestShutil::test_rmtree_on_symlink PASSED                                                              [ 43%]
Lib/test/test_shutil.py::TestShutil::test_rmtree_uses_safe_fd_version_if_available PASSED                                       [ 44%]
Lib/test/test_shutil.py::TestShutil::test_rmtree_works_on_bytes PASSED                                                          [ 45%]
Lib/test/test_shutil.py::TestShutil::test_rmtree_works_on_symlinks PASSED                                                       [ 46%]
Lib/test/test_shutil.py::TestShutil::test_tarfile_root_owner PASSED                                                             [ 47%]
Lib/test/test_shutil.py::TestShutil::test_tarfile_vs_tar PASSED                                                                 [ 48%]
Lib/test/test_shutil.py::TestShutil::test_unpack_archive_bztar PASSED                                                           [ 49%]
Lib/test/test_shutil.py::TestShutil::test_unpack_archive_gztar PASSED                                                           [ 50%]
Lib/test/test_shutil.py::TestShutil::test_unpack_archive_tar PASSED                                                             [ 51%]
Lib/test/test_shutil.py::TestShutil::test_unpack_archive_xztar PASSED                                                           [ 52%]
Lib/test/test_shutil.py::TestShutil::test_unpack_archive_zip PASSED                                                             [ 53%]
Lib/test/test_shutil.py::TestShutil::test_unpack_registry PASSED                                                                [ 54%]
Lib/test/test_shutil.py::TestShutil::test_unzip_zipfile PASSED                                                                  [ 55%]
Lib/test/test_shutil.py::TestShutil::test_zipfile_vs_zip PASSED                                                                 [ 56%]
Lib/test/test_shutil.py::TestWhich::test_absolute_cmd PASSED                                                                    [ 57%]
Lib/test/test_shutil.py::TestWhich::test_basic PASSED                                                                           [ 58%]
Lib/test/test_shutil.py::TestWhich::test_cwd PASSED                                                                             [ 59%]
Lib/test/test_shutil.py::TestWhich::test_empty_path PASSED                                                                      [ 60%]
Lib/test/test_shutil.py::TestWhich::test_empty_path_no_PATH PASSED                                                              [ 61%]
Lib/test/test_shutil.py::TestWhich::test_environ_path PASSED                                                                    [ 62%]
Lib/test/test_shutil.py::TestWhich::test_non_matching_mode PASSED                                                               [ 63%]
Lib/test/test_shutil.py::TestWhich::test_nonexistent_file PASSED                                                                [ 64%]
Lib/test/test_shutil.py::TestWhich::test_pathext_checking SKIPPED                                                               [ 65%]
Lib/test/test_shutil.py::TestWhich::test_relative_cmd PASSED                                                                    [ 66%]
Lib/test/test_shutil.py::TestWhich::test_relative_path PASSED                                                                   [ 66%]
Lib/test/test_shutil.py::TestMove::test_destinsrc_false_negative PASSED                                                         [ 67%]
Lib/test/test_shutil.py::TestMove::test_destinsrc_false_positive PASSED                                                         [ 68%]
Lib/test/test_shutil.py::TestMove::test_dont_move_dir_in_itself PASSED                                                          [ 69%]
Lib/test/test_shutil.py::TestMove::test_existing_file_inside_dest_dir PASSED                                                    [ 70%]
Lib/test/test_shutil.py::TestMove::test_move_as_rename_return_value PASSED                                                      [ 71%]
Lib/test/test_shutil.py::TestMove::test_move_dangling_symlink PASSED                                                            [ 72%]
Lib/test/test_shutil.py::TestMove::test_move_dir PASSED                                                                         [ 73%]
Lib/test/test_shutil.py::TestMove::test_move_dir_altsep_to_dir SKIPPED                                                          [ 74%]
Lib/test/test_shutil.py::TestMove::test_move_dir_other_fs PASSED                                                                [ 75%]
Lib/test/test_shutil.py::TestMove::test_move_dir_sep_to_dir PASSED                                                              [ 76%]
Lib/test/test_shutil.py::TestMove::test_move_dir_special_function PASSED                                                        [ 77%]
Lib/test/test_shutil.py::TestMove::test_move_dir_symlink PASSED                                                                 [ 78%]
Lib/test/test_shutil.py::TestMove::test_move_dir_to_dir PASSED                                                                  [ 79%]
Lib/test/test_shutil.py::TestMove::test_move_dir_to_dir_other_fs PASSED                                                         [ 80%]
Lib/test/test_shutil.py::TestMove::test_move_file PASSED                                                                        [ 81%]
Lib/test/test_shutil.py::TestMove::test_move_file_other_fs PASSED                                                               [ 82%]
Lib/test/test_shutil.py::TestMove::test_move_file_pathlike PASSED                                                               [ 83%]
Lib/test/test_shutil.py::TestMove::test_move_file_special_function PASSED                                                       [ 84%]
Lib/test/test_shutil.py::TestMove::test_move_file_symlink PASSED                                                                [ 85%]
Lib/test/test_shutil.py::TestMove::test_move_file_symlink_to_dir PASSED                                                         [ 86%]
Lib/test/test_shutil.py::TestMove::test_move_file_to_dir PASSED                                                                 [ 87%]
Lib/test/test_shutil.py::TestMove::test_move_file_to_dir_other_fs PASSED                                                        [ 88%]
Lib/test/test_shutil.py::TestMove::test_move_return_value PASSED                                                                [ 89%]
Lib/test/test_shutil.py::TestCopyFile::test_move_dir_caseinsensitive PASSED                                                     [ 90%]
Lib/test/test_shutil.py::TestCopyFile::test_w_dest_close_fails PASSED                                                           [ 91%]
Lib/test/test_shutil.py::TestCopyFile::test_w_dest_open_fails PASSED                                                            [ 92%]
Lib/test/test_shutil.py::TestCopyFile::test_w_source_close_fails PASSED                                                         [ 93%]
Lib/test/test_shutil.py::TestCopyFile::test_w_source_open_fails PASSED                                                          [ 94%]
Lib/test/test_shutil.py::TermsizeTests::test_bad_environ PASSED                                                                 [ 95%]
Lib/test/test_shutil.py::TermsizeTests::test_does_not_crash PASSED                                                              [ 96%]
Lib/test/test_shutil.py::TermsizeTests::test_fallback PASSED                                                                    [ 97%]
Lib/test/test_shutil.py::TermsizeTests::test_os_environ_first PASSED                                                            [ 98%]
Lib/test/test_shutil.py::TermsizeTests::test_stty_match SKIPPED                                                                 [ 99%]
Lib/test/test_shutil.py::PublicAPITests::test_module_all_attribute PASSED                                                       [100%]

================================================ 96 passed, 7 skipped in 1.25 seconds =================================================

@maxwellmckinnon
Copy link
Contributor

@emilyemorehouse: would you mind to add an unit test to make sure that the reported bug no longer fails? https://bugs.python.org/issue32689#msg310900

@vstinner Test is added, code added to make it pass. Performance testing was also done to check out a few other viable solutions to ensure mass renaming doesn't get a performance regression as a result of this change.

@gvanrossum
Copy link
Member

This PR has been stuck for over two weeks. Who's waiting for whom?

@ammaraskar
Copy link
Member

Looks like reviewers were waiting on @emilyemorehouse to add tests. @maxwellmckinnon made their own PR iterating on the feedback here: #15326

@emilyemorehouse: Would you like to finish this off yourself or do you mind if Maxwell continues it?

@maxwellmckinnon
Copy link
Contributor

I guess no one had any luck pinging @emilyemorehouse over the last month?

I found Emily’s email and pinged her there to come take a look.

Should we just merge the additional tweaks (giving Emily 1st credit of course) and new tests if nothing happens in two more weeks?

@gvanrossum
Copy link
Member

gvanrossum commented Sep 29, 2019 via email

@maxwellmckinnon
Copy link
Contributor

From Emily over email: “I haven't had the time to finish it, feel free to propose yours!”

#15326

No worries on getting this in for anything. I imagine it can always be put in as a patch release to 3.7? All it does is take the tediousness out of getting hit with “that one shutil function that doesn’t work with pathlib”, as the stackoverflow on this bug can attest to.

miss-islington pushed a commit that referenced this pull request Oct 1, 2019
…s source arg (GH-15326)

Important work originally done by @emilyemorehouse two years ago and nearly ready to go in.

This bug has affected many people and in some cases has been a dealbreaker to the adoption of the otherwise wonderful pathlib and PEP519. https://stackoverflow.com/questions/33625931/copy-file-with-pathlib-in-python.

This adds the outstanding test request from that PR @vstinner (#5393).

Test fails without the change, passes with it, along with every other test in test_shutil.

Some variants were experimented with to make the one line change and the most performant one was picked.


# Added Test for PathLike directory destination, the current fail case

```
Lib/test/test_shutil.py::TestMove::test_move_file_pathlike FAILED                                                               [100%]

============================================================== FAILURES ===============================================================
__________________________________________________ TestMove.test_move_file_pathlike ___________________________________________________

self = <test.test_shutil.TestMove testMethod=test_move_file_pathlike>

    def test_move_file_pathlike(self):
        # Move a file to another location on the same filesystem.
        src = pathlib.Path(self.src_file)
>       self._check_move_file(src, self.dst_dir, self.dst_file)

Lib/test/test_shutil.py:1563:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
Lib/test/test_shutil.py:1545: in _check_move_file
    shutil.move(src, dst)
/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/shutil.py:562: in move
    real_dst = os.path.join(dst, _basename(src))
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

path = PosixPath('/var/folders/r2/psq74t5x3nbfzlph8bh2pvdw0000gn/T/tmp9ie0wh9_/foo')

    def _basename(path):
        # A basename() variant which first strips the trailing slash, if present.
        # Thus we always get the last component of the path, even for directories.
        sep = os.path.sep + (os.path.altsep or '')
>       return os.path.basename(path.rstrip(sep))
E       AttributeError: 'PosixPath' object has no attribute 'rstrip'

/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/shutil.py:526: AttributeError
============================================== 1 failed, 102 deselected in 0.30 seconds ===============================================
```

After change:

```
========================================================= test session starts =========================================================
platform darwin -- Python 3.7.4, pytest-5.0.1, py-1.8.0, pluggy-0.12.0 -- /Users/maxwellmckinnon/.venvs/TA3.7/bin/python3.7
cachedir: .pytest_cache
rootdir: /Users/maxwellmckinnon/dev/cpython
plugins: cov-2.7.1, mock-1.10.4
collected 103 items / 102 deselected / 1 selected

Lib/test/test_shutil.py::TestMove::test_move_file_pathlike PASSED                                                               [100%]

============================================== 1 passed, 102 deselected in 0.06 seconds ===============================================
```

Running all the tests in test_shutil.py
```
╰─ pytest Lib/test/test_shutil.py -v
========================================================= test session starts =========================================================
platform darwin -- Python 3.7.4, pytest-5.0.1, py-1.8.0, pluggy-0.12.0 -- /Users/maxwellmckinnon/.venvs/TA3.7/bin/python3.7
cachedir: .pytest_cache
rootdir: /Users/maxwellmckinnon/dev/cpython
plugins: cov-2.7.1, mock-1.10.4
collected 103 items

Lib/test/test_shutil.py::TestShutil::test_chown PASSED                                                                          [  0%]
Lib/test/test_shutil.py::TestShutil::test_copy PASSED                                                                           [  1%]
...
Lib/test/test_shutil.py::TermsizeTests::test_stty_match SKIPPED                                                                 [ 99%]
Lib/test/test_shutil.py::PublicAPITests::test_module_all_attribute PASSED                                                       [100%]

================================================ 96 passed, 7 skipped in 1.25 seconds =================================================
```

# Performance Considerations
Is it considered poor form to get rid of _basename altogether and make use of pathlib in the move function? I'm not sure if the idea is for all these modules to strictly avoid circular dependencies. They are already using os.path which is just as much a citizen in 3.8 as pathlib right?

e.g.

`real_dst = os.path.join(dst, _basename(src))`
becomes
`real_dst = Path(dst) / Path(src).name`

I've looked around and familiarized myself, and I now think importing pathlib here is fine. My only remaining concern is that of performance.

Here's the performance difference for this step. 

```
In [46]: %timeit real_dst = os.path.join("a/b/c", _basename('b/'))
2.71 µs ± 62.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [47]: %timeit real_dst = Path("a/b/c") / Path('b/').name
12.4 µs ± 65.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
```

Is 10us significant or insignificant compared to the least expensive operation this function will do? I don't know. Let's find out.

```
In [55]: %timeit os.rename('/tmp/a/a.txt', '/tmp/a/b.txt'); os.rename('/tmp/a/b.txt', '/tmp/a/a.txt')
124 µs ± 2.18 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
```
62us to rename. 10us seems significant enough that we wouldn't want to favor the Path sugar suggestion. 16% speed decrease from adding the 10us.

What do people think? I was hoping to get to use pathlib.Path here, but I suspect for this low level move, it should be as fast as possible, and 16% is not worth one line of sugary code to me.



https://bugs.python.org/issue32689



Automerge-Triggered-By: @gvanrossum
@gvanrossum
Copy link
Member

Superseded by #15326.

@gvanrossum gvanrossum closed this Oct 1, 2019
@gvanrossum gvanrossum deleted the fix-issue-32689 branch October 1, 2019 02:45
jacobneiltaylor pushed a commit to jacobneiltaylor/cpython that referenced this pull request Dec 5, 2019
…s source arg (pythonGH-15326)

Important work originally done by @emilyemorehouse two years ago and nearly ready to go in.

This bug has affected many people and in some cases has been a dealbreaker to the adoption of the otherwise wonderful pathlib and PEP519. https://stackoverflow.com/questions/33625931/copy-file-with-pathlib-in-python.

This adds the outstanding test request from that PR @vstinner (python#5393).

Test fails without the change, passes with it, along with every other test in test_shutil.

Some variants were experimented with to make the one line change and the most performant one was picked.


# Added Test for PathLike directory destination, the current fail case

```
Lib/test/test_shutil.py::TestMove::test_move_file_pathlike FAILED                                                               [100%]

============================================================== FAILURES ===============================================================
__________________________________________________ TestMove.test_move_file_pathlike ___________________________________________________

self = <test.test_shutil.TestMove testMethod=test_move_file_pathlike>

    def test_move_file_pathlike(self):
        # Move a file to another location on the same filesystem.
        src = pathlib.Path(self.src_file)
>       self._check_move_file(src, self.dst_dir, self.dst_file)

Lib/test/test_shutil.py:1563:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
Lib/test/test_shutil.py:1545: in _check_move_file
    shutil.move(src, dst)
/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/shutil.py:562: in move
    real_dst = os.path.join(dst, _basename(src))
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

path = PosixPath('/var/folders/r2/psq74t5x3nbfzlph8bh2pvdw0000gn/T/tmp9ie0wh9_/foo')

    def _basename(path):
        # A basename() variant which first strips the trailing slash, if present.
        # Thus we always get the last component of the path, even for directories.
        sep = os.path.sep + (os.path.altsep or '')
>       return os.path.basename(path.rstrip(sep))
E       AttributeError: 'PosixPath' object has no attribute 'rstrip'

/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/shutil.py:526: AttributeError
============================================== 1 failed, 102 deselected in 0.30 seconds ===============================================
```

After change:

```
========================================================= test session starts =========================================================
platform darwin -- Python 3.7.4, pytest-5.0.1, py-1.8.0, pluggy-0.12.0 -- /Users/maxwellmckinnon/.venvs/TA3.7/bin/python3.7
cachedir: .pytest_cache
rootdir: /Users/maxwellmckinnon/dev/cpython
plugins: cov-2.7.1, mock-1.10.4
collected 103 items / 102 deselected / 1 selected

Lib/test/test_shutil.py::TestMove::test_move_file_pathlike PASSED                                                               [100%]

============================================== 1 passed, 102 deselected in 0.06 seconds ===============================================
```

Running all the tests in test_shutil.py
```
╰─ pytest Lib/test/test_shutil.py -v
========================================================= test session starts =========================================================
platform darwin -- Python 3.7.4, pytest-5.0.1, py-1.8.0, pluggy-0.12.0 -- /Users/maxwellmckinnon/.venvs/TA3.7/bin/python3.7
cachedir: .pytest_cache
rootdir: /Users/maxwellmckinnon/dev/cpython
plugins: cov-2.7.1, mock-1.10.4
collected 103 items

Lib/test/test_shutil.py::TestShutil::test_chown PASSED                                                                          [  0%]
Lib/test/test_shutil.py::TestShutil::test_copy PASSED                                                                           [  1%]
...
Lib/test/test_shutil.py::TermsizeTests::test_stty_match SKIPPED                                                                 [ 99%]
Lib/test/test_shutil.py::PublicAPITests::test_module_all_attribute PASSED                                                       [100%]

================================================ 96 passed, 7 skipped in 1.25 seconds =================================================
```

# Performance Considerations
Is it considered poor form to get rid of _basename altogether and make use of pathlib in the move function? I'm not sure if the idea is for all these modules to strictly avoid circular dependencies. They are already using os.path which is just as much a citizen in 3.8 as pathlib right?

e.g.

`real_dst = os.path.join(dst, _basename(src))`
becomes
`real_dst = Path(dst) / Path(src).name`

I've looked around and familiarized myself, and I now think importing pathlib here is fine. My only remaining concern is that of performance.

Here's the performance difference for this step. 

```
In [46]: %timeit real_dst = os.path.join("a/b/c", _basename('b/'))
2.71 µs ± 62.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [47]: %timeit real_dst = Path("a/b/c") / Path('b/').name
12.4 µs ± 65.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
```

Is 10us significant or insignificant compared to the least expensive operation this function will do? I don't know. Let's find out.

```
In [55]: %timeit os.rename('/tmp/a/a.txt', '/tmp/a/b.txt'); os.rename('/tmp/a/b.txt', '/tmp/a/a.txt')
124 µs ± 2.18 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
```
62us to rename. 10us seems significant enough that we wouldn't want to favor the Path sugar suggestion. 16% speed decrease from adding the 10us.

What do people think? I was hoping to get to use pathlib.Path here, but I suspect for this low level move, it should be as fast as possible, and 16% is not worth one line of sugary code to me.



https://bugs.python.org/issue32689



Automerge-Triggered-By: @gvanrossum
sneakers-the-rat added a commit to sneakers-the-rat/pynwb that referenced this pull request Jul 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.