pcopy

A simple/experimental parallel file copy utility.

Overview

File transfers on a network file system often benefit from parallelization. Python's shutil.copytree(..., copy_function=...) provides a hook that we can leverage to submit a copy task to a work queue, which can be managed by concurrent.futures.ThreadPoolExecutor. The implementation has pleasantly few lines:

class MultithreadedCopier(concurrent.futures.ThreadPoolExecutor):
    def copy(self, source, dest):
        self.submit(shutil.copy2, source, dest)

with MultithreadedCopier(max_workers=workers) as copier:
    shutil.copytree(src, dest, copy_function=copier.copy)

That's it! In practice, we add a little bit more code for a CLI, exception handling, and to make it behave like cp -r src dst for multiple src and for when dst exists.

This package is designed to be used from the command line. Install it with pip (probably in a venv) and use the pcopy command as one would use cp -r:

pip install git+https://github.com/lgarrison/pcopy
pcopy src dst

Or, install it as a "tool" with uv or pipx:

uv tool install git+https://github.com/lgarrison/pcopy
pcopy src dst

Select the number of threads with -w:

pcopy -w 4 src dst

Some questions to explore in the future:

This only parallelizes the file copies (by releasing the GIL), not the directory walk or the directory creation. Can we make a fast Python version that parallelizes those parts?
Would nogil Python help with that version? Would nogil Python help with the current version?
Can we get decent small-file/many-dir performance out of Python, or should we use a compiled language?
What if we want to distribute the work over many nodes?
Would multi-process be better in any sense than multi-threaded?
A progress meter would be nice, but that would probably require walking the tree twice.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
src/pcopy		src/pcopy
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pcopy

Overview

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

pcopy

Overview

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages