New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
在Windows平台使用多线程进行文件复制 #51
Conversation
@@ -119,6 +119,22 @@ mcd_root/ | |||
- Python >= 3.8 | |||
- 选项 `backup_format` 为 `plain` | |||
|
|||
### copy_thread_active |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
请同步维护英文文档
|
||
注意: | ||
|
||
- 一般情况下,多线程复制能够将速度提升4-5倍(测试结果) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
如果想要展示性能差异,请带上具体的测试场景
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
测试部分写在了pr里
并且由于存档大小和系统环境等差异,在不同使用场景下提升差异可能较大,提升较小的可能只有20%左右
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR 不是文档。readme 里要么不写,要么写完善
|
||
- 一般情况下,多线程复制能够将速度提升4-5倍(测试结果) | ||
|
||
- 线程数不建议超过8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
请提供建议参考来源
@@ -1,13 +1,15 @@ | |||
{ | |||
"id": "quick_backup_multi", | |||
"version": "1.9.0", | |||
"version": "1.10.0-beta", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
请不要修改版本号,这不是这一类新增功能 PR 该做的
"name": "Quick Backup Multi", | ||
"description": { | ||
"en_us": "A backup / restore plugin, with multiple backup slot", | ||
"zh_cn": "多槽位备份/回档插件" | ||
}, | ||
"author": [ | ||
"Fallen_Breath" | ||
"Fallen_Breath", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
你只是贡献者而已,并非维护者/作者
@@ -30,6 +30,44 @@ class CopyWorldIntent(Enum): | |||
backup = auto() | |||
restore = auto() | |||
|
|||
# 多线程复制文件 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
请使用英文来写注释,并且不要写这种翻译变量名这一类无意义的注释
step = int(len(files_from)//thread_counts) | ||
f = lambda _list: [_list[i:i+step] for i in range(0,len(_list),step)] # 切分文件为thread_count份 | ||
files_from,files_to = f(files_from),f(files_to) | ||
for thread in range(thread_counts): # 多线程复制 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
考虑使用线程池 ThreadPoolExecutor
。借助线程池,无需手动切分复制任务、无需管理线程生命周期
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
目前仍在尝试各种不同的线程调用方式,感谢提醒,我会尝试测试线程池的调用及耗时
s_time = time.time() | ||
for i in threads: | ||
i.join() | ||
server_inst.logger.info(time.time()-s_time) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
如果耗时展示仅用于调试,请删去
for file_from, file_to in zip(files_from,files_to): | ||
try: | ||
shutil.copy2(file_from,file_to) | ||
except PermissionError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
请不要无原因地抑制异常发生
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里是由于session.lock在部分情况下可能会被mc锁定无法复制,从而引发permission异常。如果尝试在except中再验证一次复制的文件名,非session.lock文件则抛出异常应该能更好解决?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
session.lock
这一类文件已可使用 ignored_files 配置进行忽略
shutil.copy2(file_from,file_to) | ||
except PermissionError: | ||
pass | ||
if id != None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
如果耗时展示仅用于调试,请删去
try: | ||
MultithreadedCopy(src_path, dst_path, config.copy_thread_active) | ||
except Exception as e: | ||
server_inst.logger.warn(f"多线程复制出错,使用常规复制 {e}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- 请使用英文
- 请阐述进行重试操作的原因,什么时候会出现多线程失败而单线程能成功的情况
- 如果出现问题,请做出恰当的恢复操作,而非直接调用常规复制方法重试
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
目前的调试过程中似乎暂未发现多线程失败的情况?
@@ -30,6 +30,44 @@ class CopyWorldIntent(Enum): | |||
backup = auto() | |||
restore = auto() | |||
|
|||
# 多线程复制文件 | |||
class MultithreadedCopy: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
考虑到该类的使用方式
QuickBackupM/quick_backup_multi/__init__.py
Lines 172 to 174 in eb06543
try: | |
MultithreadedCopy(src_path, dst_path, config.copy_thread_active) | |
except Exception as e: |
请不用把类当函数来用,定义一个普通的函数即可。这些成员函数都可以在函数里面定义函数
shutil.copytree(src_path, dst_path, ignore=lambda path, files: set(filter(config.is_file_ignored, files)), copy_function=copy_file_fast) | ||
if config.copy_thread_active >= 1: | ||
try: | ||
MultithreadedCopy(src_path, dst_path, config.copy_thread_active) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
请正确支持功能 config.is_file_ignored
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
见 Files changed 页面的 review 项
使用线程池,支持 def copy_tree_fast(src_path: str, dst_path: str, ignore=None, copy_function: Callable[[str, str], object] = shutil.copy2):
with ThreadPoolExecutor(max_workers=max(1, config.copy_thread_active), thread_name_prefix='QBMFileCopier') as pool:
def threaded_copy(s: str, d: str):
tasks.append((s, d, pool.submit(copy_function, s, d)))
tasks = []
shutil.copytree(src_path, dst_path, ignore=ignore, copy_function=threaded_copy)
# expose the possible exceptions
for src, dst, future in tasks:
try:
future.result()
except Exception:
server_inst.logger.error('Failed to copy file from {} to {}'.format(src, dst))
raise @@ -130,7 +148,7 @@
server_inst.logger.info('copying {} -> {}'.format(src_path, dst_path))
if os.path.isdir(src_path):
- shutil.copytree(src_path, dst_path, ignore=lambda path, files: set(filter(config.is_file_ignored, files)), copy_function=copy_file_fast)
+ copy_tree_fast(src_path, dst_path, ignore=lambda path, files: set(filter(config.is_file_ignored, files)), copy_function=copy_file_fast)
elif os.path.isfile(src_path):
dst_dir = os.path.dirname(dst_path)
if not os.path.isdir(dst_dir): |
@@ -11,6 +11,7 @@ class Configuration(Serializable): | |||
size_display: bool = True | |||
turn_off_auto_save: bool = True | |||
enable_copy_file_range: bool = False | |||
copy_thread_active: int = 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
请使用 0
或 1
作为默认值,让多线程复制默认关闭。毕竟该功能并非是无副作用的
已于 f39a393 中实现并行复制 |
多线程在Windows平台
普通的copytree等复制函数在Windows平台上调度性能较差,无法充分利用硬盘。
使用多线程可将速度提升约300%-600%
添加内容
函数
通过使用
shutil.copytree
函数先获取文件列表,threading
模块建立线程,将待复制的文件按照线程数切分成n份分配给各个线程,开始计算极大的提高了Windows平台上的备份速度
Config
copy_thread_active
默认值:
4
建立多个线程同时请求硬盘以加快复制速度,该选项控制了建立的线程数量
当该选项设定为0时,关闭多线程复制而采用传统复制方式
测试
环境:
结果:
速度提升了约300%-600%